Easily build complex reports
Monitoring and efficiency metrics
Custom cost allocation tags
Network cost visibility
Organizational cost hierarchies
Budgeting and budget alerts
Discover active resources
Consumption-based insights
Alerts for unexpected charges
Automated AWS cost savings
Discover cost savings
Unified view of AWS discounts
COGS and business metrics
Model savings plans
Collaborate on cost initiatives
Create and manage your teams
Automate cloud infrastructure
Cloud cost issue tracking
Detect cost spikes
by Emily Dunenfeld
Contents
The huge rise in demand for GPU compute corresponds to the dramatic growth of compute-intensive applications utilizing AI/ML, blockchain, gaming, etc. Fortunately, for companies looking to harness GPU compute power, they can rent GPU from cloud providers like Amazon, instead of investing in expensive hardware upfront. Amazon has several EC2 instance options with GPU, categorized into the G and P families, which we will compare, mainly focusing on use case and price.
GPUs (graphics processing units) are designed to handle parallel processing tasks. While CPUs are ideal for general-purpose tasks and management tasks, GPUs are ideal for compute-intensive tasks, such as machine learning (ML), data analytics, rendering graphics, and scientific simulations.
They are extremely popular and in demand because of the need for high-performance compute and the rise in complexity of the tasks just listed. As such, there have been issues with supply and demand imbalances. There are more options to choose from now, though keep in mind availability may still be limited and region-specific.
Within the accelerated computing instances, two families consist of GPU-based instances, the G and P families. The P family was the first of the accelerated computing instances and was designed for general-purpose GPU compute tasks. The family has since evolved and has become widely adopted for ML workloads, with AI companies, like Anthropic and Cohere, using P family instances. The G family is optimized for graphics-intensive applications and has since also expanded its use cases to cover the ever-popular ML use cases.
EC2 P family information with Amazon recommended use cases (scroll to see full table)
EC2 G family information with Amazon recommended use cases (scroll to see full table)
For graphics workloads, the choice between the P and G family is usually much simpler—pick an instance in the G family. For ML workloads, more factors affect choice, the main being: use case, performance, instance size, and price. There are other factors too, such as availability and hardware compatibility, that will further narrow down the options.
The first thing to consider is the use case, whether it’s training models, performing inference, or deploying pre-trained models. Certain instances are designed to handle these requirements better than others.
P family instances are generally much more powerful than comparable G family instances, making them an excellent choice for demanding ML tasks, such as large-scale model training or high-performance computing (HPC) workloads. Another obvious rule of thumb is that the later generations of an instance type tend to be more performant than the previous generations. So, if your use case requires the highest amounts of performance, consider P5 or P4 instances.
However, in many cases, such as for deploying pre-trained models or performing inference, you just don’t need that level of compute. In those scenarios, the G5 or G4dn instances can be a more suitable and cost-effective choice.
The size of the instance, in terms of CPU and memory capacity, is another important consideration since it significantly impacts performance and cost-effectiveness. The G family offers a wider range of instance sizes, allowing you to choose the appropriate CPU and memory capacity based on your workload requirements. In contrast, the P family has fewer options; for example, the P5 and P4 series each only have one instance size available.
EC2 GPU instances pricing US East (N Virginia) (scroll to see full table)
G family instances tend to be much more cost-effective than their P family counterparts, potentially resulting in significant cost savings for organizations that don’t require the highest levels of GPU performance.
The g4dn instance receives a lot of attention, and rightfully so. It is the lowest cost of the EC2 GPU instances and is performant for ML inference and small-scale training.
Selecting an EC2 GPU instance, though not as much of an investment as setting up and purchasing your own hardware, is still a big investment with lots of factors to consider. The two accelerated computing families with GPU instances, the P family and G family, both have several options to choose from. While the P family has instances that are better suited for demanding tasks like large-scale model training, G family instances have a good balance between performance and cost-effectiveness that is a good choice for many workloads.
MongoDB Atlas is the cost-effective choice for production workloads where high-availability is a requirement.
Grafana is a strong competitor to the monitoring and observability features of Datadog for a fraction of the price.
AWS is implementing a policy update that will no longer allow Reserved Instances and Savings Plans to be shared across end customers.