EC2 Right Sizing

Each year AWS re: invent brings more and more services and upgrades, yet the basic rules of creating scalable, fault-tolerant architecture are still the same.

AWS’s Pay as You Go model is a catchy hook to lure in people from expensive often paid upfront On-premise hardware machines, however, unless you have architected it properly AWS can end up being more expensive than it should.

The economy is heading towards a recession and cost optimization is the need of the hour, organisation leaders are being asked to do the same or more with less.

One of the most costly building blocks of an Environment created on AWS is the humble EC2 instance. the founding block of computing power in AWS.

EC2 instances can be used for storage, Web Servers, Backend Applications, Worker nodes – the use cases are endless, and yet in most of our AWS Well-Architected Framework reviews, EC2 instances are found to be overprovisioned or even yet not entirely used.

Even if your EC2 instances are correctly provisioned there are so many things that a Solution Architect can leverage for instant cost optimization and discounts on the monthly bills so on this article we will go through some of the ways you can remediate your EC2 instances costs and implement much-desired discounts.

EC2 Rightsizing:

80%of cost savings come directly from seeing which instance classes EC2s are using and whether they are consuming the resources allocated or if they are spending time idle.

Compute workloads on AWS can be differentiated into 4 unique use cases.

Steady Load – This is the load on the Compute nodes normally, it is usually a fixed load and can be predicted from historical metric data and business expansion plans. This use case is perfect for Reserved Instances major discounts.

Spiky load – Similar to Steady Load; however this load tends to spike up and down during rush hours.

For such instances, we architect the EC2 size based on the average steady load and create scaling policies that can scale the system horizontally or vertically depending on the requirement to combat the spikes, that way you end up paying more only when there are spikes rather than paying an increased cost for the time you are not using your Ec2 instances.

Test load – Usually your developers will create instances to test or try out a new feature in the code or on AWS, since this type of work is done often but for shorter periods and they do not mind service disruptions, Spot instances are perfect for such use cases. In fact, spot instances can be included if your application has a graceful exit scenario (e.g. Spot instances are perfect for containerized workloads where even if a host goes down, a new one can be spun up and new containers can be provisioned).

Dev/Stage/QA – Because Development and staging replicas of production are only used during business hours Monday through Friday, You can use many of AWS offerings to shut down the Resources after hours and through the weekends. A simple schedular can bring the AWS Account costs substantially, and since the workloads on these environments are by and large are not bound to an SLA spot instances might also be a good option.

Identify EC2 Metrics for Right-Sizing:

We’ve identified that setting up EC2 with the right instance class and size is imperative for a cost-optimized environment. But how do you even check the metrics for instance?

There are a couple of ways; both manually and using AWS managed services which can help in this, the first method is to check with AWS Cloudwatch metrics.

To view EC2 usage and performance metrics, go to the EC2 console, select an EC2 and click the monitoring tab.

You will be greeted with the Cloudwatch Metrics for the Particular instance.

Based on the metrics we have identified above, there is low CPU utilization thus, this instance seems to be a prime candidate for Right-Sizing. However, since Memory metrics are not available by default in Cloudwatch, it is a best practice to first set up cloudwatch agents on the EC2 and push memory metrics to cloudwatch so that you can get the full picture. It would be disastrous if the instance was a memory-intensive instance and it got right-sized based on CPU metrics.

You can collect the additial metrics on Memory and Disk performance by installing and configuring AWS Cloudwatch Agent on your Ec2 Instances, additinal costs apply since it will create new metrics.

Another AWS-managed service that can help in right-sizing is the AWS Compute Optimizer. You can find this in the AWS services tab.

AWS Compute Optimizer is an ML AWS Managed service that looks through utilization logs for EC2 instances and provides instant right sizing recommendations.
The Compute Optimizer provides you input on EC2, Autoscaling Groups, EBS, and Lambda rightsizing.

The findings page has a View Recommendations Button that allows you to view the recommendations in more detail. Let’s look at one instance that is over-provisioned.

AWS Compute Optimizer Provides details on exactly which infrastructure is overprovisioned in the EC2 instance (Compute, Storage, network requirements, etc.) and provides you with recommendations regarding the new target instance type.

You can get a complete picture of the risk of migration (what sort of performance downgrade can be expected), the level of effort for the migration as well as the savings that can be expected. By moving this instance for example from a t3.xlarge to an r6g.large you can get roughly 40% Savings per month!.

Creating infrastructure on AWS requires considerable care on how components are provisioned and whether they are optimized for consumption. Using AWS Cloudwatch metrics and AWS Compute Optimizer allows you to see if what you are running is costing you more unnecessarily and how you can reduce unoptimized instances and immediately give your wallet a major sigh of relief.