Save up to 90% on Your AWS Bill with Spot Instances

Jul 1, 2016

Is the majority of your AWS monthly bill made up of thousands of on-demand EC2 instance hours, steadily increasing month by month?

If your organization knows it'll be around for years to come, you've probably already purchased Reserved Instances to attempt to lower the cost (by paying in advance for instances which leads to up to 75% cost reduction, depending on how long you pay in advance for). But this pricing model isn't applicable to most startups and small businesses as they may shut down abruptly in a moment's notice. Whether you utilize Reserved Instances or not, there's another nifty trick you can take advantage of to get up to 90% discount on the on-demand EC2 instance price, without paying anything in advance!

Even if your organization doesn't seem to care about optimizing the AWS monthly bill, I'm sure they'd appreciate the huge savings you could harness by utilizing Spot instances correctly! You might even be able to convince the management to organize a company-wide recreational day with the money you save on the monthly AWS bill.

Spot Instances

Their name is a bit misleading -- they are pretty much regular EC2 instances, but simply priced much lower (usually), up to 90% off the on-demand price, and with different life expectancy. Also, they are not a new type of offering -- they actually exist for quite some time now, and yet, most AWS customers still don't take full advantage of them.

Why does AWS offer the same underlying hardware for much less, without any upfront payment? Here's why.

AWS provisions millions of physical servers in advance for their on-demand EC2 instance service. They need to be prepared for when their clients decide they want to spin up 1,000 m4.large instances within a few minutes time, without any notice in advance.

AWS Data Center

Since AWS will probably always be ready to grant you your on-demand instances when you need them, they have to provision the hardware in advance and keep it running to support your request in a moment's notice. This creates a situation where tons of EC2 compute power is being actively wasted -- physical servers just sitting there doing absolutely nothing, wasting precious resources and money for Amazon (electricity / hardware / cooling / etc), as they wait for clients to utilize them by starting up on-demand instances. There will always be spare compute power in the AWS cloud, waiting for clients to utilize it.

Imagine you had to file in a request for on-demand instances and wait a few hours until AWS manually plugged in and provisioned more physical servers in its data centers to fulfill your request! That wouldn't be much fun.

Spare Compute Power

Spot instances are Amazon's solution to this problem -- it is essentially a stock market for spare compute power. They sell any excess compute power for ridiculously low hourly rates -- usually 80% - 90% off the on-demand price for each supported instance type (not all instance types are available as Spot instances). The Spot price, or the hourly cost of running a Spot instance, is specific to each instance type and availability-zone and is decided by Amazon's secret algorithm which factors in supply and demand of this spare compute power.

Spot instances let you bid on spare Amazon EC2 instances to name your own price for compute capacity. The Spot price fluctuates based on the supply and demand of available EC2 capacity. Your Spot instance is launched when your bid exceeds the current Spot market price, and will continue run until you choose to terminate it, or until the Spot market price exceeds your bid.

Unfortunately, since this is a market, the price can fluctuate both ways -- the Spot price will 95% of the time be substantially less than the on-demand price, however, when demand for compute power within the AWS cloud grows, or when some clients bid too high for Spot instances, the Spot price may spike and actually surpass the on-demand price for a short period of time, shutting down your server within 2 minutes notice if the Spot price exceeds your bid price.

Show Me the Money

So how does the Spot price usually look like?

Here's a graph representing 3 months of Spot pricing history for the m4.large instance type in four different us-east-1 AZs, generated on July 1st of 2016:

Spot Price History

For reference, the standard on-demand price for m4.large is $0.12 per instance hour, approximately $86.40 per month.

As you can see from the graph, most of the time, the Spot price fluctuates between $0.0185 (85% less than on-demand, $13.32 / month) and $0.0276 (77% less than on-demand, $19.87 / month), except for when it peaks ridiculously, for relatively short periods of time, most likely due to high demand in a certain AZ. Paying around $15 per month instead of $86.40 for an m4.large instance sounds like a hell of a deal to me!

It's also worth noting that most of the time, the Spot price peaks in only one AZ at a time. So if you spread out your Spot instances in multiple AZs (you should already be doing this for high availability anyway), there is much less chance that all of your Spot instances will terminate at once.

You can view an up-to-date pricing history graph by checking out the Spot instance launch wizard in the AWS Console.

Fear of Spot Instances

I speculate that AWS customers are afraid of using Spot for the following reasons:

The workflow involved in provisioning them is cluttered and messy
People are scared off by the fact that Spot instances may shut down abruptly as the Spot price increases past their max bid price
Spot instances are not supported in all AWS services (for example, in Elastic Beanstalk, however, there are hacks to make them work in EB as well)

Furthermore, most people don't really understand how to utilize Spot instances safely. They tend to go with a radical approach to using Spot -- all or nothing. You see, you shouldn't be relying on Spot instances for 100% of your workload. What you should be doing is supplementing your on-demand instances with Spot instances, where appropriate.

I'd say it's pretty much a safe bet to utilize Spot for 70% of your stateless workload, provided you have at least 10 total instances. Since the Spot price usually won't spike in all AZs at once, you should be good to go. However, the percentage of Spot instances you employ in your workload is definitely application and task-specific. Make your own decision on how much Spot to supplement your scalable environment with.

There is one exception though -- feel free to use 100% Spot instances for any stateless app that is not meant for production -- development and test servers are absolutely fine for Spot, as long as you're OK with suffering a bit of downtime every once in a while if the Spot price peaks.

Bidding

How do you know how much to bid for each Spot instance? A good practice is to simply bid 100% of the on-demand price. That way, you'll never pay more than if the instance was a standard on-demand, and you won't be charged excessively when the Spot price spikes uncontrollably. The best part is that you don't pay your max bid price, but instead, you pay the current Spot price. If you set your max bid price to 100% of the on-demand price, you have absolutely nothing to lose when using Spot!

You can then simply replace terminating Spot instances with on-demands and suffer no blow to your wallet, and hopefully with no noticeable service degradation, provided you supplemented correctly.

Supplementation Example

A backend API service running on 20 on-demand m4.large instances could instead be running on 10 on-demand and 10 Spot instances, setting the maximum bid price at 100% the on-demand price. You could then set up a monitoring service to automatically increase the number of on-demand instances when Spot instances get terminated, and terminate the on-demand when the Spots are back up.

Spot Isn't for Everything

Note that you shouldn't be using Spot for running any sensitive workloads. If it isn't clear enough already, you should NOT run your databases on Spot. Spot instances are meant for workloads which do not persist any sensitive data to a local disk, since they may shut down abruptly as the Spot price increases past your bid price. They're perfect as web servers, API backends, Hadoop, etc. Any kind of workload that can be interrupted and replaced by an on-demand instance without any need for backup and restore. Basically, any stateless application or task that can be interrupted safely.

Spot Blocks

AWS also provides an option to guarantee uptime of up to 6 hours for your Spot instance, in exchange for a higher Spot price. You won't be affected by a price spike, but you'll be paying a bit more for each instance hour. And your instance will always be terminated after 6 hours.

You may prefer this type of Spot instance if you have some one-time task that you need less than 6 hours to finish and don't want it interrupted.

Using Spot

There are several ways to utilize Spot instances:

Manually request them using the Spot Requests console which is not recommended as there is no resiliency here -- once the Spot price exceeds your bid price, your instances will shut down and AWS will not restore them after the price goes back down
Request a "Spot Fleet" in the Spot Requests console to attempt to maintain your target capacity by relaunching Spot instances after the price goes down again
Configure an EC2 Launch Configuration to use Spot instances instead of on-demand and hook up an Auto Scaling Group to use your launch configuration (recommended)
Configure an Elastic Beanstalk environment to utilize 100% Spot instances (not recommended except for non-production workloads)

I'd recommend going with the Launch Configuration method, as it's the least cluttered. I tend to find the Spot requests console a bit messy to work with.

If you wish to supplement an existing Elastic Load Balancer with Spot instances, simply create another Auto Scaling Group, duplicate its Launch Configuration, modify it to use Spot instances, and finally, attach it to your existing ELB. Your ELB will now forward traffic to both Auto Scaling Groups, and you'll be able to easily increase or decrease the number of desired instances in either Auto Scaling Group to maintain a good on-demand-to-Spot-instances ratio.

That's it! Let me know how you take advantage of Spot instances to save tons of money on your AWS bill in the comments below!