Scaling basics on AWS: Horizontal and Vertical

By
Tim Gray
November 22, 2018

Scaling is a large part of why running applications in the cloud is a good idea, but the different approaches to scaling are often not talked about in much detail. So I thought I would quickly cover off some of the options when it comes to scaling on AWS (though these approaches work on other cloud’s too, just need different terminology).

Some definitions

We techy people love defining things, so here is a quick definition or two that will help us when we talk about scaling.
Scaling
Scaling is the act of changing the size of a computer system during operation to meet changes in demand or operational requirements.
Auto-Scaling
Auto-Scaling is a computer system changing its size automatically during normal operation to meet the requirements of a dynamic system.
Node
A node is a component in a computer architecture that is responsible for a part of that system’s operation.
Instance
An instance is a single physical or virtual server in a computer architecture. The term Node and Instance can be used interchangeably in most systems, though in some systems an instance can hold the operation of many nodes.
Horizontal Scaling
Horizontal Scaling is the act of changing the number of nodes in a computing system without changing the size of any individual node.
Vertical Scaling
Vertical Scaling is increasing the size and computing power of a single instance or node without increasing the number of nodes or instances.
Load Balancer
A load balancer is a computing architecture component that is responsible for distributing load across a cluster of nodes.
Scaling Up
Increasing the size and capacity of a software system.
Scaling Down
Decreasing the size and capacity of a software system (and generally the costs).

Manual Scaling on AWS

Manually vertically scaling an ec2 server.

One approach to scaling is to do it manually. While this has obvious drawbacks it does allows us to get good application performance without having to worry about setting up scaling rules. To vertically scale manually is quite easy, for RDS or EC2 servers just change the instance size (which unfortunately requires a little downtime). Scaling up would mean picking a larger instance type (t3.medium to t3.large for example) and scaling down would be picking a smaller instance (t3.nano anyone). This has the obvious drawbacks of needing a manual intervention to change the performance of our software architecture, and can often lead to higher costs as people tend to leave their architecture larger than what is required.

Auto-Scaling on AWS

How AWS Auto Scaling works

A lot of AWS based services makes auto-scaling much easier. One of those services is aptly named AWS Auto Scaling which is a reasonably new section of the AWS console that allows us to apply scaling policies over multiple components of our architecture.
Another approach to autoscaling is to scale the individual components of the software system. For example, EC2 has EC2 Auto Scale that can use a number of metrics to automatically add and remove servers from clusters.

Example: Auto Scaling a Web Service

Example of a basic web app on AWS

Here we have a basic web app on AWS. We have a couple of EC2’s serving a web page or some other web service. In front of the web servers is an Elastic Load Balancer that routes the traffic evenly-ish between the instances, then in front of the ELB is CloudFront to give us some caching. It’s also worth noting that the EC2 servers are in an autoscaling group.
The main way to automatically scale this web service is to have EC2 Auto Scaling scale based on the number of connections per instance for the connected ELB. The way this works is to have the auto scaling group set to target a certain metric and keep it within a safe range. For this web service example, we would want to target “Application Load Balancer Request Count Per Target” and ask the auto scaling group to keep this value around 200. This would then automatically add and remove instances from the autoscaling group as needed to keep the value at approximately 200 requests per instance in the autoscaling group.

Remember to scale down!

One of the most important things that needs to be done when you are autoscaling is to remember to scale down. Scaling a cluster up costs more money (which is fine when there is extra traffic to your web service as this traffic generally means more income), so when there is a chance to scale down without a loss of performance we should take that opportunity. The above example does this automatically as the autoscaling group will remove instances just as happily as it will add instances.
Hope this quick overview of scaling on AWS gives you a few things to think on.
Until next time!
Tim Gray
Coffee to Code.

Tim works on a lot of things here at optimal and writes blogs on a wide range of topics.

Connect with Tim on LinkedIn or read his other blogs here.

Copyright © 2019 OptimalBI LTD.