Tech : Moving to Production on AWS - Part 1 (Scalability)
- ketan saxena
- May 29, 2020
- 3 min read
Updated: Jul 2, 2025

Introduction
With the advancements in the frameworks and scaffolding tools, it has become pretty easy to develop a substantially complex application within reasonable time. But the real challenge arrives when we plan to take this application live. Nobody would ever want to wake up and see hundreds of users complaining about our site being too slow or the site being unavailable. That would be a nightmare to all the project stakeholders.
Here I am starting this article series that would cover all important action items that one should consider before taking their application live. I would be using a classical AWS EC2 instance based architecture as a reference in this series since it is the most commonly used architecture. Also, It would be a bit easier to understand and relate for most of the developers.
About Scalability
As per the definition in techopedia, Scalability is an attribute that describes the ability of a process, network, software or organization to grow and manage increased demand. In simpler words, your application should not crash if there’s a sudden increase in usage of the application.
We all have seen at least one such project where everything goes well till demo sessions are given over staging environment and testing is done. The real problem arises when it is exposed to a large number of audience. When the hits per second count increases that’s when the cloud architecture you setup either starts to pay off or backfire. Of course, a well written code, query and an indexed DB would certainly save you for a while, but after a certain point, it eventually comes down to the question — How scalable and robust your production deployment is?
The Essential Fives:
Here I have compiled a list of 5 essential things that one MUST consider before going to production If you’re planning to have something more serious than just a demo application. Of course, there is a plethora of other techniques, best practices, security measures, etc that you should do based on the nature of your application, but as per my experience these 5 action items seems necessary in almost all web applications that are expected to have varying user traffic.
IMPORTANT: This article covers the 5 necessary items from the scalability point of view for an app. There are many other actions related to security like setting up internet firewall, going HTTPS , configuring security groups. I would cover those topics in a later article.
1. Configuring a Load Balancer and choosing the right EC2 instance size
First things first, setting up a production environment is not just SSHing into your EC2 instance and running the application. You need to have a load balancer in front that would distribute the incoming traffic over multiple EC2 instances.
Setting up a Load Balancer
For most of the web apps, Application Load Balancer (ALB) works well. If you want a more granular control over requests and have Layer 4 (Transport Layer) Routing, you can use Network Load Balancer (NLB).
Follow the below steps to configure an ALB
So first step, go to AWS Console and go to `Services` > `EC2`
In the Sidebar, look for Load Balancers under `LOAD BALANCING` section
Click on create load balancer and choose Application Load Balancer
Attach an existing EC2 instance on which you have your app running
Choosing the correct server size for your EC2 instance
For this, you need to understand the build size and the CPU consumption by your application when running with little traffic. When we speak about build size, we also refer to the additional dependencies that you would have to install and run for your application. For example, to run a nodejs application, you need nodejs and npm pre-installed.
Typically the rule of thumb is “Your server instance should have such a RAM configuration, where if we run the application with little traffic (~10 requests / minute), the CPU utilisation should be between 10% to 25%”. If it is lesser than 10, then you have taken a big server and would end up paying more than needed. If it is more than 20, then as soon as the incoming traffic increases, the CPU would hit saturation point and server may stop.”
2. Configuring Auto-scaling with proper scaling metrics
Configuring Auto scaling is pretty simple in AWS
3. Using CloudFront to cache your assets
4. Configuring CloudWatch events


Comments