Hi again! In the previous part, we deployed a Node.js application with EC2. In this part, we will use AWS Load Balancing to achieve Elastic Load Balancing & Auto Scaling.
AWS Elastic Load Balancing
Load balancer is a service which uniformly distributes network traffic and workloads across multiple servers or cluster of servers. Load balancer in AWS increases the availability and fault tolerance of an application.
It allows auto scaling and, at the same time, allows single point access. It keeps the health of registered instances in check and if any instance is unhealthy, traffic is redirected to another instance.
AWS has 3 types of Load Balancers
Classical Load Balancer (v1-old generation)
Application Load Balancer (v2-new generation)
Network Load Balancer (v2-new generation)
It is recommended to use the new generation Load Balancers as they provide more features.
Application Loads Balancers
- It is a Layer 7 Load Balancer.
- It is used for HTTP traffic.
- It Load Balances application on the same machine.
- It load balances based on route in URL or Hostname in URL
- As many Target Groups are behind ELB, each Target Group can have as many EC2 instances.
- It allows stickiness, i.e. same user's request will be served by the same EC2 instance.
- The ALB sees Client IP but does not allow the EC2 instance to see the client IP. The instance sees private IP of the ALB.
Network Loads Balancers
- It is a Layer 4 Load Balancer.
- It is used for TCP traffic.
- It has comparatively higher performance.
- NLBs supports source IP preserving.
If LB doesn't connect to application, check the Security Groups
For this article, I have already setup an EC2 instance and deployed a basic Node.js application using Nginx and PM2.
Create an ELB
- Sign In to AWS management console ==>
Go to EC2 Dashboard ==> Go to EC2 Dashboard and slide down the left panel to open Load Balancers.
Create Load Balancer ==> You will be presented with the 3 available types of Load balancers. For this article, I have chosen ALB as Classical is old generation and NLB is for higher performance.
Configure Load Balancer ==> In configuration, we set the name, scheme (An Internet-facing load balancer routes requests from clients over the Internet to targets. An internal load balancer routes requests from clients to targets using private IP addresses.), the listeners(to serve traffic) and the IP address type and the availability zones. I have made these load balancers highly available. Now click Next: Configure Security Settings.
Configure Security Settings ==> This configuration is for HTTPS traffic. Now click Next: Configure Security Groups
Configure Security Groups ==> Here, we select the Custom TCP to serve traffic on port 80. Now select Next: Configure Routing
Configure Routing ==> Each target group is used to route requests to one or more registered targets. When you create a listener rule, you specify a target group and conditions. When a rule condition is met, traffic is forwarded to the corresponding target group. You can create different target groups for different types of requests. Regular Health Checks are carried out on instances to check whether the are able to serve request or not. These Health checks can be customised as per the needs under Advanced health check settings. Now click on Next: Register Targets.
Register Targets ==> Here, we register targets with your target group. If we register a target in an enabled Availability Zone, the load balancer starts routing requests to the targets as soon as the registration process completes and the target passes the initial health checks. We need to select instances and register/deregister them. Now select Next: Review.
Review the configuration ==> After reviewing the configurations and ensuring that they are correct, Create the Load Balancer.
Load Balancer created ==> If this message is received, this means the Load Balancer is successfully created. On the Load balancer page the watch the state change from provisioning -> active.
View Listeners ==> In the Listener Tab, new listeners can be added and existing listeners can be edited in order to manage the traffic.
View the output ==> If everything goes well, output is available at Public DNS of Load Balancer.If the Load Balancer is not connected to the EC2 instance ensure that the Security Groups are configured correctly.
Edit Inbound rules of EC2 instance ==> Edit inbound rules of the EC2 instance to change source of HTTP requests to security group of the Load Balancer. This is to ensure only a single point access to the EC2 instance which would be via the Load balancer. Now, the output will only be available at the public URL of Load Balancer and not the public URL of the EC2 instance.
Now, we have successfully implemented Load Balancing in AWS.
AWS Auto Scaling
Auto-scaling is a way to automatically scale up or down the number of compute resources that are being allocated to your application based on its needs at any given time. AWS Auto Scaling lets you build scaling plans that automate how groups of different resources respond to changes in demand.
In simple words, in Auto Scaling, when the load increases on a server, it is this concept that will scale-in or scale-out number of active servers, so that there is no increase in response time of requests.
- Better fault tolerance.
- Better cost management.
- Better availability.
Create Auto Scaling in AWS
- Create an Image of an EC2 instance ==> This image will be used to create new instances whenever scale-out policies are fired to increase the number of servers to adhere to incoming requests.
Now slide down left panel to open Auto Scaling.
Create Launch Configuration ==> Click on Create Launch Configuration. A launch configuration is a template that an EC2 Auto Scaling group uses to launch EC2 instances. When you create a launch configuration, you specify information for the instances, such as the ID of the Amazon Machine Image (AMI). It is for the kind of machine that needs to be launched when needed. It can be either an image of an instance or an altogether new AMI.
Create Auto Scaling Groups ==> Click on Create Auto Scaling Groups. Here, the name and the launch configurations needs to be selected that will be used in scaling policies. If wanted we can create an entirely new AMI here and create new launch configuration. We can also use Launch Templates in place of Launch Configurations. Launch templates are very similar to Launch configurations except that a launch template allows you to have multiple versions of a template.
Configure Auto Scaling Groups ==> Here, we select the VPC that the Auto Scaling Group will be part of and also the availability of the groups in various zones.
We can attach a load balancer in order to allow a single point of access to all the running instances at a given point of time. The process is very similar as discussed in previous section. This load balancing step is optional.
Here, we can select the maximum and minimum number of instances that are up and running at a given point of time based on the incoming requests. AWS Auto Scaling will take care of scaling-in and scaling-out within the specified upper and lower bounds.
Notifications can be set to notify whenever a scale-in or scale-out is carried out. In simple words, we will be notified whenever an EC2 instance is launched or terminated.
Add tags in form of key-value pairs.
Now Review and Create.
After successful creation, we will also be able to see a new Load Balancer being fired for this Auto Scaling Group.
- View the running instances ==> The number of running instances will match the minimum number of instances specified in the group size.
Now, try shutting down any of the running instances.
Be patient and wait for sometime. You will see a completely new instance start getting initialized so as to match the minimum number of required instances.
With this, we have successfully implemented Auto Scaling in AWS in an EC2 instance. I hope this article was informative for you. This is also my first article, so please leave some love and comments and I will see you in my future posts.