4.49 AWS ASG Overview

What is an Auto Scaling Group (ASG)? When your application servers face increasing traffic — more visitors, more usage — you can no longer scale by adding hardware manually. In the cloud, resources can be created and destroyed quickly and cheaply, so the goal of an ASG is to scale the number of instances automatically in response to demand or scheduled events.

Key attributes of an ASG

  • Launch configuration or launch template: AMI, instance type, EC2 user data, EBS volumes, security groups, SSH key pair.
  • Minimum / maximum / desired capacity: the bounds of the fleet.
  • Network and subnet configuration plus an optional load balancer attachment.
  • A scaling policy: the rules that trigger scale-out (add instances) or scale-in (remove instances).
  • CloudWatch alarms that fire on metrics such as CPU utilization, network in/out, request count per target.

An ASG also accepts custom CloudWatch metrics — number of users, RAM utilization — published from the EC2 application via the PutMetricData API. The ASG can then react to your business signals, not only to system metrics. New CloudWatch-based scaling rules let you target metrics like average CPU, average request count per target, average network in or average network out — easy to configure and maintain in AWS.

If an instance is unhealthy, the ASG terminates and replaces it automatically. If the ASG is attached to a load balancer, the balancer drains the failing instance and the ASG provisions a fresh one. The ASG service itself is free — you only pay for the underlying EC2 instances. In the next lesson we will set up an ASG hands-on, define a launch configuration, attach a load balancer, and write scaling rules in the console.

Summary

AWS Auto Scaling Groups (ASG) automatically manage the number of EC2 instances to handle varying application loads by scaling out (adding instances) or scaling in (removing instances). ASGs use launch configurations specifying instance type, security groups, and AMI, combined with scaling policies based on CloudWatch metrics like CPU, network traffic, or custom metrics to maintain optimal infrastructure capacity.

Key points

  • Auto Scaling Groups scale infrastructure up or down based on demand via CloudWatch alarms and metrics (CPU, network, custom)
  • Each ASG has configurable minimum size, maximum size, and desired capacity to control instance range
  • Launch configurations define instance specifications including AMI, instance type, security groups, SSH keys, and IAM roles
  • Target tracking scaling policies and scheduled scaling enable automatic scaling without manual intervention
  • ASGs automatically replace unhealthy instances and integrate with load balancers for traffic distribution

FAQ

What is the difference between scaling out and scaling in?

Scaling out adds new instances to handle increased load, while scaling in removes instances when demand decreases to optimize costs.

What metrics can trigger Auto Scaling actions?

Scaling can be triggered by CPU utilization, average network traffic, request count from load balancers, or custom metrics sent from your application to CloudWatch.

How are failed instances handled by ASGs?

ASGs automatically detect unhealthy instances through health checks and recreate them, ensuring high availability without manual intervention.