4.51 The different scaling strategies for GSAs

In this lesson we go through the different scaling strategies available for an Auto Scaling Group (ASG). The easiest to use is target tracking: you pick a target value for a metric (for example 40% average CPU) and the ASG automatically scales out when above 40% and scales in when below. Above the target it adds instances, below it removes them, always trying to keep the metric near the target.

Other ASG strategies

  • Simple / step scaling: define alarms — e.g. CPU > 70% triggers add 3 instances, CPU < 30% triggers remove 1 instance.
  • Scheduled scaling: pre-define capacity changes based on time — e.g. add instances every Friday afternoon 5pm to 8pm when you know traffic will rise.
  • Cooldown period: prevent the ASG from launching or terminating instances before previous changes have settled (default 300 s).
  • Scale-in protection: stop the ASG from terminating specific instances even during a scale-in event.

The cooldown period exists so that, after a scale-out, new machines have time to boot, register with the load balancer, and start serving requests before the next scaling decision is made. You can shorten it (say 180 s) for fast-booting workloads, or override it per scaling policy. For scale-in actions, AWS waits between consecutive removals to avoid wiping out healthy capacity too quickly.

In the console we open an ASG, observe its desired/min/max capacity (e.g. 2 / 1 / 3), then create a target tracking policy on average CPU at 40%, with a 200 s cooldown. The dashboard immediately reports that the current capacity is above the target and triggers a scale-in to one instance. Next we add a scheduled action — capacity = 10 every Friday at midnight — to anticipate weekly load spikes. Be careful: ASGs will terminate the instances they manage automatically; do not run important manual workloads on them.

Summary

This lesson covers the three primary scaling strategies available for AWS Auto Scaling Groups: target tracking scaling (maintaining a specific metric threshold like CPU usage), step scaling (triggering actions based on alarm thresholds), and scheduled scaling (pre-defining scaling actions for predictable traffic patterns). The course also explores cooldown periods, which prevent rapid scale oscillation, and demonstrates practical configuration of these strategies directly in the AWS console, showing real-world examples of how each strategy automatically adjusts instance capacity.

Key points

  • Target tracking scaling simplifies ASG management by automatically maintaining a single metric (e.g., 40% CPU average) without requiring manual threshold configuration
  • Step scaling offers granular control through multiple alarm thresholds (e.g., scale-up at >70% CPU, scale-down at <30%), allowing nuanced responses to different load levels
  • Scheduled scaling enables proactive capacity planning for predictable demand spikes, such as weekend traffic surges, by setting specific date/time-based scaling actions
  • Cooldown periods (default 300 seconds) prevent premature scale decisions by pausing additional scaling actions until previous scaling effects are observed
  • AWS console provides visual monitoring of ASG desired capacity and real-time policy impact, showing exactly how many instances are added or removed based on active strategies

FAQ

What is the main difference between target tracking and step scaling?

Target tracking scaling maintains a single target metric (like 40% CPU) and automatically adjusts capacity to stay close to that target, requiring minimal configuration. Step scaling, by contrast, reacts to multiple discrete alarm thresholds, allowing different actions for different severity levels (e.g., add 1 instance if >70% CPU, remove 1 instance if <30% CPU).

Why is a cooldown period important in auto-scaling?

The cooldown period (typically 300 seconds) prevents rapid oscillation by ensuring the system waits for instances to fully start and stabilize before evaluating new scaling decisions. This avoids wasteful scale-up/scale-down cycles and ensures metrics have time to stabilize after a scaling action.

When should I use scheduled scaling?

Scheduled scaling is ideal for predictable traffic patterns—for example, increasing capacity on Friday evenings before weekend traffic spikes or reducing capacity Sunday nights. It allows proactive capacity planning without waiting for metrics to trigger alarms, making it cost-effective for forecastable load variations.