Auto Scaling
Last updated
Last updated
Most application during its life-time receives varying load.
To ensure that application is highly available, it is necessary to make sure that the application has the resources to handle the load.
Meaning, application can scale-out to match the increasing load and scale-in when load is decreased.
Remove instances which are unhealthy.
Ensure minimum/maximum application instances are running at a given time.
Automatically register new instances to the load balancer.
Automatic Scaling Group
(ASG) is the solution to above set of problems.
ASG
is linked to a load balancer.
ASG in itself doesnt solve the problem of scalability. ELB
and ASG
and multi AZ
together help us achieve Scalability and High Availability.
ASG
is free.
ASG
A Launch Template
is used to configure the ASG
.
Parameters include,
AMI
Instance type
EC2 User Data
EBS/EFS volume
SSH key pair
Security group
IAM Roles
ELB configuration
Network configuration
Following scaling parameters are to be configured in a Launch Template
as part of scaling policies when creating an ASG
.
Desired capacity: Represents the initial capacity of the Auto Scaling group at the time of creation.
Minimum capacity: an Auto Scaling group cannot decrease its desired capacity lower than the minimum size limit.
Maximum capacity: Represents the maximum group size. Desired capacity cannot exceed this capacity value.
Policies based on which scaling can be done.
Most simple and easy to set-up.
For example, based on Average CPU utilization configuration.
Based on cloud watch alarm, one can setup scale-out and scale-in configuration.
Same as simple scaling, but scaling can be done in steps with out scaling cool down period between scaling activity.
Anticipate scaling based on known usage pattern.
Based on historical load, forecast load and scheduled scaling ahead. This will scale-in or scale-out the instances.
This uses Machine learning.
Average CPU Utilization
Request Count per target
Average Network In/Out
Custom metrics configured using Cloud Watch
After a scaling activity, for a duration of cooldown period (by default 5 minutes, though configurable) the ASG will not launch or terminate additional instances.
During this period ASG
will wait for the metrics to stabilize.
Instance refresh helps us to update launch template configuration based on which new instances should be created using this new template.
Updating launch template should lead to terminating instances running on old template and start new instances using newer template.
This feature enables us to keep an healthy percentage of EC2 instances. This percentage help us to keep a track of how many instances using old template can be deleted at a time.
One can specify minimum healthy percentage and warmup time after which only the newly set up instances will be considered ready to handle traffic.
Difference between Simple scaling and Step scaling