Auto Scaling

Icon

AutoScaling Icon

Problem

  • Most application during its life-time receives varying load.

  • To ensure that application is highly available, it is necessary to make sure that the application has the resources to handle the load.

    • Meaning, application can scale-out to match the increasing load and scale-in when load is decreased.

  • Remove instances which are unhealthy.

  • Ensure minimum/maximum application instances are running at a given time.

  • Automatically register new instances to the load balancer.

Solution

  • Automatic Scaling Group (ASG) is the solution to above set of problems.

  • ASG is linked to a load balancer.

  • ASG in itself doesnt solve the problem of scalability. ELB and ASG and multi AZ together help us achieve Scalability and High Availability.

  • ASG is free.

Configuring ASG

  • A Launch Template is used to configure the ASG.

  • Parameters include,

    • AMI

    • Instance type

    • EC2 User Data

    • EBS/EFS volume

    • SSH key pair

    • Security group

    • IAM Roles

    • ELB configuration

    • Network configuration

  • Following scaling parameters are to be configured in a Launch Template as part of scaling policies when creating an ASG.

    • Desired capacity: Represents the initial capacity of the Auto Scaling group at the time of creation.

    • Minimum capacity: an Auto Scaling group cannot decrease its desired capacity lower than the minimum size limit.

    • Maximum capacity: Represents the maximum group size. Desired capacity cannot exceed this capacity value.

Scaling policy

  • Policies based on which scaling can be done.

Dynamic Scaling

Target Tracking Scaling

  • Most simple and easy to set-up.

  • For example, based on Average CPU utilization configuration.

Simple Scaling

  • Based on cloud watch alarm, one can setup scale-out and scale-in configuration.

Step Scaling

  • Same as simple scaling, but scaling can be done in steps with out scaling cool down period between scaling activity.

Scheduled Actions

  • Anticipate scaling based on known usage pattern.

Predictive Scaling

  • Based on historical load, forecast load and scheduled scaling ahead. This will scale-in or scale-out the instances.

  • This uses Machine learning.

Scaling Metrics

  • Average CPU Utilization

  • Request Count per target

  • Average Network In/Out

  • Custom metrics configured using Cloud Watch

Scaling cooldown

  • After a scaling activity, for a duration of cooldown period (by default 5 minutes, though configurable) the ASG will not launch or terminate additional instances.

  • During this period ASG will wait for the metrics to stabilize.

Instance Refresh

  • Instance refresh helps us to update launch template configuration based on which new instances should be created using this new template.

  • Updating launch template should lead to terminating instances running on old template and start new instances using newer template.

  • This feature enables us to keep a minimum healthy percentage of EC2 instances. This percentage help us to keep a track of how many instances using old template can be deleted at a time.

  • One can specify minimum healthy percentage and warmup time after which only the newly set up instances will be considered ready to handle traffic.

References

Last updated