Step Functions

Icon

Step Function Icon

About

  • It is an orchestrator service to manage workflows as state machine.

  • Workflow is written in json format using Amazon State Language (ASL) .

  • It provides visualization of the workflow and execution of workflow as well as history of execution.

  • To start workflow we can use SDK API call, API Gateway, EventBridge or invoke manually from Management Console.

Task

  • State machine comprises of task.

  • Task state is a unit of work that is executed to perform some work in state machine.

  • Task that could be defined in task state are,

    • Invoking a lambda function

    • Run an AWS batch job

    • Run an ECS task and wait for it to complete

    • Insert an item from DynamoDb

    • Publish message to SNS and SQS

    • Launch another step function workflow

  • Task could also be running one activity.

    • Activity could be running an EC2 machine, ECS task, on-premise

    • Acivities poll the step functions for work.

    • Once work is received, the task is execution by polling system and result is send back to step function.

States

State
Description

Task state

A single unit of work to be performed by your state machine

Choice state

Test for a condition to send to a branch

Fail or Succeed state

Stop execution with failure or success

Pass state

Simply pass its input to its output or inject some fixed data without performing any work

Wait state

Provide a delay for certain amount of time or until a specified date/time

Map state

Dynamically iterate steps

Parallel state

Begin Parallel branches of execution

Activity task

  • Enables you to have the task work performed by an ActivityWorker.

  • ActivityWorker apps can be running on EC2, Lambda etc.

  • These workers poll for a task using GetActivityTask API, after ActivityWorker completes its work, it sends a response of it success/failure using SendTaskSuccess or SendTaskFailure API.

  • Task can be kept active

    • By configuring TimeoutSeconds.

    • Periodically send a heartbeat from ActivityWorker using SendTaskHeartBeat with the time you set in HeartBeatSeconds.

  • By configuring a long TimeoutSeconds and actively sending a hearbeat, ActivityTask can wait upto 1 year.

Wait for task token

  • Allows to pause the step function during a task until a task token is returned.

  • Task might wait for other AWS services, human approval, 3rd party integration etc.

  • Append .waitForTaskToken to the resource field to tell Step Functions to wait for the task token to be returned.

  • In the message body input pass the task token.

  • Once task is completed the token is passed back in response through SendTaskSuccess or SendTaskFailure API call.

  • Task will pause until it receives the task token back with SendTaskSuccess or SendTaskFailure API call.

Sample flow diagram

Wait for task token

Error handling

  • Step functions executes many small task.

  • Error handling should happen outside of step funtion.

  • Error could be

    • State machine has definition issues (for example no rule matching in a Choice state)

    • Task failure such as (exception in lambda functions).

  • Use Retry and Catch in the State Machine to handle errors instead of the Application code.

  • Predefined error codes are present

    Error code
    Description

    States.ALL

    Matches any error name

    States.Timeout

    When task takes longer than TimeoutSeconds or no heartbeat received

    States.TaskFailed

    Execution failure

    States.Permissions

    Insufficient permission to execute code

  • The state may report its own errors.

Retry

  • Allows you to define what happens on errors and customize retry parameters like back-off rate, max attempts, interval-seconds.

  • By default max-attempt is 3.

  • Once max-attempt is reached the Catch kicks in.

Retry

Catch

  • Allow you to define how to handle error once they occur or if retries are exhausted.

  • The catch block can have ResultPath key, which can have $.error as value, to pass on the error to next task as input.

    • Typically ResultPath have result of current task, which can be passed to the next task.

Catch Block

Last updated