Step Functions
Icon
About
It is an orchestrator service to manage workflows as state machine.
Workflow is written in
json
format usingAmazon State Language
(ASL) .It provides visualization of the workflow and execution of workflow as well as history of execution.
To start workflow we can use
SDK
API call,API Gateway
,EventBridge
or invoke manually fromManagement Console
.
Task
State machine comprises of task.
Task state
is a unit of work that is executed to perform some work in state machine.Task that could be defined in task state are,
Invoking a lambda function
Run an AWS batch job
Run an ECS task and wait for it to complete
Insert an item from
DynamoDb
Publish message to
SNS
andSQS
Launch another step function workflow
Task could also be running one activity.
Activity could be running an EC2 machine, ECS task, on-premise
Acivities poll the step functions for work.
Once work is received, the task is execution by polling system and result is send back to step function.
States
Task state
A single unit of work to be performed by your state machine
Choice state
Test for a condition to send to a branch
Fail
or Succeed
state
Stop execution with failure or success
Pass state
Simply pass its input to its output or inject some fixed data without performing any work
Wait state
Provide a delay for certain amount of time or until a specified date/time
Map state
Dynamically iterate steps
Parallel state
Begin Parallel branches of execution
Activity task
Enables you to have the task work performed by an
ActivityWorker
.ActivityWorker
apps can be running onEC2
,Lambda
etc.These workers poll for a task using
GetActivityTask
API, afterActivityWorker
completes its work, it sends a response of it success/failure usingSendTaskSuccess
orSendTaskFailure
API.Task can be kept active
By configuring
TimeoutSeconds
.Periodically send a heartbeat from
ActivityWorker
usingSendTaskHeartBeat
with the time you set inHeartBeatSeconds
.
By configuring a long
TimeoutSeconds
and actively sending a hearbeat,ActivityTask
can wait upto 1 year.
Wait for task token
Allows to pause the step function during a task until a task token is returned.
Task might wait for other AWS services, human approval, 3rd party integration etc.
Append
.waitForTaskToken
to the resource field to tell Step Functions to wait for the task token to be returned.In the message body input pass the task token.
Once task is completed the token is passed back in response through
SendTaskSuccess
orSendTaskFailure
API call.Task will pause until it receives the task token back with
SendTaskSuccess
orSendTaskFailure
API call.
Sample flow diagram

Error handling
Step functions executes many small task.
Error handling should happen outside of step funtion.
Error could be
State machine has definition issues (for example no rule matching in a Choice state)
Task failure such as (exception in lambda functions).
Use Retry and Catch in the State Machine to handle errors instead of the Application code.
Predefined error codes are present
Error codeDescriptionStates.ALL
Matches any error name
States.Timeout
When task takes longer than TimeoutSeconds or no heartbeat received
States.TaskFailed
Execution failure
States.Permissions
Insufficient permission to execute code
The state may report its own errors.
Retry
Allows you to define what happens on errors and customize retry parameters like
back-off rate
,max attempts
,interval-seconds
.By default
max-attempt
is 3.Once
max-attempt
is reached the Catch kicks in.

Catch
Allow you to define how to handle error once they occur or if retries are exhausted.
The catch block can have
ResultPath
key, which can have$.error
as value, to pass on the error to next task as input.Typically
ResultPath
have result of current task, which can be passed to the next task.

Last updated