Step Functions
Last updated
Last updated
It is an orchestrator service to manage workflows as state machine.
Workflow is written in json
format using Amazon State Language
(ASL) .
It provides visualization of the workflow and execution of workflow as well as history of execution.
To start workflow we can use SDK
API call, API Gateway
, EventBridge
or invoke manually from Management Console
.
State machine comprises of task.
Task state
is a unit of work that is executed to perform some work in state machine.
Task that could be defined in task state are,
Invoking a lambda function
Run an AWS batch job
Run an ECS task and wait for it to complete
Insert an item from DynamoDb
Publish message to SNS
and SQS
Launch another step function workflow
Task could also be running one activity.
Activity could be running an EC2 machine, ECS task, on-premise
Acivities poll the step functions for work.
Once work is received, the task is execution by polling system and result is send back to step function.
Task state
A single unit of work to be performed by your state machine
Choice state
Test for a condition to send to a branch
Fail
or Succeed
state
Stop execution with failure or success
Pass state
Simply pass its input to its output or inject some fixed data without performing any work
Wait state
Provide a delay for certain amount of time or until a specified date/time
Map state
Dynamically iterate steps
Parallel state
Begin Parallel branches of execution
Enables you to have the task work performed by an ActivityWorker
.
ActivityWorker
apps can be running on EC2
, Lambda
etc.
These workers poll for a task using GetActivityTask
API, after ActivityWorker
completes its work, it sends a response of it success/failure using SendTaskSuccess
or SendTaskFailure
API.
Task can be kept active
By configuring TimeoutSeconds
.
Periodically send a heartbeat from ActivityWorker
using SendTaskHeartBeat
with the time you set in HeartBeatSeconds
.
By configuring a long TimeoutSeconds
and actively sending a hearbeat, ActivityTask
can wait upto 1 year.
Allows to pause the step function during a task until a task token is returned.
Task might wait for other AWS services, human approval, 3rd party integration etc.
Append .waitForTaskToken
to the resource field to tell Step Functions to wait for the task token to be returned.
In the message body input pass the task token.
Once task is completed the token is passed back in response through SendTaskSuccess
or SendTaskFailure
API call.
Task will pause until it receives the task token back with SendTaskSuccess
or SendTaskFailure
API call.
Step functions executes many small task.
Error handling should happen outside of step funtion.
Error could be
State machine has definition issues (for example no rule matching in a Choice state)
Task failure such as (exception in lambda functions).
Use Retry and Catch in the State Machine to handle errors instead of the Application code.
Predefined error codes are present
States.ALL
Matches any error name
States.Timeout
When task takes longer than TimeoutSeconds or no heartbeat received
States.TaskFailed
Execution failure
States.Permissions
Insufficient permission to execute code
The state may report its own errors.
Allows you to define what happens on errors and customize retry parameters like back-off rate
, max attempts
, interval-seconds
.
By default max-attempt
is 3.
Once max-attempt
is reached the Catch kicks in.
Allow you to define how to handle error once they occur or if retries are exhausted.
The catch block can have ResultPath
key, which can have $.error
as value, to pass on the error to next task as input.
Typically ResultPath
have result of current task, which can be passed to the next task.