Operations

Writing Data

Operation Name
Description

PutItem

Create a new item or fully replace an old item (using Primary Key). It Consumes WCU

UpdateItem

Edits an existing item's attribute or add items if it does not exists. Can be used to implement Atomic Counters - A numeric attribute that's unconditionally incremented.

Conditional Writes

Accept a write/update/delete only if condition are met, otherwise return an error. Helps with concurrent access to items.

Conditional Writes

  • For PutItem, UpdateItem, DeleteItem and BatchWriteItem one can specify a condition expression to determine which items should be modified.

  • Condition expressions like

    • attribute_exists

    • attribute_not_exists

    • attribute_type

    • contains(string)

    • begins_with(string)

    • IN (:cat1, :cat2)

    • size(length)

  • FilterExpression filters the result of read queries, while Condition Expressions for write operations.

  • An example query would look as follows,

        aws dynamodb update-item \
            --table-name ProductCatalog \
            --key '{"Id":{"N":"1"}}' \
            --update-expression "SET Price = :newval" \
            --condition-expression "Price > :limit" \
            --expression-attribute-values file://expression-attribute-values.json
  • If attribute_not_exists(partition_key) is in condition-expresion it makes sure to not overwrite an item if the item already exists, its a trick to prevent overwrite. Can be combined with sort_key if that is also part of partition key.

  • More about them can be read herearrow-up-right

Reading Data

Operation Name
Description

GetItem

Reads based on primary key to retrieve one item. Primary key can be HASH or HASH + RANGE. Eventually Consistent Read. Option to use Strongly Consistent Reads (more RCU, hence might take longer). ProjectedExpression can be specified to retrieve only certain attributes.

Query

Return items based on KeyConditionExpression, partition key value must be required. Sort key is optional. FilterExpression is used for additional filtering after Query operation. Used only with non-key attributes. (It does not allow HASH or RANGE attributes). It returns a list of items specified in Limit, or upto 1MB. Pagination on result can be done to get more result. Can query table, a local secondary index or a Global secondary Index.

Scan

Scan an entire table and then filter out data, it is inefficient though. Returns upto 1 MB of data, use pagination to keep on reading. Consumes a lot of RCU. Also has option to do Parallel Scan, in which multiple workers will scan multiple data segment at same time and increases throughput and RCU consumed. Ensure to use limit to reduce the impact of parallel scans. Can use ProjectionExpression and FilterExpression.

Delete Data

Operation Name
Description

DeleteItem

Delete an individual Item or conditional delete.

DeleteTable

Delete an whole table and all its items.

Batch Operations

  • Allow to save in latency by reducing the number of API calls.

  • Operations are done in parallel for better efficiency.

  • Part of batch can fail, in which case would need to try again for failed items.

Operation Name
Description

BatchWriteItem

Supports upto 25 PutItem and/or DeleteItem in one call. Upto 16 MB of data written, upto 400 KB of data per item. Can't update items (use UpdateItem). UnprocessedItems for failed write operations are returned on failure, and to correct the failure use exponential back-off or add WCU.

BatchGetItem

Returns items from one or more tables. Upto 100 items and 16 MB of data. Items are retrieved in parallel to minimize latency. UnprocessedKeys for failed read operations are returned on failure, and to correct the failure use exponential back-off or add RCU.

Some more Operations

  1. Table Cleanup

    • Scan and DeleteItem

      • Very slow, consumes RCU and WCU and hence expensive

    • Drop Table

      • Fast, cheap and efficient

  2. Copying a Table

    • AWS Data Pipeline.

      • This will spin up an EMR (MapReduce) cluster.

      • EMR cluster will read data from DynamoDB table and write that data to S3.

      • Once above step is done, it will read back the data from S3 and write it to new DynamoDB table.

    • Backup and restore

      • Takes some time.

    • AWS Glue (ETL Job)

      • Creates a script and reads from source table and then writes to destination

    • Scan and PutItem/BatchWriteItem

      • Write your own code and allows to do some transformation on data if needed.

Optimistic Locking

  • If multiple clients tries to operate on same set of data then only of them will succeed. Other clients will receive Optimistic Locking exception.

  • This ensures that concurrent operation has isolation when acting on same data to avoid data corruption and incorrect result.

  • This locking is acheived by maintaining a column in table named version. Every update will increment this version column. If an update tries to update while possessing incorrect version number then the operation will fail.

  • This allows higher concurrency and locking data only at transaction commit time.

PartiQL

  • SQL compatible query language for DynamoDB.

  • Allows you to select, insert, update and delete data in DynamoDB across multiple DynamoDB Tables.

  • Supports batch operations as well.

  • No joins are allowed.

  • Queries can be run from,

    • AWS Management Console

    • NoSQLWorkbench for DynamoDB

    • SDK

    • AWS-CLI

    • DynamoDB APIs

Last updated