TechWriterDev
  • Cloud
    • AWS
      • 00_Doubts
      • CloudPractitioner
        • Cloud Computing
        • AWS Global Infrastructure
        • Introduction to AWS EC2
        • Elastic load balancer(ELB)
        • 04_Messaging_Queuing
        • Aditional Computing Service
        • Accessing AWS resources
        • AWS Networking
        • Storage
        • Amazon Database Solutions
        • Monitoring Tools
        • AWS Security
        • Distributed Denial Of Service Attacks
      • DeveloperAssociate
        • References
        • AWS DVA-C02 Services Index
        • Services
          • 00_IAM
            • Identity and Access Management (IAM)
            • Account Protection Mechanisms
            • Access Mechanism of AWS Resources
            • Security Tools
            • Responsibility Model
            • Advanced Concepts
          • 01_EC2
            • Elastic Compute Cloud (EC2)
            • EC2 Volume Types
            • Amazon Machine Image (AMI)
            • AWS charges for IPv4 address
          • 02_SecurityGroups
            • Security Groups
          • 03_Elastic_LoadBalancing
            • Terminology
            • Elastic load balancer
            • Features
            • Basics
          • 04_AutoScaling
            • Auto Scaling
          • 05_RDS
            • Relational Database Service (RDS)
            • Aurora
            • Security
            • RDS Proxy
          • 06_ElastiCache
            • Cache
            • Cache Offerings
          • 07_Route53
            • Basics of DNS
            • Route 53
          • 08_VPC
            • Virtual Private Cloud (VPC)
          • 09_S3
            • Simple Storage Service (S3)
            • S3 Features
            • S3 Encryption
            • S3 Features
            • S3 Bucket Policy and IAM Policy
          • 10_ECS
            • Elastic Container Service (ECS)
            • Elastic Container Registry (ECR)
            • AWS Copilot
          • 11_EKS
            • Elastic Kubernetes Service (EKS)
          • 12_SDK_CLI_Tips
            • Access AWS Resources
          • 13_CloudFront
            • Cloud Front
          • 14_Messaging
            • Simple Queue Service (SQS)
            • Simple Notification Service (SNS)
            • Fan Out Pattern
            • Kinesis
            • Compare and Contrast
          • 15_ElasticBeanStalk
            • Elastic Beanstalk
          • 16_CloudFormation
            • CloudFormation
            • Dynamic References
          • 17_Monitoring
            • AWS Monitoring
            • AWS CloudWatch
            • CloudWatch Alarms
            • Synthetics Canary
            • Amazon EventBridge (formerly CloudWatch Events)
            • X-Ray
            • OpenTelemetry
            • CloudTrail
          • 18_Lambda
            • Lambda
            • Lambda Integrations
            • Configuring Lambda
            • Lambda Layers
          • 19_API_Gateway
            • API Gateway
            • API Gateway Integrations
          • 20_DynamoDB
            • DynamoDB
            • Operations
            • Indexes
            • DynamoDB Accelerator (DAX)
            • DynamoDB Streams
            • Transactions
            • Integrations
          • 21_CICD
            • CICD
            • CodeCommit
            • CodePipeline
            • CodeBuild
            • CodeDeploy
            • CodeArtifact
            • CloudGuru
          • 22_SAM
            • Serverless Application Model (SAM)
          • 23_CDK
            • Cloud Development Kit (CDK)
          • 24_StepFunctions
            • Step Functions
            • Types of step function
          • 25_AppSync
            • AppSync
          • 26_Amplify
            • Amplify
          • 27_STS
            • Security Token Service (STS)
          • 28_DirectoryService
            • Active Directory
          • 29_KMS
            • Encryption
            • KMS API
            • Features
            • Cloud Hardware Security Module (HSM)
          • 30_SSM_Store
            • SSM Parameter Store
          • 31_SecretsManager
            • Secrets Manager
          • 32_Cognito
            • Cognito
      • Questions
        • AWS_Region
        • EC2
        • IAM
  • Database
    • MongoDb
      • Mongo db Basics
      • Mongo DB Atlas
      • Document
      • Import-Export based on Data Format
      • Mongo Shell Commands
      • Query Operators
      • Indexes
      • Upsert
      • MongoDB Aggregation Framework
      • Aggregation Framework Operators
    • PostgreSQL
      • POSTGRE SQL DataTypes
      • About table
      • Constraints
  • Technologies
    • RabbitMQ
      • RabbitMQ Concepts
      • Introduction to Exchanges
      • Introduction to Queues
    • Terraform
      • 00_Introduction
      • Configuration blocks
      • Commands
      • Variables
      • Terraform Cloud
      • Modules
  • Languages
    • Java
      • Logging
        • Getting Started
      • 00_Core
        • 00_Basics
          • Java Vs C++
          • Object oriented principles
          • Steps to compile a java program
          • JVM Internals
          • Understanding Java Development Kit
          • What is JIT Compiler?
          • Java data types
          • 07_identifiers_type_conversion
          • 08_references_and_packages
          • Steps for attaching scanner
        • Concurrency
          • 00_Threads
            • Threads
          • 01_ExecutorFramework
            • Executor Framework
            • Asynchronous Computation
      • 01_Backend
        • 01_HttpAndWebServerBasics
          • HTTP
          • Content Type
          • Web Server
        • 02_J2EE_Basics
          • J2EE_Basics
          • Why HttpServlet classs is declared as abstract class BUT with 100 % concrete functionality ?
        • 03_TomCatAndSession
          • What is a Session?
          • WebContainer
        • 04_PageNavigation
          • Cookies Additional Information
          • Page Navigation Techniques
        • 05_AboutServlet
          • CGI v/s Servlet
          • Executor Framework
          • Servlet Life cycle
          • SERVLET CONFIG
          • Servlet Context
          • Servlet Listener (web application listener)
        • 08_SpringBoot
          • Spring Boot
          • Some common annotations used in spring eco system
        • 09_SpringDataJPA
          • Spring Data JPA
        • Java_Language_Changes
          • JDK enhancement tracking reference
        • 06_ORM_Hibernate
          • readmes
            • Hibernate
            • Advantages of Hibernate
            • Hibernate Caching
            • Hibernate API
            • Hibernate Query API
            • Hibernate Annotations and JPQL
            • Entity and Value Type
        • 07_SpringFramework
          • bean_validation
            • Bean Validation
          • core
            • readme
              • Spring
              • Spring Framework Modules
              • Spring MVC Request flow
              • Dependency Injection
              • Spring Beans
              • 06_Spring_Framework_Annotations
      • 03_Tools
        • Maven
          • Maven
  • SoftwareEngineering
    • DesignPatterns
      • Notes
        • Basics
        • OOP
        • SOLID Principles
        • 03_Creational
          • Abstract Factory (aka Kit)
          • Builder
          • Factory Method (aka Virtual constructor)
          • Prototype
          • Singleton
        • 04_Structural
          • Adapter (aka Wrapper)
          • Bridge (aka Handle | Body)
          • Composite
          • Decorator (aka Wrapper)
          • Facade
          • Flyweight
          • Proxy (aka Surrogate)
        • 05_Behavioral
          • Chain of Responsibility
          • Command (aka Action | Transaction)
          • Iterator (aka Cursor)
          • Observer (aka Publish-Subscribe | Dependents)
          • Strategy (aka Policy)
    • Principles
      • REST
        • REST
  • Tools
    • Containers
      • Docker
        • Docker
        • Docker Image
        • Commands
        • Compose
        • Best Practices
      • Kubernetes
        • Kubernetes
    • VCS
      • Git
        • Quick reference of useful Git commands
Powered by GitBook
On this page
  • Icon
  • Brief about NoSQL
  • References
  • About
  • Concepts
  • Read Write Capacity Modes
  • Internal Partitions
  • Write Sharding
  • Throttling
  • TTL
  • CLI Options
  • Session State Cache
  • Write Types
  • Fine Grained Access Control
  • Security
  • Backup and Restore
  • Global Tables.
  • DynamoDB Local
  • Migrations
  1. Cloud
  2. AWS
  3. DeveloperAssociate
  4. Services
  5. 20_DynamoDB

DynamoDB

Previous20_DynamoDBNextOperations

Last updated 4 months ago

Icon

DynamoDB Icon

Brief about NoSQL

  • Its a NoSQL (non-relational) database, distributed database.

  • NoSQL databases typically have no query joins or limited support for SQL joins.

  • They dont perform aggregations such as SUM, AVG etc.

  • They scale horizontally.

References

About

  • NoSQL distributed database.

  • Fully managed database with replication across multiple AZs.

  • Scales easily and can handle millions or requests, trillions of row, 100TB of storage.

  • Fast and consistent in performance.

  • It is integrated with IAM for security, authorization and administration.

  • Low cost and have Standard and Infrequest Access (IA) Table class.

Concepts

  • DynamoDB is made of tables.

  • Each table has a Primary Key, must be specified at creation time.

  • Each table can have infinite number of items.

  • Each item has attributes.

  • Maximum size of item is 400KB.

  • Data Types supported are

    • Scalar types: String, Number, Binary, Null, Boolean

    • Document types: List, Map

    • Set types: String Set, Number Set, Binary Set.

Choosing Primary Keys

  1. Partition Key Hash

    • Partition key must be unique.

    • Must be diverse, so that data can be distributed.

  2. Partition Key and Sort Key (HASH and Range)

    • The combination must be unique.

    • Data is grouped by partition key.

Read Write Capacity Modes

  • There are two modes

    • Provisioned

    • On-Demand

    Provisioned

    • Its the default mode.

    • User should specify Read Capacity Units (RCU) and Write Capacity Units (WCU).

    • User need to plan capacity beforehand. Though they can setup auto-scaling of throughput to meet demand.

    • Pay for provisioned read and write capacity units.

    • Throughput above RCU and WCU can be exceeded using Burst Capacity. But once this capacity is exceeded, you will see ProvisionedThroughputExceededException. One can try exponential backoff based retry mechanism to recover from such failure.

    Write Capacity Unit (WCU)

    • Represents one write/second for an item upto 1 KB in size.

    • If the items are larger than 1 KB (rounded to integer), more WCUs are consumed.

    • Formula to calculate WCU is,

          (items/second) * (size of each item/1KB)

    Read Capacity Unit (RCU)

    • There are two types of reads

      • Strongly Consistent Read

        • Set ConsistentRead parameter to be true in API calls.

        • Consume twice the RCU.

      • Eventually Consistent Read

        • It is default reading strategy.

        • May offer stale data, if tried to read just after a write.

    • Represents 1 Strongly Consistent Read per second or 2 Eventually Consistent Reads per second, for an item up to 4 KB in size.

    • If item size is not multiple of 4, then round it to nearest upper multiple of 4.

    • If item is larger than 4 KB, more RCUs are consumed.

    • Formula to calculate RCU is,

          (reads per second/type of read-factor) * (item-size to nearest multiple of 4 / 4)

    On-Demand

    • Reads and Writes automatically scales up/down with your workloads.

    • No capacity planning needed and hence no throttling.

    • Pay only for the capacity you use.

    • 2.5 time expensive than provisioned mode.

    • Charged based on Read Request Units (RRU) and Write Request Units (WRU).

    • Use case includes unknown workloads, unpredictable application traffic etc

  • One can switch between both modes once every 24 hours.

Internal Partitions

  • Data is stored in partitions.

  • Based on partition-key send from the application, the partition to write is selected.

  • Partition key, Sort Key and other attributes are given as input to partition algorithm. This hash is then used to determine the partition.

  • The following formula gives the number of partition by capacity,

    No. of partition by capacity = (RCUTotal/3000) + (WCUTotal/1000)

    No. of partition by size = Total Data Size/ 10 GB

    No of partitions = ceil(max(No. of partition by capacity, No. of partition by size))

  • RCU and WCU are spread evenly across partitions.

Write Sharding

  • To solve the Hot Partition issue when data is not evenly distributed due to limited partition key distribution characteristics, one can add suffix/prefix to partition key value to get better distribution.

  • There are two methods to create prefix and suffix.

    • Sharding using random suffix

    • Sharding using calculated suffix.

Throttling

  • If application exceeds the provisioned WCU and RCU at partition level, then will receive ProvisionedThroughputExceededException.

  • Reasons could be one of the following,

    • Hot Keys : Too many reads on one partitiond.

    • Hot Partitions

    • Very large items, as RCU and WCU depend on size of items.

  • To solve the above problems one could,

    • Exponential BackOff (included in SDK)

    • Distribute Partition keys as much as possible.

    • If RCU is being throttles, due to Hot Keys issue, use DynamoDB Accelerator (DAX).

TTL

  • Delete items after an expiry timestamp.

  • It doesnt consume any WCU.

  • TTL attribute must be a Number data type with Unix Epoch Timestamp.

  • Expired items are deleted within 48 hours of expiration.

  • If expired items appear in result, filter them out.

  • Any indexes (LSI or GSI) created which has the expired items will be deleted as well.

  • A delete operation for each expired items enters DynamoDB streams, which can be used to recover expired items.

  • Use cases includes, reduce stored data by keeping only current items, adhere to regulatory obligations etc.

CLI Options

Option
Description

--project-expression

One or more attributes to obtain as output

--filter-expression

Filter items before being returned

--page-size

To retrieve a list of items part by part, with specified page size by default item list size to be 1000 items. Behind the scene it will be done part by part and single result will be projected.

max-items

Maximum number of items to show in the CLI. It returns NextToken

starting-token

To specify NextToken to retrieve the next set of items.

Session State Cache

  • It is a serverless alternative to ElastiCache to store session state.

  • ElastiCache is in memory.

  • Both are key/value stores.

  • EFS as network drive is a great choice for saving into disk.

  • Note that EBS and InstanceStore can only be used for local caching and not shared caching.

  • S3 is not suitable as it has higher latency and not meant for small objects.

Write Types

  • There are different types of writes

    • Conditional Writes

    • Concurrent Writes

    • Atomic Writes

    • Batch Writes

Fine Grained Access Control

  • Use the temporary AWS credentials with a restricted IAM role based on condition.

  • Above set up can limit access to items and attributes in DynamoDB based on user level access.

Security

  • VPC Endpoints available to access DynamoDB without using Internet.

  • Access fully controlled by IAM.

  • Encryption at rest using AWS KMS and in transit using SSL/TLS.

Backup and Restore

  • Point-in-time Recovery like RDS, with no performance impact.

  • Normal back up and restore.

Global Tables.

  • These are multi-region, fully replicated, high performance DynamoDB.

  • This replication is done using DynamoDB Streams.

DynamoDB Local

  • This allows to run DynamoDB in local machine.

  • This allows to test and develop application using DynamoDB without internet.

Migrations

  • To migrate data to and from DynamoDB we have AWS DMS.

  • It supports different database as source and destination like MongoDB, Oracle, S3, MySQL etc.

Can do with Dynamo Streams.

To directly access DynamoDB, rather than creating IAM roles use Identity Providers like Google (behind the scene uses protocol) to exchange temporary AWS credentials.

Sample policy would like below,

SQL vs NoSQL
event driven programming
OpenID Connect