DynamoDB is a NoSQL database offered by Amazon Web Services. A NoSQL database is a database that doesn't have a fixed schema, unlike relational databases. This means that you don't have to specify the data types for each attribute upfront.
A DynamoDB table is a collection of items. Each item is a collection of attributes. An attribute is a name-value pair. The primary key is a unique identifier for an item.
There are two types of primary keys: simple and composite.
A simple primary key has a single element, a partition key. A partition key is used to determine how data is partitioned across DynamoDB's servers.
A composite primary key has two elements, a partition key and a sort key. A sort key is used to further partition data within a partition.
Let's use an example of an airline application. In this example, the partition key would be the customer ID and the sort key would be the flight ID.
Let's see item collections. Item collections are a set of records that have the same partition key in a table with a composite primary key.
Here are the steps on how to achieve item collections in the AWS console:
Let's deep dive into the details:
Partitions: DynamoDB partitions data based on the partition key (part of the primary key). Items with the same partition key are stored together in a partition.
**Distribution: **Partitions are distributed across multiple servers for scalability and availability.
Read/Write Capacity Units (RCUs/WCUs): You provision RCUs and WCUs to handle expected read/write traffic.
Example:
Identify common queries and updates: This dictates table structure and indexing.
Consider read/write frequency and volume: Allocate RCUs/WCUs accordingly.
Example:
Item size limit: 400 KB per item.
Attribute size limit: 400 KB per attribute.
10 GB per partition: Distribute data evenly to avoid "hot" partitions.
Example:
Balance granularity and performance: Smaller items generally mean faster reads/writes.
Denormalize data if needed: Combine related data for frequent access.
Example:
Additional Considerations:
GSIs: Create secondary indexes for flexible querying, but be mindful of additional costs and write overhead.
Data modeling best practices:
Use composite primary keys (partition key + sort key) for efficient retrieval and sorting.
Consider single-table design for related data.
Use GSIs judiciously.
Monitoring and optimization: Track performance and adjust as needed.
DynamoDB is a powerful, scalable NoSQL database, but careful design is crucial for optimal performance and cost-efficiency.
Understanding these key concepts will guide you in creating effective DynamoDB data models.
Moving on to the next level and applying what has been discussed so far:
Here's a step-by-step breakdown of the data access patterns, using the example of booking a flight:
1. Booking a Flight with DynamoDB:
a. Right Patterns and Constraints:
Tables:
Primary Keys:
Secondary Indexes:
Constraints:
b. Booking Process:
2. Handling Complex Filtering with External Systems:
Scenario: Users search for flights with complex criteria (price range, multiple connections, etc.).
Solution:
3. Integrating DynamoDB with Other Tools:
It is recommended to watch two other DynamoDB talks at Reinvent 2023: DAT329 and DAT330 for in-depth knowledge of the underlying architecture of Dynamodb.