# Questions
Author, DynamoDB Book
This is the place for your DynamoDB questions.
A few notes to make this more helpful to you and to others:
A few notes to make this more helpful to you and to others:
- When asking your question, include enough context to make it helpful. If it's a data modeling question, what's the access pattern you're trying to solve? If it's an error or unexpected behavior, include examples of code or errors.
- If your question is directly related to a specific programming language, check out the language-specific spaces instead.
- If your question is directly related to The DynamoDB Book, please use the space for The DynamoDB Book.
Hello All,
I am trying to build a real-time component in my bigger application. The component runs an auction for an hour or so. There isn't much meta data to served in this real-time component other than the actual Bid Value.
Other than fetching and updating the current bid there will be no access pattern.
5000+ users are expected to participate in the auction (90% viewers 10% bidders) I am trying to capture this in a way that doesn't cause a hot partition but I keep thinking of having only 1 partition with 1 item on the dynamo table that users read and update.
Would appreciate your opinion
Thanks
I am trying to build a real-time component in my bigger application. The component runs an auction for an hour or so. There isn't much meta data to served in this real-time component other than the actual Bid Value.
Other than fetching and updating the current bid there will be no access pattern.
5000+ users are expected to participate in the auction (90% viewers 10% bidders) I am trying to capture this in a way that doesn't cause a hot partition but I keep thinking of having only 1 partition with 1 item on the dynamo table that users read and update.
Would appreciate your opinion
Thanks
A couple of questions:
- Is distributed latency of concern (global tables etc)
- How many auctions are you likely to have concurrently (Seems like you only really need bidId and bidValue and you would get an awful lot of them in memory using DAX - in memory cache)https://aws.amazon.com/dynamodb/dax/
· 2 replies

Writing code at mobilerider.com
Just got my copy of the book! and really thrilled on the potential of DynamoDB. I’ve been watching my fair share of conferences from Alex and Rick and others, and there is one use case or access pattern that is not mentioned even when in my spoiled RDS brain is basic: accessing latest N items (without obvios collection PK). I’m like I must be presenting myself with a very dumb question no one even mentioned before, but then any data that needs some kind of manual management from an Admin/CMS needs to display paginated sorted items. Finally I found one solution on the book!
Sharding over truncated timestamps and then caching data duplicated across partitions.
My question is: does that still applies when the amount of data pouring in is in the order of thousand items per second. ie a chat app for an online event where users show up at specific moment and stay in the chat for a period of time. After the initial peak of users arriving at the site, most probably I will be wasting a lot of WCUs to keep “most recent” items cache up to date vs amount of new users coming to the site and requesting initial “most recent” items load.
Would it make sense in that scenario to add a random element in front of the timestamp to guarantee some level of sharding and skip the cache even if I have to do multiple reads per initial request?
Also is there a way to avoid a second read to ensure there will be enough items when there is a change in timestamp key?
Thanks!!
Put a strignfied timestamp as SK and then you are able to do it. If the SK is already used then create a GSI.
Lev
question below discusses this bit, it's also in the book.

Hi, guys, I have one question about strategies of migration. [I am new in DynamoDB]
I have read section «Adding a new entity type into a new item collection» from «The DynamoDB book».
In that section we want to add Comments for Posts.
We have already have primary key pattern:
I have read section «Adding a new entity type into a new item collection» from «The DynamoDB book».
In that section we want to add Comments for Posts.
We have already have primary key pattern:
PK: POST#<PostId> SK: LIKE#<Username>
Access pattern:
Fetch a Post and the most recent Comments for that Post.
Question:
Why we can not just add new entity Comment and update Sort Key:
PK: POST#<PostId> SK: LIKE#<Username> or COMMENT#<Timestamp> (for comments)
Its like in the Github example, section «Assembling different collections of items».
Thanks!
Yes, but its written in the book written:
«The Post item collection is already being used on the base table. To handle this access pattern, we’ll need to create a new item collection in a global secondary index.
To do this, let’s add the following attributes to the Post item:
GSI1PK: POST#<PostId>
GSI1SK: POST#<PostId>
And we’ll create a Comment item with the following attributes:
PK: COMMENT#<CommentId>
SK: COMMENT#<CommentId>
GSI1PK: POST#<PostId>
GSI1SK: COMMENT#<Timestamp>»
Why it is so complicated? Why we can not just create new entity Comment?
«The Post item collection is already being used on the base table. To handle this access pattern, we’ll need to create a new item collection in a global secondary index.
To do this, let’s add the following attributes to the Post item:
GSI1PK: POST#<PostId>
GSI1SK: POST#<PostId>
And we’ll create a Comment item with the following attributes:
PK: COMMENT#<CommentId>
SK: COMMENT#<CommentId>
GSI1PK: POST#<PostId>
GSI1SK: COMMENT#<Timestamp>»
Why it is so complicated? Why we can not just create new entity Comment?
· 2 replies

I currently have a wrapper around all my DynamoDB calls that does stuff like adding and updating createdAt, updatedAt item attributes and so forth. When I test I mock out the actual Dynamodb.documentClient call and check the parameters are coming in correctly.
I encounter two problems with this approach:
I encounter two problems with this approach:
- I am forever introducing syntax errors on the documentClient payload
- I need to test the actual "magic" that update and condition expressions actually work, e.g. duplicate key, counter increments etc.
What I currently do is intercept the code I have generated in my app and then copy/paste into a scaffold bit of code. This picks up syntax errors and I can manually check that the expressions perform as expected. All a bit laborious.
I would like to write some tests to actually check the statements really work. And these tests to work in CI/CD pipelines.
So the test structure would set up data in db, run code, check values of items in db and then delete the items. This would involve running against a real db (either locally or remotely).
Perhaps there is another way?
O.
Good question! Curious to hear what others do here. Personally, I run the amazon/dynamodb-local docker image, both locally and in CI/CD, but it does not support PartiQL (yet). I'm not sure about their java implementation.
I do mock the dynamo calls wherever possible, but also use a suite of end-to-end tests that create each type of record, query, then delete them, and finally, scan the database to see if anything's left. My hope is this will reveal any mismatched operations, or orphaned records.
I'm not sure how else to reliably test the dynamo side of things, without a complex (risky) mock.
I do mock the dynamo calls wherever possible, but also use a suite of end-to-end tests that create each type of record, query, then delete them, and finally, scan the database to see if anything's left. My hope is this will reveal any mismatched operations, or orphaned records.
I'm not sure how else to reliably test the dynamo side of things, without a complex (risky) mock.
Software Engineer
For database integration tests in Java I use DynamoDB Local as described here. For Python, I tried moto and it worked pretty well, I found this guide helpful. Also, for local development I sometimes use LocalStack which could also be an option for integration testing.
· 1 reply
Author, DynamoDB Book
Commented in a similar post, but I agree with the general sentiment of running certain tests against actual DynamoDB instances or emulators.
To expand on the point -- one of the nice things about DynamoDB is that it's pretty quick to create a new table, and it's basically free to create a table, run a few operations, and tear it down. Makes it quite a bit easier than other databases for running integration tests.
To expand on the point -- one of the nice things about DynamoDB is that it's pretty quick to create a new table, and it's basically free to create a table, run a few operations, and tear it down. Makes it quite a bit easier than other databases for running integration tests.

I found this on the web and thought I would post it here. It's very powerful what you can do in an update, which probably limits the need to use a put.
{ TableName: `dev-table-primary`, "Key": { "id": "UPF#idKey", "composite": "SHR#hashedSource" }, "ExpressionAttributeNames": { "#counter": "counter" }, "UpdateExpression": "SET #counter = if_not_exists(#counter, :zero) + :inc, createdAt = if_not_exists(createdAt, :now)", "ExpressionAttributeValues": { ":now": new Date().toISOString(), ":inc": 1, ":zero": 0 }, ReturnValues: 'UPDATED_NEW' }
A follow-on question on testing this coming ....

You guys helped me tremendously with my first DynamoDB project, but I've come to realize I don't really need Cognito. Alex's "Session Store Example" seems like a good place to start, but I was wondering how to integrate GitHub/LinkedIn signin options that Cognito offers. Is there another example using this?

Software Engineer
I know that Lambda works well with DynamoDB streams. You would normally enable streams on your Single Table and create a lambda that is triggered from that DynamoDB Stream. The lambda would be invoked every time an INSERT, UPDATE, or DELETE occurs on the that table. So I think this lambda would always have a switch statement to determine the event type of that record it is processing. However, is there a way that we can pre-divide the workload into 3 lambdas one for each type?
The only way I am seeing it is to have a proxy lambda attached to the stream that will direct the messages into a step function that contains 3 lambdas, one for each type.
The only way I am seeing it is to have a proxy lambda attached to the stream that will direct the messages into a step function that contains 3 lambdas, one for each type.
I like to use a lambda that polls a batch of events every N seconds. Most stream events are not useful but sometimes it creates a related eventbridge event. This handler just acts like a router, and avoids any processing that might throw an error. Downstream eventbridge handlers have their own retries/DLQ's.
I'd love to hear how other people handle this.

· 2 replies
