# The DynamoDB Book
Author, DynamoDB Book
This is a space to discuss The DynamoDB Book.
If you have a question about a particular strategy, example, or other section of The DynamoDB Book, post it in here.
If you notice typos, errors, or confusing issues, post those here as well!
Note that moderators may occasionally remove posts from this space if they reflect errors that have been fixed in The DynamoDB Book.
If you have a question about a particular strategy, example, or other section of The DynamoDB Book, post it in here.
If you notice typos, errors, or confusing issues, post those here as well!
Note that moderators may occasionally remove posts from this space if they reflect errors that have been fixed in The DynamoDB Book.
0
I think there is a whole chapter on testing, what to test, how to test (unit and integration) and CI/CD pipelines. Which is also plays into how to structure your code, to make it easy to test. In 9.2 it states:
All interaction with DynamoDB should be handled in the data module that is at the boundary of your application. There’s a lot of work to reshape your application object into the format needed by DynamoDB, and it’s not something that the core of your application should care about. Write that DynamoDB logic once, at the edge of your application, and operate on application objects the rest of the time.
Which is a pointer to creating a domain model and wrappers around to the dynamodb client (I use document client personally) which I get to do some common stuff such as adding a createdAt attribute if not set and updating updatedAt attribute. Additionally, your wrapper can check for the put requests not to duplicate PKs etc, via condition expressions - all of this needs testing.
My particular problem is how to unit test some of the funky things you can do such as incrementing counters and so forth on update statements such as.
My particular problem is how to unit test some of the funky things you can do such as incrementing counters and so forth on update statements such as.
"UpdateExpression": "SET #counter = if_not_exists(#counter, :zero) + :inc, createdAt = if_not_exists(createdAt, :now), entity = :cmpId, cmp = :cmpId, amount = :amount",
I am continually having to copy/paste my generated expressions into a code snippet then executing that which validates syntax errors (a real problem) and then manually checking the output is as expected. Which as for the expression above needs to be called more than once to validate it working correctly. It all seems just a little backward and needing a better way.
I mock out the actual DynamoDB calls and spy on what they are receiving from my code but is the UpdateExpresssion actually doing what I want it to do?
Insights into this sort of stuff, with code examples (there are no tests are in the book's source) would be extremely beneficial.
Thoughts?
Author, DynamoDB Book
In the meantime, I like to use a mixture of unit tests and integration tests, with the following rules of thumb:
1. Unit tests for logic in your data access code _before_ calling DynamoDB. For example, if you have an if-statement or some conditional logic when assembling the properties to call the DynamoDB code, you should unit test to ensure it's calling DynamoDB with the right properties.
2. Integration tests for interaction between elements in your code. If you are writing to DynamoDB in one place and reading in another, I prefer to use an integration test to ensure those code paths work together. There's an implicit assumption between those two elements around how the primary key is structured, and you want to ensure the assumption holds.
3. Integration tests for advanced elements of the DynamoDB API (e.g. condition expressions, update expressions, key condition expressions). Again, these cases often involve subtle, implicit assumptions between different elements of your code base. I like to run these against DynamoDB rather than relying on mocks.
The downside of (2) and (3) is that integration tests can be slower, particularly if you're waiting on actual resources. You can speed this up by using something like Dynalite or DynamoDB Local.
Additionally, I know some folks like to go unit test only, perhaps due to the size of the codebase. I'm fine with that approach as well, provided you have other mechanisms to discover issues with DynamoDB changes (e.g. canary rollouts + rollbacks).

Solutions Architect, The Vanguard Group
Hi
Alex DeBrie
. In Chapter 15 of The DynamoDB Book you covered many useful migration scenarios, but there is one migration scenario that you didn't cover - migrating data from one table to another.
In Chapter 18, you give an example of creating a table with a simple primary key for storing sessions. But what if 6 months later you discover a new requirement that necessitates a sort key? I understand that in this case, it is necessary to create a new table. But what are the most common solutions for copying the data from the old to the new table? There are many things to consider, including data validation, security, downtime, etc.. As you stated in Chapter 15, migrations can be intimidating.
In Chapter 18, you give an example of creating a table with a simple primary key for storing sessions. But what if 6 months later you discover a new requirement that necessitates a sort key? I understand that in this case, it is necessary to create a new table. But what are the most common solutions for copying the data from the old to the new table? There are many things to consider, including data validation, security, downtime, etc.. As you stated in Chapter 15, migrations can be intimidating.
Author, DynamoDB Book
Hey
Paul Tihansky
! That's a good point, and one I don't cover in the book. There are lots of complications there, and you'd likely need to do a period of dual-writes unless you could handle downtime.
I'm going to be working on some updates to the book in the next few months, so I'll see if I can squeeze this in :)
I'm going to be working on some updates to the book in the next few months, so I'll see if I can squeeze this in :)

Hello Alex,
I have one question. What I observed is, when we try to put the duplicate PK without CondionExpression, we don't get the error but the same {} empty brackets, but the record doesn't get added as it has duplicate PK. In fact following code do have provision to display an error:
I have one question. What I observed is, when we try to put the duplicate PK without CondionExpression, we don't get the error but the same {} empty brackets, but the record doesn't get added as it has duplicate PK. In fact following code do have provision to display an error:
// Call DynamoDB to add the item to table
documentClient.put(params, function (err, data) { if (err) console.log(err); else console.log(data); });
When I include ConditionExpression with attribute_not_exists, that also don't give clear error, which says we are trying to put duplicate PK. I am getting following response.
message: 'The conditional request failed', code: 'ConditionalCheckFailedException', time: 2020-10-09T17:05:56.981Z, requestId: 'NNCTU5FKJFO20UTRMTSQQEIQ8JVV4KQNSO5AEMVJF66Q9ASUAAJG', statusCode: 400, retryable: false, retryDelay: 16.075542132578448 }
So is there any way to get the proper error/error code, so that application can handle it and inform the user in user friendly way?
Thanks

Hi Alex,
In chapter 19.3.4. Modelling the Order Items, for the table the PK and SK are model for OrderItems as:
OrderItems PK: ORDER#<OrderId>#ITEM#<ItemId> SK: ORDER#<OrderId>#ITEM#<ItemId>
I was wondering why you modelled them this way and why not say
OrderItems PK: ORDER#<OrderId> SK: ITEM#<ItemId>
or
OrderItems PK: ITEM#<ItemId> SK: ITEM#<ItemId>
Kind Regards
Rob
In chapter 19.3.4. Modelling the Order Items, for the table the PK and SK are model for OrderItems as:
OrderItems PK: ORDER#<OrderId>#ITEM#<ItemId> SK: ORDER#<OrderId>#ITEM#<ItemId>
I was wondering why you modelled them this way and why not say
OrderItems PK: ORDER#<OrderId> SK: ITEM#<ItemId>
or
OrderItems PK: ITEM#<ItemId> SK: ITEM#<ItemId>
Kind Regards
Rob
Serverless enthusiast and consultant
Good question, Rob. Obviously I can't speak for
Alex DeBrie
but it seems like for this demo example the schema is assuming that OrderItems don't come from anywhere, i.e. they only exist on an order.
Author, DynamoDB Book
Hey
Rob
, good question. The main point I was trying to show there was that the one-to-many pattern of 'Fetch Order and Order Items' would be handled in the secondary index. Thus, the primary key pattern in the base table was less important.
The second pattern you showed wouldn't have worked because it wouldn't have had enough uniqueness. In this model, the ItemId is an identifier for that particular item. If the PK & SK were both ITEM#<ItemId>, then the item would have been overwritten whenever *anyone* purchased that item in an order.
The first pattern (OrderId as PK, ItemId as SK) would have worked. That said, I generally avoid putting items in the same item collection unless I'm going to be fetching them as part of a Query. By spreading them and putting them into their own item collection based on the combination of OrderId and ItemId, you give DynamoDB the ability to really spread those out. It probably doesn't matter too much here, but just a practice I try to do.
The second pattern you showed wouldn't have worked because it wouldn't have had enough uniqueness. In this model, the ItemId is an identifier for that particular item. If the PK & SK were both ITEM#<ItemId>, then the item would have been overwritten whenever *anyone* purchased that item in an order.
The first pattern (OrderId as PK, ItemId as SK) would have worked. That said, I generally avoid putting items in the same item collection unless I'm going to be fetching them as part of a Query. By spreading them and putting them into their own item collection based on the combination of OrderId and ItemId, you give DynamoDB the ability to really spread those out. It probably doesn't matter too much here, but just a practice I try to do.

Hey
Alex DeBrie
,
I'm just working my way through the Big Time Deals example, and on page 362 in the `create_message` method you assign the message's `Unread` attribute a value of a string "True".
Is there a reason for using string here, rather than DDB's native BOOL type?
I'm just working my way through the Big Time Deals example, and on page 362 in the `create_message` method you assign the message's `Unread` attribute a value of a string "True".
Is there a reason for using string here, rather than DDB's native BOOL type?
Author, DynamoDB Book
Nope, no particularly reason on the string vs. bool. I rarely think of the bool type, and I'll need to convert it from a string value ('true') no matter which DynamoDB type it is.
And nice catch on the MESSAGE/MESSAGES part! That's an error on my end. Will update :)
And nice catch on the MESSAGE/MESSAGES part! That's an error on my end. Will update :)

In 13.2.2. Assembling different collections of items, the Github example describes getting a repo and all its issues, and similarly a repo and all its stars. It seems like the two filter patterns rely on the lexical order of the entries. Scanning forward gives the first access pattern and scanning backwards gives the other.
I wanted to confirm that this pattern depends on the fact that there are only three types in the SK: ISSUE/REPO/STAR, and that REPO just happens to be the in the middle. e.g. if you added FORK/WATCHER it would break this pattern.
Is this a practical example or just something that happens to work every now and then?
I wanted to confirm that this pattern depends on the fact that there are only three types in the SK: ISSUE/REPO/STAR, and that REPO just happens to be the in the middle. e.g. if you added FORK/WATCHER it would break this pattern.
Is this a practical example or just something that happens to work every now and then?

I’m having the toughest time squaring what the Dynamo team say with what the Amplify team say. Seems AWS have internal conflicts on one vs many table design. One table makes most sense to me but Amplify has been seeming built to go against this design and continues adding more features that kinda lock you into many table design. Amplify seems to be taking off and powering apps behind it with Dynamo so I’m guessing will force multi design more just by sheer deployment numbers.
Has anyone got any feedback on this? Alex? I don’t want to build a huge app on the wrong architecture that even Amazon can’t seem to agree on.
Has anyone got any feedback on this? Alex? I don’t want to build a huge app on the wrong architecture that even Amazon can’t seem to agree on.
Author, DynamoDB Book
Good question,
Martin
, and it's one I keep seeing. Clearly someone needs to write the definitive post on it :)
I've opined briefly here. Basically, I think GraphQL & AppSync are optimizing for different things: frontend developer happiness and ease of backend code. As part of that, they'll accepting some inefficiencies in database access by having a single request make multiple hits to the database.
It's hard to say what the right approach is. I've gotten to where I say it's fine to use multiple tables with GraphQL & AppSync as long as you know and accept the tradeoffs.
There have been a few people (Rich Buggy) that have discussed using single-table design with GraphQL. And that's doable too! At that point, you're making a different tradeoff: more backend complexity in exchange for fewer database hits.
I've opined briefly here. Basically, I think GraphQL & AppSync are optimizing for different things: frontend developer happiness and ease of backend code. As part of that, they'll accepting some inefficiencies in database access by having a single request make multiple hits to the database.
It's hard to say what the right approach is. I've gotten to where I say it's fine to use multiple tables with GraphQL & AppSync as long as you know and accept the tradeoffs.
There have been a few people (Rich Buggy) that have discussed using single-table design with GraphQL. And that's doable too! At that point, you're making a different tradeoff: more backend complexity in exchange for fewer database hits.
I am working through a react/redux course right now so it is what it is, but I think when I'm done with this I'm going to work in a more lightweight version of amplify to serve my own sadistic desires to put my data in a single table.
If anyone is interested I'd be happy to post back here if anyone would like to help.
