What is The DynamoDB Place missing?

What can we be doing better?

What would make this space helpful to you?

If you have thoughts, let us know!
I believe I've found a mistake in chapter 20.3.1 (the Deals modeling example).  The discussion of truncated timestamps says

You can truncate down to any granularity you like—hour, day, week, month, year—and your choice will depend on how frequently you’re writing items. 
  The issue is with using truncated timestamps to model weeks.  I believe you can only truncate down to the fields the timestamp represent; second, minute, hour, day, month, year (not weeks).  I came across this issue while trying to model something on a weekly boundary.  I'd love if I were mistaken, since it would solve my data modeling issue :)


Owain in this specific example, the truncated timestamps are used as PK's.  I don't believe we can do a between query on PKs.  Although, I suppose we could use truncated timestamps as SK's and do what you've described.  However, it's not clear to me that's what was meant when this chapter said this strategy could work on the week boundary.

The example later in the chapter shows how multiple single-day partitions are queried to ensure at least 25 Deals are returned.  Since the example used truncated timestamps at PK's, the application logic needed to make n queries across several single-day partitions.  This pattern is probably sufficient for this example, since querying multiple partitions is the worst case scenario.  However, if querying by week is an access pattern that your app always supports (e.g. fetch all deals by week), hitting 5 (or 7) day partitions every time seems less than ideal.  

I thought the truncated timestamp strategy was meant to build single item collections on minute/hour/day/moth/year boundaries.  The result being a single partition for each timeframe.  I suppose I could query truncated timestamps in the way you describe, but it seems not terribly different than querying non-truncated timestamps.


Seth Geoghegan You are quite correct that between only works on SK and not PK.  I didn't review the code but there are ways of doing it.  Remember that you are not trying to optimize on storage just writes then reads.  So data redundancy is your friend.

If you need this is as an access pattern then create a GSI with a PK on YYYY-WW from the timestamp and or if you want to keep timestamps then use the message I suggested above. 

My original suggestion (SK timestamp) gives you the added benefit of giving you more access patterns e.g. previous 7 days, previous 5 days on any date you pass in as an arg.
HTH,

O.

Hey Seth Geoghegan , you could truncate down to the week by choosing the first day of a week (e.g. August 9th, 2020) and truncating the value to the beginning of that day: "2020-08-09T00:00:00.000Z".

The key here isn't that you take a timestamp and zero-out anything after a certain digit. Rather, it's making sure you have a consistent way to group a certain timestamp into some boundary. Here, it would take a timestamp, find the Sunday prior to it, and create a timestamp for 00:00 UTC on that Sunday.

Your application will need to encode the logic to handle this truncation, but you'll basically have a function that takes in a time and finds the truncated week timestamp for that time.

I may be misunderstanding, so let me know if that doesn't make sense :)
Seth Geoghegan replied
  ·  1 reply
Possible to somehow highlight the new posts since my last visit and add a number next to the left-nav items to show how many new posts there are?
I believe this already happens.  Or maybe it's new?
I don't see any numbering like what's shown in your screenshot, but it could be because of the still relatively low volume. 
In Section 8.1.4. Other benefits of single-table design, you mention:
1. reducing the number of requests for an access pattern 
2. reducing the operational overhead with each table you have in DynamoDB
3. cost

For #2, I think it's rather trivial since most folks are going to use something like CloudFormation to configure their table and operational aspects (like on-demand or auto-scaling, alarms, dashboards etc.). I would rank it lower than **cost** and almost make it a footnote.

To me, the biggest benefit is tied to #1 and is rather hidden today: having built some DDB tables without using this pattern (almost the Faux-SQL pattern), the biggest benefit is reducing the complexity in working with multiple DDB calls serially, rather than just the time/latency it takes to do that. 

Considering that data may be changing behind the scenes (and often does, in such high-scale cases), a multiple-request-to-DDB-serially approach needs to account for such changes (i.e. referred item is changed/deleted from the first table, while the second table is being queried). This is complexity on it's own (besides increasing the latency of the service) that can be eliminated with the single-table partition-key based query.

Thoughts?
Re: #2, I think it's worth noting burst capacity is related to average traffic, so a single table design with more average traffic gives autoscaling more time to get ready.  We saw this with a table of scheduled events where users wanted everything to run on the hour.
Hi Alex, loving the book so far, thanks for writing this!

I did find what looks like a minor syntax mistake.  Would you like to hear about things like that, and where should we send them?
Thanks, Levi!

Yep, it'd be great to hear about those fixes. A bunch of folks have helped me out already, which I'm grateful for :). 

You can leave them here, or you can email them to me -- alex@alexdebrie.com . Thanks!
I'll post them here in the hopes that you can avoid getting duplicate bug reports:

Cheatsheets, page 31/54 has two 'KeyConditionExpression'

Cheetsheets, pages 30, 41, and 52 use the same copy/paste typo "CleintRequestToken"
 
Again, loving the book so far.  Your tips have already improved the schema I'm currently working on, by opening up several new access patterns.
I haven't seen this forum software before, what is it? The design is very clean but it doesn't seem to update automatically, so I don't see new posts etc if I don't refresh the page? Also it has an excessive amount of clicking to see more - like having to click to see a reply, then click again to see all of it rather than the first couple of lines.

For this type of forum, the best I've used elsewhere is Discourse, although probably rather late to reconsider that.
Hey Mark, it's a tool called Circle. They're in beta right now but launching soon.

Agree on the clicking -- will pass that back to them as feedback.
It would be awesome if it supported markdown too.