Emanuele Ricci

Developer
All the examples I've seen use the secondary index to query (because they could have multiple results).

Is it possible to get an item (not query) using the GSI?
Hi Emanuele,

getItem is not possible for GSIs since you could have duplicates (GSI items don't need to be unique while the table key must be unique). That's why you can only use the query operation to get items or a single item with the GSI.
Alex DeBrie replied
  ·  1 reply
Thank you very much for the explanation. I got confused by the IndexName property of the DocumentQuery. Alex DeBrie do you still advise to call GSI index name "GSI1", "GSI2" and so on? 
Alex DeBrie replied
  ·  1 reply
On the PutItem you can put some update constraints to effectively make the GSI a unique index, and throw an error if the returned items > 1 either through velocity templates.  So you effectively handle it in the application (i.e. domain object).  
Hi everyone, I started modeling my DB and I would like to know if it's possible to solve this access pattern in a good way. 

My access patterns are:
  • Get the top 100 scores
  • Get the user best score and ranking among top scores ("your rank is #24 of 343434 players")

I have a table with this model (it's a simplified version because I'm handling score organized for the game chapter but this does not change the problem).
Users can have multiple runs of the game and I will use the GSI as a sparse index to update the user's best score (he can have only one best score).

Game Run

Primary Key:
  • PK: USER#<UserId>
  • SK: RUN#<RunId>

GSI1:
  • GSI1PK: LEADERBOARD
  • GSI1SK: Score

Attributes
  • RunId
  • UserId
  • Score
  • GSI1PK (not null if it's the best score)
  • GSI1SK (not null if it's the best score)

Access Pattern 1: Leaderboard Top 100

I can query the GSI1 with limit 100 and ScanIndexForward=false and I solved this issue.

Access Pattern 2: User current rank in the leaderboard

This is where I don't know how to solve the problem. The only way to solve this problem that has come to my mind is to Filter the records where the GSI1SK is <= of the user score and then let the server's logic check which is the user's rank.
The problem is that I could hit the 1MB limit of the Query (I need to think big, what if I get millions of users playing the game and everyone has a personal best score?). So I would need to keep doing the query until I find the record with the user. It would be a waste of RCU (and I still haven't figured out how to calculate them :D).

Do you have a solution to this problem? Is dynamodb not the right tool to solve this access pattern?


e
Access Pattern 2

I have a generic itemtype COUNTER# which I maintain the totals with DYNDB streams.  It is based on this strategy (http://www.railstips.org/blog/archives/2011/06/28/counters-everywhere/)

but something along these lines (streams and different item type) might be worth considering.  

HTH
Emanuele Ricci replied
  ·  2 replies
Hi everyone, 

In my current database, I need to dump some data into an attribute to restore it when loaded. That attribute will not be used in any index or queried via Expressions so it just needs to be there. 

I was wondering, is it better to store it as a Map/List or just a string? I was wondering if the JSON string version is better because when I use a Map, dynamo needs to add all the overhead structure needed to know all the subfield's type.
On the other hand, the JSON string dump could have a lot of unnecessary chars like quote escapes.

What do you think?
Just to be clear: Your definition of "better" in this case is the method that uses less space in DynamoDB?

If that is the case, then I think Maps/Lists are the way to go. Looking at the docs you can see that there is a 3 byte overhead for the native structures:

An attribute of type List or Map requires 3 bytes of overhead, regardless of its contents. The size of a List or Map is (length of attribute name) + sum (size of nested elements) + (3 bytes) . The size of an empty List or Map is (length of attribute name) + (3 bytes).
Given that strings are UTF-8 encoded in DDB, as long as you have 3 non-attribute/value characters (e.g. quotes, brackets, whitespace) then strings will take more space; obviously you will have more than that in any usable/complex JSON object!

The only downside I can think of is that if you're dumping the items out in DDB native JSON, then you will have all of the type fields (e.g. "S" for strings, etc) which will take more space on your disk, hence my clarification above.
Emanuele Ricci replied
  ·  5 replies