jeudi 24 septembre 2015

How to index a two column database in AWS (DynamoDB or SQL) for preformance

I want an efficient way to store ItemIDs and AttributeIDs for querying in DynamoDb.

There are a few thousand unique ItemIDs and 300 unique AttributeIDs. Each ItemID is usually associated with 20-100 AttributeIDs. Each AttributeID is associated with 20,000-40,000 ItemIDs There will be more AttributeIDs added with time.

I would like to be able to query to find:

  1. Return me the AttributeIDs associated with a given ItemID
  2. Find me the ItemIDs associated with a given AttributeID

My initial solutions are

  1. DynamoDB: Have a table where one ItemID matches an AttributeID. The primary key will be a Hash and Rage where the Hash = ItemID and Range = AttributeID. Have a global secondary index, Hash and Range where Hash= AttributeID and Range = ItemID.

  2. Same thing as above but with SQL, though this will result in about 9 million entries which will be slow for performance, especially as I add more AttributeIDs.

Is my solution okay or what is a better way of doing it? With the way I'm suggesting, would there be any difference in performance between an SQL and DynamoDB version (due to underlying table implementations)?




Aucun commentaire:

Enregistrer un commentaire