amazon web services: Amazon Redshift doing Hash Join even when joined on column that is both Dist Key and Sort Key

mardi 31 mars 2015

Amazon Redshift doing Hash Join even when joined on column that is both Dist Key and Sort Key

I have a fact table in Redshift having about 1.3 Billion rows with DISTribution key c1 and sort key c1, c2.

I need to join this table with itself with a join clause on c1 (i.e. c1 from 1st instance of table = c1 from 2nd instance of table).

As I see query plan of my query, Redshift appears to be doing a Hash Join with DS_DIST_NONE. Though DS_DIST_NONE is expected as I have both dist key and sort key on the column c1, but I expected Redshift to do a Merge Join instead of Hash Join (again because of the same reason).

I believe this is slowing down my query.

Can anyone please explain as to why Redshift may be doing a Hash Join instead of Merge Join (even though I have both DIST Key and SORT key on the joining column) and Redshift is doing DS_DIST_NONE for the query?

amazon web services

mardi 31 mars 2015

Amazon Redshift doing Hash Join even when joined on column that is both Dist Key and Sort Key

Aucun commentaire:

Enregistrer un commentaire