From my Redshift cluster performance panel, I can see that one of the nodes has roughly twice as much data as the others, and that leads to a significantly higher CPU utilization too. There are a few dozen large tables in the database using a distkey-based distribution, and I haven't been able to find which aren't properly balanced.
Searching in the documentation, I saw that the SVV_TABLE_INFO view has a column called skew_rows. Is it the number I'm looking for?
Aucun commentaire:
Enregistrer un commentaire