05 Dec

DISTRIBUTED randomly useless

3.90K views
0
0 Comments

A randomly distributed partition table is defined, but when the insert operation is performed, the data hits a segment. After adding a self-adding column,it’s ok.

0

im not sure I completely understand the statement – distributions are how we define where in the cluster the data goes. when it’s random, it will go to 1 of the segments and generally keep all the segments in balance (defining your own strategy allows for less data movement on joins). so if your queries are always going to be going thru all of the data – keeping the data balanced throat the cluster may be advantageous and random is nice for that.

Head of Data for Pivotal