Check out the latest publication from Greenplum Engineering at this year’s SMDB — ICDE’s workshop on Self-managing Database Systems. One of the authors was an intern during last summer and we asked him to do an investigation into how data is redistributed during query processing in an MPP environment. In particular, are there patterns that can be discerned and used to layout the data more systematically than the usual somewhat hunch-based approach of looking for “frequent join attributes”? Sure enough the answer is yes, and this paper describes how it can be done.
Find an electronic copy here.