21 Dec

external table retain data where?

when I query to the external table that `LOCATION` is hdfs,

I don’t make sense that where dose greenplum retain data(including tmp).

is that any rule to hold data in greenplum?

for instance

1.a lot of data : gp’s hdd

2.little data : gp’s memory

3.No,do not retain in gp at all.gp just display them.


I like to think of the external tables similar to a unix pipe. Data is streamed from the source to one or more segments whether it’s a lot or a little and then something can be done with it (insert to a physical table, spill to temp space, data movement to other segments, etc). does that make sense? there is not an explicit place it goes as it’s all dynamic.

Head of Data for Pivotal
