when I query to the external table that `LOCATION` is hdfs,
I don’t make sense that where dose greenplum retain data(including tmp).
is that any rule to hold data in greenplum?
1.a lot of data : gp’s hdd
2.little data : gp’s memory
3.No,do not retain in gp at all.gp just display them.
I like to think of the external tables similar to a unix pipe. Data is streamed from the source to one or more segments whether it’s a lot or a little and then something can be done with it (insert to a physical table, spill to temp space, data movement to other segments, etc). does that make sense? there is not an explicit place it goes as it’s all dynamic.