13 Sep

Introduction of Readable External Protocol of gpfdist

As the fundamental of all ETL operation of Greenplum, it worth explaining a little more  about the detail of gpfdist to understand why it is faster than other tools and how could we improve in future.

This blog will focus on the detail of communication of readable external table between gpfdist server and Greenplum, and introduce the traffic flow and protocol of gpfdist external table. Read More

06 Sep

Introduction to Greenplum ETL tool – Overview

Why ETL is important for Greenplum

As a data warehouse product of future, Greenplum is able to process huge set of data which is usually in petabyte level, but Greenplum can’t generate such number of data by itself. Data is often generated by millions of users or embedded devices. Ideally, all data sources populate data to Greenplum directly  but it is impossible in reality because data is the core asset of a company and Greenplum is only one of many tools that can be used to create value with data asset. One common solution is to use an intermediate system to store all the data.  Read More