01 Dec

is the dimensional data model best suited for the GreenPlum?


Hi Everyone,

We have been running the GreenPlum appliance for the last 3 years but I always been hearing things about a most suitable data model for this appliance. If there such thing or the appliance can be used host hadoop workload or to maintain a data warehouse based on dimensional model.

At the moment we use the appliance as a landing zone and from there we create views and others objects so the users can perform their analysis/BI reporting …

I would like to hear what is the most common used that has been given to this appliance among other users/customers.

Thanks for you comments…


Thanks for your comments. It seems to me the dimensional model should be the way to go but Pivotal engineers sometimes advise that the GP appliance is suitable for other type of workloads and different kind of data models. I hope one of those engineers/architects can also provide theirs views …

thanks Jacque for your input….


it’s a pretty varied answer. generally speaking, I personally prefer a dimensional model over something more snowflake’d out because I think it’s easier for our users to understand. it also necessitates less data movement which results in better speed. Some of our customers even further optimized by flattening even further reducing the need for data movement and optimizing for queries. At the end of day, understanding (via explain plans) what is going on for a given query is the best way to optimize how you physically lay things out.

Head of Data for Pivotal