Blog
APJ Tanzu Data Tour —— a series of live sessions
While we can’t get together in person to discuss, share and shape the future of advanced data technologies, we can still learn from each other online! Join us at the APJ Tanzu Data Tour, for a series of live sessions each month to discuss the latest developments in
Relationship and difference between Greenplum and PostgreSQL
Greenplum is open-source software for massively parallel database used for reporting, analytics, machine learning, artificial intelligence, and high concurrency SQL. Greenplum database is described as big data technology with a basis of MPP architecture and the PostgreSQL open-source database technology. PostgreSQL is a popular free and open-source relational
Greenplum Hackday 2021
Come and hack Greenplum and win prizes. On Friday, Apr 16th, we are having a hackday for Greenplum around the world. The theme would be anything related to Greenplum: use Greenplum, market Greenplum, break Greenplum, hack Greenplum or do anything related to Greenplum When Friday, Apr 16th 2021
Top Ten Open-Source Big Data Database
Data has become a powerful tool for the global workforce. It’s a prerequisite to translate massive amounts of unstructured and structured information into meaningful and valuable business insights for future growth. Hence, the current global market is flooded with a wide range of big data tools to process
Faster Optimization of Join Queries in ORCA
author:Hans Zeller Optimizing joins is the core part of any query optimizer. It consists of picking a good join order, the right join algorithms (hash join, nested loop join, etc.) and various other things. The number of possible options grows extremely fast and requires a method called Dynamic
Introduction to Greenplum Architecture
This is the first article of the Greenplum Kernel series. There are a total of ten articles in this series, which will explain in depth the different modules of Greenplum. Today I’m going to explain the Greenplum architecture in more detail. Before we talk about Greenplum’s architecture, let’s
World Class Open Source Distributed HTAP Database based on PostgreSQL
In 2019, Yao Yandong, was invited by the Alibaba cloud developer community to deliver a live technical speech “PB level open source enterprise level distributed HTAP database based on PostgreSQL”. This paper is organized by the content of the speech. Today, I’d like to share with you the
Greenplum Summit Week 5: AI, Neural Networks, and the Future of Analytics
Author: Jared Ruckle Every enterprise is refining their AI strategy. So it’s only fitting that the final installment of Greenplum Summit 2020 focused on how artificial intelligence and neural networks will shape the future of analytics. Let’s get right to the highlights! (You can watch all Greenplum Summit
Launch! Greenplum Hardware Goes Open Source
Today, we unveil the first massively parallel postgres data warehouse to open source across both software and hardware. In 2015, we launched the world’s first open source MPP data platform. I am happy to announce that we are also sharing Greenplum’s infrastructure. In 2015 we said the future was
Greenplum Summit Week 4: Parallel Postgres
Author: Bob Glithero For over 15 years, Greenplum has solved the problem of parallelizing Postgres for high-performance querying and analysis of data at massive scale. In Week 4 of the Greenplum Summit, after a brief interlude for a discussion with Heimdall Data, we shift gears a bit to
Greenplum Database Upgrade
Greenplum database team earlier last year started working towards building a in-place major version upgrade tool, gpupgrade. The driving force in developing less time and less space consuming upgrades was to offer an easy upgrade path for customers. This tool will enable customers to quickly and confidently upgrade to
Relocatable Postgres Builds
As engineers on the Greenplum Release Engineering team, we recently had the opportunity to do an in-depth exploration of Postgres’ build system. Greenplum Server is based on Postgres and has inherited the upstream build system. Our team was working on producing a relocatable build of Greenplum Server which
Greenplum Summit Week 3: How to Get Started with a Modern Data Warehouse
The idea of a data warehouse isn’t new. Many enterprises have used them for years. What is new: the data landscape in 2020. The amount of data is exploding, and the use cases requiring real-time data analysis are growing just as fast. Talks from Week 3 cover two
Talkin’ Federated Analytics: Recapping Week 2 of Greenplum Summit
By Bob Glithero Greenplum Summit rolls on: three sessions down, two to go! Week 2 was all about federated analytics, the art of analyzing data from multiple sources to solve business challenges. Here’s our recap. (You can watch all the sessions from Week 1 and Week 2 on-demand
Greenplum Summit 2020: 6 Highlights from Week 1 of the Digital Series
The first two sessions of the virtual Greenplum Summit are in the books. To whet your appetite for the next session (“Data Warehouse Modernization,” happening August 26, sign-up here), we wanted to recap some of our takeaways from Week 1. (We’ll have a recap from Week 2 in
Greenplum Summit Preview: Run Greenplum Your Way
Why is Greenplum so popular? A few factors leap to mind: It’s backed by a thriving open source community Massively parallel data analytics performance Multi-cloud, Infrastructure-native support This third item is the focus of our first set of talks at Greenplum Summit (sign-up here). We wanted to preview