Author | David Kimura
Greenplum Database is a massively parallel processing (MPP) database designed for handling large-scale data warehousing and analytics workloads. One of its key features is the ability to partition tables, which helps improve query performance, manage data distribution, and enhance data organization. In this blog post, we will explore 20 examples of Greenplum partition commands to showcase how to effectively use partitioning in your data management strategy.
Prerequisites
Before we dive into the examples, make sure you have a working Greenplum environment set up and a basic understanding of SQL and database concepts.
Example 1: Creating a Range Partitioned Table
This example creates a sales table partitioned by range using the sale_date column.
-- (GP7 only)
CREATE TABLE sales(
sale_id SERIAL,
sale_date DATE,
amount DECIMAL
) PARTITION BY RANGE (sale_date);
Then adds a new partition to the sales table for the first half of 2023.
-- (GP7 only)
CREATE TABLE sales_p1
PARTITION OF sales FOR VALUES FROM ('2023-01-01') TO ('2023-06-30');
Example 2: Classic Range Partitioned Table
For convenience, a single command can be used.
CREATE TABLE sales(
sale_id SERIAL,
sale_date DATE,
amount DECIMAL
) PARTITION BY RANGE (sale_date)
(
PARTITION p1 START ('2023-01-01') END ('2023-06-30')
);
Example 3: Creating a List Partitioned Table
Create an orders table partitioned by a list of product categories.
-- (GP7 only)
CREATE TABLE orders (
order_id SERIAL,
product_category TEXT,
order_date DATE
)
PARTITION BY LIST (product_category);
Then add list partitions for the specified product categories.
-- (GP7 only)
CREATE TABLE orders_p1
PARTITION OF orders FOR VALUES IN ('Electronics', 'Clothing');
Example 4: Classic List Partitioned Table
For convenience, a single command can be used.
CREATE TABLE orders (
order_id SERIAL,
product_category TEXT,
order_date DATE
) PARTITION BY LIST (product_category)
(
PARTITION p1 VALUES ('Electronics', 'Clothing')
);
Example 5: Creating a Hash Partitioned Table
Create a sensor_data table partitioned by hash on the sensor_id column.
-- (GP7 only)
CREATE TABLE sensor_data (
sensor_id SERIAL PRIMARY KEY,
timestamp TIMESTAMP,
reading FLOAT
)
PARTITION BY HASH (sensor_id);
Example 6: Adding Hash Partitions
Add hash partitions to the sensor_data table using a modulus-based approach. Note: The classic convenience single command is not supported for hash type partitions
-- (GP7 only)
CREATE TABLE sensor_data_p1
PARTITION OF sensor_data FOR VALUES WITH (MODULUS 4, REMAINDER 0);
Example 7: Creating a Composite Partitioned Table
Create a composite partitioned table based on both range and list partitioning.
-- (GP7 only)
CREATE TABLE logs (
log_id SERIAL,
log_date DATE,
log_type TEXT
)
DISTRIBUTED BY (log_id)
PARTITION BY RANGE (log_date) SUBPARTITION BY LIST (log_type);
Example 8: Adding Composite Partitions
Add composite partitions to the logs table for a specific date range and log types.
-- (GP7 only)
CREATE TABLE logs_p1
PARTITION OF logs FOR VALUES FROM ('2023-01-01') TO ('2023-03-31')
PARTITION BY LIST (log_type);
Example 9: Adding Default Partition
ALTER TABLE sales ADD DEFAULT PARTITION other;
Example 10: Splitting Default Partition
Split the default partition of the sales table into two partitions.
ALTER TABLE sales SPLIT DEFAULT PARTITION
START ('2023-12-01') INCLUSIVE
END ('2023-12-31') INCLUSIVE
INTO (PARTITION dec2023, PARTITION other);
Example 11: Detaching Partitions
--- (GP7 only)
ALTER TABLE sensor_data
DETACH PARTITION sensor_data_p1;
Example 12: Attaching Existing Table
Reattach a detached partition from another table back into the sensor_data table.
CREATE TABLE sensor_data_old(LIKE sensor_data);
-- add stuff to sensor_data_old... then attach
-- (GP7 only)
ALTER TABLE sensor_data
ATTACH PARTITION sensor_data_old FOR VALUES WITH (MODULUS 3, REMAINDER 0);
Example 13: Checking Partition Information
Retrieve information about partitions and their rules in the database.
\d+ logs
Partitioned table "public.logs"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
----------+---------+-----------+----------+--------------------------------------+----------+--------------+-------------
log_id | integer | | not null | nextval('logs_log_id_seq'::regclass) | plain | |
log_date | date | | | | plain | |
log_type | text | | | | extended | |
Partition key: RANGE (log_date)
Partitions: logs_p1 FOR VALUES FROM ('2023-01-01') TO ('2023-03-31'), PARTITIONED
Distributed by: (log_id)
Access method: heap
\d+ logs_p1
Partitioned table "public.logs_p1"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
----------+---------+-----------+----------+--------------------------------------+----------+--------------+-------------
log_id | integer | | not null | nextval('logs_log_id_seq'::regclass) | plain | |
log_date | date | | | | plain | |
log_type | text | | | | extended | |
Partition of: logs FOR VALUES FROM ('2023-01-01') TO ('2023-03-31')
Partition constraint: ((log_date IS NOT NULL) AND (log_date >= '2023-01-01'::date) AND (log_date < '2023-03-31'::date))
Partition key: LIST (log_type)
Number of partitions: 0
Distributed by: (log_id)
Access method: heap
Example 14: Getting Partition Statistics
Analyze the sales table to update partition statistics for query optimization.
ANALYZE sales;
Example 15: Truncate a Partition
Remove all rows from the specified partition while keeping its structure intact.
--(GP7 only)
ALTER TABLE sensor_data TRUNCATE PARTITION FOR (42);
Example 16: Exchange Partition Data
Exchange data between a partition in the main table and a corresponding partition in an archive table.
CREATE TABLE sensor_data_archive (LIKE sensor_data);
-- add stuff to sensor_data_archive... then exchange
ALTER TABLE sensor_data
EXCHANGE PARTITION FOR (42) WITH TABLE sensor_data_archive;
Example 17: Applying Constraints on Partitions
Apply a check constraint on a partitioned table to enforce data integrity within partitions.
ALTER TABLE logs
ADD CONSTRAINT check_date CHECK (log_date != '2023-01-01');
Example 18: Creating an indexed Partitioned Tables
Create an index on all the tables in a partitioned table.
CREATE INDEX index_logs_check_date ON logs(log_date);
Example 19: Rename Partitioned Tables
Rename a partitioned table is same as normal table and will be reflected in the partition heirarchy.
-- GP7 syntax:
ALTER TABLE sales_p1 RENAME TO sales_p1_backup;
-- GP6 syntax:
ALTER TABLE sales RENAME PARTITION "p1" TO "p1_backup";
Example 20: Dropping Partitions
Remove the specified partitions from the sales table.
ALTER TABLE sales
DROP PARTITION FOR ('2023-01-01');
Or
ALTER TABLE sales DROP PARTITION other;
Conclusion
Partitioning is a powerful feature in Greenplum Database that enables efficient data management and query optimization for large-scale datasets. These examples demonstrate various ways to create, manage, and utilize partitions in Greenplum. By applying the appropriate partitioning strategy, you can significantly improve the performance and maintainability of your database systems.