The CREATE TABLE SQL statement creates a table in the database.
The definition of a table includes the distribution policy for the data, which has great bearing on system performance. The goals for the distribution policy are to:
The distribution policy determines how data is distributed among the segments. Defining an effective distribution policy requires an understanding of the data’s characteristics, the kinds of queries that will be run once the data is loaded into the database, and what distribution strategies best utilize the parallel execution capacity of the segments.
Use the DISTRIBUTED clause of the CREATE TABLE statement to define the distribution policy for a table. Ideally, each segment will store an equal volume of data and perform an equal share of work when processing queries. There are two kinds of distribution policies:
When different tables are joined on the same columns that comprise the distribution key, the join can be accomplished at the segments, which is much faster than joining rows across segments. The random distribution policy makes this impossible, so it is best practice to define a distribution key that will optimize joins.
The CREATE TABLE statements for the faa database are in the faa create_dim_tables.sql script.
Change to the directory containing the FAA data and scripts:
$ cd ~/gpdb-sandbox-tutorials/faa
Open the script in a text editor to see the text of the commands that will be executed when you run the script.
gpadmin@gpdb-sandbox faa]$ more create_dim_tables.sql create table faa.d_airports (airport_code text, airport_desc text) distributed by (airport_code); create table faa.d_wac (wac smallint, area_desc text) distributed by (wac); create table faa.d_airlines (airlineid integer, airline_desc text) distributed by (airlineid); create table faa.d_cancellation_codes (cancel_code text, cancel_desc text) distributed by (cancel_code); create table faa.d_delay_groups (delay_group_code text, delay_group_desc text) distributed by (delay_group_code); create table faa.d_distance_groups (distance_group_code text, distance_group_desc text) distributed by (distance_group_code)
Execute the create_dim_tables.sql script. The psql \i command executes a script:
$ psql -U user1 tutorial
tutorial=# \i create_dim_tables.sql
List the tables that were created, using the psql \dt command.
Exit the psql shell: