Configuration Files

Greenplum is based off of PostgreSQL, thus the way that it looks for configuration information and how to control which connections are allowed to the system should be familiar to those that have worked worked in that environment before.

pg_hba.conf - Client Authentication


The pg_hba.conf file is located in the base directory of each running Greenplum database process be it the master, primary or mirror segment. In the Sandbox VM the master runs out of the /gpdata/master/gpseg-1/ and this is often captured to the environment variable MASTER_DATA_DIRECTORY and the configuration files can be found there.

More robust instructions can be found in the Docs under Administrator Guide > Managing Greenplum Database Access > Configuring Client Authentication

If you look at the entries in this file

tail /gpdata/master/gpseg-1/pg_hba.conf
# IPv4 local connections:
# IPv6 local connections:
local all gpadmin ident
host all gpadmin trust
host all gpadmin trust
host all gpadmin ::1/128 trust
host all gpadmin fe80::80c:b8ff:fe77:f960/128 trust
local replication gpadmin ident
host replication gpadmin samenet trust
local tutorial +users md5

You will see a set of columns. They correspond to:

  • type of connection
  • database connected to
  • user connecting
  • addresses allowed ( optional depending on type )
  • method of authentication

For example this line

local all gpadmin ident

Indicates that for local connection to all databases the gpadmin will  be authenticated using the ident methodology

host all gpadmin trust

This rule notes that a host connection to all databases for gpadmin from the IP address will be trusted, thus allowing it to log in without a password.

If you are using the VM you may need to adjust or add an additional rule if you wish to make a psql connection to the VM. Adding rule such as this one

host all all md5

Would allow a host connection for all users to connect to all databases originating from any network address ( ) to authenticate using the password ( md5 ) the user has been setup within the database.

This file is only read when the process starts up or is signaled to reread the file. Thus if changes are made the database either needs to be restarted or gpstop can be ran with the -u flag in order to force it to upload changes.

postgresql.conf - System Configuration


All of the system setting that can be tweaked and configured as the system comes up are held in the postgresql.conf file. The file is found again in the base directory of each running database process. In the Sandbox VM this is located at /gpdata/master/gpseg-1/postgresql.conf

There are literally hundreds of different configuration parameters, these are set to default settings that will work for most installations. Adjustment of these parameters is recommend only after you have gained experience working with the database and understand how altering a parameter will change the performance of the database.

In addition for each segment instance that is running in the cluster they have a postgresql.conf and there are setting that should be kept consistent across all of the hosts in the cluster and others that may be different, depending on if you are on a master or segment system.

gpconfig is a tool that is packaged with Greenplum and helps alter and maintain postgresql.conf files across and entire Greenplum cluster.