Greenplum is based off of PostgreSQL, thus the way that it looks for configuration information and how to control which connections are allowed to the system should be familiar to those that have worked worked in that environment before.
- pg_hba.conf - Client Authentication
- postgresql.conf - System Configuration
The pg_hba.conf file is located in the base directory of each running Greenplum database process be it the master, primary or mirror segment. In the Sandbox VM the master runs out of the /gpdata/master/gpseg-1/ and this is often captured to the environment variable MASTER_DATA_DIRECTORY and the configuration files can be found there.
More robust instructions can be found in the Docs under Administrator Guide > Managing Greenplum Database Access > Configuring Client Authentication
If you look at the entries in this file
tail /gpdata/master/gpseg-1/pg_hba.conf # IPv4 local connections: # IPv6 local connections: local all gpadmin ident host all gpadmin 127.0.0.1/28 trust host all gpadmin 172.31.34.119/32 trust host all gpadmin ::1/128 trust host all gpadmin fe80::80c:b8ff:fe77:f960/128 trust local replication gpadmin ident host replication gpadmin samenet trust local tutorial +users md5
You will see a set of columns. They correspond to:
- type of connection
- database connected to
- user connecting
- addresses allowed ( optional depending on type )
- method of authentication
For example this line
local all gpadmin ident
Indicates that for local connection to all databases the gpadmin will be authenticated using the ident methodology
host all gpadmin 172.31.34.119/32 trust
This rule notes that a host connection to all databases for gpadmin from the 172.31.34.119 IP address will be trusted, thus allowing it to log in without a password.
If you are using the VM you may need to adjust or add an additional rule if you wish to make a psql connection to the VM. Adding rule such as this one
host all all 0.0.0.0/0 md5
Would allow a host connection for all users to connect to all databases originating from any network address ( 0.0.0.0/0 ) to authenticate using the password ( md5 ) the user has been setup within the database.
This file is only read when the process starts up or is signaled to reread the file. Thus if changes are made the database either needs to be restarted or gpstop can be ran with the -u flag in order to force it to upload changes.
All of the system setting that can be tweaked and configured as the system comes up are held in the postgresql.conf file. The file is found again in the base directory of each running database process. In the Sandbox VM this is located at /gpdata/master/gpseg-1/postgresql.conf
There are literally hundreds of different configuration parameters, these are set to default settings that will work for most installations. Adjustment of these parameters is recommend only after you have gained experience working with the database and understand how altering a parameter will change the performance of the database.
In addition for each segment instance that is running in the cluster they have a postgresql.conf and there are setting that should be kept consistent across all of the hosts in the cluster and others that may be different, depending on if you are on a master or segment system.
gpconfig is a tool that is packaged with Greenplum and helps alter and maintain postgresql.conf files across and entire Greenplum cluster.