gpperfmon_install

gpperfmon_install

Installs the gpperfmon database, which is used by Greenplum Command Center, and optionally enables the data collection agents.

Synopsis

gpperfmon_install
      [--enable --password gpmon_password --port gpdb_port]
      [--pgpass path_to_file]
      [--verbose]

gpperfmon_install --help | -h | -?

Description

The gpperfmon_install utility automates the steps required to enable the data collection agents. You must be the Greenplum Database system user (gpadmin) in order to run this utility. If using the --enable option, Greenplum Database must be restarted after the utility completes.

When run without any options, the utility will just create the gpperfmon database (the database used to store system metrics collected by the data collection agents). When run with the --enable option, the utility will also run the following additional tasks necessary to enable the data collection agents:

  1. Creates the gpmon superuser role in Greenplum Database. The data collection agents require this role to connect to the database and write their data. The gpmon superuser role uses MD5-encrypted password authentication by default. Use the --password option to set the gpmon superuser's password. Use the --port option to supply the port of the Greenplum Database master instance.
  2. Updates the $MASTER_DATA_DIRECTORY/pg_hba.conf file. The utility adds the following lines to the host-based authentication file (pg_hba.conf):
    local      gpperfmon     gpmon                 md5
    host       all           gpmon  127.0.0.1/28   md5
    host       all           gpmon  ::1/128        md5
    Note:

    It may be necessary to edit the pg_hba.conf file after running the gpperfmon_install utility to limit the gpmon role's access to databases or to change the authentication method. After you edit the file, run gpstop -u to reload the file in Greenplum Database.

    • To limit access to just the gpperfmon database edit the pg_hba.conf file and in the host entry for the gpmon user change the second field from all to gpperfmon:
      host       gpperfmon     gpmon    127.0.0.1/28  md5
      host       gpperfmon     gpmon    ::1/128  md5
    • The gpperfmon_install utility assumes the default MD5 authentication method. Greenplum Database can optionally be configured to use the SHA-256 or SHA-256-FIPS hash algorithm to compute the password hashes saved in the system catalog. This is incompatible with the MD5 authentication method, which expects an MD5 hash or clear text password in the system catalog. Because of this, if you have enabled the SHA-256 or SHA-256-FIPS hash algorithm in the database, you must edit the pg_hba.conf file after running the gpperfmon_install utility to change the authentication method for the gpmon role from md5 to password:
      local      gpperfmon     gpmon                 md5
      host       all           gpmon  127.0.0.1/28   password
      host       all           gpmon  ::1/128        password

      The password authentication method submits the user's clear text password for authentication and should not be used on an untrusted network. See "Protecting Passwords in Greenplum Database" in the Greenplum Database Administrator Guide for more information about configuring password hashing.

  3. Updates the password file (.pgpass). In order to allow the data collection agents to connect as the gpmon role without a password prompt, you must have a password file that has an entry for the gpmon user. The utility adds the following entry to your password file (if the file does not exist, the utility will create it):
    *:5432:gpperfmon:gpmon:gpmon_password
    If your password file is not located in the default location (~/.pgpass), use the --pgpass option to specify the file location.
  4. Sets the server configuration parameters for Greenplum Command Center. The following parameters must be enabled for the data collection agents to begin collecting data. The utility will set the following parameters in the Greenplum Database postgresql.conf configuration files:
    • gp_enable_gpperfmon=on (in all postgresql.conf files)
    • gpperfmon_port=8888 (in all postgresql.conf files)
    • gp_external_enable_exec=on (in the master postgresql.conf file)

    Data collection agents can be configured by setting parameters in the gpperfmon.conf configuration file. See Data Collection Agent Configuration for details.

    For information about the Greenplum Command Center, see the Greenplum Command Center Administrator Guide.

Options

--enable
In addition to creating the gpperfmon database, performs the additional steps required to enable the data collection agents. When --enable is specified the utility will also create and configure the gpmon superuser account and set the Command Center server configuration parameters in the postgresql.conf files.
--password gpmon_password
Required if --enable is specified. Sets the password of the gpmon superuser.
--port gpdb_port
Required if --enable is specified. Specifies the connection port of the Greenplum Database master.
--pgpass path_to_file
Optional if --enable is specified. If the password file is not in the default location of ~/.pgpass, specifies the location of the password file.
--verbose
Sets the logging level to verbose.
--help | -h | -?
Displays the online help.

Data Collection Agent Configuration

The $MASTER_DATA_DIRECTORY/gpperfmon/conf/gpperfmon.conf file stores configuration parameters for the data collection agents. For configuration changes to these options to take effect, you must save gpperfmon.conf and then restart Greenplum Database server (gpstop -r).

The gpperfmon.conf file contains the following configuration parameters.

Parameter Description
log_location Specifies a directory location for gpperfmon log files. Default is $MASTER_DATA_DIRECTORY/gpperfmon/logs.
min_query_time

Specifies the minimum query run time in seconds for statistics collection. All queries that run longer than this value are logged in the queries_history table. For queries with shorter run times, no historical data is collected. Defaults to 20 seconds.

If you know that you want to collect data for all queries, you can set this parameter to a low value. Setting the minimum query run time to zero, however, collects data even for the numerous queries run by Greenplum Command Center, creating a large amount of data that may not be useful.

min_detailed_query_time

Specifies the minimum iterator run time in seconds for statistics collection. Iterators that run longer than this value are logged in the iterators_history table. For iterators with shorter run times, no data is collected. Minimum value is 10 seconds.

This parameter’s value must always be equal to, or greater than, the value of min_query_time. Setting min_detailed_query_time higher than min_query_time allows you to log detailed query plan iterator data only for especially complex, long-running queries, while still logging basic query data for shorter queries.

Given the complexity and size of iterator data, you may want to adjust this parameter according to the size of data collected. If the iterators_* tables are growing to excessive size without providing useful information, you can raise the value of this parameter to log iterator detail for fewer queries.

max_log_size

This parameter is not included in gpperfmon.conf, but it may be added to this file.

To prevent the log files from growing to excessive size, you can add the max_log_size parameter to gpperfmon.conf. The value of this parameter is measured in bytes. For example:

max_log_size = 10485760

With this setting, the log files will grow to 10MB before the system rolls over to a new log file.

partition_age The number of months that gperfmon statistics data will be retained. The default it is 0, which means we won’t drop any data.
quantum Specifies the time in seconds between updates from data collection agents on all segments. Valid values are 10, 15, 20, 30, and 60. Defaults to 15 seconds.

If you prefer a less granular view of performance, or want to collect and analyze minimal amounts of data for system metrics, choose a higher quantum. To collect data more frequently, choose a lower value.

smdw_aliases This parameter allows you to specify additional host names for the standby master. For example, if the standby master has two NICs, you can enter:
smdw_aliases=smdw-1,smdw-2

This optional fault tolerance parameter is useful if the Greenplum Command Center loses connectivity with the standby master. Instead of continuously retrying to connect to host smdw, it will try to connect to the NIC-based aliases of smdw-1 and/or smdw-2. This ensures that the Command Center Console can continuously poll and monitor the standby master.

Notes

The gpperfmon database and Greenplum Command Center require the gpmon role. After the gpperfmon database and gpmon role have been created, you can change the password for the gpmon role and update the information that Greenplum Command Center uses to connect to the gpperfmon database:

  1. Log in to Greenplum Database as a superuser and change the gpmon password with the ALTER ROLE command.
    # ALTER ROLE gpmon WITH PASSWORD 'new_password' ;
  2. Update the password in .pgpass file that is used by Greenplum Command Center. The default file location is the gpadmin home directory (~/.pgpass). The .pgpass file contains a line with the gpmon password.
    *:5432:gpperfmon:gpmon:new_password
  3. Restart the Greenplum Command Center with the Command Center gpcmdr utility.
     $ gpcmdr --restart

This gpperfmon monitoring system requires some initialization after startup, so expect monitoring information to appear after a few minutes have passed, and not immediately after installation and startup of the gpperfmon system.

Examples

Create the gpperfmon database only:

$ su - gpadmin
$ gpperfmon_install

Create the gpperfmon database, create the gpmon superuser, and enable the data collection agents:

$ su - gpadmin
$ gpperfmon_install --enable --password changeme --port 5432
$ gpstop -r

See Also

gpstop