gpsupport: Support Utility for VMware Greenplum

Author | Nihal Jain

Troubleshooting and identifying supportability issues in a complex database system can be a daunting task. However, with the advent of tools like gpsupport, the process has become much simpler and more efficient. gpsupport is a powerful diagnostic utility provided by VMware Greenplum that enables database administrators to identify, analyze, and resolve common supportability issues. In addition, gpsupport offers a consistent method for gathering information required by VMware Support, facilitating a seamless support experience. In this blog post, we will delve into the capabilities of gpsupport and explore how it simplifies troubleshooting and support for VMware Greenplum.

Following are the various tools offered by the gpsupport utility –

1. analyze_session

This tool traces active processes within a Greenplum Database session, gathering diagnostic information that proves valuable in scenarios where a query process appears to be unresponsive or stalled.

Usage

$ gpsupport analyze_session [ -session SESSION_ID ] [ -coordinator-dir DIRECTORY ]
        [ -segment-dir DIRECTORY ]

  OPTIONS:
    -session           GPDB session id which is referenced in pg_stat_activity
    -coordinator-dir   Working directory for coordinator process
    -segment-dir       Working directory for segment processes
    -free-space        default=10  free space threshold which will abort data collection if reached
    -a                 Answer yes to all prompts

Example

Let us simulate a hung session by running the pg_sleep() function

[gpadmin@cdw ~]$ psql -c "select pg_sleep(10000)" &

Next, we need to find the session ID of this particular query. We can use the gp_stat_activitytable to get this value.

gpadmin=# select sess_id from gp_stat_activity where query like '%pg_sleep%' and query not like '%like%';
 sess_id
---------
      78
(1 row)

Now we are going to use this value to collect all the information associated with this particular query process

[gpadmin@cdw ~]$ gpsupport analyze_session -session 78

The above command creates an analyze_session_*.tgz file which when extracted has the following contents.

[gpadmin@cdw ~]$ tree analyze_session_con74_2023-07-31_17-54-22
analyze_session_con74_2023-07-31_17-54-22
└── analyze_session_con74_2023-07-31_17-54-22
    ├── analyze_session_cdw_con74_2023-07-31_17-54-22
    │   ├── commands
    │   │   ├── df
    │   │   │   └── -h
    │   │   ├── dmidecode
    │   │   ├── free
    │   │   ├── ifconfig
    │   │   ├── lsof
    │   │   ├── netstat
    │   │   │   ├── -i
    │   │   │   └── -rn
    │   │   ├── ps
    │   │   │   ├── -elfy
    │   │   │   ├── -flyCpostgres
    │   │   │   └── aux
    │   │   ├── sysctl
    │   │   │   └── -a
    │   │   ├── ulimit
    │   │   │   └── -a
    │   │   └── uname
    │   │       └── -a
    │   ├── etc
    │   │   ├── redhat-release
    │   │   ├── security
    │   │   │   └── limits.conf
    │   │   ├── sysconfig
    │   │   │   └── network
    │   │   └── sysctl.conf
    │   └── tmp
    │       └── analyze_session_cdw_con74_2023-07-31_17-54-22
    │           └── process-collector-2023-07-31_17-54-25
    │               ├── lsof.90925-2023-07-31_17-54-28
    │               ├── lsof.90925-2023-07-31_17-54-32
    │               ├── lsof.90925-2023-07-31_17-54-36
    │               ├── lsof.90925-2023-07-31_17-54-39
    │               ├── lsof.90925-2023-07-31_17-54-43
    │               ├── lsof.90925-2023-07-31_17-54-47
    │               ├── lsof.90925-2023-07-31_17-54-50
    │               ├── lsof.90925-2023-07-31_17-54-54
    │               ├── lsof.90925-2023-07-31_17-54-58
    │               ├── lsof.90925-2023-07-31_17-55-02
    │               ├── packcore-core.90925.tar.gz
    │               ├── pstack.90925-2023-07-31_17-54-31
    │               ├── pstack.90925-2023-07-31_17-54-35
    │               ├── pstack.90925-2023-07-31_17-54-38
    │               ├── pstack.90925-2023-07-31_17-54-42
    │               ├── pstack.90925-2023-07-31_17-54-46
    │               ├── pstack.90925-2023-07-31_17-54-49
    │               ├── pstack.90925-2023-07-31_17-54-53
    │               ├── pstack.90925-2023-07-31_17-54-57
    │               ├── pstack.90925-2023-07-31_17-55-00
    │               ├── pstack.90925-2023-07-31_17-55-04
    │               └── strace-90925.out
    ├── analyze_session_cdw_con74_2023-07-31_17-54-22.tgz
    ├── gp_locks_on_relation.out
    ├── gp_locks_on_resqueue.out
    ├── gp_resq_activity.out
    ├── gp_resqueue_status.out
    ├── pg_locks.out
    └── pg_stat_activity.out

15 directories, 46 files

2. catalogbackup

This tool is used to take the backup of the catalog and transaction files for a database. This can be helpful in scenarios where we need to make some changes to the catalog and we want to take a backup.

Usage

$ gpsupport catalogbackup -d dbname -c ID1,ID2,... [-r -wd <directory>]

  OPTIONS:
    -c           The ID of the segments to be backed up
    -d           The name of the databse which needs to be backed up
    -r           Retrieve backups from segments
    -wd          Directory to copy all files to, current directory will be used otherwise

Example

Let us say we want to make a backup of the gpadmin database for segments with content ID as 1. We can do it by running the following command.

$ gpsupport catalogbackup -r -d gpadmin -c 0

Here, we are using the -r option to retrieve the backups generated from the segment hosts to the coordinator host.

Following are the files collected by this tool.

[gpadmin@cdw gpsupport_logs]$ ls -l
total 2588
-rw-rw-r-- 1 gpadmin gpadmin 1304239 Aug  3 12:28 catalogBackup-m1-230803122805.tgz
-rw-rw-r-- 1 gpadmin gpadmin 1311274 Aug  3 12:28 catalogBackup-p1-230803122805.tgz
-rw-rw-r-- 1 gpadmin gpadmin   12711 Aug  3 12:28 last_ops.csv
-rw-rw-r-- 1 gpadmin gpadmin    6798 Aug  3 12:28 relids_seg_1_gpadmin.csv
-rw-rw-r-- 1 gpadmin gpadmin     682 Aug  3 12:28 seg_hist.csv

It contains 2 catalogBackup-*.tgz files, where p1 and m1 denote the primary and mirror segments having content ID 1 respectively.

If we take a look at one of the tar files

[gpadmin@cdw gpsupport_logs]$ tree catalogBackup-p1-230803122805
catalogBackup-p1-230803122805
├── base
│   └── 16942
│       ├── 112
│       ├── 113
│       ... (truncated)
├── global
│   ├── 1137
│   ├── 1213
│   ├── 1213_fsm
│   ├── 1213_vm
│   ... (truncated)
│   ├── pg_control
│   ├── pg_filenode.map
│   └── pg_internal.init
└── pg_distributedlog
    └── 0000

3. gp_log_collector

As the name suggests, this tool is used to collect the logs which are generated by the Greenplum Database. This includes postgres log files, logs generated by the management utilities and much more. This can be helpful in cases where you want to share the logs with the support team whenever they need it for troubleshooting.

Usage

$ gpsupport gp_log_collector [ -failed-segs | -c ID1,ID2,... | -hostfile FILE | -h HOST1,HOST2,... ]
        [ -start YYYY-MM-DD | YYYY-MM-DD HH:MM ] [ -end YYYY-MM-DD | YYYY-MM-DD HH:MM ]
        [ -dir PATH ] [ -segdir PATH ] [ -a ]

  OPTIONS:
    -failed-segs       Query gp_configuration_history for list of faulted content IDs.
    -free-space        default=10  free space threshold which will abort log collection if reached
    -c                 Comma seperated list of content IDs.
    -hostfile          Read hostnames from a hostfile.
    -h                 Comma seperated list of hostnames.
    -start             Specifies the start timestamp for collecting logs. It should be in the format 'YYYY-MM-DD' or 'YYYY-MM-DD HH:MM' (defaults to current date)
    -end               Specifies the end timestamp for collecting logs. It should be in the format 'YYYY-MM-DD' or 'YYYY-MM-DD HH:MM' (defaults to current date)
    -a                 Answer Yes to all prompts
    -dir               Working directory (defaults to current directory)
    -segdir            Segment temporary directory (defaults to /tmp)
    -skip-coordinator  Skip coordinator log collection
    -with-gptext       Collect all gptext logs within gpdb logs
    -with-gptext-only  Only Collect gptext logs
    -with-pxf          Collect all pxf logs along with other greenplum logs
    -with-pxf-only     Only Collect pxf logs
    -with-gpupgrade    Collect all logs relevent to gpupgrade
    -with-gpbackup     Collect all logs relevent to gpbackup and gprestore
    -with-gpcc         Collect all logs relevent to gpcc
    -with-gpss         Collect all logs relevent to gpss. Logs will be collected from gpAdminLogs directory by default
    -gpsslogdir        Directory where gpss logs are located (can only be used with -with-gpss option)

Example

If we use this tool without any option, it would collect the coordinator and standby logs for the current day by default. We can get more control over the logs to be collected by using the options mentioned above appropriately.

For example if we want to collect the logs for the segments with content ID 1 for the current day, we could use the following command.

$ gpsupport gp_log_collector -c 1

Here you can specify the time range for which you wish to collect the logs using -start and -end and flags. Additionally if you wish to colect logs for additional GPDB componenets like GPCC, GPSS, PXF…, you coould use the appropriate option shown above.

Below shows the list of files it usually collects. Logs for each host will have their own respective tarballs.

[gpadmin@cdw gp_log_collection_2023-08-09_12-29-28]$ ls -l
total 112
-rw-r--r-- 1 gpadmin gpadmin  2039 Aug  9 12:29 gp_configuration_history
-rw-rw-r-- 1 gpadmin gpadmin 24304 Aug  9 12:29 gp_log_collection_cdw_2023-08-09_12-29-28.tgz
-rw-rw-r-- 1 gpadmin gpadmin 23747 Aug  9 12:29 gp_log_collection_sdw2_2023-08-09_12-29-28.tgz
-rw-rw-r-- 1 gpadmin gpadmin 23460 Aug  9 12:29 gp_log_collection_sdw3_2023-08-09_12-29-28.tgz
-rw-r--r-- 1 gpadmin gpadmin   221 Aug  9 12:29 gp_resgroup_config
-rw-r--r-- 1 gpadmin gpadmin   414 Aug  9 12:29 gp_segment_configuration
-rw-r--r-- 1 gpadmin gpadmin   460 Aug  9 12:29 pg_database
-rw-r--r-- 1 gpadmin gpadmin    64 Aug  9 12:29 pg_db_role_setting_Database
-rw-r--r-- 1 gpadmin gpadmin   202 Aug  9 12:29 pg_db_role_setting_Roles
-rw-r--r-- 1 gpadmin gpadmin   113 Aug  9 12:29 pg_resqueue
-rw-r--r-- 1 gpadmin gpadmin    52 Aug  9 12:29 system_initialization_timestamp
-rw-r--r-- 1 gpadmin gpadmin    31 Aug  9 12:29 uptime
-rw-r--r-- 1 gpadmin gpadmin   248 Aug  9 12:29 version

Let us take a look at host sdw2’s tarball to see the kind of files it collects. Usually it contains output of certain system commands like ps, top, netstat… along with the postgres logs for all the segments on that host.

[gpadmin@cdw gp_log_collection_2023-08-09_12-29-28]$ tree gp_log_collection_sdw2_2023-08-09_12-29-28
gp_log_collection_sdw2_2023-08-09_12-29-28
├── commands
│   ├── bash
│   │   ├── -c_echo $PATH
│   │   ├── -c_ps -ef | grep solr
│   │   ├── -c_ps -ef | grep zookeeper
│   │   └── -c_ulimit -a
│   ├── df
│   │   └── -h
│   ├── free
│   ├── ifconfig
│   ├── netstat
│   │   ├── -i
│   │   └── -rn
│   ├── ps
│   │   └── aux
│   ├── sysctl
│   │   └── -a
│   ├── top
│   │   └── -n_1_-b
│   └── uname
│       └── -a
├── data
│   └── primary
│       └── gpseg1
│           ├── log
│           │   ├── gpdb-2023-08-09_000000.csv
│           │   ├── gpdb-2023-08-09_033827.csv
│           │   └── gpdb-2023-08-09_040710.csv
│           ├── pg_hba.conf
│           └── postgresql.conf
└── etc
    ├── redhat-release
    ├── security
    │   └── limits.conf
    ├── sysconfig
    │   └── network
    └── sysctl.conf

15 directories, 22 files

4. gpcheckcat

This tool is used to analyse the gpcheckcat log file generated by the gpcheckcat utility. This can include splitting the log file into different files for each test gpcheckcat performs (like missing, dependency…) or generating SQLs to fix the catalog issues related to tables like pg_class and pg_depend.

Usage

$ gpsupport gpcheckcat [COMMAND] [OPTIONS]

  COMMANDS:
    split                 Split gpcheckcat log file into a tree structure for easier analysis (root -> database check directory -> test files)
    pgclass_extra         Generate DROP TABLE command files for all extra relations discovered by the "missing_extraneous_pg_class" test.
                              Use as input either gpcheckcat log file (-infile) or single database directory generated by -split command (-indir).
                              ***IMPORTANT*** Verify DROP statements before executing ***IMPORTANT***
    depfix                Exampine pg_depend table for orphan and broken dependency entries. Generate command file with DELETE statements as well as
                              command file with COPY IN command to serve as a backup.)
    depshow               Build and run the gpcheckcat dependency check query and provide the details (gpcheckcat provides only the summary)


  OPTIONS:
    -infile                Input file (gpcheckcat log file)
    -indir                 Input directory. This must be a database directory created by -split command. See examples for details.
    -outdir                Output directory (directories/files will be written here)
    -errorsonly            Record only error related lines ("ERROR", "FAIL", "WARNING")
    -db                    Database to connect to. If empty, defaults to PGDATABASE environment variable. If PGDATABASE is not set, use "postgres" database.

Example

Let us take an example for the usage of the split command. Before that we will run gpcheckcat to generate a gpcheckcat log file which we would then use use with the tool.

[gpadmin@cdw gpsupport_logs]$ gpcheckcat gpadmin
[gpadmin@cdw gpsupport_logs]$ ls ~/gpAdminLogs/
gpcheckcat_20230816.log

Now we will use the generated gpcheckcat log file with the tool

[gpadmin@cdw gpsupport_logs]$ gpsupport gpcheckcat -split -infile ~/gpAdminLogs/gpcheckcat_20230816.log
[gpadmin@cdw gpsupport_logs]$ tree gpcheckcat-split-gpcheckcat_20230816.log-20230816-140434
gpcheckcat-split-gpcheckcat_20230816.log-20230816-140434
└── gpadmin-20230816:14:03:59
    ├── ao_lastrownums
    ├── aoseg_table
    ├── dependency
    ├── distribution_policy
    ├── duplicate
    ├── foreign_key
    ├── inconsistent
    ├── missing_extraneous
    ├── namespace
    ├── orphaned_toast_tables
    ├── part_integrity
    ├── pgclass
    ├── summary_report
    └── unique_index_violation

1 directory, 14 files

As we can see, the tool has split the log file into different sections for each of the tests that gpcheckat performs.

5. gpcheckup

gpcheckup performs a health check across all the hosts of a Greenplum Database Cluster. It checks the network and various kernel settings to determine if the host is in a healthy condition to run the Greenplum Database effeciently

Usage

$ gpsupport gpcheckup

Example

[gpadmin@cdw gpsupport_logs]$ gpsupport gpcheckup
Checking connectivity...
Connectivity confirmed successfully.
Checking hosts for existing file...
Tool already on remote hosts with correct MD5.

Running System Healthchecks

	Kernel Memory Settings....   FAILED

[cdw] vm.overcommit_ratio = 50. Expected = 95
[cdw] vm.overcommit_memory = 1. Expected = 2
[sdw1] vm.overcommit_ratio = 50. Expected = 95
[sdw1] vm.overcommit_memory = 1. Expected = 2
[sdw2] vm.overcommit_ratio = 50. Expected = 95
[sdw2] vm.overcommit_memory = 1. Expected = 2
[sdw3] vm.overcommit_ratio = 50. Expected = 95
[sdw3] vm.overcommit_memory = 1. Expected = 2

Note: The vm.overcommit_ratio may be influenced by other factors.
      See http://greenplum.org/calc/ for more information

See KB 202703383 for more information on setting overcommit_memory to 2

	System Memory....            PASSED
	Database Memory Settings.... FAILED

ERROR: The gp_vmem_protect_limit is set too high: setting: 8192 suggestion: 500

Note: These memory suggestions are based on the calculations on greeplum.org, and may not fit your use case.
      See http://greenplum.org/calc/ for more information

	Resource Queue Memory....    WARNING

WARN: Resource queue pg_default does not have a memory limit
WARN: Statement memory potentially too high. Total number of available active queries * the statement_mem is > 90 percent of available system memory

Total available system memory: 829506 kB
Total available query slots: 20
Total memory allocated to resource queues: 0 kB
GUC statement_mem setting: 128000

6. gpstatscheck

This tools helps to identify all tables which are involved in a query that have outdated statistics. When a query is running slower than expected, it is possible that outdated or invalid statistics on the tables involved in the query are causing the slowness. This can happen if new data has been loaded into the table but analyze was never run, so the database is using wrong statistics when generating the query plan.

Usage

$ gpsupport gpstatscheck -f QUERYFILE [ -p PORT ] [ -d DATABASE ]

  OPTIONS:
    -f          Query file
    -p          Port (defaults to 5432)
    -d          Database (defaults to gpadmin)

Example

First, I will create a table abc with 1000 rows. After that I will insert additional 2000 rows into the same table without running ANALYZE, so that the table is not yet updated with the latest statistics.

gpadmin=# CREATE TABLE abc (col1 int);
NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'col1' as the Greenplum Database data distribution key for this table.
HINT:  The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
CREATE TABLE
gpadmin=# INSERT INTO abc VALUES (generate_series(1,1000));
INSERT 0 1000
gpadmin=# INSERT INTO abc VALUES (generate_series(1,2000));
INSERT 0 2000

Now let us assume I have a query that involves operations on table abc. I am taking a very simple query here just as an example. Note that EXPLAIN VERBOSE keyword is a must before any query to use this tool.

[gpadmin@cdw gpsupport_logs]$ cat gp.sql
EXPLAIN VERBOSE
SELECT
    *
FROM
    abc

Now I will check this query using the gpstatscheck tool to check if the tables involved have any outdated statistics.

[gpadmin@cdw gpsupport_logs]$ gpsupport gpstatscheck -f gp.sql -p 7000 -d gpadmin -y
Executing EXPLAIN VERBOSE query.
Found 1 tables in query.

                      Table Details
-------------------------------------
| Table Name | Info                 |
-------------+-----------------------
| public.abc |                      |
-------------+-----------------------

Note: Views and External Tables will be skipped since they do not have statistics.

-y option passed, continuing...

Executing count(*) to get actual tuple counts:
 -> public.abc ... done

                                            Stats Check Summary
---------------------------------------------------------------------------------------
| Table Name | Actual           | Estimated        | Diff      | Comments             |
-------------+------------------+------------------+-----------+-----------------------
| public.abc |             3000 |             1000 |      2000 | Needs ANALYZE        |
-------------+------------------+------------------+-----------+-----------------------

Generating ANALYZE commands.

Output file:
    gpstatscheck_20230816_171953.sql

Execute using:
    psql -p 7000 -d gpadmin -f gpstatscheck_20230816_171953.sql

Execution finished successfully!

As we can see, the output shows that the table is not yet updated with the latest statistics after we inserted additional 2000 rows. This tool also generates an output SQL file which we can run to update the table with the latest statistics which is a simple ANALYZE on the affected table.

[gpadmin@cdw gpsupport_logs]$ cat gpstatscheck_20230816_171953.sql
ANALYZE public.abc;

7. packcore

Packcore takes a corefile, extracts the name of the binary which generated the core, executes ldd (List Dynamic Dependencies) to get the required shared libraries and packages everything into a single tarball archive.

Usage

$ gpsupport packcore -cmd COMMAND -core COREFILE [ -binary BINARY ] [ -keep_tmp_dir ] [ -ignore_missing ]

  COMMANDS:
    collect               Collect core file and associated libraries

  OPTIONS:
    -core                  Corefile
    -binary                Binary
    -keep_tmp_dir          Do not delete temp directory
    -ignore_missing        Ignore missing libraries

Example

Assume we have a core file for the PID 5129 which we want to package using this tool. We could do it as following.

[gpadmin@cdw gpsupport_logs]$ ls
core.5129
[gpadmin@cdw gpsupport_logs]$ gpsupport packcore -cmd collect -core core.5129

This would generate a tar file with the name packcore-*.tar.gz in the same directory which would contain the following files.

[gpadmin@cdw gpsupport_logs]$ tree packcore-core.5129
packcore-core.5129
├── core.5129
├── etc
│   ├── centos-release
│   ├── os-release
│   ├── redhat-release
│   ├── rocky-release
│   └── system-release
├── gdb_output
├── lib64
│   ├── ld-linux-x86-64.so.2
│   ├── libaudit.so.1
│   ├── libbz2.so.1
│   ├── libc.so.6
│   ├── libcap-ng.so.0
│   ├── libcom_err.so.2
│   ├── libcrypt.so.1
│   ├── libcrypto.so.1.1
│   ├── libcurl.so.4
│   ├── libdl.so.2
│   ├── libgcc_s.so.1
│   ├── libgssapi_krb5.so.2
│   ├── libk5crypto.so.3
│   ├── libkeyutils.so.1
│   ├── libkrb5.so.3
│   ├── libkrb5support.so.0
│   ├── liblber-2.4.so.2
│   ├── libldap-2.4.so.2
│   ├── liblzma.so.5
│   ├── libm.so.6
│   ├── libnghttp2.so.14
│   ├── libnss_files.so.2
│   ├── libpam.so.0
│   ├── libpcre2-8.so.0
│   ├── libpthread.so.0
│   ├── libresolv.so.2
│   ├── librt.so.1
│   ├── libsasl2.so.3
│   ├── libselinux.so.1
│   ├── libssl.so.1.1
│   ├── libstdc++.so.6
│   ├── libuv.so.1
│   ├── libxml2.so.2
│   ├── libz.so.1
│   └── libzstd.so.1
├── postgres
├── runGDB.sh
├── uname.out
└── usr
    └── local
        └── greenplum-db-7.0.0-beta.5
            └── lib
                └── libxerces-c-3.1.so

6 directories, 46 files

8. primarymirror_lengths

This utility serves the purpose of detecting invalid file lengths within AO and AOCO tables present in both primary and mirror segments. By utilizing this tool, you can effectively identify any discrepancies in file lengths, ensuring the integrity and consistency of your data storage.

Usage

$ gpsupport primarymirror_lengths [OPTIONS]

  OPTIONS:
    -d                    Database to check
    -batch-size           Number of tables to lock concurrently [default=5]
    -max-lock-attempts    Maximum number of times to attempt table lock. [default=1000]

Example

I have 3 AOCO tables in my database with some data named abc, def, ghi. Now let us find any file related to any of the tables on primary segment with content ID 1 and try to damage the file by reducing its size.

gpadmin=# \dt+
                         List of relations
 Schema | Name | Type  |  Owner  |  Storage  |  Size  | Description
--------+------+-------+---------+-----------+--------+-------------
 public | abc  | table | gpadmin | ao_column | 302 kB |
 public | def  | table | gpadmin | ao_column | 302 kB |
 public | ghi  | table | gpadmin | ao_column | 302 kB |
(3 rows)

The below command will give the file info for all the AOCO tables on segment with content ID 1.

gpadmin=# select c.relfilenode, aoco_info.physical_segno, aoco_info.eof from gp_dist_random('pg_class') c join gp_toolkit.__gp_aocsseg(c.oid) aoco_info on gp_segment_id=aoco_info.segment_id where relam=3435 and c.gp_segment_id=1 ORDER BY random();
 relfilenode | physical_segno |  eof
-------------+----------------+-------
       18424 |              1 | 13584
       18420 |            129 | 13584
       18424 |            129 | 13584
       18420 |              1 | 13584
       18428 |              1 | 13584
       18428 |            129 | 13584
(6 rows)

gpadmin=# select oid from pg_database where datname = current_database();
  oid
-------
 21672
(1 row)

Using the above information, we can construct the path for any one of the file nodes. Let us use the first row and now the file path would be /data/primary/gpseg1/base/21672/18424.1, which will be on host sdw2. Now we will reduce its file size to 0 to damage the file.

[gpadmin@sdw2 ~]$ truncate -s 0 /data/primary/gpseg1/base/21672/18424.1

When we run the tool, it would be able to identify the file paths which are damaged.

[gpadmin@cdw gpsupport_logs]$ gpsupport primarymirror_lengths -d gpadmin
Checking connectivity...
Connectivity confirmed successfully.
Checking hosts for existing file...
Tool already on remote hosts with correct MD5.
===========================================
STARTING PRIMARY MIRROR SEGFILE LENGTH SCAN
===========================================
Data size mismatch: /data/primary/gpseg1/base/21672/18424.1 host: sdw2 dbid: 3 expected size: 13584 actual size: 0
===========================================
Found relations in catalog: 3
Scanned ao relations: 0
Found ao inconsistencies: 0
Scanned aoco relations: 3
Found aoco inconsistencies: 1
Relations not scanned: 0
Relations not found: 0
===========================================