Avoiding subtransaction overflow in GPDB6

Author | Soumyadeep Chakraborty

Subtransaction overflow can really bring a cluster to it’s knees, if coupled with long running transactions.

It manifests when any given backend creates more than 64 subtransactions in any given transaction that it runs.

This can happen on the master as well as primaries.

Monitoring for subtx overflow:

Monitoring available in Greenplum versions 6.22.0 and newer.

-- Contains view gp_suboverflowed_backend
CREATE EXTENSION gp_subtransaction_overflow;
-- For each session set:
SET gp_log_suboverflow_statement to on;
-- Or set it cluster wide with:
gpconfig -c gp_log_suboverflow_statement -v on
gpstop -au

Scenarios causing subtx overflow:

EXCEPTION blocks in PL/pgSQL:

CREATE OR REPLACE FUNCTION transaction_test2()
RETURNS void AS $$
DECLARE
    i int;
BEGIN
	for i in 0..65
	LOOP
		BEGIN
			CREATE TEMP TABLE tmptab(c int) DISTRIBUTED BY (c);
			DROP TABLE tmptab;
		EXCEPTION
			WHEN others THEN
				NULL;
		END;
	END LOOP;
END;
$$
LANGUAGE plpgsql;

BEGIN;
SELECT transaction_test2();
SELECT * from gp_suboverflowed_backend;
 segid |  pids   
-------+---------
    -1 | {92506}
     0 | {92566}
     1 | {92567}
     2 | {92568}
(4 rows)

-- In master and segment logs:
-- "Statement caused suboverflow: SELECT transaction_test2();"

In the above examples, we execute the EXCEPTION clause 65 times. If we looped <= 64 times, overflow would have been avoided!

plpy.execute() and plpy.subtransaction() in PL/python:

CREATE OR REPLACE FUNCTION create_temp_tables() RETURNS VOID AS $$
import plpy

for i in range(0, 65):
    plpy.execute('CREATE TEMP TABLE t_%d (id int)' % i)

$$ LANGUAGE plpythonu;

BEGIN;
SELECT create_temp_tables();
SELECT * from gp_suboverflowed_backend ;
 segid |  pids   
-------+---------
    -1 | {94294}
     0 | {94300}
     1 | {94299}
     2 | {94301}
(4 rows)

-- In master and segment logs:
-- "Statement caused suboverflow: SELECT create_temp_tables();"

Each plpy.execute call will create a subtransaction. Again, by looping more than 64 times, we hit overflow.

Similarly using explicit subtransactions with plpy.subtransaction() will create similar problems:

CREATE OR REPLACE FUNCTION create_temp_tables2() RETURNS VOID AS $$
import plpy

for i in range(0, 65):
    with plpy.subtransaction():
    	plpy.execute('CREATE TEMP TABLE t_%d (id int)' % i)
    	plpy.execute('CREATE TEMP TABLE t_x%d (id int)' % i)

$$ LANGUAGE plpythonu;

BEGIN;
SELECT create_temp_tables2();
SELECT * from gp_suboverflowed_backend ;
 segid |   pids   
-------+----------
    -1 | {100673}
     0 | {100684}
     1 | {100685}
     2 | {100686}
(4 rows)

-- In master and segment logs:
-- "Statement caused suboverflow: SELECT create_temp_tables2();"

SAVEPOINT:

Doing savepoints creates subtransactions too:

BEGIN;
SAVEPOINT a0;
INSERT INTO foo VALUES(1);
ROLLBACK TO SAVEPOINT a0; -- creates 1st subtransaction.

SAVEPOINT a1;
INSERT INTO foo VALUES(1);
ROLLBACK TO SAVEPOINT a1; -- creates 2nd subtransaction.

...

SAVEPOINT a64;
INSERT INTO foo VALUES(1);
ROLLBACK TO SAVEPOINT a64; -- creates 65th subtransaction and causes overflow.

COMMIT;

JDBC:

JDBC driver setting autosave=always will create a savepoint (and hence subtransaction) for every statement executed! See docs

General guidance:

  1. Subtransaction overflow is not harmful by itself. When it is coupled with long running transactions, it will cause system slowness.
  2. Avoiding subtransaction overflow can be achieved by avoiding:
    • EXCEPTION blocks that can execute more than 64 times
    • plpy.execute()/plpy.subtransaction() that can execute more than 64 times
    • SAVEPOINT and ROLLBACK TO SAVEPOINT
    • Driver settings (like JDBC autosave)