Bug Report: AOCS Sampling Fails with Large Objects (BLOB)
Summary
Greenplum 6.29.0, ANALYZE and auto_stats fail on AOCS tables containing large objects (TEXT/JSONB columns) with error:
ERROR: Advance not called on large datum stream object (datumstream.c:276)
Root Cause
Problem location: src/backend/access/aocs/aocsam.c, function aocs_gettuple_column()
if (chkvisimap && !isSnapshotAny && !AppendOnlyVisimap_IsVisible(&scan->visibilityMap, &aotid))
{
ret = false;
goto out; // ← Returns WITHOUT calling datumstreamread_advance()
}
datumstreamread_find(ds, rownum - ds->blockFirstRowNum); // Never reached
When a BLOB block is read, largeObjectState is set to HaveAoContent. If aocs_gettuple_column() returns early (visibility check or other reasons), datumstreamread_advance() is never called, leaving largeObjectState = HaveAoContent.
On the next sample row iteration, datumstreamread_nth() is called (line 1015 in elog DEBUG2), which throws error when largeObjectState == HaveAoContent.
Reproduction
- Create AOCS table with TEXT/JSONB column containing large values (>block size)
- Enable auto_stats:
SET gp_autostats_mode = 'on_change';
- INSERT/COPY data into the table
- Error occurs during auto_stats or manual ANALYZE
Workaround
Disable auto_stats:
SET gp_autostats_mode = 'none';
Then run ANALYZE manually using legacy method or skip ANALYZE on affected tables.
Suggested Fix
In aocs_gettuple_column(), call datumstreamread_advance() even for invisible rows to properly transition largeObjectState:
if (chkvisimap && !isSnapshotAny && !AppendOnlyVisimap_IsVisible(&scan->visibilityMap, &aotid))
{
// Advance position for large objects to reset state
if (ds->largeObjectState == DatumStreamLargeObjectState_HaveAoContent)
datumstreamread_advance(ds);
ret = false;
goto out;
}
Affected Tables
Tables with:
appendonly=true, orientation=column
- TEXT, JSONB, or other varlena columns with large values (BLOBs)
Stack Trace
acquire_sample_rows -> analyze_rel -> vacuum -> auto_stats
datumstreamread_nthlarge (datumstream.c:276)
Related Files
src/backend/access/aocs/aocsam.c - aocs_gettuple_column(), aocs_gettuple()
src/backend/utils/datumstream/datumstream.c - datumstreamread_nthlarge() (line 276)
src/include/utils/datumstream.h - DatumStreamLargeObjectState enum
Status in other branches
Bug is NOT fixed in any branch (checked 2025-01-21):
origin/master - NOT FIXED (same goto out pattern)
origin/OPENGPDB_STABLE - NOT FIXED
origin/OPENGPDB_6_29_STABLE - NOT FIXED
The problematic goto out without calling datumstreamread_advance() exists in all branches.
Bug Report: AOCS Sampling Fails with Large Objects (BLOB)
Summary
Greenplum 6.29.0, ANALYZE and auto_stats fail on AOCS tables containing large objects (TEXT/JSONB columns) with error:
Root Cause
Problem location:
src/backend/access/aocs/aocsam.c, functionaocs_gettuple_column()When a BLOB block is read,
largeObjectStateis set toHaveAoContent. Ifaocs_gettuple_column()returns early (visibility check or other reasons),datumstreamread_advance()is never called, leavinglargeObjectState = HaveAoContent.On the next sample row iteration,
datumstreamread_nth()is called (line 1015 in elog DEBUG2), which throws error whenlargeObjectState == HaveAoContent.Reproduction
SET gp_autostats_mode = 'on_change';Workaround
Disable auto_stats:
Then run ANALYZE manually using legacy method or skip ANALYZE on affected tables.
Suggested Fix
In
aocs_gettuple_column(), calldatumstreamread_advance()even for invisible rows to properly transitionlargeObjectState:Affected Tables
Tables with:
appendonly=true, orientation=columnStack Trace
Related Files
src/backend/access/aocs/aocsam.c- aocs_gettuple_column(), aocs_gettuple()src/backend/utils/datumstream/datumstream.c- datumstreamread_nthlarge() (line 276)src/include/utils/datumstream.h- DatumStreamLargeObjectState enumStatus in other branches
Bug is NOT fixed in any branch (checked 2025-01-21):
origin/master- NOT FIXED (same goto out pattern)origin/OPENGPDB_STABLE- NOT FIXEDorigin/OPENGPDB_6_29_STABLE- NOT FIXEDThe problematic
goto outwithout callingdatumstreamread_advance()exists in all branches.