Skip to content

[Bug] VACUUM and VACUUM FULL appear to be ineffective on PAX tables. #1766

@ryapandt

Description

@ryapandt

Apache Cloudberry version

2.1

What happened

After inserting a large amount of data into a PAX table and deleting most of the rows, running either VACUUM or VACUUM FULL does not reduce the table size / disk usage.

What you think should happen instead

  • VACUUM should reclaim or compact storage occupied by deleted rows, if supported.
  • VACUUM FULL should reclaim the space more aggressively, likely by rewriting/compacting the table.
  • If VACUUM / VACUUM FULL is currently unsupported for PAX tables, Cloudberry should return a clear warning/error instead of silently doing nothing.

How to reproduce

DROP TABLE IF EXISTS pax_vacuum_repro;

CREATE TABLE pax_vacuum_repro (
    id int,
    payload text
) USING PAX
DISTRIBUTED BY (id);

INSERT INTO pax_vacuum_repro
SELECT
    g,
    repeat(md5(g::text), 50)
FROM generate_series(1, 5000000) AS g;


-- Check initial row count and size.
SELECT count(*) FROM pax_vacuum_repro;
SELECT pg_size_pretty(pg_total_relation_size('pax_vacuum_repro')) AS size_before_delete;>

-- Delete most rows.
DELETE FROM pax_vacuum_repro
WHERE id <= 4500000;

SELECT count(*) FROM pax_vacuum_repro;
SELECT pg_size_pretty(pg_total_relation_size('pax_vacuum_repro')) AS size_after_delete;

analyze pax_vacuum_repro;
-- Try normal VACUUM.
VACUUM pax_vacuum_repro;
analyze pax_vacuum_repro;

SELECT pg_size_pretty(pg_total_relation_size('pax_vacuum_repro')) AS size_after_vacuum;


-- Try VACUUM FULL.
VACUUM FULL pax_vacuum_repro;
analyze pax_vacuum_repro;
SELECT pg_size_pretty(pg_total_relation_size('pax_vacuum_repro')) AS size_after_vacuum_full;

Operating System

rocky 9.6

Anything else

No response

Are you willing to submit PR?

  • Yes, I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    type: BugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions