feat(query): support experimental table branch#19551
feat(query): support experimental table branch#19551zhyass wants to merge 14 commits intodatabendlabs:mainfrom
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d9a61930b6
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
b1a380e to
8690b60
Compare
e5e2a5c to
62b0be6
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 62b0be6c3f
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
2d9b217 to
ff2d089
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ff2d089ca7
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
ff2d089 to
4ee7b06
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4ee7b06209
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
4e11f1f to
a099884
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a09988423b
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
a099884 to
e2da87d
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e2da87d535
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
e2da87d to
440dfd8
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 440dfd8b24
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
440dfd8 to
5963f95
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5963f9502a
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
5963f95 to
e17cb4a
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1fe74f7996
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
d2e876d to
ff6ec17
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ff6ec17a2b
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
drmingdrmer
left a comment
There was a problem hiding this comment.
@drmingdrmer made 2 comments and resolved 1 discussion.
Reviewable status: 0 of 98 files reviewed, 5 unresolved discussions (waiting on dantengsky, zhang2014, and zhyass).
src/meta/api/src/api_impl/ref_api.rs line 372 at r16 (raw file):
} }; let key_staged_branch = StagedBranchIdent::new(req.table_id, branch_id);
What is staged? what does it mean here? add some comment to explain its purpose.
And if the staged branch id contains the branch_id we just created, key_staged_branch would never present, because no one else knows about this branch_id. Unless a crreate-branch transaction is done and retried. But there should be other macheism to prevent such cases if it is forbidden.
src/meta/api/src/api_impl/ref_api.rs line 421 at r16 (raw file):
return Ok(CreateTableBranchReply { branch_id, auto_increment_start_vals,
it's a little bit weird to return a auto incremental value. Why should the caller need these values?
Code quote:
let mut auto_increment_start_vals = BTreeMap::new();
for table_field in seq_source_table_meta.data.schema.fields() {
let Some(auto_increment_expr) = table_field.auto_increment_expr() else {
continue;
};
let source_ai_key =
AutoIncrementKey::new(req.source_table_id, table_field.column_id());
let source_ai_ident =
AutoIncrementStorageIdent::new_generic(&req.tenant, source_ai_key);
let start_value = match self.get_pb(&source_ai_ident).await? {
Some(seq_v) => seq_v.data.into_inner().0,
None => auto_increment_expr.start,
};
auto_increment_start_vals.insert(table_field.column_id(), start_value);
}
return Ok(CreateTableBranchReply {
branch_id,
auto_increment_start_vals,e7d7c28 to
82cef9a
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 82cef9a72f
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
82cef9a to
41b8d65
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 41b8d65c4c
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
41b8d65 to
78971b4
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 78971b4068
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
78971b4 to
bdb8d56
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bdb8d56412
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1347e62457
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
1347e62 to
e46ac0f
Compare
|
Codex Review: Didn't find any major issues. 👍 ℹ️ About Codex in GitHubCodex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback". |
drmingdrmer
left a comment
There was a problem hiding this comment.
@drmingdrmer partially reviewed 1 file, made 2 comments, and resolved 4 discussions.
Reviewable status: 1 of 101 files reviewed, 2 unresolved discussions (waiting on dantengsky, zhang2014, and zhyass).
src/meta/api/src/api_impl/ref_api.rs line 320 at r24 (raw file):
let key_table_id = TableId { table_id: req.base_table_id, };
what's the differences between these two table ids?
Code quote:
let key_source_table_id = TableId {
table_id: source_table_id,
};
let key_table_id = TableId {
table_id: req.base_table_id,
};src/meta/api/src/api_impl/ref_api.rs line 626 at r24 (raw file):
UnknownTable::new(&req.name_ident.table_name, "get_table_branch"), ))); }
this method should not include the logic of getting a table id. the table id should be a field in of the req. getting table-id then getting (table_id, branch_name) should be two phase to the caller.
So are for all other methods. this is RefApi, thus it should assume the input req contains a table_id, and all of the method should just focus on operating on a table_id. How to get the table_id should not be included in these methods.
Code quote:
let tenant_dbname = req.name_ident.db_name_ident();
let (seq_db_id, _db_meta) = get_db_or_err(
self,
&tenant_dbname,
format!("{}: {}", "get_table_branch", tenant_dbname.display()),
)
.await?;
let db_id = *seq_db_id.data;
let dbid_tbname = DBIdTableName {
db_id,
table_name: req.name_ident.table_name.clone(),
};
// Branch names are resolved under the *currently visible* base table id of `db.table`.
// This is intentional: branches are treated as refs of the current base table namespace,
// not as independently addressable objects. If the base table is dropped or replaced and
// `db.table` points to a new table id, previously created branches under the old base
// table id become unreachable by design.
let (tb_id_seq, table_id) = get_u64_value(self, &dbid_tbname).await?;
if tb_id_seq == 0 {
return Err(KVAppError::AppError(AppError::UnknownTable(
UnknownTable::new(&req.name_ident.table_name, "get_table_branch"),
)));
}
drmingdrmer
left a comment
There was a problem hiding this comment.
@drmingdrmer reviewed 1 file and all commit messages, and made 2 comments.
Reviewable status: 1 of 102 files reviewed, 4 unresolved discussions (waiting on dantengsky, zhang2014, and zhyass).
src/meta/api/src/api_impl/ref_api.rs line 868 at r26 (raw file):
), )))); }
inappropriate to return an error if there are more than one dropped records as undrop candidates
Code quote:
if retained_entries.len() > 1 {
let ids: Vec<String> = retained_entries
.iter()
.map(|(k, _)| k.branch_id.to_string())
.collect();
return Err(KVAppError::AppError(AppError::from(UnknownReference::new(
format!(
"Multiple dropped branches with name '{}' found (ids: {}), use UNDROP BRANCH ... IDENTIFIER(<branch_id>) to specify",
branch_name,
ids.join(", ")
),
))));
}src/meta/api/src/api_impl/ref_api.rs line 934 at r26 (raw file):
)), ))); }
In order to assert absence, just add a condition seq == 0. there is not need to get it.
Code quote:
let seq_branch = self.get_pb(&key_branch).await?;
if seq_branch.is_some() {
return Err(KVAppError::AppError(AppError::from(
ReferenceAlreadyExists::new(format!(
"Branch '{}' already exists, cannot undrop",
branch_name
)),
)));
}
drmingdrmer
left a comment
There was a problem hiding this comment.
@drmingdrmer reviewed 4 files and all commit messages, and resolved 4 discussions.
Reviewable status: 5 of 102 files reviewed, all discussions resolved (waiting on dantengsky and zhang2014).
I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/
Summary
This PR implements table branches for FUSE tables, including branch creation, branch-qualified reads and writes, branch lifecycle metadata, and branch-aware garbage collection. It also extends vacuum2 and virtual-column vacuum so historical files referenced by active branches or tags remain protected until they are no longer reachable.
A branch is a lightweight, writable fork of a table's snapshot history — it shares the base table's storage (segments/blocks) via copy-on-write semantics and maintains its own independent snapshot chain.
Syntax
Implementation Details
KV Metadata Model
Branches are persisted as explicit KV entries in meta-service:
__fd_table_branch/<table_id>/<branch_name>TableBranch { expire_at, branch_id }__fd_dropped_branch/<table_id>/<branch_name>/<branch_id>DroppedBranchMeta { drop_on, expire_at }__fd_table_by_id/<branch_id>TableMeta__fd_branch_id_to_name/<branch_id>TableIdBranchName { table_id, branch_name }Each branch gets its own table_id (allocated via fetch_id) and its own storage prefix (<db_id>/<branch_table_id>/). The branch's TableMeta.options records OPT_KEY_BASE_TABLE_ID and OPT_KEY_REFERENCED_BRANCH_IDS for cross-table GC protection.
Ref API
New meta-service APIs in ref_api.rs:
Vacuum2 Semantics
fuse_vacuum2() is now branch-aware. It:
This is especially important for branch chains such as base -> b1 -> b2, where data introduced on b1 must remain alive while b2 still references it.
Virtual Column Vacuum
Virtual-column cleanup is also extended to understand branch/tag reachability. Historical virtual-column files referenced only by an active branch or tag are preserved, and expired/dropped references stop protecting them once they become unreachable.
This keeps branch and tag semantics consistent across both snapshot data files and derived virtual-column artifacts.
Tests
Type of change
This change is