Skip to content

dissect: add --manifest switch to generate a UAPI.16 manifest file fr…#41990

Draft
poettering wants to merge 5 commits into
systemd:mainfrom
poettering:uapi16
Draft

dissect: add --manifest switch to generate a UAPI.16 manifest file fr…#41990
poettering wants to merge 5 commits into
systemd:mainfrom
poettering:uapi16

Conversation

@poettering

@poettering poettering commented May 8, 2026

Copy link
Copy Markdown
Member

…om a directory tree

The UAPI.16 spec is being discussed here:

uapi-group/specifications#213

@poettering

Copy link
Copy Markdown
Member Author

Do not merge until the spec is also good to be merged.

@github-actions

github-actions Bot commented May 8, 2026

Copy link
Copy Markdown

Claude review of PR #41990 (97448c7)

Must fix

  • Unused variable escapedsrc/dissect/dissect.c:1348 — declared but never used in manifest_print_item(), will fail with -Werror
  • S_ISDIR(sx->stx_mask) should be S_ISDIR(sx->stx_mode)src/dissect/dissect.c:1386 — uses wrong struct field for file type check, causing directories to incorrectly enter the hardlink processing path
  • fd leak on lseek failure in rm_rf_childrensrc/shared/rm-rf.c:313-315 — when REMOVE_REWIND lseek fails, returns -errno without closing the fd, violating the function's ownership contract
  • Non-UTF-8 filename aborts entire manifestsrc/dissect/dissect.c:1585 — a single non-UTF-8 name makes sd_json_buildo() fail with -EUCLEAN and aborts the whole recurse_dir() walk

Suggestions

  • NULL from inode_type_to_string()src/dissect/dissect.c:1397 — can return NULL for unknown inode types, producing a JSON null instead of a string
  • name_to_handle_at_try_fid hard failuresrc/dissect/dissect.c:1418 — non-fatal errors (e.g. -EOPNOTSUPP) abort the entire manifest, could gracefully skip hardlink detection instead
  • Missing log on sd_json_buildo failuresrc/dissect/dissect.c:1543 — error returned without logging, unlike the analogous call earlier in the file
  • stx_size accessed without STATX_SIZE mask checksrc/dissect/dissect.c:1543 — in the S_IFREG case, sx->stx_size is used after only checking STATX_TYPE, inconsistent with mtree_print_item which checks both
  • --mtree-hash flag reused for manifestsrc/dissect/dissect.c:1453 — the flag name and description reference mtree output only, but it now also controls manifest hashing
  • get_entry() helper has no failure guardtest/units/TEST-50-DISSECT.dissect.sh:1143 — if no entry matches, produces empty output leading to confusing assertion failures
  • --manifest on image only checks exit codetest/units/TEST-50-DISSECT.dissect.sh:1219 — output is discarded; consider verifying non-empty output or expected content
  • type vocabulary vs UAPI.16 specsrc/dissect/dissect.c:1578 — confirm inode_type_to_string() tokens (dir/lnk/chr/blk/fifo/sock) match the spec's type enum
  • Symlink target via contents/literalsrc/dissect/dissect.c:1588 — confirm the spec sanctions reusing the regular-file contents/literal mechanism for symlink targets
  • python3 bind unguarded under errexittest/units/TEST-50-DISSECT.dissect.sh:1125 — a bind failure aborts the whole test even though socket coverage is best-effort

Nits

  • Double blank linesrc/dissect/dissect.c:1380-1382 — extra blank line between break; and case S_IFLNK:
  • Misleading error messagesrc/dissect/dissect.c:1427 — says "Failed to symlink" but the operation is readlinkat_malloc on the hardlink database
  • Pointer star stylesrc/dissect/dissect.c:1383const char* type should be const char *type
  • Operator spacingsrc/dissect/dissect.c:1475colon +1 should be colon + 1
  • Hardcoded SHA256 for empty filetest/units/TEST-50-DISSECT.dissect.sh:1162 — could compute dynamically like done for regular.txt
  • Hardcoded symlink sizetest/units/TEST-50-DISSECT.dissect.sh:1179"11" is strlen of target, would be more maintainable computed
  • Missing comma in man pageman/systemd-dissect.xml:272 — "Like --mtree this" should be "Like --mtree, this"
  • FID error message missing pathsrc/dissect/dissect.c:1395 — doesn't include the file path for debugging
  • Cycle avoidance startswith too broadsrc/dissect/dissect.c:1525startswith(path, "Uapi16Manifest.") also skips non-manifest files
  • Redundant db_path guardsrc/dissect/dissect.c:1358rmdir_and_free(NULL) is a no-op, the if (hd->db_path) check is unnecessary
  • Possibly dead spec URLman/systemd-dissect.xml:267 — the UAPI.16 spec is still an unmerged proposal, the linked URL may 404
  • JSON-Lines conversion assumes single-line recordstest/units/TEST-50-DISSECT.dissect.sh:1146tr/head -n1 break if the dump ever emits multi-line records

Workflow run

Comment thread src/dissect/dissect.c Outdated
Comment thread src/dissect/dissect.c Outdated
Comment thread src/dissect/dissect.c
@bluca bluca added ci-fails/needs-rework 🔥 Please rework this, the CI noticed an issue with the PR labels May 8, 2026
@github-actions github-actions Bot removed the ci-fails/needs-rework 🔥 Please rework this, the CI noticed an issue with the PR label May 10, 2026
Comment thread src/dissect/dissect.c Outdated
Comment thread src/dissect/dissect.c Outdated
Comment thread src/shared/rm-rf.c Outdated
Comment thread src/dissect/dissect.c
Comment thread src/dissect/dissect.c Outdated
Comment thread src/dissect/dissect.c Outdated
Comment thread src/dissect/dissect.c Outdated
Comment thread src/dissect/dissect.c Outdated
Comment thread src/dissect/dissect.c
Comment thread test/units/TEST-50-DISSECT.dissect.sh
Comment thread test/units/TEST-50-DISSECT.dissect.sh
Comment thread test/units/TEST-50-DISSECT.dissect.sh
Comment thread man/systemd-dissect.xml Outdated
Comment thread src/dissect/dissect.c
Comment thread src/dissect/dissect.c
Comment thread test/units/TEST-50-DISSECT.dissect.sh
@yuwata

yuwata commented May 26, 2026

Copy link
Copy Markdown
Member

Making this as draft, as this has the do-not-merge tag.

@yuwata yuwata marked this pull request as draft May 26, 2026 19:00
@yuwata yuwata added needs-rebase and removed please-review PR is ready for (re-)review by a maintainer labels May 26, 2026
Comment thread src/dissect/dissect.c
r = sd_json_buildo(
&j,
SD_JSON_BUILD_PAIR_CONDITION(is_root, "mediaType", JSON_BUILD_CONST_STRING("application/vnd.uapi.16.manifest")),
SD_JSON_BUILD_PAIR_CONDITION(!is_root, "name", SD_JSON_BUILD_STRING(path)),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: must-fix: The name field is built from the raw filename via SD_JSON_BUILD_STRING(path), but sd_json_variant_new_stringn() rejects any string that is not valid UTF-8 with -EUCLEAN (src/libsystemd/sd-json/sd-json.c:424). Linux filenames are arbitrary byte sequences and the image / directory tree may be untrusted, so a single non-UTF-8 (or mojibake) name anywhere in the tree makes sd_json_buildo() fail, manifest_print_item() return the error, and recurse_dir() abort — the entire --manifest run dies with "Failed to build JSON object" on the first bad name. The --mtree path tolerates such names (it prints them raw / escapes the symlink target). Consider validating or escaping the name (or skipping with a warning) so an untrusted image cannot abort manifest generation. The same applies if any intermediate path component is non-UTF-8.

Comment thread src/dissect/dissect.c
}

/* Suppress the type on the top-level entry and on regular files. */
type = !is_root && !S_ISREG(sx->stx_mode) ? inode_type_to_string(sx->stx_mode) : NULL;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: suggestion: The type value comes from inode_type_to_string(), which yields the systemd-internal tokens dir/lnk/chr/blk/fifo/sock (note these differ from the --mtree vocabulary dir/file/link/block/char/socket). Please confirm these exact strings match the type enum defined by the UAPI.16 Manifest Format spec (uapi-group/specifications#213) — a manifest using a different vocabulary than the spec will not validate. The CI test in the last commit hardcodes the same tokens, so it would pass even if they diverged from the spec.

Comment thread src/dissect/dissect.c
SD_JSON_BUILD_PAIR_CONDITION(!is_root, "name", SD_JSON_BUILD_STRING(path)),
SD_JSON_BUILD_PAIR_CONDITION(!!type, "type", SD_JSON_BUILD_STRING(type)),
SD_JSON_BUILD_PAIR_CONDITION(have_size, "size", SD_JSON_BUILD_UNSIGNED(size)),
SD_JSON_BUILD_PAIR_CONDITION(!!data, "contents", SD_JSON_BUILD_ARRAY(SD_JSON_BUILD_OBJECT(SD_JSON_BUILD_PAIR("literal", SD_JSON_BUILD_BASE64(data, size))))),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: suggestion: For symlinks the target is emitted as contents: [{ "literal": base64(target) }]. (The base64 size argument is correct here — it is strlen(target) set in the S_IFLNK branch, not the stale UINT64_MAX default, and contents is gated on !!data so the two cases are mutually exclusive.) The open conformance question is whether the UAPI.16 spec represents a symlink target through the same contents/literal mechanism it uses for regular-file contents, or via a dedicated field. Please confirm the spec sanctions reusing contents/literal for symlink targets.

Comment thread src/dissect/dissect.c
(void) rm_rf_children(TAKE_FD(hd->db_fd), REMOVE_PHYSICAL|REMOVE_REWIND, /* root_dev= */ NULL);

if (hd->db_path)
hd->db_path = rmdir_and_free(hd->db_path);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: nit: rmdir_and_free(NULL) is a no-op that returns NULL, so the if (hd->db_path) guard is redundant. The surrounding systemd idiom is to call such freers unconditionally: hd->db_path = rmdir_and_free(hd->db_path);

Comment thread man/systemd-dissect.xml
<term><option>--manifest</option></term>

<listitem><para>Generates a file manifest of the specified disk image or directory, following the
<ulink url="https://uapi-group.org/specifications/specs/manifest_format/">UAPI.16 Manifest

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: nit: This links to https://uapi-group.org/specifications/specs/manifest_format/, but the UAPI.16 Manifest Format spec is still an open, unmerged proposal (uapi-group/specifications#213, and this PR carries a do-not-merge label). That URL likely does not resolve until the spec PR merges. Consider pointing at the spec PR until it is published, or holding the docs until the canonical URL exists, so the man page does not ship a dead link.

# Unix socket — bind() it with python3 if available (process exits, file persists)
HAVE_SOCKET=0
if command -v python3 >/dev/null 2>&1; then
python3 -c "import socket,sys; s=socket.socket(socket.AF_UNIX); s.bind(sys.argv[1])" "$MANIFEST_ROOT/socket"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: suggestion: The socket coverage is gated only on command -v python3, but the bind() itself runs unguarded under errexit. If python3 is present but the bind fails for any reason (sandboxed environment, AF_UNIX restricted, an unexpected error), the whole test aborts even though the socket case is meant to be best-effort. Consider folding the bind into the guard so a failure only leaves HAVE_SOCKET=0, e.g. if command -v python3 >/dev/null 2>&1 && python3 -c '...' "$MANIFEST_ROOT/socket"; then HAVE_SOCKET=1; fi.

Comment on lines +1135 to +1146
# Capture the JSON-seq stream and convert to JSON-Lines for jq filtering. Each
# record is already on its own line in compact form, so stripping the RS (0x1E)
# byte gives us a valid JSON-Lines stream.
systemd-dissect --manifest "$MANIFEST_ROOT" >"$MANIFEST_ROOT.seq"
tr -d '\036' <"$MANIFEST_ROOT.seq" >"$MANIFEST_ROOT.jsonl"
test -s "$MANIFEST_ROOT.jsonl"

# Helper: emit the entry whose "name" matches $1 (root entry has no name)
get_entry() { jq -c --arg n "$1" 'select(.name == $n)' <"$MANIFEST_ROOT.jsonl"; }

# Root entry is the first record, has mediaType and no name
head -n1 "$MANIFEST_ROOT.jsonl" | \

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: nit: The tr -d '\036' -> JSON-Lines conversion and the head -n1 root-record extraction both rely on each manifest record being emitted as a single compact line. That holds only because the implementation dumps with no PRETTY flag; if the manifest output ever became multi-line, head -n1 would no longer capture the full root record (jq reading the whole stream would still cope, but this one root assertion would not). Selecting the root record with jq over the full stream (e.g. jq -s '.[0]' over the stripped output) would be more robust than head -n1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

3 participants