Skip to content

Improve must-gather image handling and annotation checks#2071

Open
midu16 wants to merge 9 commits intoopenshift:mainfrom
midu16:master
Open

Improve must-gather image handling and annotation checks#2071
midu16 wants to merge 9 commits intoopenshift:mainfrom
midu16:master

Conversation

@midu16
Copy link
Contributor

@midu16 midu16 commented Aug 7, 2025

Pull Request: Improve must-gather image handling and annotation checks

Summary

This PR enhances the MustGatherOptions.completeImages() method by:

  • Logging warnings for CSVs that are missing the required must-gather-image annotation.
  • Improving robustness and readability of the image collection logic.
  • Ensuring deduplication of default must-gather image when collecting plug-in images.

Changes

  • Moved CSV annotation scanning inline to allow per-CSV warning messages.

  • Added warning log in the format:

    WARNING: CSV operator <name> doesn't have the must-gather-image annotation.
    
  • Enhanced image resolution logic to:

    • Fallback to hardcoded must-gather image if the image stream tag cannot be resolved.
    • Log warnings for such fallbacks.
  • Maintained support for --all (aka AllImages) by collecting images from both:

    • CSVs (ClusterServiceVersions)
    • ClusterOperators
  • Ensured duplicate images (especially the default one) are not added to the final list.

Testing

  • Verified that missing annotations on CSVs produce proper warning output.
  • Confirmed image collection and deduplication works as expected.
  • Manually validated fallback logic for missing default image stream tag.

Related

  • Annotation key used: operators.openshift.io/must-gather-image
  • Fulfills a need for better diagnostics and operator plug-in visibility in oc adm must-gather --all-images.

Notes

  • Future enhancement may consider moving CSV image collection into a separate method again, but returning both:

    • pluginImages
    • []string of CSVs missing the annotation for external logging.

- Updated completeImages() to log a warning when a ClusterServiceVersion (CSV)
  is missing the "operators.openshift.io/must-gather-image" annotation.
- Inlined CSV processing logic to allow detailed per-CSV warning logging.
- Improved error handling and logging for image stream tag resolution.
- Ensured deduplication of default must-gather image from plug-in list.
- Maintained compatibility with --all-images and fallback logic.
@openshift-ci openshift-ci bot requested review from atiratree and deads2k August 7, 2025 16:01
@coderabbitai
Copy link

coderabbitai bot commented Sep 17, 2025

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

When --all-images is set (or no images provided), the must-gather flow resolves the default must-gather image, lists CSVs and ClusterOperators to collect mgAnnotation values, warns for CSVs missing the annotation, removes the default image from discovered results to avoid duplication, and appends discovered images to o.Images.

Changes

Cohort / File(s) Summary
Must-gather image discovery and defaults
pkg/cli/admin/mustgather/mustgather.go
Ensures default must-gather image is resolved when no images or --all-images is set; when --all-images is active, lists CSVs and ClusterOperators to gather mgAnnotation values, logs warnings for CSVs without the annotation, wraps list errors with contextual messages (failed to list CSVs: %v, failed to list ClusterOperators: %v), removes the resolved default from discovered plugins to avoid duplicates, appends discovered images to o.Images, and updates logging/formatting.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The title succinctly summarizes the primary change—improvements to must-gather image handling and annotation checks—and directly reflects the modifications in MustGatherOptions.completeImages(), including per-CSV warnings, inline annotation scanning, improved --all-images discovery, deduplication, and fallback image resolution.
Description Check ✅ Passed The pull request description is directly relevant to the changeset and clearly documents the intent, key code changes, testing performed, and the annotation key used; it gives reviewers sufficient context about per-CSV warnings, deduplication behavior, and fallback image logic.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
pkg/cli/admin/mustgather/mustgather.go (2)

273-276: Make fallback explicit and avoid double newline in log.

Current o.log("%v\n", err) adds two newlines and doesn’t state that a fallback is used.

Apply this diff:

-        if image, err = o.resolveImageStreamTag("openshift", "must-gather", "latest"); err != nil {
-            o.log("%v\n", err)
-            image = "registry.redhat.io/openshift4/ose-must-gather:latest"
-        }
+        if image, err = o.resolveImageStreamTag("openshift", "must-gather", "latest"); err != nil {
+            fallback := "registry.redhat.io/openshift4/ose-must-gather:latest"
+            o.log("WARNING: unable to resolve imagestream openshift/must-gather:latest: %v; falling back to %s", err, fallback)
+            image = fallback
+        }

309-314: Dedup across equivalent pullspecs, not just exact string match.

If a CSV annotates the default image via a different but equivalent reference (e.g., tag vs digest), delete(pluginImages, o.Images[0]) won’t dedupe it.

Consider normalizing with imagereference.Parse and comparing Exact() (or repository+ID) before appending, e.g., parse o.Images[0] once and skip plugin entries that resolve to the same canonical ref.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between fa1bc38 and 4358b67.

📒 Files selected for processing (1)
  • pkg/cli/admin/mustgather/mustgather.go (3 hunks)
🔇 Additional comments (1)
pkg/cli/admin/mustgather/mustgather.go (1)

315-317: Confirm intended UX for duplicate “Using must-gather plug-in image” lines.

This now prints once during Complete() to stdout and again into must-gather.logs in Run(). If that’s unintentional, drop the Complete() log.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
pkg/cli/admin/mustgather/mustgather.go (3)

269-278: Avoid double newline and log explicit fallback

o.log("%v\n", err) prints two newlines because o.log appends one. Also add an explicit fallback warning for clarity.

Apply this diff:

-        if image, err = o.resolveImageStreamTag("openshift", "must-gather", "latest"); err != nil {
-            o.log("%v\n", err)
-            image = "registry.redhat.io/openshift4/ose-must-gather:latest"
-        }
+        if image, err = o.resolveImageStreamTag("openshift", "must-gather", "latest"); err != nil {
+            o.log("WARNING: %v", err)
+            fallback := "registry.redhat.io/openshift4/ose-must-gather:latest"
+            o.log("WARNING: falling back to default must-gather image %q", fallback)
+            image = fallback
+        }

Please confirm the intended behavior is to always include the default image when --all-images is set (then add discovered plug-ins), which matches the PR summary.


294-301: Include namespace in per-CSV warning

CSV names can collide across namespaces. Log the fully-qualified ns/name for disambiguation.

Apply this diff:

-        for _, u := range uList.Items {
-            ann := u.GetAnnotations()
-            if v, ok := ann[mgAnnotation]; ok {
-                pluginImages[v] = struct{}{}
-            } else {
-                o.log("WARNING: CSV operator %s doesn't have the must-gather-image annotation.", u.GetName())
-            }
-        }
+        for _, u := range uList.Items {
+            ann := u.GetAnnotations()
+            if v, ok := ann[mgAnnotation]; ok {
+                pluginImages[v] = struct{}{}
+            } else {
+                name := u.GetName()
+                if ns := u.GetNamespace(); ns != "" {
+                    name = ns + "/" + name
+                }
+                o.log("WARNING: CSV operator %s doesn't have the must-gather-image annotation.", name)
+            }
+        }

315-318: Deterministic ordering for appended plug-in images

Appending from a map yields nondeterministic order, which affects logs and pod creation order. Sort for stable output.

Apply this diff:

-        delete(pluginImages, o.Images[0])
-        for i := range pluginImages {
-            o.Images = append(o.Images, i)
-        }
+        delete(pluginImages, o.Images[0])
+        if len(pluginImages) > 0 {
+            extras := make([]string, 0, len(pluginImages))
+            for img := range pluginImages {
+                extras = append(extras, img)
+            }
+            sort.Strings(extras)
+            o.Images = append(o.Images, extras...)
+        }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 4358b67 and 3845d4b.

📒 Files selected for processing (1)
  • pkg/cli/admin/mustgather/mustgather.go (2 hunks)
🔇 Additional comments (4)
pkg/cli/admin/mustgather/mustgather.go (4)

261-268: LGTM: explicit imagestream resolution

No issues spotted in resolving user-provided imagestreams.


284-293: CSV listing via DynamicClient: good

Switch to DynamicClient looks correct; error handling message is clear.


303-307: LGTM: ClusterOperators path and error message

Error message is specific and consistent with the CSV case.


325-345: Remove now-unused annotatedCSVs() helper — safe to delete.
No call sites found; delete the function in pkg/cli/admin/mustgather/mustgather.go (lines ~325–345) or rewire it if it was intended to be used.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 17, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: midu16
Once this PR has been reviewed and has the lgtm label, please assign ingvagabund for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

midu16 added a commit to midu16/oc that referenced this pull request Sep 17, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
pkg/cli/admin/mustgather/mustgather.go (1)

284-296: Fix compile error: OperatorClient is not a field; use DynamicClient to list CSVs

MustGatherOptions has no OperatorClient. Use the already-initialized DynamicClient to list CSVs and keep per-CSV warnings.

Apply:

-        // Annotated CSVs
-        csvs, err := o.OperatorClient.OperatorsV1alpha1().ClusterServiceVersions("").List(context.TODO(), metav1.ListOptions{})
-        if err != nil {
-            return fmt.Errorf("failed to list CSVs: %v", err)
-        }
-        for _, csv := range csvs.Items {
-            ann := csv.GetAnnotations()
-            if v, ok := ann[mgAnnotation]; ok {
-                pluginImages[v] = struct{}{}
-            } else {
-                o.log("WARNING: CSV operator %s doesn't have the must-gather-image annotation.", csv.GetName())
-            }
-        }
+        // Annotated CSVs (via dynamic client)
+        csvGVR := schema.GroupVersionResource{
+            Group:    "operators.coreos.com",
+            Version:  "v1alpha1",
+            Resource: "clusterserviceversions",
+        }
+        uList, err := o.DynamicClient.Resource(csvGVR).Namespace(metav1.NamespaceAll).List(context.TODO(), metav1.ListOptions{})
+        if err != nil {
+            return fmt.Errorf("failed to list CSVs: %w", err)
+        }
+        for _, u := range uList.Items {
+            ann := u.GetAnnotations()
+            if v, ok := ann[mgAnnotation]; ok {
+                pluginImages[v] = struct{}{}
+            } else {
+                o.log("WARNING: CSV operator %s doesn't have the must-gather-image annotation.", u.GetName())
+            }
+        }
🧹 Nitpick comments (3)
pkg/cli/admin/mustgather/mustgather.go (3)

269-278: Emit explicit fallback WARNING and avoid double newline in log

  • o.log already appends a newline; passing "%v\n" yields double newlines.
  • Per PR goals, the fallback should clearly log a WARNING that a hardcoded image is used.

Apply:

-        if image, err = o.resolveImageStreamTag("openshift", "must-gather", "latest"); err != nil {
-            o.log("%v\n", err)
-            image = "registry.redhat.io/openshift4/ose-must-gather:latest"
-        }
+        if image, err = o.resolveImageStreamTag("openshift", "must-gather", "latest"); err != nil {
+            o.log("WARNING: %v", err)
+            image = "registry.redhat.io/openshift4/ose-must-gather:latest"
+            o.log("WARNING: falling back to default must-gather image %s", image)
+        }

309-314: Make plug-in image ordering deterministic; consider stronger dedup

  • Map iteration order is random; sort for reproducible CLI output and pod ordering.
  • Optional: normalize refs (tag vs digest) before dedup so the default isn’t re-added under a different form.

Apply:

-        for i := range pluginImages {
-            o.Images = append(o.Images, i)
-        }
+        imgs := make([]string, 0, len(pluginImages))
+        for img := range pluginImages {
+            imgs = append(imgs, img)
+        }
+        sort.Strings(imgs)
+        o.Images = append(o.Images, imgs...)

Optional follow-up (separate change): normalize references via imagereference.Parse and compare by digest (when present) to enhance dedup. Verify current CSV annotations’ formats across operators before changing behavior.


315-316: LGTM: summary log of selected images

The consolidated log line is helpful. Keep it; the extra blank line after isn’t needed but harmless.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 3845d4b and 8b0bde2.

📒 Files selected for processing (1)
  • pkg/cli/admin/mustgather/mustgather.go (3 hunks)
🔇 Additional comments (2)
pkg/cli/admin/mustgather/mustgather.go (2)

261-268: LGTM: explicit image-stream resolution path

Resolving user-specified imagestream tags up-front is correct and keeps error handling tight.


299-302: Wrap underlying error with %w for better context

Use error wrapping to preserve cause.

[ suggest_nitpick_refactor ]

-        if err != nil {
-            return fmt.Errorf("failed to list ClusterOperators: %v", err)
-        }
+        if err != nil {
+            return fmt.Errorf("failed to list ClusterOperators: %w", err)
+        }

midu16 added a commit to midu16/oc that referenced this pull request Nov 3, 2025
midu16 added a commit to midu16/oc that referenced this pull request Nov 3, 2025
@midu16
Copy link
Contributor Author

midu16 commented Nov 6, 2025

@ingvagabund wondering if you can assist on this review, intention here its to improve the logging in the scenario where a given operator doesnt have the must-gather-image annotation within its CSV.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 6, 2025
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 26, 2025
@shivprakashmuley
Copy link

shivprakashmuley commented Dec 6, 2025

Hi @midu16 , can you please check the CI job failures.
# github.com/openshift/oc/pkg/cli/admin/mustgather
pkg/cli/admin/mustgather/mustgather.go:290:23: undefined: csvs

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@pkg/cli/admin/mustgather/mustgather.go`:
- Around line 290-301: The undefined `csvs` bug is caused by moving CSV
iteration out of annotatedCSVs; restore per-CSV warning logic by changing
annotatedCSVs to accept the logger (e.g., add a parameter for o.log or a func
like logf) and move the loop that inspects csvs.Items and emits warnings using
mgAnnotation into annotatedCSVs; update annotatedCSVs' signature (e.g.,
annotatedCSVs(ctx context.Context, logf func(string, ...interface{}))
(map[string]struct{}, error)), implement the per-CSV warning there when an
annotation is missing, and then call it from mustgather.go as pluginImages, err
= o.annotatedCSVs(ctx, o.log) and remove the orphaned for-loop in mustgather.go;
also update any other callers to match the new signature.

Comment on lines 290 to 301
pluginImages, err = o.annotatedCSVs(ctx)
if err != nil {
return err
return fmt.Errorf("failed to list CSVs: %v", err)
}
for _, csv := range csvs.Items {
ann := csv.GetAnnotations()
if v, ok := ann[mgAnnotation]; ok {
pluginImages[v] = struct{}{}
} else {
o.log("WARNING: CSV operator %s doesn't have the must-gather-image annotation.", csv.GetName())
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Compile error: csvs is undefined in this scope.

The annotatedCSVs function returns only (map[string]struct{}, error), but lines 294-300 reference csvs.Items which is undefined here. This matches the CI failure reported in PR comments.

To preserve the per-CSV warning behavior, move the warning logic into annotatedCSVs by passing the log function:

🐛 Proposed fix

Update the annotatedCSVs signature and implementation:

-func (o *MustGatherOptions) annotatedCSVs(ctx context.Context) (map[string]struct{}, error) {
+func (o *MustGatherOptions) annotatedCSVs(ctx context.Context, warnMissing bool) (map[string]struct{}, error) {
 	csvGVR := schema.GroupVersionResource{
 		Group:    "operators.coreos.com",
 		Version:  "v1alpha1",
 		Resource: "clusterserviceversions",
 	}
 	pluginImages := make(map[string]struct{})

 	csvs, err := o.DynamicClient.Resource(csvGVR).List(ctx, metav1.ListOptions{})
 	if err != nil {
 		return nil, err
 	}

 	for _, item := range csvs.Items {
 		ann := item.GetAnnotations()
 		if v, ok := ann[mgAnnotation]; ok {
 			pluginImages[v] = struct{}{}
+		} else if warnMissing {
+			o.log("WARNING: CSV operator %s doesn't have the must-gather-image annotation.", item.GetName())
 		}
 	}
 	return pluginImages, nil
 }

Then update the call site and remove the orphaned loop:

-		pluginImages, err = o.annotatedCSVs(ctx)
+		pluginImages, err = o.annotatedCSVs(ctx, true)
 		if err != nil {
 			return fmt.Errorf("failed to list CSVs: %v", err)
 		}
-		for _, csv := range csvs.Items {
-			ann := csv.GetAnnotations()
-			if v, ok := ann[mgAnnotation]; ok {
-				pluginImages[v] = struct{}{}
-			} else {
-				o.log("WARNING: CSV operator %s doesn't have the must-gather-image annotation.", csv.GetName())
-			}
-		}
🤖 Prompt for AI Agents
In `@pkg/cli/admin/mustgather/mustgather.go` around lines 290 - 301, The undefined
`csvs` bug is caused by moving CSV iteration out of annotatedCSVs; restore
per-CSV warning logic by changing annotatedCSVs to accept the logger (e.g., add
a parameter for o.log or a func like logf) and move the loop that inspects
csvs.Items and emits warnings using mgAnnotation into annotatedCSVs; update
annotatedCSVs' signature (e.g., annotatedCSVs(ctx context.Context, logf
func(string, ...interface{})) (map[string]struct{}, error)), implement the
per-CSV warning there when an annotation is missing, and then call it from
mustgather.go as pluginImages, err = o.annotatedCSVs(ctx, o.log) and remove the
orphaned for-loop in mustgather.go; also update any other callers to match the
new signature.

- Remove redundant loop that referenced undefined csvs variable; annotatedCSVs
  already returns the plugin images map
- Add WARNING log in annotatedCSVs when a CSV operator lacks the
  must-gather-image annotation so users are aware

Co-authored-by: Cursor <cursoragent@cursor.com>
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 12, 2026

@midu16: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-metal-ipi-ovn-ipv6 8b0bde2 link false /test e2e-metal-ipi-ovn-ipv6
ci/prow/okd-scos-e2e-aws-ovn 8b0bde2 link false /test okd-scos-e2e-aws-ovn
ci/prow/e2e-aws-certrotation 8b0bde2 link false /test e2e-aws-certrotation
ci/prow/e2e-aws-ovn-serial-1of2 05dc42e link true /test e2e-aws-ovn-serial-1of2
ci/prow/e2e-agnostic-ovn-cmd 05dc42e link true /test e2e-agnostic-ovn-cmd

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants