ci: add weekly CI health check with Slack notification#3191
ci: add weekly CI health check with Slack notification#3191QuantumExplorer merged 11 commits intov3.1-devfrom
Conversation
Runs every Monday at 8 UTC. Checks last run of all active workflows on v3.1-dev and posts to #platform-team Slack channel if any are red. Includes workflow_dispatch trigger for manual testing. Requires SLACK_CI_WEBHOOK_URL repo secret. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Replace gh api calls with gh workflow list / gh run list subcommands. Auto-detect default branch via gh repo view instead of hardcoding. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add top-level permissions block: actions:read + contents:read. This workflow only queries workflow runs — no write access needed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
gh CLI needs a .git directory to infer the repository. Uses sparse-checkout to avoid downloading any actual code. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Workflow-level conclusion can be 'failure' even when all jobs succeeded or were skipped. Now drills into individual jobs and only reports workflows with actual job failures. Slack message includes failed job names for context. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This reverts commit e80ee53.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use $'\n' for real newlines in bash variables so jq --arg properly encodes them as \n in the JSON payload. Fixes literal \n showing in Slack and broken mrkdwn bold markers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Issue being fixed or feature implemented
Nightly CI jobs run on the main branch but failures go unnoticed. This adds a weekly Monday morning check that alerts the team when things are red.
User Story
Imagine you are a developer on the platform team. Every Monday at 8 UTC you get a Slack message in #platform-team if any CI workflow on v3.1-dev is failing — so you can fix it before it blocks the whole week.
What was done?
Added
.github/workflows/weekly-ci-health.yml:workflow_dispatchfor manual testingv3.1-devconclusion: failureSetup required
#platform-teamSLACK_CI_WEBHOOK_URLwith the webhook URLHow Has This Been Tested?
yaml.safe_load()gh workflow run "Weekly CI Health Check"Breaking Changes
None
Checklist:
🤖 Co-authored by Claudius the Magnificent AI Agent