Add run_diff: LCS-aligned diff of two run step-traces#411
Merged
Conversation
A run history says a run failed but not what changed from the run that passed. Align two step sequences with an LCS walk (so an inserted or removed step shifts the rest into place instead of mis-pairing) and classify the differences: added/removed steps, status flips (with the new failure's signature), and timing regressions. summarize_run_diff renders a one-line summary. Pure stdlib over step dicts.
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | 53 |
| Duplication | 0 |
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



Why
A run history tells you a run failed, but not what changed from the run that passed: which step was added or dropped, which step flipped pass→fail, which step got slower.
run_diffaligns the two step sequences with a longest-common-subsequence walk — so an inserted/removed step shifts the rest into place instead of mis-pairing everything — and classifies the differences:failure_signaturewhen it carries anerrorregress_factor× slowersummarize_run_diffrenders a one-line summary. The second item of the test-robustness lane (consumesfailure_signaturefrom v191).Design
_status_flip/_regression/_aligned_changes/_unmatched) keep every function under CC 10 (radon-clean). A step is any dict with anamekey + optionalstatus/duration/error.__all__→AC_diff_runs(returns the diff + asummary) → read-onlyac_diff_runsMCP tool → Script Builder (Testing). Qt-free verified;pytest.approxfor the ratio (no float==).Tests
test/unit_test/headless/test_run_diff_batch.py— LCS isolating an insert (no mis-pairing), status flip carrying a 12-char signature, timing regression with ratio + sub-factor non-regression, removed-step detection, identical runs →"no change", summary contents, and the executorsummarypath + 5-layer wiring. 18 passed with thefailure_signaturesibling.