Skip to content

fix: harden JSON misformat handling to prevent infinite loops and crashes#1245

Closed
PaoloC68 wants to merge 1 commit intoagent0ai:developmentfrom
PaoloC68:fix/json-misformat-resilience
Closed

fix: harden JSON misformat handling to prevent infinite loops and crashes#1245
PaoloC68 wants to merge 1 commit intoagent0ai:developmentfrom
PaoloC68:fix/json-misformat-resilience

Conversation

@PaoloC68
Copy link
Copy Markdown
Contributor

@PaoloC68 PaoloC68 commented Mar 12, 2026

Summary

Fixes the infinite misformat loop that affects 5+ reported issues. The agent's process_tools had a gap where parseable-but-invalid JSON (e.g., dict without tool_name) would raise RepairableException, get swallowed by _50_handle_repairable_exception, and loop forever without hitting the circuit breaker.

Changes

agent.py — Circuit breaker covers both failure modes

  • Case A (already worked): json_parse_dirty returns None → misformat counter increments
  • Case B (NEW fix): validate_tool_request raises RepairableException → counter now increments via try/except wrapper in process_tools
  • Moved counter reset to right after validation passes (format correctness, not tool existence)
  • At 5 consecutive failures, raises HandledException to stop the monologue gracefully

helpers/extract_tools.py — Brace-depth JSON extraction (unchanged from v1)

  • Replaces buggy rfind('}') with character-by-character brace-depth tracking
  • Handles nested braces, string-escaped braces, escaped quotes

prompts/fw.msg_misformat.md — Improved misformat prompt

  • Shows attempt countdown (attempt X/5) so model knows it's running out of retries
  • Removed contradictory code-fence example (told model "no fences" but showed one)
  • Compressed to single-line JSON example

tests/test_misformat_circuit_breaker.py — 73 tests

Suite Tests Coverage
TestExtractJsonObjectString 22 Brace-depth tracking, rfind regression, escapes, edge cases
TestJsonParseDirty 12 Parse, dirty JSON, non-string input, arrays
TestValidateToolRequest 14 None guard, RepairableException (not ValueError), all invalid shapes
TestCircuitBreakerCaseA 4 None parse → counter accumulates → breaker at 5
TestCircuitBreakerCaseB 6 Invalid dict → counter accumulates → breaker at 5
TestCircuitBreakerMixed 5 Mixed A+B accumulate, reset, boundary conditions
TestEndToEndPipeline 10 Realistic messages, rfind bug scenario, session reproduction

Root cause analysis

The _50_handle_repairable_exception extension (line 23) sets data["exception"] = None, swallowing the RepairableException and continuing the monologue loop. Meanwhile, the _error_retry plugin explicitly skips RepairableException (line 21). So for Case B, no circuit breaker ever fired — the counter was only in the else branch (Case A).

Workaround for models with frequent misformats

Thinking/reasoning models (e.g., MiniMax M2.5) produce reasoning traces with incidental { characters that confuse the JSON parser. To mitigate, add this to Chat model additional parameters in Agent Zero settings:

response_format={"type": "json_object"}

This forces JSON output at the token generation level. Confirmed supported on OpenRouter for: MiniMax M2.5, DeepSeek V3, Gemini 2.5 Flash, Grok 4.1 Fast, GPT-OSS-120B.

Testing

pytest tests/test_misformat_circuit_breaker.py -v
# 73 passed in 0.26s

…shes

Circuit breaker now covers BOTH failure modes:
- Case A: complete parse failure (tool_request is None)
- Case B: parseable-but-invalid JSON (RepairableException from validation)

Previously, Case B bypassed the consecutive_misformat counter entirely
because RepairableException was swallowed by _50_handle_repairable_exception
extension, allowing the agent to loop forever on models producing JSON
without required tool_name/tool_args fields.

Changes:
- agent.py: Wrap validate_tool_request in try/except RepairableException
  inside process_tools; increment counter for Case B; move counter reset
  to after validation passes (not after tool execution)
- prompts/fw.msg_misformat.md: Add attempt countdown, remove contradictory
  code-fence example, compress to single-line JSON example
- tests/test_misformat_circuit_breaker.py: 73 tests covering extraction,
  validation, circuit breaker (both cases), and end-to-end pipeline

Workaround: Models producing frequent misformats (especially thinking/reasoning
models like MiniMax M2.5) can be improved by adding response_format to the
Chat model additional parameters in Agent Zero settings:
  response_format={"type": "json_object"}
This forces JSON output at the token generation level. Confirmed supported
on OpenRouter for MiniMax M2.5, DeepSeek V3, Gemini 2.5 Flash, Grok 4.1.
@PaoloC68 PaoloC68 force-pushed the fix/json-misformat-resilience branch from 94656ea to b2b70a7 Compare March 19, 2026 18:54
@Krashnicov
Copy link
Copy Markdown
Contributor

Strong +1 on merging this. Here's the reasoning after reviewing the relevant codebase sections.

Why this cannot be a plugin

The three plugin extension surfaces available here are process_tools/start, process_tools/end, and the existing handle_exception chain. None of them can reach the intercept point this fix requires:

  • process_tools/start fires before any parsing — too early.
  • process_tools/end never fires when validate_tool_request throws, because an exception exits the function body before the @extensible end-hook can run.
  • handle_exception is ironically the cause of the bug. _50_handle_repairable_exception is already an extension, and it's exactly what swallows the exception, nullifies data["exception"], and lets the monologue loop restart forever. The fix must sit upstream of that — inside process_tools — which requires a core change.

helpers/extract_tools.py has no @extensible decoration at all. The brace-depth fix has zero plugin seam available.

The plugin boundary is the wrong level of abstraction for both fixes.

Why this is genuinely needed

Verified against current main:

  1. extract_json_object_string still uses rfind('}') (line 29 of helpers/extract_tools.py). For any response where the model includes trailing text after a JSON block — or nested objects — this silently extracts the wrong substring. The brace-depth fix is the correct solution.

  2. There is currently no retry counter for misformat loops. The else branch in process_tools appends the fw.msg_misformat.md warning and continues — the inner while True simply restarts. There is no ceiling.

  3. The PR's Case B is particularly tight: validate_tool_request raising RepairableException for structurally-invalid-but-parseable JSON is semantically correct (it is repairable — tell the model to reformat). But making that exception type change alone would make things worse, since _50_handle_repairable_exception swallows it silently. The counter is not an add-on — it's the required companion to make the exception type semantically correct without creating an infinite loop. The two changes are coupled and must land together.

One design question worth a comment

Where the misformat counter lives matters for observability. If it's stored on loop_data it resets every monologue (right behaviour for per-task limits). If it's stored on the agent instance it persists across tasks. Worth clarifying in the implementation or docs so future contributors understand the intended reset boundary.

The 73 tests are a strong quality signal. This is a well-scoped fix for a real production issue.

@Krashnicov
Copy link
Copy Markdown
Contributor

As per comments from @frdel - this PR will not be merged as it does not solve key issues or take in to account json before tool call etc.

PR will auto-close in 90 days.

@PaoloC68
Copy link
Copy Markdown
Contributor Author

PaoloC68 commented Apr 3, 2026

Closing per maintainer feedback (see @Krashnicov comment Mar 28). Will revisit if the approach changes.

@PaoloC68 PaoloC68 closed this Apr 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants