fix: harden JSON misformat handling to prevent infinite loops and crashes#1245
fix: harden JSON misformat handling to prevent infinite loops and crashes#1245PaoloC68 wants to merge 1 commit intoagent0ai:developmentfrom
Conversation
f883360 to
94656ea
Compare
…shes
Circuit breaker now covers BOTH failure modes:
- Case A: complete parse failure (tool_request is None)
- Case B: parseable-but-invalid JSON (RepairableException from validation)
Previously, Case B bypassed the consecutive_misformat counter entirely
because RepairableException was swallowed by _50_handle_repairable_exception
extension, allowing the agent to loop forever on models producing JSON
without required tool_name/tool_args fields.
Changes:
- agent.py: Wrap validate_tool_request in try/except RepairableException
inside process_tools; increment counter for Case B; move counter reset
to after validation passes (not after tool execution)
- prompts/fw.msg_misformat.md: Add attempt countdown, remove contradictory
code-fence example, compress to single-line JSON example
- tests/test_misformat_circuit_breaker.py: 73 tests covering extraction,
validation, circuit breaker (both cases), and end-to-end pipeline
Workaround: Models producing frequent misformats (especially thinking/reasoning
models like MiniMax M2.5) can be improved by adding response_format to the
Chat model additional parameters in Agent Zero settings:
response_format={"type": "json_object"}
This forces JSON output at the token generation level. Confirmed supported
on OpenRouter for MiniMax M2.5, DeepSeek V3, Gemini 2.5 Flash, Grok 4.1.
94656ea to
b2b70a7
Compare
|
Strong +1 on merging this. Here's the reasoning after reviewing the relevant codebase sections. Why this cannot be a pluginThe three plugin extension surfaces available here are
The plugin boundary is the wrong level of abstraction for both fixes. Why this is genuinely neededVerified against current
One design question worth a commentWhere the misformat counter lives matters for observability. If it's stored on The 73 tests are a strong quality signal. This is a well-scoped fix for a real production issue. |
|
As per comments from @frdel - this PR will not be merged as it does not solve key issues or take in to account json before tool call etc. PR will auto-close in 90 days. |
|
Closing per maintainer feedback (see @Krashnicov comment Mar 28). Will revisit if the approach changes. |
Summary
Fixes the infinite misformat loop that affects 5+ reported issues. The agent's
process_toolshad a gap where parseable-but-invalid JSON (e.g., dict withouttool_name) would raiseRepairableException, get swallowed by_50_handle_repairable_exception, and loop forever without hitting the circuit breaker.Changes
agent.py— Circuit breaker covers both failure modesjson_parse_dirtyreturnsNone→ misformat counter incrementsvalidate_tool_requestraisesRepairableException→ counter now increments via try/except wrapper inprocess_toolsHandledExceptionto stop the monologue gracefullyhelpers/extract_tools.py— Brace-depth JSON extraction (unchanged from v1)rfind('}')with character-by-character brace-depth trackingprompts/fw.msg_misformat.md— Improved misformat promptattempt X/5) so model knows it's running out of retriestests/test_misformat_circuit_breaker.py— 73 testsTestExtractJsonObjectStringTestJsonParseDirtyTestValidateToolRequestTestCircuitBreakerCaseATestCircuitBreakerCaseBTestCircuitBreakerMixedTestEndToEndPipelineRoot cause analysis
The
_50_handle_repairable_exceptionextension (line 23) setsdata["exception"] = None, swallowing theRepairableExceptionand continuing the monologue loop. Meanwhile, the_error_retryplugin explicitly skipsRepairableException(line 21). So for Case B, no circuit breaker ever fired — the counter was only in theelsebranch (Case A).Workaround for models with frequent misformats
Thinking/reasoning models (e.g., MiniMax M2.5) produce reasoning traces with incidental
{characters that confuse the JSON parser. To mitigate, add this to Chat model additional parameters in Agent Zero settings:This forces JSON output at the token generation level. Confirmed supported on OpenRouter for: MiniMax M2.5, DeepSeek V3, Gemini 2.5 Flash, Grok 4.1 Fast, GPT-OSS-120B.
Testing
pytest tests/test_misformat_circuit_breaker.py -v # 73 passed in 0.26s