fix(router): increase inference validation token budget#432
fix(router): increase inference validation token budget#432geelen wants to merge 1 commit intoNVIDIA:mainfrom
Conversation
|
All contributors have signed the DCO ✍️ ✅ |
|
Thank you for your interest in contributing to OpenShell, @geelen. This project uses a vouch system for first-time contributors. Before submitting a pull request, you need to be vouched by a maintainer. To get vouched:
See CONTRIBUTING.md for details. |
|
I have read the DCO document and I hereby sign the DCO. |
|
FYI I have now tested this against the particular endpoint and it does indeed pass validation automatically. Also the value of 32 was just plucked out of thin air, but seemed like a safe default (my endpoint returned 11 tokens in response). |
I think 32-ish makes sense and shouldn't impact the time it takes for response to come back too much. Flakiness/potential timeouts, etc. was a reason to include the I just checked how the openclaw does verification and they also use 1 for Out of curiosity - what's the validation on the inference-api side? I'm assuming this is some kind of default that litellm is enforcing? |
Summary
Increase the inference validation probe token budget from 1 to 32 so OpenAI-compatible backends that reject extremely small output budgets can still pass verification.
Related Issue
N/A
Changes
Testing
mise run pre-commitpassesChecklist