feat: add token-budget truncation for file mentions #10830
Draft
+162
−3
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR attempts to address Issue #10824. Feedback and guidance are welcome.
Summary
Implements token-budget aware truncation for file mentions to prevent context exhaustion when large files are referenced in the initial prompt.
Problem
When users mention large files (e.g., 2.7MB JSON files) using
@/path/to/filesyntax in their prompts, the entire file content was being included in the context, which could immediately exhaust the context window.Solution
Added token-budget based truncation for file mentions, similar to how the
read_filetool works:New parameter
maxFileTokenBudget- Added toparseMentions(),processUserContentMentions(), andgetFileOrFolderContent()functionsToken-budget calculation - Each file mention is limited to 10% of the model's context window. For example, with a 200k context window, each mentioned file is limited to ~20k tokens.
Uses
readFileWithTokenBudget()- Leverages the existing incremental token counting implementation to read files up to the budget and then stop.Clear truncation message - When a file is truncated, users see:
Testing
src/core/mentions/__tests__/index.spec.tsprocessUserContentMentions.spec.tsto account for the new parameterRelated Issue
Closes #10824
Important
Introduces token-budget truncation for file mentions to prevent context exhaustion, limiting each file to 10% of the model's context window.
parseMentions()andgetFileOrFolderContent().readFileWithTokenBudget()for reading files within the budget.maxFileTokenBudgetparameter toparseMentions(),processUserContentMentions(), andgetFileOrFolderContent().index.spec.ts.processUserContentMentions.spec.tsfor new parameter.Task.tsbased on 10% of context window.This description was created by
for 46f8582. You can customize this summary. It will automatically update as commits are pushed.