Skip to content

Reduce MCP review latency via shared changed-file resolution + bounded auto-detection#457

Open
Bakul2006 wants to merge 3 commits into
tirth8205:mainfrom
Bakul2006:bakul/MCPSpeedUp
Open

Reduce MCP review latency via shared changed-file resolution + bounded auto-detection#457
Bakul2006 wants to merge 3 commits into
tirth8205:mainfrom
Bakul2006:bakul/MCPSpeedUp

Conversation

@Bakul2006

Copy link
Copy Markdown
Contributor

Issue #262

What Changed

New shared resolver

Added:

resolve_changed_files()

in:

_common.py

This centralizes changed-file resolution for all review-oriented tools.

Resolution order:

  1. Explicit changed_files
  2. Short-lived cache reuse
  3. Fast auto-detection fallback
  4. Graceful timeout handling

The resolver returns both resolved files and metadata:

{
"files": [...],
"source": "...",
"cache_hit": bool,
"auto_detect_timed_out": bool,
"timeout_seconds": float,
}

Bounded auto-detection

Auto-detection is now intentionally bounded.

Default timeout:

5 seconds

(configurable)

Instead of allowing long-running internal detection paths to stall MCP workflows, the resolver now fails fast and returns structured guidance.

This prevents multi-minute hangs during agent execution.


Short-lived shared cache

Added a thread-safe changed-file cache with configurable TTL.

Default TTL:

3 seconds

(configurable)

This allows consecutive MCP tool calls in the same review workflow to reuse resolved changed files instead of repeatedly invoking git detection logic.

This is especially beneficial for agent orchestration patterns where multiple tools are called back-to-back.


Wired Into Review Tools

The shared resolver is now used by:

  • detect_changes_func()
  • get_review_context()
  • get_affected_flows_func()
  • get_impact_radius()
  • get_minimal_context()

This keeps behavior consistent across review flows.


User-Facing Behavior

When automatic changed-file detection exceeds the configured timeout and no files are resolved, tools now return explicit guidance indicating that callers should provide changed_files directly for fast execution.

This replaces previous silent stalls / timeout-heavy behavior.


Configuration

New environment variables:

Variable | Default | Purpose -- | -- | -- CRG_FAST_DIFF_TIMEOUT | 5.0 | Maximum auto-detection time CRG_CHANGED_FILES_CACHE_TTL | 3.0 | Cache reuse window

These can be tuned per environment / deployment.


Performance Impact

Expected improvements:

  • Small-diff MCP review workflows drop from minutes to seconds
  • Multi-call agent workflows avoid repeated git scans
  • Faster behavior on Windows repositories
  • Faster recovery on slow repos / CI environments
  • Better interoperability with agent clients using aggressive tool timeouts

@Bakul2006

Copy link
Copy Markdown
Contributor Author

Hey @tirth8205 Just tried adding the Cache memory In the MCP Calls To improve the latency please review if the approach is correct or I should think of another way . Core idea is to make the MCP calls Stateful in some way.

@tirth8205

Copy link
Copy Markdown
Owner

I reviewed this for v2.3.4 and deferred it.

The latency problem is real, but the shared changed-file cache adds state across tools and increases the integration risk. The release instead shipped bounded detect-changes behaviour and compact architecture output. This PR should be revisited separately, ideally with clear cache invalidation rules and tests around stale graph state.

@Bakul2006

Copy link
Copy Markdown
Contributor Author

Hey @tirth8205 thanks for the review I understand this is a complex logic for now And needs a defined structure for this to be implemented for now I will try to make a detailed explaination of my approach within 2-3 days.

@Bakul2006

Copy link
Copy Markdown
Contributor Author

Clear Graphical Approach Of Shared Memory Cache

This is the usual flow of the code-review graph
Flowchart
This is where I thought of as a solution
Flowchart (1)
Resolution of your concern regarding if the repository gets updated how cache will be handled
Flowchart (2)
Hey @tirth8205 this is what I thought as of now for implementing and creating fast and secure solution for Code-review-graph working fast with The MCP As well

@tirth8205

Copy link
Copy Markdown
Owner

Thanks @Bakul2006 — and sorry the follow-up diagrams sat without a reply; the cache-invalidation flow you drew is exactly the right framing. The direction is sound and I measured the warm cache at ~0.01ms vs ~5-10ms cold, so the multi-call win is real.

I can't merge it yet — a few concrete things:

  1. Security: resolve_changed_files passes base to git without the _SAFE_GIT_REF check that get_changed_files had. I reproduced base="--output=…" writing an arbitrary file. Please validate base before the subprocess and add a test.
  2. SVN: route through detect_vcs like the old path, or the review tools go blind on SVN repos. Add an SVN-path test.
  3. Don't cache the empty timeout result — calls 2..N then lose the "pass changed_files" guidance (the repeated-call case Windows 11: MCP-based review is extremely slow while CLI graph operations are fast #262 cares about).
  4. --untracked-files=no drops newly added files; reconsider.
  5. Add direct unit tests for the resolver (cache hit/expiry, parsing) and document invalidation rules.
  6. Rebase on main (conflicts in query.py/review.py/test_changes.py).

Land those and I'll take another pass.

@Bakul2006

Copy link
Copy Markdown
Contributor Author

Hey @tirth8205 thanks for the review I will work on this and get you updated shortly

@Bakul2006 Bakul2006 marked this pull request as draft June 11, 2026 19:34
@Bakul2006 Bakul2006 marked this pull request as ready for review June 11, 2026 20:08
@Bakul2006

Copy link
Copy Markdown
Contributor Author

@tirth8205 please review the changes ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants