Skip to content

docs: add comprehensive rate limiting audit#1468

Closed
kilo-code-bot[bot] wants to merge 1 commit intomainfrom
docs/rate-limiting-audit
Closed

docs: add comprehensive rate limiting audit#1468
kilo-code-bot[bot] wants to merge 1 commit intomainfrom
docs/rate-limiting-audit

Conversation

@kilo-code-bot
Copy link
Contributor

@kilo-code-bot kilo-code-bot bot commented Mar 24, 2026

Summary

  • Adds a comprehensive audit document (docs/rate-limiting-audit.md) cataloging all rate limiting, throttling, and IP-based limits across the gateway codebase.
  • Covers 6 distinct rate limiting mechanisms: free model IP limit (200/hr), promotion anonymous limit (10k/24h), deploy dispatcher login limit (6/min), device auth IP limit (5 pending), webhook in-flight limit (20 concurrent), and the observe-only abuse detection service.
  • Documents potential impact on cloud features: shared IP exhaustion for cloud agents, datacenter IP flagging, and the lack of per-user exemption mechanisms.

Verification

  • Searched the entire codebase for rate limiting logic using multiple patterns (rateLimit, rate_limit, 429, Too Many Requests, quota, throttle, MAX_REQUESTS, per.minute, per.hour, etc.)
  • Verified all referenced file paths and line numbers exist and contain the described code
  • Cross-referenced constants, implementations, and enforcement points for consistency
  • Confirmed no Redis-based rate limit stores exist — all use PostgreSQL or Cloudflare bindings
  • Confirmed no per-model rate limits exist — limits are uniform across all free models

Visual Changes

N/A

Reviewer Notes

  • The promotion limit returns HTTP 401 instead of 429 — there is an existing TODO at src/app/api/openrouter/[...path]/route.ts:257 to change this once the extension supports it.
  • The admin page at src/app/admin/free-model-usage/page.tsx:49 mentions "600 requests/day" which is stale — the actual limit is 10,000/24h.
  • The abuse detection service has full infrastructure for enforcement (verdicts, signals, action metadata) but currently operates in observe-only mode (src/lib/abuse-service.ts:304).
  • The free model IP limit (200/hr) is the most impactful for cloud agents since it's IP-based and checked pre-auth — cloud agents sharing infrastructure IPs could collectively exhaust this limit.


## Architectural Notes

- **No Redis:** All rate limiting uses PostgreSQL or Cloudflare's native rate limiting. No Redis-based stores.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: This note excludes mechanisms documented earlier in the audit

The webhook limiter is backed by Durable Object SQLite and the abuse-detection flow uses an external service, so saying all rate limiting uses PostgreSQL or Cloudflare's native rate limiting is broader than what the rest of this document shows.

Suggested change
- **No Redis:** All rate limiting uses PostgreSQL or Cloudflare's native rate limiting. No Redis-based stores.
- **No Redis:** None of the mechanisms in this audit use Redis; they rely on PostgreSQL, Cloudflare-managed infrastructure (including Durable Object SQLite and RateLimit bindings), or the external abuse service.

## Architectural Notes

- **No Redis:** All rate limiting uses PostgreSQL or Cloudflare's native rate limiting. No Redis-based stores.
- **No environment-variable-controlled thresholds:** All rate limit values are hardcoded constants. Changing limits requires a code deployment.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: This overstates how the thresholds are configured

The deploy-dispatcher limit comes from wrangler.jsonc, and the abuse-detection service is env-var / external-service driven, so not every limit value here is a hardcoded constant in app code.

Suggested change
- **No environment-variable-controlled thresholds:** All rate limit values are hardcoded constants. Changing limits requires a code deployment.
- **Configuration:** Most explicit limits are defined in code or Wrangler config, while abuse-detection thresholds come from external-service or environment configuration.

@kilo-code-bot
Copy link
Contributor Author

kilo-code-bot bot commented Mar 24, 2026

Code Review Summary

Status: 2 Issues Found | Recommendation: Address before merge

Overview

Severity Count
CRITICAL 0
WARNING 2
SUGGESTION 0
Issue Details (click to expand)

WARNING

File Line Issue
docs/rate-limiting-audit.md 216 Architectural note says all mechanisms use PostgreSQL or Cloudflare native rate limiting, but the audit also documents Durable Object SQLite and an external abuse service.
docs/rate-limiting-audit.md 217 Configuration note says all limit values are hardcoded constants, which conflicts with the Wrangler-configured deploy limit and env-configured abuse-detection service.

Fix these issues in Kilo Cloud

Other Observations (not in diff)

None.

Files Reviewed (1 files)
  • docs/rate-limiting-audit.md - 2 issues

Reviewed by gpt-5.4-20260305 · 842,625 tokens

@eshurakov eshurakov closed this Mar 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant