optimize(api): add warning logs for load-based request rejection#2972
optimize(api): add warning logs for load-based request rejection#2972contrueCT wants to merge 2 commits intoapache:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds operator-visible warning logs when LoadDetectFilter rejects requests due to high worker load or low free memory (503s), and introduces unit tests to cover the new rejection behavior plus whitelist/healthy pass-through.
Changes:
- Add
WARNlogging for worker-load and low-memory request rejections inLoadDetectFilter. - Add
LoadDetectFilterTestunit coverage for whitelist bypass, busy rejection, low-memory rejection, and healthy requests. - Register the new unit test in
UnitTestSuite.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| hugegraph-server/hugegraph-api/src/main/java/org/apache/hugegraph/api/filter/LoadDetectFilter.java | Adds rejection WARN logs and minor refactor (captures current load). |
| hugegraph-server/hugegraph-test/src/main/java/org/apache/hugegraph/unit/api/filter/LoadDetectFilterTest.java | New unit tests for whitelist/busy/low-memory/healthy scenarios. |
| hugegraph-server/hugegraph-test/src/main/java/org/apache/hugegraph/unit/UnitTestSuite.java | Adds LoadDetectFilterTest to the unit suite. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (currentLoad >= maxWorkerThreads) { | ||
| LOG.warn("Rejected request due to high worker load, method={}, path={}, " + | ||
| "currentLoad={}, maxWorkerThreads={}", | ||
| context.getMethod(), context.getUriInfo().getPath(), | ||
| currentLoad, maxWorkerThreads); |
There was a problem hiding this comment.
The new WARN logs will be emitted for every rejected request; under overload this can produce a log storm and further degrade the server (I/O, disk, log rotation). Consider rate-limiting/sampling these rejection logs (similar to the existing RateLimiter usage elsewhere) or logging once per interval with aggregated counters.
There was a problem hiding this comment.
The new WARN logs will be emitted for every rejected request; under overload this can produce a log storm and further degrade the server (I/O, disk, log rotation). Consider rate-limiting/sampling these rejection logs (similar to the existing RateLimiter usage elsewhere) or logging once per interval with aggregated counters.
WARN,可能产生大量日志 I/O,进一步放大系统抖动(导入/批量写入期间更明显)。
可以给“拒绝日志”加一个轻量限速(只限日志,不限请求处理),这样不会阻塞线程:
| if (currentLoad >= maxWorkerThreads) { | |
| LOG.warn("Rejected request due to high worker load, method={}, path={}, " + | |
| "currentLoad={}, maxWorkerThreads={}", | |
| context.getMethod(), context.getUriInfo().getPath(), | |
| currentLoad, maxWorkerThreads); | |
| if (REJECT_LOG_RATE_LIMITER.tryAcquire()) { | |
| LOG.warn("Rejected request due to high worker load, method={}, path={}, " + | |
| "currentLoad={}, maxWorkerThreads={}", | |
| context.getMethod(), context.getUriInfo().getPath(), | |
| currentLoad, maxWorkerThreads); | |
| } |
// 每秒最多 1 条日志记录
参考:private static final RateLimiter REJECT_LOG_RATE_LIMITER = RateLimiter.create(1.0);
| private <T> void injectProvider(LoadDetectFilter filter, String fieldName, | ||
| Provider<T> provider) { | ||
| try { | ||
| java.lang.reflect.Field field = LoadDetectFilter.class.getDeclaredField(fieldName); | ||
| field.setAccessible(true); | ||
| field.set(filter, provider); | ||
| } catch (Exception e) { | ||
| throw new RuntimeException("Failed to inject provider: " + fieldName, e); | ||
| } |
There was a problem hiding this comment.
This test injects private fields via a bespoke reflection helper. Since the codebase already has org.apache.hugegraph.testutil.Whitebox used in other tests for setting internal state, using it here (or centralizing the reflection helper) would reduce duplication and make the test less brittle if field names change.
| allocatedMem) / Bytes.MB; | ||
| if (presumableFreeMem < minFreeMemory) { | ||
| gcIfNeeded(); | ||
| LOG.warn("Rejected request due to low free memory, method={}, path={}, " + |
There was a problem hiding this comment.
gcIfNeeded() 再记录日志,但当前日志里的 presumableFreeMem 是 GC 前快照。排障时容易误解为“GC 后仍然是这个值”。
建议同时输出 GC 前/后快照,定位会更直观:
| LOG.warn("Rejected request due to low free memory, method={}, path={}, " + | |
| gcIfNeeded(); | |
| long allocatedMemAfterGc = Runtime.getRuntime().totalMemory() - | |
| Runtime.getRuntime().freeMemory(); | |
| long freeMemAfterGc = (Runtime.getRuntime().maxMemory() - | |
| allocatedMemAfterGc) / Bytes.MB; | |
| LOG.warn("Rejected request due to low free memory, method={}, path={}, " + | |
| "presumableFreeMemMB(beforeGC)={}, presumableFreeMemMB(afterGC)={}, minFreeMemoryMB={}", | |
| context.getMethod(), context.getUriInfo().getPath(), | |
| presumableFreeMem, freeMemAfterGc, minFreeMemory); |
Purpose of the PR
Main Changes
Log busy-worker and low-memory rejections in LoadDetectFilter before returning ServiceUnavailableException so operators can diagnose 503 responses from server-side logs.
Add unit coverage for whitelist bypass, busy rejection,
low-memory rejection, and healthy request pass-through, and register the new test in UnitTestSuite.
Verifying these changes
Does this PR potentially affect the following parts?
Documentation Status
Doc - TODODoc - DoneDoc - No Need