Add memory benchmarks for scan pipeline#3001
Conversation
|
I have read the CLA Document and I hereby sign the CLA You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot. |
📊 Performance Benchmark Report
📈 Detailed Results (All Benchmarks)
🎯 Performance Summary✅ No significant performance changes detected (all changes <10%) 🆕 New Tests
🐍 Python Version 3.11.15 |
|
recheck |
1 similar comment
|
recheck |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## 3.0 #3001 +/- ##
======================================
- Coverage 91% 91% -0%
======================================
Files 436 439 +3
Lines 37072 37184 +112
======================================
+ Hits 33677 33711 +34
- Misses 3395 3473 +78 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
1825523 to
cb5940b
Compare
Scanner construction allocates 400+ MB in pytest (presets, module loading, etc.) which was setting the tracemalloc peak before any scan events existed, masking real differences between branches. Split scanner init out of the tracemalloc window so we measure only scan execution memory. Also separate "new tests" from "significant changes" in benchmark report output.
pytest's own allocations (~200 MB) contaminate tracemalloc peak measurements when scans run in-process, masking real differences between branches. Run each benchmark scan as a subprocess instead so measurements reflect only the scan's own memory use. Also rename tests to test_memory_use_* for clarity.
aa7e2bc to
590e979
Compare
IP addresses and DNS record type strings (A, AAAA, CNAME, etc.) repeat heavily across events. sys.intern() deduplicates them so all events sharing the same IPs/rdtypes reference the same string object, reducing memory ~10-30% on those fields.
…interning Intern repeated strings in resolved_hosts and dns_children
| # 1) Web crawl -- httpx visits many pages, excavate processes bodies | ||
| # --------------------------------------------------------------------------- | ||
|
|
||
| _WEB_CRAWL_SCRIPT = """ |
There was a problem hiding this comment.
can we break this out into a file?
Summary
extra_info) for--benchmark-save/--benchmark-compareacross branches