Performance Testing¶
This page documents the local/persistent Chroma soak/load validation harness.
Goals¶
- Detect memory/goroutine/file-descriptor leaks in local runtime usage.
- Detect major query/write latency regressions.
- Verify persistence durability across local runtime restarts.
- Track persistence-store behavior over time (directory size + WAL growth).
Harness Location¶
pkg/api/v2/client_local_perf_test.gopkg/api/v2/client_local_perf_helpers_test.go
Build tags:
//go:build soak && !cloud
Profiles¶
Smoke profile¶
Smoke is used for PR gating with strict threshold enforcement. In CI, this workflow is triggered only on relevant local-runtime/perf path changes.
Default scenarios:
embedded_synthetic_smoke(90s)server_synthetic_smoke(90s)embedded_churn_smoke(35 create/close cycles)server_churn_smoke(35 create/close cycles)
Run:
Soak profile¶
Soak is intended for nightly endurance runs in report-only mode.
Default scenarios:
embedded_synthetic_soak(20m)server_synthetic_soak(20m)embedded_default_ef_soak(10m, enabled by default in soak)server_default_ef_soak(10m, enabled by default in soak)
Run:
Workload Shape¶
The synthetic workload uses a single write lane plus read workers.
- Read operations:
Query+Get(~70% Query,~30% Getacross the read lane) - Write operations:
Upsert(and optionalDelete+reinsert) - Seed phase: batched
Add - Churn workload: repeated
NewPersistentClient/Closelifecycle cycles with heartbeat checks - Sampling period: 5s by default (configurable per scenario)
Thresholds¶
Hard-fail thresholds (smoke when CHROMA_PERF_ENFORCE=true)¶
- Error rate must be 0
Queryp95 <= 750msGetp95 <= 750ms- write p95 <= 1500ms
- post-GC heap growth <=
max(30% of baseline heap, 64MiB) - goroutine growth <= +8
- FD growth <= +16 (when measurable)
- durability check must pass for scenarios that require restart verification
Report-only alerts (soak by default)¶
- Heap slope alert:
- synthetic: >3 MiB/min
default_ef: >8 MiB/min- Goroutine slope alert: >0.2/min
- WAL anomaly: >=90% non-decreasing WAL samples and final WAL >4x median without record-count growth
- Throughput drift: last quartile throughput >20% lower than first quartile
Reports¶
Each scenario writes a JSON summary:
perf-summary-<profile>-<scenario>.json
A profile-level Markdown summary is also generated:
perf-summary-<profile>.md
The CI workflow publishes these artifacts and appends the Markdown summary to job output.
Environment Variables¶
CHROMA_PERF_PROFILE-smokeorsoak(default:smoke)CHROMA_PERF_ENFORCE-true/false(default:truefor smoke,falsefor soak)CHROMA_PERF_INCLUDE_DEFAULT_EF- includedefault_efscenarios (default:falsefor smoke,truefor soak)CHROMA_PERF_REPORT_DIR- directory for JSON/Markdown reportsCHROMA_PERF_ENABLE_DELETE_REINSERT- enable delete+reinsert writes (default:false)
Current Runtime Caveat¶
Delete+reinsert operations are disabled by default in this harness.
Reason: current local runtime builds can assert/abort under delete-heavy vector
index mutation (hnswalg.h integrity assertion). The harness keeps this path
behind CHROMA_PERF_ENABLE_DELETE_REINSERT=true so teams can opt in for
focused investigation without destabilizing baseline smoke/soak gates.