forked from Snider/Poindexter
74 lines
3.3 KiB
Markdown
74 lines
3.3 KiB
Markdown
# Performance: KDTree benchmarks and guidance
|
||
|
||
This page summarizes how to measure KDTree performance in this repository and how to compare the two internal backends (Linear vs Gonum) that you can select at build/runtime.
|
||
|
||
## How benchmarks are organized
|
||
|
||
- Micro-benchmarks live in `bench_kdtree_test.go`, `bench_kdtree_dual_test.go`, and `bench_kdtree_dual_100k_test.go` and cover:
|
||
- `Nearest` in 2D and 4D with N = 1k, 10k (both backends)
|
||
- `Nearest` in 2D and 4D with N = 100k (gonum-tag job; linear also measured there)
|
||
- `KNearest(k=10)` in 2D/4D with N = 1k, 10k
|
||
- `Radius` (mid radius r≈0.5 after normalization) in 2D/4D with N = 1k, 10k
|
||
- Datasets: Uniform and 3-cluster synthetic generators in normalized [0,1] spaces.
|
||
- Backends: Linear (always available) and Gonum (enabled when built with `-tags=gonum`).
|
||
|
||
Run them locally:
|
||
|
||
```bash
|
||
# Linear backend (default)
|
||
go test -bench . -benchmem -run=^$ ./...
|
||
|
||
# Gonum backend (optimized KD; requires build tag)
|
||
go test -tags=gonum -bench . -benchmem -run=^$ ./...
|
||
```
|
||
|
||
GitHub Actions publishes benchmark artifacts on every push/PR:
|
||
- Linear job: artifact `bench-linear.txt`
|
||
- Gonum job: artifact `bench-gonum.txt`
|
||
|
||
## Backend selection and defaults
|
||
|
||
- Default backend is Linear.
|
||
- If you build with `-tags=gonum`, the default switches to the optimized KD backend.
|
||
- You can override at runtime:
|
||
|
||
```
|
||
// Force Linear
|
||
kdt, _ := poindexter.NewKDTree(pts, poindexter.WithBackend(poindexter.BackendLinear))
|
||
// Force Gonum (requires build tag)
|
||
kdt, _ := poindexter.NewKDTree(pts, poindexter.WithBackend(poindexter.BackendGonum))
|
||
```
|
||
|
||
Supported metrics in the optimized backend: L2 (Euclidean), L1 (Manhattan), L∞ (Chebyshev). Cosine/Weighted-Cosine currently use the Linear backend.
|
||
|
||
## What to expect (rule of thumb)
|
||
|
||
- Linear backend: O(n) per query; fast for small-to-medium datasets (≤10k), especially in low dims (≤4).
|
||
- Gonum backend: typically sub-linear for prunable datasets and dims ≤ ~8, with noticeable gains as N grows (≥10k–100k), especially on uniform or moderately clustered data and moderate radii.
|
||
- For large radii (many points within r) or highly correlated/pathological data, pruning may be less effective and behavior approaches O(n) even with KD-trees.
|
||
|
||
## Interpreting results
|
||
|
||
Benchmarks output something like:
|
||
|
||
```
|
||
BenchmarkNearest_10k_4D_Gonum_Uniform-8 50000 12,300 ns/op 0 B/op 0 allocs/op
|
||
```
|
||
|
||
- `ns/op`: lower is better (nanoseconds per operation)
|
||
- `B/op` and `allocs/op`: memory behavior; fewer is better
|
||
- `KNearest` incurs extra work due to sorting; `Radius` cost scales with the number of hits.
|
||
|
||
## Improving performance
|
||
|
||
- Normalize and weight features once; reuse across queries (see `Build*WithStats` helpers).
|
||
- Choose a metric aligned with your policy: L2 usually a solid default; L1 for per-axis penalties; L∞ for hard-threshold dominated objectives.
|
||
- Batch queries to benefit from CPU caches.
|
||
- Prefer the Gonum backend for larger N and dims ≤ ~8; stick to Linear for tiny datasets or when using Cosine metrics.
|
||
|
||
## Reproducing and tracking performance
|
||
|
||
- Local (Linear): `go test -bench . -benchmem -run=^$ ./...`
|
||
- Local (Gonum): `go test -tags=gonum -bench . -benchmem -run=^$ ./...`
|
||
- CI artifacts: download `bench-linear.txt` and `bench-gonum.txt` from the latest workflow run.
|
||
- Optional: add historical trend graphs via Benchstat or Codecov integration.
|