Enhance CI workflow with coverage and benchmark reporting for KDTree

2025-11-03 18:26:40 +00:00 · 2025-11-03 18:26:40 +00:00 · f106261216
commit f106261216
parent dad508de31
5 changed files with 84 additions and 2 deletions
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@ -36,8 +36,22 @@ jobs:
      - name: Vet
        run: go vet ./...

-      - name: Test (race)
-        run: go test -race ./...
+      - name: Test (race + coverage)
+        run: go test -race -coverprofile=coverage.out -covermode=atomic ./...
+
+      - name: Upload coverage artifact
+        if: always()
+        uses: actions/upload-artifact@v4
+        with:
+          name: coverage-${{ matrix.go-version }}
+          path: coverage.out
+
+      - name: Upload to Codecov
+        uses: codecov/codecov-action@v4
+        with:
+          files: coverage.out
+          flags: unit
+          fail_ci_if_error: false

      - name: Build examples
        run: |
@ -45,6 +59,17 @@ jobs:
            go build ./examples/...
          fi

+      - name: Benchmarks (benchmem)
+        run: |
+          go test -bench . -benchmem -run=^$ ./... | tee bench-${{ matrix.go-version }}.txt
+
+      - name: Upload benchmark artifact
+        if: always()
+        uses: actions/upload-artifact@v4
+        with:
+          name: bench-${{ matrix.go-version }}
+          path: bench-${{ matrix.go-version }}.txt
+
      - name: Vulncheck
        uses: golang/govulncheck-action@v1
        with:
--- a/README.md
+++ b/README.md
@ -4,6 +4,7 @@
 [![CI](https://github.com/Snider/Poindexter/actions/workflows/ci.yml/badge.svg)](https://github.com/Snider/Poindexter/actions)
 [![Go Report Card](https://goreportcard.com/badge/github.com/Snider/Poindexter)](https://goreportcard.com/report/github.com/Snider/Poindexter)
 [![Vulncheck](https://img.shields.io/badge/govulncheck-enabled-brightgreen.svg)](https://pkg.go.dev/golang.org/x/vuln/cmd/govulncheck)
+[![codecov](https://codecov.io/gh/Snider/Poindexter/branch/main/graph/badge.svg)](https://codecov.io/gh/Snider/Poindexter)

 A Go library package providing utility functions including sorting algorithms with custom comparators.

--- a/docs/index.md
+++ b/docs/index.md
@ -53,3 +53,7 @@ Contributions are welcome! Please feel free to submit a Pull Request.

 - Find the best (lowest‑ping) DHT peer using KDTree: [Best Ping Peer (DHT)](dht-best-ping.md)
 - Multi-dimensional neighbor search over ping, hops, geo, and score: [Multi-Dimensional KDTree (DHT)](kdtree-multidimensional.md)
+
+## Performance
+
+- Benchmark methodology and guidance: [Performance](perf.md)
--- a/docs/perf.md
+++ b/docs/perf.md
@ -0,0 +1,51 @@
+# Performance: KDTree benchmarks and guidance
+
+This page summarizes how to measure KDTree performance in this repository and when to consider switching the internal engine to `gonum.org/v1/gonum/spatial/kdtree` for large datasets.
+
+## How benchmarks are organized
+
+- Micro-benchmarks live in `bench_kdtree_test.go` and cover:
+  - `Nearest` in 2D and 4D with N = 1k, 10k
+  - `KNearest(k=10)` in 2D with N = 1k, 10k
+  - `Radius` (mid radius) in 2D with N = 1k, 10k
+- All benchmarks operate in normalized [0,1] spaces and use the current linear-scan implementation.
+
+Run them locally:
+
+```bash
+go test -bench . -benchmem -run=^$ ./...
+```
+
+GitHub Actions publishes benchmark artifacts for Go 1.22 and 1.23 on every push/PR. Look for artifacts named `bench-<go-version>.txt` in the CI run.
+
+## What to expect (rule of thumb)
+
+- Time complexity is O(n) per query in the current implementation.
+- For small-to-medium datasets (up to ~10k points), linear scans are often fast enough, especially for low dimensionality (≤4) and if queries are batched efficiently.
+- For larger datasets (≥100k) and low/medium dimensions (≤8), a true KD-tree (like Gonum’s) often yields sub-linear queries and significantly lower latency.
+
+## Interpreting results
+
+Benchmarks output something like:
+
+```
+BenchmarkNearest_10k_4D-8      50000         23,000 ns/op      0 B/op      0 allocs/op
+```
+
+- `ns/op`: lower is better (nanoseconds per operation)
+- `B/op` and `allocs/op`: memory behavior; fewer is better
+
+Because `KNearest` sorts by distance, you should expect additional cost over `Nearest`. `Radius` cost depends on how many points fall within the radius; tighter radii usually run faster.
+
+## Improving performance
+
+- Prefer Euclidean (L2) over metrics that require extra branching for CPU pipelines, unless your policy prefers otherwise.
+- Normalize and weight features once; reuse coordinates across queries.
+- Batch queries to amortize overhead of data locality and caches.
+- Consider a backend swap to Gonum’s KD-tree for large N (we plan to add a `WithBackend("gonum")` option).
+
+## Reproducing and tracking performance
+
+- Local: run `go test -bench . -benchmem -run=^$ ./...`
+- CI: download `bench-*.txt` artifacts from the latest workflow run
+- Optional: we can add historical trend graphs via Codecov or Benchstat integration if desired.
--- a/mkdocs.yml
+++ b/mkdocs.yml
@ -59,6 +59,7 @@ nav:
      - Best Ping Peer (DHT): dht-best-ping.md
      - Multi-Dimensional KDTree (DHT): kdtree-multidimensional.md
  - API Reference: api.md
+  - Performance: perf.md
  - License: license.md

 copyright: Copyright &copy; 2025 Snider