go-help/TODO.md
Claude 3e91510bcf
feat(search): add fuzzy matching, phrase search, and improved scoring
Phase 0: Push test coverage from 92.1% to 100% by adding catalog_test.go
and targeted tests for all uncovered branches in search.go. Add
BenchmarkSearch with 150 topics (~745us/op baseline).

Phase 1: Implement three search improvements:
- Levenshtein-based fuzzy matching (max distance 2, words >= 3 chars)
- Quoted phrase search via extractPhrases() with +8.0 boost
- Tag boost (+3.0) and multi-word bonus (+2.0) scoring
- Named scoring constants replacing magic numbers

All changes are backward-compatible; Search() signature unchanged.

Co-Authored-By: Charon <developers@lethean.io>
2026-02-20 01:21:35 +00:00

2.5 KiB

TODO.md — go-help

Dispatched from core/go orchestration. Pick up tasks in order.


Phase 0: Hardening & Test Coverage

  • Expand parser tests — Parser already at 100%. Existing tests cover: empty input, frontmatter-only, malformed YAML, all heading levels (H1-H6), Unicode content, path-derived IDs.
  • Expand search tests — Added tests for: empty query, no results, case sensitivity, multi-word queries, special characters, overlapping matches, scoring boundary cases, nil-topic guard, snippet edge cases (headings-only, whitespace-only).
  • Add catalog tests — Created catalog_test.go covering: DefaultCatalog, Add, List, Search, Get (found/not-found), score tie-breaking.
  • Benchmark searchBenchmarkSearch with 150 topics. Baseline: ~745us/op, ~392KB/op, 4114 allocs/op (Ryzen 9 9950X).
  • go vet ./... clean — No warnings.
  • Coverage: 100% — Up from 92.1%.

Phase 1: Search Improvements

  • Fuzzy matching — Levenshtein distance with max edit distance of 2. Words under 3 chars skip fuzzy. Score: +0.3 per fuzzy match (below prefix +0.5 and exact +1.0).
  • Add stemming support for English search terms
  • Phrase search — Quoted multi-word queries via extractPhrases(). Phrase boost: +8.0. Searches title, content, and section content.
  • Improved scoring weights — Named constants: title +10, section +5, tag +3, phrase +8, all-words bonus +2, exact +1.0, prefix +0.5, fuzzy +0.3.
  • Tag boost — Query words matching tags add +3.0 per matching tag.
  • Multi-word bonus — All query words present in topic adds +2.0.
  • Tests for all new features — Levenshtein, min3, extractPhrases, fuzzy search, phrase search, tag boost, multi-word bonus, scoring constants, phrase highlighting, section phrase matching.

Phase 2: core.help Integration

  • Feed CLI docs into help catalog (parse core subcommand help text)
  • Serve catalog via HTTP for the core.help domain
  • Add Markdown rendering for topic bodies in HTTP responses
  • Generate static site from catalog for BunnyCDN deployment
  • Embed help topics into go-rag collections for semantic search
  • Add vector similarity fallback when keyword search returns no results
  • Support natural language queries ("how do I push all repos?")

Workflow

  1. Virgil in core/go writes tasks here after research
  2. This repo's dedicated session picks up tasks in phase order
  3. Mark [x] when done, note commit hash