# FINDINGS.md -- go-help ## 2026-02-19: Split from core/go (Virgil) ### Origin Extracted from `forge.lthn.ai/core/go` `pkg/help/` on 19 Feb 2026. ### Architecture - `Topic` struct with title, body, tags, related topics - `Section` groups topics under a heading - `Frontmatter` for YAML metadata in topic files - `Catalog` loads topics from YAML files on disk - `Search` provides keyword search across topics with scoring: - Title match: +10 - Section match: +5 - Partial/body match: +0.5 ### Dependencies - Pure Go, only external dependency is `gopkg.in/yaml.v3` ### Tests - 2 test files covering catalog loading and search behaviour ## 2026-02-20: Phase 0 + Phase 1 (Charon) ### Phase 0: Coverage 92.1% -> 100% - Created `catalog_test.go` — the entire `catalog.go` was untested (0%) - Added targeted search tests for previously uncovered branches: - Nil topic guard in `Search()` (stale index references) - Alphabetical tie-breaking when scores are equal - Headings-only content in snippet extraction (no body text) - Whitespace-only content trimmed to empty in snippets - Empty regex slice in `highlight()` - Overlapping match extension in highlight merging - Added `BenchmarkSearch` with 150 generated topics - Baseline: ~745us/op, ~392KB/op, 4114 allocs/op (Ryzen 9 9950X) - `go vet ./...` clean ### Phase 1: Search Improvements #### Fuzzy Matching (Levenshtein distance) - Implemented `levenshtein()` using two-row DP (memory-efficient) - Integrated into `Search()` with max edit distance of 2 - Only applied to query words >= 3 characters (avoids noise from short words) - Score: +0.3 per fuzzy match (lower than prefix +0.5 and exact +1.0) - Skips words already matched as exact or prefix (no double-counting) #### Phrase Search - `extractPhrases()` pulls `"quoted strings"` from the query - Remaining text is tokenised normally for keyword search - Phrase matching checks title + content + all section content (case-insensitive) - Phrase boost: +8.0 per matching phrase - Phrase terms are also compiled as regexes for snippet highlighting - Empty quotes `""` are left as-is (regex requires `[^"]+`) - Whitespace-only quotes are ignored #### Improved Scoring Weights - Replaced magic numbers with named constants for clarity: - `scoreExactWord = 1.0` -- exact word in index - `scorePrefixWord = 0.5` -- prefix/partial word match - `scoreFuzzyWord = 0.3` -- Levenshtein fuzzy match - `scoreTitleBoost = 10.0` -- query word in topic title - `scoreSectionBoost = 5.0` -- query word in section title - `scoreTagBoost = 3.0` -- query word matches a tag (NEW) - `scorePhraseBoost = 8.0` -- exact phrase match (NEW) - `scoreAllWords = 2.0` -- all query words present (NEW) - `fuzzyMaxDistance = 2` -- max Levenshtein distance #### New Scoring Features - **Tag boost** (+3.0): topics with tags matching query words rank higher - **Multi-word bonus** (+2.0): topics containing ALL query words get a bonus - Both are additive with existing boosts (title, section, exact/prefix) ### API Compatibility - `Search(query string) []*SearchResult` signature unchanged - All existing behaviour preserved; new features are additive - Existing tests pass without modification