Delete page "Search-Engine.-"

2026-02-19 16:57:06 +00:00 · 2026-02-19 16:57:06 +00:00 · 46fd1262ab
commit 46fd1262ab
parent fdcf0f2866
1 changed files with 0 additions and 114 deletions
--- a/Search-Engine.-.md
+++ b/Search-Engine.-.md
@ -1,114 +0,0 @@
-# Search Engine
-
-The help package includes a full-text search engine built on an inverted index with scoring, prefix matching, and snippet extraction.
-
-## SearchResult
-
-```go
-type SearchResult struct {
-    Topic   *Topic
-    Section *Section // nil if topic-level match
-    Score   float64
-    Snippet string   // Context around match with highlighting
-}
-```
-
-Results are sorted by score (highest first). Ties are broken alphabetically by topic title.
-
-## How Search Works
-
-```go
-results := catalog.Search("installation guide")
-```
-
-1. **Tokenise** the query into lowercase words (min 2 characters).
-2. **Look up** each word in the inverted index for exact matches (score +1.0 per match).
-3. **Prefix match** — words in the index that start with a query word score +0.5 (e.g. query `"config"` matches indexed word `"configuration"`).
-4. **Title boost** — if any query word appears in the topic title, add +10.0.
-5. **Section title boost** — if the best-matching section title contains a query word, add +5.0.
-6. **Extract snippet** from the best-matching section content.
-7. **Sort** results by score descending.
-
-## Tokenisation
-
-```go
-func tokenize(text string) []string
-```
-
-Splits text into lowercase words. Rules:
-
- Letters and digits are kept together as words.
- Punctuation and whitespace are word separators.
- Single-character words are discarded (minimum length is 2).
-
-| Input | Tokens |
-|-------|--------|
-| `"Hello World"` | `["hello", "world"]` |
-| `"v2.0 and config123"` | `["v2", "and", "config123"]` |
-| `"a b c hello d"` | `["hello"]` |
-
-## Inverted Index
-
-The search index maps each unique word to a list of topic IDs:
-
-```
-"installation" -> ["getting-started", "installation-guide"]
-"config"       -> ["configuration"]
-"cli"          -> ["getting-started", "configuration"]
-```
-
-When a topic is added via `Catalog.Add`, the index is populated from:
- Topic title words
- Topic content words
- Section title words
- Section content words
- Tag words
-
-Duplicate word-to-topic mappings are prevented.
-
-## Scoring
-
-| Match Type | Points |
-|------------|--------|
-| Exact word match in index | +1.0 per query word |
-| Prefix match in index | +0.5 per query word |
-| Query word in topic title | +10.0 (once) |
-| Query word in best section title | +5.0 (once) |
-
-Within `findBestMatch`, section title matches count double when selecting the best section.
-
-## Snippet Extraction
-
-```go
-func extractSnippet(content string, regexes []*regexp.Regexp) string
-```
-
-Extracts up to 150 characters of context around the first match:
-
-1. Finds the earliest match position across all query regexes.
-2. Centres a 150-character window around that position (50 chars before the match).
-3. Trims to word boundaries and adds `"..."` prefix/suffix when truncated.
-4. Applies bold highlighting to all matches.
-
-If no regexes are provided, returns the first non-empty, non-heading line of content (truncated to 150 characters).
-
-### Highlighting
-
-Matches are wrapped in `**bold**` markup. The highlighting engine:
-
- Uses case-insensitive regex matching.
- Sorts matches by position and merges overlapping/adjacent spans to avoid nested markers (e.g. `"**Searching**"` not `"**Search****ing**"`).
- Applies replacements back-to-front to avoid position shifts.
-
-Examples:
-
-| Content | Query | Highlighted |
-|---------|-------|-------------|
-| `"The configuration is complete."` | `"config"` | `"The **config**uration is complete."` |
-| `"Searching for something."` | `"search", "searching"` | `"**Searching** for something."` |
-
-## UTF-8 Safety
-
-All snippet extraction operates on `[]rune` slices rather than byte offsets, ensuring multi-byte characters (e.g. Japanese, accented Latin) are never split mid-character. The byte position of a regex match is converted to a rune position before windowing.
-
-See [[Topics-and-Catalog]] for topic and catalog API. Back to [[Home]].