diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 0000000..51a1e82 --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,154 @@ +# CLAUDE.md — go-scm Domain Expert Guide + +You are a dedicated domain expert for `forge.lthn.ai/core/go-scm`. Virgil (in core/go) orchestrates your work via TODO.md. Pick up tasks in phase order, mark `[x]` when done, commit and push. + +## What This Package Does + +SCM integration, AgentCI dispatch, and data collection for the Lethean ecosystem. ~9K LOC across 4 sub-packages: + +- **forge/** — Forgejo API client (repos, issues, PRs, labels, webhooks, orgs) +- **gitea/** — Gitea API client (repos, issues, meta) for public mirror at `git.lthn.ai` +- **git/** — Multi-repo git operations (status, commit, push, pull) +- **agentci/** — Clotho Protocol orchestrator for dual-run agent verification +- **jobrunner/** — PR automation pipeline (Forgejo webhook signals → handler dispatch) +- **collect/** — Data collection (BitcoinTalk, GitHub, market, papers, events) + +## Architecture + +``` +forge/ Forgejo SDK wrapper (codeberg.org/mvdkleijn/forgejo-sdk) + ├── client.go Config-based auth, SDK wrapper + ├── repos.go Create, list, mirror repos + ├── issues.go Create, list, assign issues + ├── prs.go Create, list, merge PRs + ├── labels.go Label management + ├── webhooks.go Webhook CRUD + ├── orgs.go Organisation management + ├── meta.go Instance metadata + └── config.go Auth resolution (~/.core/config.yaml, env, flags) + +gitea/ Gitea SDK wrapper (code.gitea.io/sdk/gitea) + ├── client.go Config-based auth + ├── repos.go Repo operations + ├── issues.go Issue operations + ├── meta.go Instance info + └── config.go Auth config + +git/ Multi-repo git operations + ├── git.go RepoStatus, StatusOptions, exec wrappers + └── service.go Bulk status, commit, push, pull across repos + +agentci/ Clotho Protocol — dual-run agent verification + ├── clotho.go Spinner orchestrator (standard vs dual-run mode) + ├── config.go ClothoConfig, AgentConfig + ├── security.go Security policy enforcement + └── config_test.go + +jobrunner/ PR automation pipeline + ├── types.go PipelineSignal, ActionResult, JobSource, Handler interfaces + ├── poller.go Polling loop with tick interval + ├── journal.go Execution journal (JSONL audit trail) + ├── forgejo/ + │ └── source.go Forgejo webhook signal source + └── handlers/ + ├── dispatch.go Agent task dispatch + ├── completion.go Handle agent completion signals + ├── enable_auto_merge.go Auto-merge when checks pass + ├── publish_draft.go Publish draft PRs + ├── resolve_threads.go Auto-resolve review threads + ├── send_fix_command.go Send fix commands to agents + └── tick_parent.go Update parent epic progress + +collect/ Data collection pipeline + ├── collect.go Collector interface, pipeline orchestrator + ├── bitcointalk.go Forum scraping + ├── github.go GitHub API data collection + ├── market.go Market data + ├── papers.go Research paper collection + ├── events.go Event tracking + ├── excavate.go Deep data extraction + ├── process.go Processing pipeline + ├── ratelimit.go Per-source rate limiting + └── state.go Collection state persistence +``` + +## Commands + +```bash +go test ./... # Run all tests +go test -v -run TestName ./... # Single test +go test -race ./... # Race detector +go vet ./... # Static analysis +``` + +## Local Dependencies + +Resolved via `replace` in go.mod or preferably via `go.work`: + +| Module | Local Path | Notes | +|--------|-----------|-------| +| `forge.lthn.ai/core/go` | varies | Framework (log, config, process) | + +**Recommended**: Use a `go.work` file in your workspace root to resolve local modules rather than editing go.mod replace directives. + +## Key Types + +```go +// forge/client.go +type Client struct { + api *forgejo.Client + url string + token string +} + +// git/git.go +type RepoStatus struct { + Name, Path, Branch string + Modified, Untracked, Staged, Ahead, Behind int + Error error +} + +// jobrunner/types.go +type PipelineSignal struct { + EpicNumber, ChildNumber, PRNumber int + RepoOwner, RepoName string + PRState, Mergeable, CheckStatus string + NeedsCoding bool + IssueTitle, IssueBody string + // ... dispatch context +} + +type ActionResult struct { + Action string + Success bool + Error string + Duration time.Duration + Cycle int +} + +// agentci/clotho.go +type RunMode string // "standard" or "dual" +type Spinner struct { + Config ClothoConfig + Agents map[string]AgentConfig +} +``` + +## Coding Standards + +- **UK English**: colour, organisation, centre +- **Tests**: testify assert/require, table-driven preferred +- **Conventional commits**: `feat(forge):`, `fix(jobrunner):`, `test(collect):` +- **Co-Author**: `Co-Authored-By: Virgil ` +- **Licence**: EUPL-1.2 +- **Imports**: stdlib → forge.lthn.ai → third-party, each group separated by blank line + +## Forge + +- **Repo**: `forge.lthn.ai/core/go-scm` +- **Push via SSH**: `git push origin main` (remote: `ssh://git@forge.lthn.ai:2223/core/go-scm.git`) + +## Task Queue + +See `TODO.md` for prioritised work. Phase 1 (test coverage) is the critical path. +See `FINDINGS.md` for research notes. diff --git a/FINDINGS.md b/FINDINGS.md new file mode 100644 index 0000000..001848e --- /dev/null +++ b/FINDINGS.md @@ -0,0 +1,48 @@ +# FINDINGS.md — go-scm Research & Discovery + +## 2026-02-20: Initial Assessment (Virgil) + +### Origin + +Extracted from `forge.lthn.ai/core/go` on 19 Feb 2026 as part of the go-ai split. Contains all SCM integration, CI dispatch, and data collection code. + +### Package Inventory + +| Package | Files | LOC (approx) | Tests | Coverage | +|---------|-------|--------------|-------|----------| +| forge/ | 9 | ~900 | 0 | 0% | +| gitea/ | 5 | ~500 | 0 | 0% | +| git/ | 2 | ~400 | 0 | 0% | +| agentci/ | 4 | ~300 | 1 | partial | +| jobrunner/ | 4 + handlers/ | ~1,500 | several | partial | +| collect/ | 12 | ~5,000 | 8 | unknown | + +### Dependencies + +- `codeberg.org/mvdkleijn/forgejo-sdk/forgejo/v2` — Forgejo API (v2) +- `code.gitea.io/sdk/gitea` — Gitea API +- `forge.lthn.ai/core/go` — Framework (log, config, viper) +- `github.com/stretchr/testify` — Testing + +### Key Observations + +1. **forge/ and gitea/ are structurally identical** — Same pattern: client.go, config.go, repos.go, issues.go, meta.go. Could share an interface. +2. **Zero tests in forge/, gitea/, git/** — Most critical gap. These are the foundational packages. +3. **collect/ has the most tests** — 8 test files covering all collectors. Coverage unknown. +4. **jobrunner/handlers/ has test files** — dispatch_test.go, enable_auto_merge_test.go, publish_draft_test.go, resolve_threads_test.go, send_fix_command_test.go, tick_parent_test.go. Quality unknown. +5. **agentci/ Clotho Protocol** — Dual-run verification for critical repos. Currently basic (string match on repo name). Needs more sophisticated policy engine. + +### Auth Resolution + +Both forge/ and gitea/ resolve auth from: +1. `~/.core/config.yaml` (forge.token/forge.url or gitea.token/gitea.url) +2. Environment variables (FORGE_TOKEN/FORGE_URL or GITEA_TOKEN/GITEA_URL) +3. CLI flag overrides (highest priority) + +This is handled via core/go's viper integration. + +### Infrastructure Context + +- **Forge** (`forge.lthn.ai`) — Production Forgejo instance on de2. Full IP/intel/research. +- **Gitea** (`git.lthn.ai`) — Public mirror with reduced data. Breach-safe. +- **Split policy**: Forge = source of truth, Gitea = public-facing mirror with sensitive data stripped. diff --git a/TODO.md b/TODO.md new file mode 100644 index 0000000..cf7d620 --- /dev/null +++ b/TODO.md @@ -0,0 +1,44 @@ +# TODO.md — go-scm Task Queue + +Dispatched from core/go orchestration. Pick up tasks in order. + +--- + +## Phase 1: Test Coverage (Critical) + +forge/, gitea/, and git/ have **zero tests**. This is the top priority. + +- [ ] **forge/ unit tests** — Test `New()` client creation, `GetCurrentUser()`, error handling. Mock the Forgejo SDK client. Cover: `repos.go` (create, list, mirror), `issues.go` (create, list, assign), `prs.go` (create, list, merge), `labels.go`, `webhooks.go`, `orgs.go`. Target: 70% coverage. +- [ ] **gitea/ unit tests** — Test `New()` client creation, repo/issue operations. Mock the Gitea SDK client. Cover: `repos.go`, `issues.go`, `meta.go`. Target: 70% coverage. +- [ ] **git/ unit tests** — Test `RepoStatus` methods (`IsDirty`, `HasUnpushed`, `HasUnpulled`). Test status parsing with mock git output. Test bulk operations with temp repos. Cover: `git.go`, `service.go`. Target: 80% coverage. +- [ ] **jobrunner handler tests** — handlers/ has test files but verify coverage. Add table-driven tests for `dispatch.go`, `completion.go`, `enable_auto_merge.go`. Test `PipelineSignal` state transitions. +- [ ] **collect/ test audit** — collect/ has test files for each collector. Run `go test -cover ./collect/...` and identify gaps below 60%. + +## Phase 2: Hardening + +- [ ] **Config resolution audit** — forge/ and gitea/ both resolve auth from `~/.core/config.yaml` → env vars → flags. Ensure consistent priority order. Document in FINDINGS.md. +- [ ] **Error wrapping** — Ensure all errors use `fmt.Errorf("package.Func: ...: %w", err)` or `log.E()` consistently. Some files may use bare `fmt.Errorf` without wrapping. +- [ ] **Context propagation** — Verify all Forgejo/Gitea API calls pass `context.Context` for cancellation. Add context to any blocking operations missing it. +- [ ] **Rate limiting** — collect/ has its own `ratelimit.go`. Verify it handles API rate limit headers from GitHub, Forgejo, Gitea. + +## Phase 3: AgentCI Pipeline + +- [ ] **Clotho dual-run validation** — `DeterminePlan()` logic is simple (check strategy + agent config + repo name). Add tests for all code paths: standard mode, dual-run by agent config, dual-run by critical repo. +- [ ] **Forgejo signal source tests** — `forgejo/source.go` polls for webhook events. Test signal parsing and filtering. +- [ ] **Journal replay** — `journal.go` writes JSONL audit trail. Add test for write + read-back + filtering by action/repo/time range. +- [ ] **Handler integration** — Test full signal → handler → result flow with mock Forgejo client. Verify `tick_parent` correctly updates epic progress. + +## Phase 4: Forge ↔ Gitea Sync + +- [ ] **Mirror sync** — Implement repo mirroring from forge.lthn.ai (private) to git.lthn.ai (public). Strip sensitive data (research docs, credentials, internal refs). +- [ ] **Issue sync** — Selective issue sync (public issues only) from Forge to Gitea. +- [ ] **PR status sync** — Update Gitea mirror PRs when Forge PRs merge. + +--- + +## Workflow + +1. Virgil in core/go writes tasks here after research +2. This repo's session picks up tasks in phase order +3. Mark `[x]` when done, note commit hash +4. New discoveries → add tasks, flag in FINDINGS.md