agent/claude/issues/009-collect-excavate.md
Snider beb24f71d2 docs: add feature request issues for core CLI migration
12 issue files documenting features needed to replace shell scripts:

Claude Code hooks:
- 001: core ai session (state management)
- 002: core ai context (fact capture)
- 003: core ai hook (command validation)
- 004: core qa debug (debug statement detection)

Data collection:
- 005: core collect github (issues/PRs archive)
- 006: core collect bitcointalk (forum threads)
- 007: core collect market (CMC/CoinGecko)
- 008: core collect papers (whitepapers)
- 009: core collect excavate (project archaeology)
- 010: core collect process (HTML→MD)
- 011: core collect dispatch (event hooks)

000: Overview tracking issue

These will be submitted to host-uk/core when rate limit resets.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 18:49:35 +00:00

2.6 KiB

feat(collect): Add project archaeology/excavation

Summary

Add core collect excavate command to perform deep excavation of abandoned cryptocurrency projects before data is lost.

Required Commands

core collect excavate <project>              # Full excavation
core collect excavate <project> --scan-only  # Check what's accessible
core collect excavate <project> --resume     # Resume interrupted dig
core collect excavate <project> --only=github,btt  # Specific collectors

Current Shell Script Being Replaced

  • claude/skills/project-archaeology/excavate.sh - 312 lines of bash

Excavation Pipeline

Runs collectors in priority order:

Phase Source Collector Priority Notes
1 GitHub repos collect github P1 Often deleted first
1 GitHub releases collect github P1 Binaries disappear
2 BitcoinTalk ANN collect bitcointalk P2 Usually persists
2 Website (Wayback) collect wayback P2 Snapshots exist
3 Block explorer collect explorer P3 Chain data
3 CoinMarketCap collect market P3 Historical prices
4 Reddit collect reddit P4 Community context
4 Medium posts collect medium P4 Announcements

Project Registry

Projects defined in cryptonote-discovery/registry.json:

{
  "name": "Masari",
  "symbol": "MSR",
  "status": "abandoned",
  "github": ["masari-project"],
  "bitcointalk": "2769739",
  "website": "getmasari.org",
  "explorer": "explorer.getmasari.org",
  "cmc": "masari"
}

Output Structure

digs/{project}/
├── EXCAVATION.md       # Dig log with timestamps
├── SALVAGE-REPORT.md   # What's worth keeping
├── LESSONS.md          # Post-mortem analysis
├── github/             # All repo history
├── releases/           # Wallet binaries, checksums
├── bitcointalk/        # Thread archive
├── website/            # Wayback snapshots
├── explorer/           # Chain data samples
├── market/             # Price history
├── papers/             # Whitepapers, docs
└── community/          # Reddit, Medium, etc

Scan Mode

--scan-only checks accessibility without downloading:

  • GitHub org accessible?
  • BitcoinTalk thread exists?
  • Wayback snapshots available?
  • Block explorer online?
  • CMC page exists?

Output Format

{
  "project": "masari",
  "status": "abandoned",
  "accessible": {
    "github": true,
    "bitcointalk": true,
    "wayback": true,
    "explorer": false,
    "cmc": true
  },
  "output": "digs/masari/"
}