agent/claude/issues/006-collect-bitcointalk.md
Snider beb24f71d2 docs: add feature request issues for core CLI migration
12 issue files documenting features needed to replace shell scripts:

Claude Code hooks:
- 001: core ai session (state management)
- 002: core ai context (fact capture)
- 003: core ai hook (command validation)
- 004: core qa debug (debug statement detection)

Data collection:
- 005: core collect github (issues/PRs archive)
- 006: core collect bitcointalk (forum threads)
- 007: core collect market (CMC/CoinGecko)
- 008: core collect papers (whitepapers)
- 009: core collect excavate (project archaeology)
- 010: core collect process (HTML→MD)
- 011: core collect dispatch (event hooks)

000: Overview tracking issue

These will be submitted to host-uk/core when rate limit resets.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 18:49:35 +00:00

1.5 KiB

feat(collect): Add BitcoinTalk thread collection

Summary

Add core collect bitcointalk command to archive BitcoinTalk forum threads.

Required Commands

core collect bitcointalk <topic-id>           # Collect full thread
core collect bitcointalk <url>                # Collect from URL
core collect bitcointalk <id> --pages=5       # Limit pages
core collect bitcointalk <id> --output=DIR    # Custom output dir

Current Shell Script Being Replaced

  • claude/skills/bitcointalk/collect.sh - 270 lines of bash + embedded Python

Features

  1. Rate limiting

    • Respectful delay between requests (default 2s)
    • Configurable via --delay=N
  2. Post type detection

    • ANN: Original announcement (post #1)
    • UPDATE: Contains [UPDATE]/[RELEASE]/[ANNOUNCEMENT]
    • QUESTION: Contains question mark in first 200 chars
    • COMMUNITY: General discussion
  3. Output structure

    bitcointalk-{topic}/
    ├── INDEX.md
    ├── pages/
    │   ├── page-0.html
    │   └── page-20.html
    └── posts/
        ├── POST-0001.md
        └── POST-0002.md
    
  4. Post metadata

    • Author
    • Date
    • Post type/score
    • Original content
  5. Incremental collection

    • Resume interrupted collections
    • Skip already-fetched pages

Output Format

{
  "topic_id": "2769739",
  "title": "Lethean - Privacy Blockchain VPN",
  "posts": 1247,
  "pages": 63,
  "output": "bitcointalk-2769739/"
}