feat(collect): add Borg-powered collection commands

New slash commands:
- /collect:collect - Auto-detect and collect any resource
- /collect:github - Collect GitHub repos/orgs
- /collect:website - Crawl websites with depth control
- /collect:excavate - Full project archaeology dig
- /collect:papers - Collect whitepapers from registry

All commands use Borg CLI (github.com/Snider/Borg) for collection.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Snider 2026-02-02 00:22:19 +00:00
parent 3c3d3de1a1
commit b504cc31ca
6 changed files with 317 additions and 1 deletions

View file

@ -15,7 +15,7 @@
{
"name": "collect",
"source": "./claude/collect",
"description": "Data collection - whitepapers, forum archives, project archaeology",
"description": "Data collection via Borg CLI - /collect:github, /collect:website, /collect:excavate, /collect:papers",
"version": "0.1.0"
},
{

View file

@ -0,0 +1,54 @@
---
name: collect
description: Quick collection using Borg - auto-detects resource type
args: <url or target>
---
# Quick Collect
Auto-detect and collect any resource using Borg.
## Usage
```
/collect:collect https://github.com/LetheanNetwork
/collect:collect https://docs.lethean.io
/collect:collect masari-project
```
## Action
Borg's `all` command handles auto-detection:
```bash
borg all <url-or-target>
```
This will:
1. Detect if it's a GitHub URL → collect repos
2. Detect if it's a website → crawl it
3. Detect if it's a PWA → download the app
## Examples
```bash
# GitHub org - collects all repos
borg all https://github.com/LetheanNetwork
# Website - crawls and packages
borg all https://docs.lethean.io
# With encryption
borg all https://github.com/graft-project --format stim -o graft.stim
```
## Specialised Commands
For more control, use specific commands:
| Command | Use Case |
|---------|----------|
| `/collect:github` | GitHub repos with org support |
| `/collect:website` | Website crawling with depth control |
| `/collect:excavate` | Full project archaeology |
| `/collect:papers` | Whitepaper collection from registry |

View file

@ -0,0 +1,69 @@
---
name: excavate
description: Full archaeological dig on a dead/dying CryptoNote project
args: <project-name or org>
---
# Project Excavation
Run a full archaeological dig on an abandoned project using Borg.
## Usage
```
/collect:excavate masari
/collect:excavate graft-project
/collect:excavate turtlecoin
```
## Action
Run a multi-step excavation:
### 1. Collect GitHub repos
```bash
borg collect github repos <org> --format stim -o digs/<project>/github.stim
```
### 2. Collect website (if still up)
```bash
borg collect website https://<project-domain> --depth 3 --format stim -o digs/<project>/website.stim
```
### 3. Collect Wayback snapshots
```bash
borg collect website "https://web.archive.org/web/*/<project-domain>" --depth 2 -o digs/<project>/wayback.stim
```
### 4. Create reports
After collection, create:
- `digs/<project>/SALVAGE-REPORT.md` - What's worth keeping
- `digs/<project>/LESSONS.md` - Post-mortem analysis
## Known Dig Sites
| Project | Org | Domain | Urgency |
|---------|-----|--------|---------|
| Masari | masari-project | getmasari.org | HIGH |
| TurtleCoin | turtlecoin | turtlecoin.lol | HIGH |
| Graft | graft-project | graft.network | HIGH |
| Aeon | aeonix | aeon.cash | MEDIUM |
| Nerva | nerva-project | nerva.one | MEDIUM |
| Loki/Oxen | oxen-io | oxen.io | MEDIUM |
## Output Structure
```
digs/<project>/
├── github.stim # All repos (encrypted)
├── website.stim # Current site (encrypted)
├── wayback.stim # Historical snapshots
├── SALVAGE-REPORT.md # What to keep
└── LESSONS.md # Post-mortem
```
## Report Templates
Use templates from `skills/project-archaeology/templates/`:
- `SALVAGE-REPORT.md` - Valuable code/features
- `LESSONS.md` - Why it died, warning signs

View file

@ -0,0 +1,64 @@
---
name: github
description: Collect GitHub repositories or entire organisations using Borg
args: <url or org name> [--format stim|tim|tar] [-o output]
---
# GitHub Collection
Collect GitHub repositories using Borg.
## Usage
```
/collect:github LetheanNetwork
/collect:github https://github.com/monero-project/monero
/collect:github graft-project --format stim -o graft.stim
```
## Action
Determine if the argument is a repo URL or org name, then run the appropriate Borg command:
**For organisation (all repos):**
```bash
borg collect github repos <org> [--format <format>] [-o <output>]
```
**For single repo:**
```bash
borg collect github repo <url> [--format <format>] [-o <output>]
```
## Formats
| Format | Extension | Description |
|--------|-----------|-------------|
| tar | .tar | Plain tarball |
| tim | .tim | OCI-compatible container bundle |
| stim | .stim | Encrypted container (will prompt for password) |
## Examples
```bash
# Clone all Lethean repos
borg collect github repos LetheanNetwork
# Clone and encrypt
borg collect github repos graft-project --format stim -o graft-archive.stim
# Single repo
borg collect github repo https://github.com/monero-project/monero
```
## Target Registry
See `skills/github-history/SKILL.md` for the full list of CryptoNote orgs to collect.
### Quick Targets
**Active:**
- `monero-project`, `hyle-team`, `zanoio`, `wownero`
**Salvage Priority:**
- `graft-project`, `turtlecoin`, `masari-project`, `oxen-io`

View file

@ -0,0 +1,74 @@
---
name: papers
description: Collect whitepapers from the ledger-papers registry
args: [--category <name>] [--all] [--search <term>]
---
# Whitepaper Collection
Collect academic papers and whitepapers from the registry.
## Usage
```
/collect:papers --category cryptography
/collect:papers --all
/collect:papers --search bulletproofs
```
## Action
### List categories
```bash
jq -r '.papers[].category' skills/ledger-papers/registry.json | sort -u
```
### Collect by category
```bash
# Get URLs for a category
jq -r '.papers[] | select(.category == "<category>") | .url' skills/ledger-papers/registry.json > urls.txt
# Download each
while read url; do
borg collect website "$url" --depth 0
done < urls.txt
```
### Collect all
```bash
jq -r '.papers[].url' skills/ledger-papers/registry.json | while read url; do
borg collect website "$url" --depth 0
done
```
## Categories
| Category | Count | Examples |
|----------|-------|----------|
| genesis | 4 | Bitcoin, b-money, hashcash |
| cryptonote | 2 | CryptoNote v2.0, CNS standards |
| mrl | 11 | Monero Research Lab papers |
| privacy | 9 | Zcash, Mimblewimble, Lelantus |
| cryptography | 10 | Bulletproofs, CLSAG, PLONK |
| consensus | 7 | PBFT, Tendermint, Casper |
| defi | 7 | Uniswap, Aave, Compound |
| layer2 | 7 | Lightning, Plasma, Rollups |
## Academic Sources
For papers not in registry, search:
```bash
# IACR ePrint
borg collect website "https://eprint.iacr.org/search?q=<term>" --depth 1
# arXiv
borg collect website "https://arxiv.org/search/?query=<term>&searchtype=all" --depth 1
```
## Output
Papers are collected to:
```
skills/ledger-papers/archive/<category>/<paper>.pdf
```

View file

@ -0,0 +1,55 @@
---
name: website
description: Crawl and collect a website using Borg
args: <url> [--depth N] [--format stim|tim|tar] [-o output]
---
# Website Collection
Crawl and collect websites using Borg.
## Usage
```
/collect:website https://getmasari.org
/collect:website https://docs.lethean.io --depth 3
/collect:website https://graft.network --format stim -o graft-site.stim
```
## Action
Run Borg to crawl the website:
```bash
borg collect website <url> [--depth <N>] [--format <format>] [-o <output>]
```
Default depth is 2 levels.
## Options
| Option | Default | Description |
|--------|---------|-------------|
| `--depth` | 2 | How many levels deep to crawl |
| `--format` | tar | Output format (tar, tim, stim) |
| `-o` | auto | Output filename |
## Examples
```bash
# Basic crawl
borg collect website https://getmasari.org
# Deep crawl with encryption
borg collect website https://docs.lethean.io --depth 5 --format stim -o lethean-docs.stim
# Wayback Machine archive
borg collect website "https://web.archive.org/web/*/graft.network" --depth 3
```
## Use Cases
1. **Project Documentation** - Archive docs before they go offline
2. **Wayback Snapshots** - Collect historical versions
3. **Forum Threads** - Archive discussion pages
4. **PWA Collection** - Use `borg collect pwa` for progressive web apps