705 lines
19 KiB
Markdown
705 lines
19 KiB
Markdown
|
|
# Host Hub Task Protocol
|
||
|
|
|
||
|
|
**Version:** 2.1
|
||
|
|
**Created:** 2026-01-01
|
||
|
|
**Updated:** 2026-01-16
|
||
|
|
**Purpose:** Ensure agent work is verified before being marked complete, and provide patterns for efficient parallel implementation.
|
||
|
|
|
||
|
|
> **Lesson learned (Jan 2026):** Task files written as checklists without implementation evidence led to 6+ "complete" tasks that were actually 70-85% done. Planning ≠ implementation. Evidence required.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## The Problem
|
||
|
|
|
||
|
|
Agents optimise for conversation completion, not task completion. Saying "done" is computationally cheaper than doing the work. Context compaction loses task state. Nobody verifies output against spec.
|
||
|
|
|
||
|
|
## The Solution
|
||
|
|
|
||
|
|
Separation of concerns:
|
||
|
|
1. **Planning Agent** — writes the spec
|
||
|
|
2. **Implementation Agent** — does the work
|
||
|
|
3. **Verification Agent** — checks the work against spec
|
||
|
|
4. **Human** — approves or rejects based on verification
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Directory Structure
|
||
|
|
|
||
|
|
```
|
||
|
|
doc/
|
||
|
|
├── TASK_PROTOCOL.md # This file
|
||
|
|
└── ... # Reference documentation
|
||
|
|
|
||
|
|
tasks/
|
||
|
|
├── TODO.md # Active task summary
|
||
|
|
├── TASK-XXX-feature.md # Active task specs
|
||
|
|
├── agentic-tasks/ # Agentic system tasks
|
||
|
|
└── future-products/ # Parked product plans
|
||
|
|
|
||
|
|
archive/
|
||
|
|
├── released/ # Completed tasks (for reference)
|
||
|
|
└── ... # Historical snapshots
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Task File Schema
|
||
|
|
|
||
|
|
Every task file follows this structure:
|
||
|
|
|
||
|
|
```markdown
|
||
|
|
# TASK-XXX: [Short Title]
|
||
|
|
|
||
|
|
**Status:** draft | ready | in_progress | needs_verification | verified | approved
|
||
|
|
**Created:** YYYY-MM-DD
|
||
|
|
**Last Updated:** YYYY-MM-DD HH:MM by [agent/human]
|
||
|
|
**Assignee:** [agent session or human]
|
||
|
|
**Verifier:** [different agent session]
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Objective
|
||
|
|
|
||
|
|
[One paragraph: what does "done" look like?]
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Acceptance Criteria
|
||
|
|
|
||
|
|
- [ ] AC1: [Specific, verifiable condition]
|
||
|
|
- [ ] AC2: [Specific, verifiable condition]
|
||
|
|
- [ ] AC3: [Specific, verifiable condition]
|
||
|
|
|
||
|
|
Each criterion must be:
|
||
|
|
- Binary (yes/no, not "mostly")
|
||
|
|
- Verifiable by code inspection or test
|
||
|
|
- Independent (can check without context)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Implementation Checklist
|
||
|
|
|
||
|
|
- [ ] File: `path/to/file.php` — [what it should contain]
|
||
|
|
- [ ] File: `path/to/other.php` — [what it should contain]
|
||
|
|
- [ ] Test: `tests/Feature/XxxTest.php` passes
|
||
|
|
- [ ] Migration: runs without error
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Verification Results
|
||
|
|
|
||
|
|
### Check 1: [Date] by [Agent]
|
||
|
|
|
||
|
|
| Criterion | Status | Evidence |
|
||
|
|
|-----------|--------|----------|
|
||
|
|
| AC1 | ✅ PASS | File exists at path, contains X |
|
||
|
|
| AC2 | ❌ FAIL | Missing method Y in class Z |
|
||
|
|
| AC3 | ⚠️ PARTIAL | 3 of 5 tests pass |
|
||
|
|
|
||
|
|
**Verdict:** FAIL — AC2 not met
|
||
|
|
|
||
|
|
### Check 2: [Date] by [Agent]
|
||
|
|
|
||
|
|
| Criterion | Status | Evidence |
|
||
|
|
|-----------|--------|----------|
|
||
|
|
| AC1 | ✅ PASS | File exists at path, contains X |
|
||
|
|
| AC2 | ✅ PASS | Method Y added, verified |
|
||
|
|
| AC3 | ✅ PASS | All 5 tests pass |
|
||
|
|
|
||
|
|
**Verdict:** PASS — ready for human approval
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Notes
|
||
|
|
|
||
|
|
[Any context, blockers, decisions made during implementation]
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Implementation Evidence (Required)
|
||
|
|
|
||
|
|
**A checklist is not evidence. Prove the work exists.**
|
||
|
|
|
||
|
|
Every completed phase MUST include:
|
||
|
|
|
||
|
|
### 1. Git Evidence
|
||
|
|
```markdown
|
||
|
|
**Commits:**
|
||
|
|
- `abc123` - Add Domain model and migration
|
||
|
|
- `def456` - Add DomainController with CRUD
|
||
|
|
- `ghi789` - Add 28 domain tests
|
||
|
|
```
|
||
|
|
|
||
|
|
### 2. Test Count
|
||
|
|
```markdown
|
||
|
|
**Tests:** 28 passing (run: `php artisan test app/Mod/Bio/Tests/Feature/DomainTest.php`)
|
||
|
|
```
|
||
|
|
|
||
|
|
### 3. File Manifest
|
||
|
|
```markdown
|
||
|
|
**Files created/modified:**
|
||
|
|
- `app/Mod/Bio/Models/Domain.php` (new)
|
||
|
|
- `app/Mod/Bio/Http/Controllers/DomainController.php` (new)
|
||
|
|
- `database/migrations/2026_01_16_create_domains_table.php` (new)
|
||
|
|
- `app/Mod/Bio/Tests/Feature/DomainTest.php` (new)
|
||
|
|
```
|
||
|
|
|
||
|
|
### 4. "What Was Built" Summary
|
||
|
|
```markdown
|
||
|
|
**Summary:** Custom domain management with DNS verification. Users can add domains,
|
||
|
|
system generates TXT record for verification, background job checks DNS propagation.
|
||
|
|
Includes SSL provisioning via Caddy API.
|
||
|
|
```
|
||
|
|
|
||
|
|
### Why This Matters
|
||
|
|
|
||
|
|
In Jan 2026, an audit found:
|
||
|
|
- Commerce Matrix Plan marked "95% done" was actually 75%
|
||
|
|
- Internal WAF section was skipped entirely (extracted to Core Bouncer)
|
||
|
|
- Warehouse/fulfillment (6 features) listed as "one item" in TODO
|
||
|
|
- Task files read like planning documents, not completion logs
|
||
|
|
|
||
|
|
**Without evidence, "done" means nothing.**
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Workflow
|
||
|
|
|
||
|
|
### 1. Task Creation
|
||
|
|
|
||
|
|
Human or planning agent creates task file in `tasks/`:
|
||
|
|
- Status: `draft`
|
||
|
|
- Must have clear acceptance criteria
|
||
|
|
- Must have implementation checklist
|
||
|
|
|
||
|
|
### 2. Task Ready
|
||
|
|
|
||
|
|
Human reviews and sets:
|
||
|
|
- Status: `ready`
|
||
|
|
- Assignee: `next available agent`
|
||
|
|
|
||
|
|
### 3. Implementation
|
||
|
|
|
||
|
|
Implementation agent:
|
||
|
|
- Sets status: `in_progress`
|
||
|
|
- Works through implementation checklist
|
||
|
|
- Checks boxes as work is done
|
||
|
|
- When complete, sets status: `needs_verification`
|
||
|
|
- **MUST NOT** mark acceptance criteria as passed
|
||
|
|
|
||
|
|
### 4. Verification
|
||
|
|
|
||
|
|
Different agent (verification agent):
|
||
|
|
- Reads the task file
|
||
|
|
- Independently checks each acceptance criterion
|
||
|
|
- Records evidence in Verification Results section
|
||
|
|
- Sets verdict: PASS or FAIL
|
||
|
|
- If PASS: status → `verified`, move to `archive/released/`
|
||
|
|
- If FAIL: status → `in_progress`, back to implementation agent
|
||
|
|
|
||
|
|
### 5. Human Approval
|
||
|
|
|
||
|
|
Human reviews verified task:
|
||
|
|
- Spot-check the evidence
|
||
|
|
- If satisfied: status → `approved`, can delete or keep in archive
|
||
|
|
- If not: back to `needs_verification` with notes
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Agent Instructions
|
||
|
|
|
||
|
|
### For Implementation Agents
|
||
|
|
|
||
|
|
```
|
||
|
|
You are implementing TASK-XXX.
|
||
|
|
|
||
|
|
1. Read the full task file
|
||
|
|
2. Set status to "in_progress"
|
||
|
|
3. Work through the implementation checklist
|
||
|
|
4. Check boxes ONLY for work you have completed
|
||
|
|
5. When done, set status to "needs_verification"
|
||
|
|
6. DO NOT check acceptance criteria boxes
|
||
|
|
7. DO NOT mark the task as complete
|
||
|
|
8. Update "Last Updated" with current timestamp
|
||
|
|
|
||
|
|
Your job is to do the work, not to verify it.
|
||
|
|
```
|
||
|
|
|
||
|
|
### For Verification Agents
|
||
|
|
|
||
|
|
```
|
||
|
|
You are verifying TASK-XXX.
|
||
|
|
|
||
|
|
1. Read the full task file
|
||
|
|
2. For EACH acceptance criterion:
|
||
|
|
a. Check the codebase independently
|
||
|
|
b. Record what you found (file paths, line numbers, test output)
|
||
|
|
c. Mark as PASS, FAIL, or PARTIAL with evidence
|
||
|
|
3. Add a new "Verification Results" section with today's date
|
||
|
|
4. Set verdict: PASS or FAIL
|
||
|
|
5. If PASS: move file to archive/released/
|
||
|
|
6. If FAIL: set status back to "in_progress"
|
||
|
|
7. Update "Last Updated" with current timestamp
|
||
|
|
|
||
|
|
You are the gatekeeper. Be thorough. Trust nothing the implementation agent said.
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Status Flow
|
||
|
|
|
||
|
|
```
|
||
|
|
draft → ready → in_progress → needs_verification → verified → approved
|
||
|
|
↑ │
|
||
|
|
└────────────────────┘
|
||
|
|
(if verification fails)
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Phase-Based Decomposition
|
||
|
|
|
||
|
|
Large tasks should be decomposed into independent phases that can be executed in parallel by multiple agents. This dramatically reduces implementation time.
|
||
|
|
|
||
|
|
### Phase Independence Rules
|
||
|
|
|
||
|
|
1. **No shared state** — Each phase writes to different files/tables
|
||
|
|
2. **No blocking dependencies** — Phase 3 shouldn't wait for Phase 2's output
|
||
|
|
3. **Clear boundaries** — Each phase has its own acceptance criteria
|
||
|
|
4. **Testable isolation** — Phase tests don't require other phases
|
||
|
|
|
||
|
|
### Example Decomposition
|
||
|
|
|
||
|
|
A feature like "BioHost Missing Features" might decompose into:
|
||
|
|
|
||
|
|
| Phase | Focus | Can Parallel With |
|
||
|
|
|-------|-------|-------------------|
|
||
|
|
| 1 | Domain Management | 2, 3, 4 |
|
||
|
|
| 2 | Project System | 1, 3, 4 |
|
||
|
|
| 3 | Analytics Core | 1, 2, 4 |
|
||
|
|
| 4 | Form Submissions | 1, 2, 3 |
|
||
|
|
| 5 | Link Scheduling | 1, 2, 3, 4 |
|
||
|
|
| ... | ... | ... |
|
||
|
|
| 12 | MCP Tools (polish) | After 1-11 |
|
||
|
|
| 13 | Admin UI (polish) | After 1-11 |
|
||
|
|
|
||
|
|
### Phase Sizing
|
||
|
|
|
||
|
|
- **Target**: 4-8 acceptance criteria per phase
|
||
|
|
- **Estimated time**: 2-4 hours per phase
|
||
|
|
- **Test count**: 15-40 tests per phase
|
||
|
|
- **File count**: 3-10 files modified per phase
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Standard Phase Types
|
||
|
|
|
||
|
|
Every large task should include these phase types:
|
||
|
|
|
||
|
|
### Core Implementation Phases (1-N)
|
||
|
|
|
||
|
|
The main feature work. Group by:
|
||
|
|
- **Resource type** (domains, projects, analytics)
|
||
|
|
- **Functional area** (CRUD, scheduling, notifications)
|
||
|
|
- **Data flow** (input, processing, output)
|
||
|
|
|
||
|
|
### Polish Phase: MCP Tools
|
||
|
|
|
||
|
|
**Always include as second-to-last phase.**
|
||
|
|
|
||
|
|
Exposes all implemented features to AI agents via MCP protocol.
|
||
|
|
|
||
|
|
Standard acceptance criteria:
|
||
|
|
- [ ] MCP tool class exists at `app/Mcp/Tools/{Feature}Tools.php`
|
||
|
|
- [ ] All CRUD operations exposed as actions
|
||
|
|
- [ ] Tool includes prompts for common workflows
|
||
|
|
- [ ] Tool includes resources for data access
|
||
|
|
- [ ] Tests verify all MCP actions return expected responses
|
||
|
|
- [ ] Tool registered in MCP service provider
|
||
|
|
|
||
|
|
### Polish Phase: Admin UI Integration
|
||
|
|
|
||
|
|
**Always include as final phase.**
|
||
|
|
|
||
|
|
Integrates features into the admin dashboard.
|
||
|
|
|
||
|
|
Standard acceptance criteria:
|
||
|
|
- [ ] Sidebar navigation updated with feature section
|
||
|
|
- [ ] Index/list page with filtering and search
|
||
|
|
- [ ] Detail/edit pages for resources
|
||
|
|
- [ ] Bulk actions where appropriate
|
||
|
|
- [ ] Breadcrumb navigation
|
||
|
|
- [ ] Role-based access control
|
||
|
|
- [ ] Tests verify all admin routes respond correctly
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Parallel Agent Execution
|
||
|
|
|
||
|
|
### Firing Multiple Agents
|
||
|
|
|
||
|
|
When phases are independent, fire agents simultaneously:
|
||
|
|
|
||
|
|
```
|
||
|
|
Human: "Implement phases 1-4 in parallel"
|
||
|
|
|
||
|
|
Agent fires 4 Task tools simultaneously:
|
||
|
|
- Task(Phase 1: Domain Management)
|
||
|
|
- Task(Phase 2: Project System)
|
||
|
|
- Task(Phase 3: Analytics Core)
|
||
|
|
- Task(Phase 4: Form Submissions)
|
||
|
|
```
|
||
|
|
|
||
|
|
### Agent Prompt Template
|
||
|
|
|
||
|
|
```
|
||
|
|
You are implementing Phase X of TASK-XXX: [Task Title]
|
||
|
|
|
||
|
|
Read the task file at: tasks/TASK-XXX-feature-name.md
|
||
|
|
|
||
|
|
Your phase covers acceptance criteria ACxx through ACyy.
|
||
|
|
|
||
|
|
Implementation requirements:
|
||
|
|
1. Create all files listed in the Phase X implementation checklist
|
||
|
|
2. Write comprehensive Pest tests (target: 20-40 tests)
|
||
|
|
3. Follow existing codebase patterns
|
||
|
|
4. Use workspace-scoped multi-tenancy
|
||
|
|
5. Check entitlements for tier-gated features
|
||
|
|
|
||
|
|
When complete:
|
||
|
|
1. Update the task file marking Phase X checklist items done
|
||
|
|
2. Report: files created, test count, any blockers
|
||
|
|
|
||
|
|
Do NOT mark acceptance criteria as passed — verification agent does that.
|
||
|
|
```
|
||
|
|
|
||
|
|
### Coordination Rules
|
||
|
|
|
||
|
|
1. **Linter accepts all** — Configure to auto-accept agent file modifications
|
||
|
|
2. **No merge conflicts** — Phases write to different files
|
||
|
|
3. **Collect results** — Wait for all agents, then fire next wave
|
||
|
|
4. **Wave pattern** — Group dependent phases into waves
|
||
|
|
|
||
|
|
### Wave Execution Example
|
||
|
|
|
||
|
|
```
|
||
|
|
Wave 1 (parallel): Phases 1, 2, 3, 4
|
||
|
|
↓ (all complete)
|
||
|
|
Wave 2 (parallel): Phases 5, 6, 7, 8
|
||
|
|
↓ (all complete)
|
||
|
|
Wave 3 (parallel): Phases 9, 10, 11
|
||
|
|
↓ (all complete)
|
||
|
|
Wave 4 (sequential): Phase 12 (MCP), then Phase 13 (UI)
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Task File Schema (Extended)
|
||
|
|
|
||
|
|
For large phased tasks, extend the schema:
|
||
|
|
|
||
|
|
```markdown
|
||
|
|
# TASK-XXX: [Feature Name]
|
||
|
|
|
||
|
|
**Status:** draft | ready | in_progress | needs_verification | verified | approved
|
||
|
|
**Created:** YYYY-MM-DD
|
||
|
|
**Last Updated:** YYYY-MM-DD HH:MM by [agent/human]
|
||
|
|
**Complexity:** small (1-3 phases) | medium (4-8 phases) | large (9+ phases)
|
||
|
|
**Estimated Phases:** N
|
||
|
|
**Completed Phases:** M/N
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Objective
|
||
|
|
|
||
|
|
[One paragraph: what does "done" look like?]
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Scope
|
||
|
|
|
||
|
|
- **Models:** X new, Y modified
|
||
|
|
- **Migrations:** Z new tables
|
||
|
|
- **Livewire Components:** A new
|
||
|
|
- **Tests:** B target test count
|
||
|
|
- **Estimated Hours:** C-D hours
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Phase Overview
|
||
|
|
|
||
|
|
| Phase | Name | Status | ACs | Tests |
|
||
|
|
|-------|------|--------|-----|-------|
|
||
|
|
| 1 | Domain Management | ✅ Done | AC1-5 | 28 |
|
||
|
|
| 2 | Project System | ✅ Done | AC6-10 | 32 |
|
||
|
|
| 3 | Analytics Core | 🔄 In Progress | AC11-16 | - |
|
||
|
|
| ... | ... | ... | ... | ... |
|
||
|
|
| 12 | MCP Tools | ⏳ Pending | AC47-53 | - |
|
||
|
|
| 13 | Admin UI | ⏳ Pending | AC54-61 | - |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Acceptance Criteria
|
||
|
|
|
||
|
|
### Phase 1: Domain Management
|
||
|
|
|
||
|
|
- [ ] AC1: [Criterion]
|
||
|
|
- [ ] AC2: [Criterion]
|
||
|
|
...
|
||
|
|
|
||
|
|
### Phase 12: MCP Tools (Standard)
|
||
|
|
|
||
|
|
- [ ] AC47: MCP tool class exists with all feature actions
|
||
|
|
- [ ] AC48: CRUD operations for all resources exposed
|
||
|
|
- [ ] AC49: Bulk operations exposed (where applicable)
|
||
|
|
- [ ] AC50: Query/filter operations exposed
|
||
|
|
- [ ] AC51: MCP prompts created for common workflows
|
||
|
|
- [ ] AC52: MCP resources expose read-only data access
|
||
|
|
- [ ] AC53: Tests verify all MCP actions
|
||
|
|
|
||
|
|
### Phase 13: Admin UI Integration (Standard)
|
||
|
|
|
||
|
|
- [ ] AC54: Sidebar updated with feature navigation
|
||
|
|
- [ ] AC55: Feature has expandable submenu (if 3+ pages)
|
||
|
|
- [ ] AC56: Index pages with DataTable/filtering
|
||
|
|
- [ ] AC57: Create/Edit forms with validation
|
||
|
|
- [ ] AC58: Detail views with related data
|
||
|
|
- [ ] AC59: Bulk action support
|
||
|
|
- [ ] AC60: Breadcrumb navigation
|
||
|
|
- [ ] AC61: Role-based visibility
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Implementation Checklist
|
||
|
|
|
||
|
|
### Phase 1: Domain Management
|
||
|
|
- [ ] File: `app/Models/...`
|
||
|
|
- [ ] File: `app/Livewire/...`
|
||
|
|
- [ ] Test: `tests/Feature/...`
|
||
|
|
|
||
|
|
### Phase 12: MCP Tools
|
||
|
|
- [ ] File: `app/Mcp/Tools/{Feature}Tools.php`
|
||
|
|
- [ ] File: `app/Mcp/Prompts/{Feature}Prompts.php` (optional)
|
||
|
|
- [ ] File: `app/Mcp/Resources/{Feature}Resources.php` (optional)
|
||
|
|
- [ ] Test: `tests/Feature/Mcp/{Feature}ToolsTest.php`
|
||
|
|
|
||
|
|
### Phase 13: Admin UI
|
||
|
|
- [ ] File: `resources/views/admin/components/sidebar.blade.php` (update)
|
||
|
|
- [ ] File: `app/Livewire/Admin/{Feature}/Index.php`
|
||
|
|
- [ ] File: `resources/views/livewire/admin/{feature}/index.blade.php`
|
||
|
|
- [ ] Test: `tests/Feature/Admin/{Feature}Test.php`
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Verification Results
|
||
|
|
|
||
|
|
[Same as before]
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Phase Completion Log
|
||
|
|
|
||
|
|
### Phase 1: Domain Management
|
||
|
|
**Completed:** YYYY-MM-DD by [Agent ID]
|
||
|
|
**Tests:** 28 passing
|
||
|
|
**Files:** 8 created/modified
|
||
|
|
**Notes:** [Any context]
|
||
|
|
|
||
|
|
### Phase 2: Project System
|
||
|
|
**Completed:** YYYY-MM-DD by [Agent ID]
|
||
|
|
**Tests:** 32 passing
|
||
|
|
...
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## MCP Endpoint (Future)
|
||
|
|
|
||
|
|
When implemented, the MCP endpoint will expose:
|
||
|
|
|
||
|
|
```
|
||
|
|
GET /tasks # List all tasks with status
|
||
|
|
GET /tasks/{id} # Get task details
|
||
|
|
POST /tasks/{id}/claim # Agent claims a task
|
||
|
|
POST /tasks/{id}/complete # Agent marks ready for verification
|
||
|
|
POST /tasks/{id}/verify # Verification agent submits results
|
||
|
|
GET /tasks/next # Get next unclaimed task
|
||
|
|
GET /tasks/verify-queue # Get tasks needing verification
|
||
|
|
POST /tasks/{id}/phases/{n}/claim # Claim specific phase
|
||
|
|
POST /tasks/{id}/phases/{n}/complete # Complete specific phase
|
||
|
|
GET /tasks/{id}/phases # List phase status
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Metrics to Track
|
||
|
|
|
||
|
|
- Tasks created vs completed (per week)
|
||
|
|
- Verification pass rate on first attempt
|
||
|
|
- Average time from ready → approved
|
||
|
|
- Most common failure reasons
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Cross-Cutting Concerns
|
||
|
|
|
||
|
|
When a feature applies to multiple modules, extract it.
|
||
|
|
|
||
|
|
### Example: Core Bouncer
|
||
|
|
|
||
|
|
The Commerce Matrix Plan included an "Internal WAF" section — a request whitelisting system with training mode. During audit, we realised:
|
||
|
|
|
||
|
|
- It's not commerce-specific
|
||
|
|
- It applies to all admin routes, all API endpoints
|
||
|
|
- It should be in `Core/`, not `Commerce/`
|
||
|
|
|
||
|
|
**Action:** Extracted to `CORE_BOUNCER_PLAN.md` as a framework-level concern.
|
||
|
|
|
||
|
|
### Signs to Extract
|
||
|
|
|
||
|
|
- Feature name doesn't include the module name naturally
|
||
|
|
- You'd copy-paste it to other modules
|
||
|
|
- It's about infrastructure, not business logic
|
||
|
|
- Multiple modules would benefit independently
|
||
|
|
|
||
|
|
### How to Extract
|
||
|
|
|
||
|
|
1. Create new task file for the cross-cutting concern
|
||
|
|
2. Add note to original plan: `> **EXTRACTED:** Section moved to X`
|
||
|
|
3. Update TODO.md with the new task
|
||
|
|
4. Don't delete from original — leave the note for context
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Retrospective Audits
|
||
|
|
|
||
|
|
Periodically audit archived tasks against actual implementation.
|
||
|
|
|
||
|
|
### When to Audit
|
||
|
|
|
||
|
|
- Before starting dependent work
|
||
|
|
- When resuming a project after a break
|
||
|
|
- When something "complete" seems broken
|
||
|
|
- Monthly for active projects
|
||
|
|
|
||
|
|
### Audit Process
|
||
|
|
|
||
|
|
1. Read the archived task file
|
||
|
|
2. Check each acceptance criterion against codebase
|
||
|
|
3. Run the tests mentioned in the task
|
||
|
|
4. Document gaps found
|
||
|
|
|
||
|
|
### Audit Template
|
||
|
|
|
||
|
|
```markdown
|
||
|
|
## Audit: TASK-XXX
|
||
|
|
**Date:** YYYY-MM-DD
|
||
|
|
**Auditor:** [human/agent]
|
||
|
|
|
||
|
|
| Claimed | Actual | Gap |
|
||
|
|
|---------|--------|-----|
|
||
|
|
| Phase 1 complete | ✅ Verified | None |
|
||
|
|
| Phase 2 complete | ⚠️ Partial | Missing X service |
|
||
|
|
| Phase 3 complete | ❌ Not done | Only stubs exist |
|
||
|
|
|
||
|
|
**Action items:**
|
||
|
|
- [ ] Create TASK-YYY for Phase 2 gap
|
||
|
|
- [ ] Move Phase 3 back to TODO as incomplete
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Anti-Patterns to Avoid
|
||
|
|
|
||
|
|
### General
|
||
|
|
|
||
|
|
1. **Same agent implements and verifies** — defeats the purpose
|
||
|
|
2. **Vague acceptance criteria** — "it works" is not verifiable
|
||
|
|
3. **Skipping verification** — the whole point is independent checking
|
||
|
|
4. **Bulk marking as done** — verify one task at a time
|
||
|
|
5. **Human approving without spot-check** — trust but verify
|
||
|
|
|
||
|
|
### Evidence & Documentation
|
||
|
|
|
||
|
|
6. **Checklist without evidence** — planning ≠ implementation
|
||
|
|
7. **Skipping "What Was Built" summary** — context lost on compaction
|
||
|
|
8. **No test count** — can't verify without knowing what to run
|
||
|
|
9. **Marking section "done" without implementation** — major gaps discovered in audits
|
||
|
|
10. **Vague TODO items** — "Warehouse system" hides 6 distinct features
|
||
|
|
|
||
|
|
### Parallel Execution
|
||
|
|
|
||
|
|
11. **Phases with shared files** — causes merge conflicts
|
||
|
|
12. **Sequential dependencies in same wave** — blocks parallelism
|
||
|
|
13. **Skipping polish phases** — features hidden from agents and admins
|
||
|
|
14. **Too many phases per wave** — diminishing returns past 4-5 agents
|
||
|
|
15. **No wave boundaries** — chaos when phases actually do depend
|
||
|
|
|
||
|
|
### MCP Tools
|
||
|
|
|
||
|
|
16. **Exposing without testing** — broken tools waste agent time
|
||
|
|
17. **Missing bulk operations** — agents do N calls instead of 1
|
||
|
|
18. **No error context** — agents can't debug failures
|
||
|
|
|
||
|
|
### Admin UI
|
||
|
|
|
||
|
|
19. **Flat navigation for large features** — use expandable submenus
|
||
|
|
20. **Missing breadcrumbs** — users get lost
|
||
|
|
21. **No bulk actions** — tedious admin experience
|
||
|
|
|
||
|
|
### Cross-Cutting Concerns
|
||
|
|
|
||
|
|
22. **Burying framework features in module plans** — extract them
|
||
|
|
23. **Assuming module-specific when it's not** — ask "would other modules need this?"
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Quick Reference: Creating a New Task
|
||
|
|
|
||
|
|
1. Copy the extended schema template
|
||
|
|
2. Fill in objective and scope
|
||
|
|
3. Decompose into phases (aim for 4-8 ACs each)
|
||
|
|
4. Map phase dependencies → wave structure
|
||
|
|
5. Check for cross-cutting concerns — extract if needed
|
||
|
|
6. **Always add Phase N-1: MCP Tools**
|
||
|
|
7. **Always add Phase N: Admin UI Integration**
|
||
|
|
8. Set status to `draft`, get human review
|
||
|
|
9. When `ready`, fire Wave 1 agents in parallel
|
||
|
|
10. Collect results with evidence (commits, tests, files)
|
||
|
|
11. Fire next wave
|
||
|
|
12. After all phases, run verification agent
|
||
|
|
13. Human approval → move to `archive/released/`
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Quick Reference: Completing a Phase
|
||
|
|
|
||
|
|
1. Do the work
|
||
|
|
2. Run the tests
|
||
|
|
3. Record evidence:
|
||
|
|
- Git commits (hashes + messages)
|
||
|
|
- Test count and command to run them
|
||
|
|
- Files created/modified
|
||
|
|
- "What Was Built" summary (2-3 sentences)
|
||
|
|
4. Update task file with Phase Completion Log entry
|
||
|
|
5. Set phase status to ✅ Done
|
||
|
|
6. Move to next phase or request verification
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Quick Reference: Auditing Archived Work
|
||
|
|
|
||
|
|
1. Read `archive/released/` task file
|
||
|
|
2. For each phase marked complete:
|
||
|
|
- Check files exist
|
||
|
|
- Run listed tests
|
||
|
|
- Verify against acceptance criteria
|
||
|
|
3. Document gaps using Audit Template
|
||
|
|
4. Create new tasks for missing work
|
||
|
|
5. Update TODO.md with accurate status
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
*This protocol exists because agents lie (unintentionally). The system catches the lies. Parallel execution makes them lie faster, so we verify more. Evidence requirements ensure lies are caught before archiving.*
|