re_struct/docs/superpowers/specs/2026-03-21-batch-orchestrat...

310 lines
13 KiB
Markdown

# Batch Orchestrator for re:struct
## Problem Statement
When tasks are pre-planned (e.g., discovery waves, user stories, architecture decisions),
the human operator is reduced to a manual scheduler:
```
session-start → "do task X" → session-end → clear context → commit → repeat
```
This is repetitive and adds no value when the next steps are already defined.
## Solution Overview
A lightweight in-session orchestrator that reads planned tasks from `task.md`,
dispatches subagents for execution (fresh context per task), and pauses at
configurable checkpoints for human review.
**Approach:** Slim orchestrator + subagent isolation (Approach 3 from evaluation).
- Subagents do the heavy work with their own context
- Orchestrator reads only task.md + subagent summaries (never full output files)
- Parallelization within groups, sequential between groups
- Phase boundary is a hard stop — never crossed automatically
## Task Parsing from task.md
### Which section does the orchestrator parse?
A `task.md` may contain multiple sections with checkboxes — e.g., a "Waves" section with
sub-headings AND a flat "Definition of Done" section. These serve different purposes:
- **Work section** (with `###` sub-headings): actionable tasks with grouping — this is what
the orchestrator parses. Identified by having sub-headings (`###`) under it.
- **Definition of Done section** (flat list): acceptance criteria for phase completion —
used only for phase-boundary checks, not for dispatching work.
**Convention:** The orchestrator parses the **first section in task.md that contains
`###` sub-headings with `- [ ]` checkboxes**. This is the work plan. Flat checkbox lists
without sub-headings are treated as DoD and ignored for dispatching.
Example `task.md` with both sections:
```markdown
## Analyse-Wellen ← WORK SECTION (orchestrator parses this)
### Wave 1 — Structure
- [ ] structure-map.md
### Wave 2 — Entry Points
- [ ] entry-points.md
### Wave 3 — Module Deep Dives
- [ ] Module: auth
- [ ] Module: invoicing
- [ ] Module: reporting
## Ergebnisse (Definition of Done) ← DoD SECTION (phase check only)
- [ ] structure-map.md
- [ ] domain-model.md
- [ ] capabilities-inventory.md
```
### Conventions
- `- [ ]` = open task, eligible for orchestrator
- `- [x]` = completed, skipped
- Tasks under the same `###` heading = **parallelization group** (can run concurrently)
- Tasks under different `###` headings = **sequential** (Group 2 starts after Group 1)
- Each task requires an **output path** — either explicit in the text or derivable from task name
- Each task must contain enough context for the subagent: what to do, which sources to read
### What the orchestrator does NOT read from task.md
- Checkpoint mode and batch size — provided by the user at invocation time
- The Definition of Done section — only relevant for phase-boundary checks
## Checkpoint Modes
Three modes, selected by the user at batch start:
| Mode | Behavior | Typical Use |
|------|----------|-------------|
| `after-each` (default) | Pause after each group, show summaries, wait for "continue" | New/uncertain tasks, want to review every output |
| `after-batch:N` | Run N groups, then pause | "Do the next 3, then I'll check" |
| `at-end` | Run all groups, consolidated review at the end | Well-understood tasks, high confidence |
### Checkpoint Behavior
At each checkpoint the orchestrator:
1. Shows per completed task: task name, output file path, 2-3 sentence summary (provided by subagent)
2. User can respond:
- `weiter` (continue) — next task / next batch
- `stopp` (stop) — orchestrator consolidates current results and ends
- `zeig mir [file]` (show me) — review an output file in detail before deciding
3. After final checkpoint: consolidation (status.md, decisions-log, task.md checkboxes, commit proposal)
### Checkpoint with Parallel Tasks
| Mode | Behavior with group of 3 parallel tasks |
|------|----------------------------------------|
| `after-each` | All 3 run in parallel, checkpoint when **all 3 complete** — show all summaries |
| `after-batch:N` | Batch counter counts groups, not individual tasks |
| `at-end` | Run through, checkpoint only after all groups |
Parallelization does not change checkpoint rhythm — checkpoints refer to groups, not individual subagents.
## Phase Protection
- The orchestrator only processes tasks from the **active phase** (defined in CLAUDE.md "Aktiver Scope")
- When the last open task of a phase completes: orchestrator **stops unconditionally** and informs the user that all DoD items are done and a phase transition is possible
- The orchestrator **never** triggers `/project:phase-transition` itself or starts tasks from the next phase
- Even in `at-end` mode: phase boundary = hard stop, no override possible
This is an implicit fourth checkpoint type: the **phase checkpoint**, which always applies regardless of selected mode.
## Relationship to session-start / session-end
The batch orchestrator **replaces** the manual session-start/end cycle for batch work:
- **session-start is not required** before run-batch. The orchestrator reads task.md and
CLAUDE.md directly — it does not need the user to restore context manually.
- **session-end is incorporated** into the orchestrator's consolidation step (Step 4).
The consolidation performs the same updates as session-end: status.md, decisions-log.md,
task.md checkboxes. A separate session-end after run-batch is not needed.
- **The user may still use session-start** before run-batch if they want to review context
first — but it is optional, not required.
In short: `/project:run-batch` is a self-contained replacement for the
`session-start → work → session-end → clear → commit` cycle.
## Orchestrator Flow
```
User: "Work through the open discovery tasks" (or /project:run-batch)
├─ 1. Read task.md of active phase, parse open tasks
├─ 1.5. Permission check:
│ • Verify required tools are allowed (Read, Write, Edit, Glob, Grep, Agent)
│ • If permissions missing → inform user, suggest settings.json config, abort
├─ 2. Check batch parameters — anything missing? → Ask interactively:
│ • "Which tasks? (all open / specific items / range)"
│ • "Checkpoint mode? (after-each / after-batch:N / at-end)"
│ • Show summary + wait for confirmation
├─ 3. Process groups sequentially:
│ │
│ ├─ Group 1:
│ │ ├─ Dispatch tasks in parallel (subagents)
│ │ ├─ Wait for completion
│ │ ├─ On subagent failure: mark task as failed, include error
│ │ │ in checkpoint summary, do not block other parallel tasks
│ │ ├─ Checkpoint per selected mode
│ │ └─ On "stop" → consolidation → end
│ │
│ ├─ Group 2:
│ │ └─ ... (only after Group 1 completes + checkpoint passes)
│ │
│ └─ Phase boundary reached?
│ → HARD STOP, inform user
├─ 4. Consolidation (orchestrator only, never subagents):
│ • task.md: mark completed items [x], failed items stay [ ]
│ • status.md: update current state (include failures if any)
│ • decisions-log.md: add entries if applicable
│ • Phase-specific shared artifacts (e.g., capabilities-inventory.md
│ in Discovery): update with consolidated findings from subagents
└─ 5. Commit proposal to user
```
## Subagent Contract
### Context Loading Strategy
Each subagent needs context to do its work. The orchestrator determines context through
a layered approach:
**Base context (always loaded):**
- Project rules of the active phase (framework-constraints, language-conventions, parallel-agents)
- status.md of the active phase (current state summary)
**Phase-specific context (loaded by convention):**
- If phase-specific rules define context requirements (e.g., `discovery-waves.md` specifies
"Wave 2 loads structure-map.md"), the orchestrator follows those rules.
**Task-specific context (optional, from task.md):**
- Tasks in task.md can declare explicit context dependencies:
```markdown
- [ ] domain-model.md (context: structure-map.md, entry-points.md)
```
- If no explicit context is declared: the orchestrator loads all existing output files
from prior groups in the current batch as context.
**Context budget principle:** The orchestrator always prefers condensed documents
(structure-map.md, status.md) over raw sources. Raw code is only passed when
the task explicitly requires it (e.g., a module deep-dive task references a source path).
### Input (provided by orchestrator)
- Exactly one task with clear assignment
- Context files per the loading strategy above
- Output file path
- Project rules (framework-constraints, language-conventions, parallel-agents)
### Output (returned to orchestrator)
- Output file written to disk at the specified path
- Summary (2-3 sentences) as return value to orchestrator
### Boundaries
- Subagent writes **only** its own output file — never shared state
- Subagent does **not** update task.md, status.md, or decisions-log.md
- Orchestrator reads **only** subagent summaries — never full output files
## Skill Interface
### Command
```
/project:run-batch → asks everything interactively
/project:run-batch items:1-3 checkpoint:after-each
/project:run-batch items:all checkpoint:at-end
/project:run-batch items:wave-3 checkpoint:after-batch:2
```
### Natural Language Triggers (examples)
```
"Arbeite die nächsten 3 offenen Tasks ab, zeig mir nach jedem das Ergebnis"
"Führe alle offenen Wave-3-Module parallel durch"
"Mach die Discovery-Tasks durch, Checkpoint am Ende"
```
### Interactive Parameter Query
When parameters are missing or incomplete, the orchestrator asks before starting:
```
Orchestrator: "Ich sehe 5 offene Tasks in Phase 1 — Discovery:
1. [ ] structure-map.md (Wave 1)
2. [ ] entry-points.md (Wave 2)
3. [ ] Module: auth (Wave 3)
4. [ ] Module: invoicing (Wave 3)
5. [ ] Module: reporting (Wave 3)
Welche Tasks soll ich abarbeiten? (alle / Nummern / Bereich)
Checkpoint-Modus? (after-each / after-batch:N / at-end)"
```
## Parallelization Rules
- Tasks under the same heading = one parallelization group
- Within a group: all tasks dispatched simultaneously as subagents
- Between groups: strictly sequential — Group 2 starts only after Group 1 fully completes AND checkpoint passes
- Checkpoint mode applies **per group**, not per individual task within parallel groups
## Permissions
Required permissions in `.claude/settings.json`:
```json
{
"permissions": {
"allow": [
"Read",
"Glob",
"Grep",
"Edit",
"Write",
"Agent"
]
}
}
```
- Configured once in settings.json, no runtime flags needed
- Subagents inherit the parent session's permission context
- Orchestrator checks permissions at startup (Step 1.5) and aborts with guidance if missing
## Artifacts
| Artifact | Purpose |
|----------|---------|
| `rules/batch-orchestration.md` | Rules for orchestrator/subagent contract |
| `.claude/commands/run-batch.md` | The skill/command definition |
| `.claude/settings.json` | Permission configuration |
| `docs/superpowers/specs/2026-03-21-batch-orchestrator-design.md` | This spec |
| `docs/superpowers/specs/2026-03-21-batch-orchestrator-future.md` | Future improvements |
## Decisions Made
| Decision | Rationale |
|----------|-----------|
| In-session orchestrator with subagents (not external script) | Keeps checkpoints interactive, stays in Claude ecosystem |
| Slim orchestrator (summaries only, no full output reads) | Prevents context overflow with many tasks |
| task.md as single source of truth | No redundant runbook file, existing structure is machine-readable |
| Heading-based parallelization groups | Already matches existing task.md structure |
| Phase boundary = hard stop | Phase transitions require human intent per framework rules |
| Missing parameters always queried interactively | Prevents accidental batch runs with wrong settings |
| Permission check at startup | Prevents hanging on permission prompts mid-batch |
| Default checkpoint mode: after-each | Safest option, user opts into more autonomy explicitly |
| Orchestrator parses work section (with ###), not DoD section | Avoids ambiguity between actionable tasks and acceptance criteria |
| run-batch replaces session-start/end cycle | Eliminates redundant manual steps; consolidation is built into the flow |
| Layered context loading for subagents | Base context always available; phase-specific and task-specific context keeps the approach generic across phases |