re_struct/docs/superpowers/specs/2026-03-21-batch-orchestrat...

# Batch Orchestrator for re:struct

## Problem Statement

When tasks are pre-planned (e.g., discovery waves, user stories, architecture decisions),
the human operator is reduced to a manual scheduler:

```
session-start → "do task X" → session-end → clear context → commit → repeat
```

This is repetitive and adds no value when the next steps are already defined.

## Solution Overview

A lightweight in-session orchestrator that reads planned tasks from `task.md`,
dispatches subagents for execution (fresh context per task), and pauses at
configurable checkpoints for human review.

**Approach:** Slim orchestrator + subagent isolation (Approach 3 from evaluation).

- Subagents do the heavy work with their own context
- Orchestrator reads only task.md + subagent summaries (never full output files)
- Parallelization within groups, sequential between groups
- Phase boundary is a hard stop — never crossed automatically

## Task Parsing from task.md

### Which section does the orchestrator parse?

A `task.md` may contain multiple sections with checkboxes — e.g., a "Waves" section with
sub-headings AND a flat "Definition of Done" section. These serve different purposes:

- **Work section** (with `###` sub-headings): actionable tasks with grouping — this is what
  the orchestrator parses. Identified by having sub-headings (`###`) under it.
- **Definition of Done section** (flat list): acceptance criteria for phase completion —
  used only for phase-boundary checks, not for dispatching work.

**Convention:** The orchestrator parses the **first section in task.md that contains
`###` sub-headings with `- [ ]` checkboxes**. This is the work plan. Flat checkbox lists
without sub-headings are treated as DoD and ignored for dispatching.

Example `task.md` with both sections:

```markdown
## Analyse-Wellen                        ← WORK SECTION (orchestrator parses this)
### Wave 1 — Structure
- [ ] structure-map.md

### Wave 2 — Entry Points
- [ ] entry-points.md

### Wave 3 — Module Deep Dives
- [ ] Module: auth
- [ ] Module: invoicing
- [ ] Module: reporting

## Ergebnisse (Definition of Done)       ← DoD SECTION (phase check only)
- [ ] structure-map.md
- [ ] domain-model.md
- [ ] capabilities-inventory.md
```

### Conventions

- `- [ ]` = open task, eligible for orchestrator
- `- [x]` = completed, skipped
- Tasks under the same `###` heading = **parallelization group** (can run concurrently)
- Tasks under different `###` headings = **sequential** (Group 2 starts after Group 1)
- Each task requires an **output path** — either explicit in the text or derivable from task name
- Each task must contain enough context for the subagent: what to do, which sources to read

### What the orchestrator does NOT read from task.md

- Checkpoint mode and batch size — provided by the user at invocation time
- The Definition of Done section — only relevant for phase-boundary checks

## Checkpoint Modes

Three modes, selected by the user at batch start:

| Mode | Behavior | Typical Use |
|------|----------|-------------|
| `after-each` (default) | Pause after each group, show summaries, wait for "continue" | New/uncertain tasks, want to review every output |
| `after-batch:N` | Run N groups, then pause | "Do the next 3, then I'll check" |
| `at-end` | Run all groups, consolidated review at the end | Well-understood tasks, high confidence |

### Checkpoint Behavior

At each checkpoint the orchestrator:

1. Shows per completed task: task name, output file path, 2-3 sentence summary (provided by subagent)
2. User can respond:
   - `weiter` (continue) — next task / next batch
   - `stopp` (stop) — orchestrator consolidates current results and ends
   - `zeig mir [file]` (show me) — review an output file in detail before deciding
3. After final checkpoint: consolidation (status.md, decisions-log, task.md checkboxes, commit proposal)

### Checkpoint with Parallel Tasks

| Mode | Behavior with group of 3 parallel tasks |
|------|----------------------------------------|
| `after-each` | All 3 run in parallel, checkpoint when **all 3 complete** — show all summaries |
| `after-batch:N` | Batch counter counts groups, not individual tasks |
| `at-end` | Run through, checkpoint only after all groups |

Parallelization does not change checkpoint rhythm — checkpoints refer to groups, not individual subagents.

## Phase Protection

- The orchestrator only processes tasks from the **active phase** (defined in CLAUDE.md "Aktiver Scope")
- When the last open task of a phase completes: orchestrator **stops unconditionally** and informs the user that all DoD items are done and a phase transition is possible
- The orchestrator **never** triggers `/project:phase-transition` itself or starts tasks from the next phase
- Even in `at-end` mode: phase boundary = hard stop, no override possible

This is an implicit fourth checkpoint type: the **phase checkpoint**, which always applies regardless of selected mode.

## Relationship to session-start / session-end

The batch orchestrator **replaces** the manual session-start/end cycle for batch work:

- **session-start is not required** before run-batch. The orchestrator reads task.md and
  CLAUDE.md directly — it does not need the user to restore context manually.
- **session-end is incorporated** into the orchestrator's consolidation step (Step 4).
  The consolidation performs the same updates as session-end: status.md, decisions-log.md,
  task.md checkboxes. A separate session-end after run-batch is not needed.
- **The user may still use session-start** before run-batch if they want to review context
  first — but it is optional, not required.

In short: `/project:run-batch` is a self-contained replacement for the
`session-start → work → session-end → clear → commit` cycle.

## Orchestrator Flow

```
User: "Work through the open discovery tasks" (or /project:run-batch)
  │
  ├─ 1. Read task.md of active phase, parse open tasks
  │
  ├─ 1.5. Permission check:
  │     • Verify required tools are allowed (Read, Write, Edit, Glob, Grep, Agent)
  │     • If permissions missing → inform user, suggest settings.json config, abort
  │
  ├─ 2. Check batch parameters — anything missing? → Ask interactively:
  │     • "Which tasks? (all open / specific items / range)"
  │     • "Checkpoint mode? (after-each / after-batch:N / at-end)"
  │     • Show summary + wait for confirmation
  │
  ├─ 3. Process groups sequentially:
  │     │
  │     ├─ Group 1:
  │     │   ├─ Dispatch tasks in parallel (subagents)
  │     │   ├─ Wait for completion
  │     │   ├─ On subagent failure: mark task as failed, include error
  │     │   │   in checkpoint summary, do not block other parallel tasks
  │     │   ├─ Checkpoint per selected mode
  │     │   └─ On "stop" → consolidation → end
  │     │
  │     ├─ Group 2:
  │     │   └─ ... (only after Group 1 completes + checkpoint passes)
  │     │
  │     └─ Phase boundary reached?
  │           → HARD STOP, inform user
  │
  ├─ 4. Consolidation (orchestrator only, never subagents):
  │     • task.md: mark completed items [x], failed items stay [ ]
  │     • status.md: update current state (include failures if any)
  │     • decisions-log.md: add entries if applicable
  │     • Phase-specific shared artifacts (e.g., capabilities-inventory.md
  │       in Discovery): update with consolidated findings from subagents
  │
  └─ 5. Commit proposal to user
```

## Subagent Contract

### Context Loading Strategy

Each subagent needs context to do its work. The orchestrator determines context through
a layered approach:

**Base context (always loaded):**
- Project rules of the active phase (framework-constraints, language-conventions, parallel-agents)
- status.md of the active phase (current state summary)

**Phase-specific context (loaded by convention):**
- If phase-specific rules define context requirements (e.g., `discovery-waves.md` specifies
  "Wave 2 loads structure-map.md"), the orchestrator follows those rules.

**Task-specific context (optional, from task.md):**
- Tasks in task.md can declare explicit context dependencies:
  ```markdown
  - [ ] domain-model.md (context: structure-map.md, entry-points.md)
  ```
- If no explicit context is declared: the orchestrator loads all existing output files
  from prior groups in the current batch as context.

**Context budget principle:** The orchestrator always prefers condensed documents
(structure-map.md, status.md) over raw sources. Raw code is only passed when
the task explicitly requires it (e.g., a module deep-dive task references a source path).

### Input (provided by orchestrator)

- Exactly one task with clear assignment
- Context files per the loading strategy above
- Output file path
- Project rules (framework-constraints, language-conventions, parallel-agents)

### Output (returned to orchestrator)

- Output file written to disk at the specified path
- Summary (2-3 sentences) as return value to orchestrator

### Boundaries

- Subagent writes **only** its own output file — never shared state
- Subagent does **not** update task.md, status.md, or decisions-log.md
- Orchestrator reads **only** subagent summaries — never full output files

## Skill Interface

### Command

```
/project:run-batch                              → asks everything interactively
/project:run-batch items:1-3 checkpoint:after-each
/project:run-batch items:all checkpoint:at-end
/project:run-batch items:wave-3 checkpoint:after-batch:2
```

### Natural Language Triggers (examples)

```
"Arbeite die nächsten 3 offenen Tasks ab, zeig mir nach jedem das Ergebnis"
"Führe alle offenen Wave-3-Module parallel durch"
"Mach die Discovery-Tasks durch, Checkpoint am Ende"
```

### Interactive Parameter Query

When parameters are missing or incomplete, the orchestrator asks before starting:

```
Orchestrator: "Ich sehe 5 offene Tasks in Phase 1 — Discovery:
  1. [ ] structure-map.md (Wave 1)
  2. [ ] entry-points.md (Wave 2)
  3. [ ] Module: auth (Wave 3)
  4. [ ] Module: invoicing (Wave 3)
  5. [ ] Module: reporting (Wave 3)

Welche Tasks soll ich abarbeiten? (alle / Nummern / Bereich)
Checkpoint-Modus? (after-each / after-batch:N / at-end)"
```

## Parallelization Rules

- Tasks under the same heading = one parallelization group
- Within a group: all tasks dispatched simultaneously as subagents
- Between groups: strictly sequential — Group 2 starts only after Group 1 fully completes AND checkpoint passes
- Checkpoint mode applies **per group**, not per individual task within parallel groups

## Permissions

Required permissions in `.claude/settings.json`:

```json
{
  "permissions": {
    "allow": [
      "Read",
      "Glob",
      "Grep",
      "Edit",
      "Write",
      "Agent"
    ]
  }
}
```

- Configured once in settings.json, no runtime flags needed
- Subagents inherit the parent session's permission context
- Orchestrator checks permissions at startup (Step 1.5) and aborts with guidance if missing

## Artifacts

| Artifact | Purpose |
|----------|---------|
| `rules/batch-orchestration.md` | Rules for orchestrator/subagent contract |
| `.claude/commands/run-batch.md` | The skill/command definition |
| `.claude/settings.json` | Permission configuration |
| `docs/superpowers/specs/2026-03-21-batch-orchestrator-design.md` | This spec |
| `docs/superpowers/specs/2026-03-21-batch-orchestrator-future.md` | Future improvements |

## Decisions Made

| Decision | Rationale |
|----------|-----------|
| In-session orchestrator with subagents (not external script) | Keeps checkpoints interactive, stays in Claude ecosystem |
| Slim orchestrator (summaries only, no full output reads) | Prevents context overflow with many tasks |
| task.md as single source of truth | No redundant runbook file, existing structure is machine-readable |
| Heading-based parallelization groups | Already matches existing task.md structure |
| Phase boundary = hard stop | Phase transitions require human intent per framework rules |
| Missing parameters always queried interactively | Prevents accidental batch runs with wrong settings |
| Permission check at startup | Prevents hanging on permission prompts mid-batch |
| Default checkpoint mode: after-each | Safest option, user opts into more autonomy explicitly |
| Orchestrator parses work section (with ###), not DoD section | Avoids ambiguity between actionable tasks and acceptance criteria |
| run-batch replaces session-start/end cycle | Eliminates redundant manual steps; consolidation is built into the flow |
| Layered context loading for subagents | Base context always available; phase-specific and task-specific context keeps the approach generic across phases |