re_struct/docs/superpowers/specs/2026-03-21-batch-orchestrat...

13 KiB

Batch Orchestrator for re:struct

Problem Statement

When tasks are pre-planned (e.g., discovery waves, user stories, architecture decisions), the human operator is reduced to a manual scheduler:

session-start → "do task X" → session-end → clear context → commit → repeat

This is repetitive and adds no value when the next steps are already defined.

Solution Overview

A lightweight in-session orchestrator that reads planned tasks from task.md, dispatches subagents for execution (fresh context per task), and pauses at configurable checkpoints for human review.

Approach: Slim orchestrator + subagent isolation (Approach 3 from evaluation).

  • Subagents do the heavy work with their own context
  • Orchestrator reads only task.md + subagent summaries (never full output files)
  • Parallelization within groups, sequential between groups
  • Phase boundary is a hard stop — never crossed automatically

Task Parsing from task.md

Which section does the orchestrator parse?

A task.md may contain multiple sections with checkboxes — e.g., a "Waves" section with sub-headings AND a flat "Definition of Done" section. These serve different purposes:

  • Work section (with ### sub-headings): actionable tasks with grouping — this is what the orchestrator parses. Identified by having sub-headings (###) under it.
  • Definition of Done section (flat list): acceptance criteria for phase completion — used only for phase-boundary checks, not for dispatching work.

Convention: The orchestrator parses the first section in task.md that contains ### sub-headings with - [ ] checkboxes. This is the work plan. Flat checkbox lists without sub-headings are treated as DoD and ignored for dispatching.

Example task.md with both sections:

## Analyse-Wellen                        ← WORK SECTION (orchestrator parses this)
### Wave 1 — Structure
- [ ] structure-map.md

### Wave 2 — Entry Points
- [ ] entry-points.md

### Wave 3 — Module Deep Dives
- [ ] Module: auth
- [ ] Module: invoicing
- [ ] Module: reporting

## Ergebnisse (Definition of Done)       ← DoD SECTION (phase check only)
- [ ] structure-map.md
- [ ] domain-model.md
- [ ] capabilities-inventory.md

Conventions

  • - [ ] = open task, eligible for orchestrator
  • - [x] = completed, skipped
  • Tasks under the same ### heading = parallelization group (can run concurrently)
  • Tasks under different ### headings = sequential (Group 2 starts after Group 1)
  • Each task requires an output path — either explicit in the text or derivable from task name
  • Each task must contain enough context for the subagent: what to do, which sources to read

What the orchestrator does NOT read from task.md

  • Checkpoint mode and batch size — provided by the user at invocation time
  • The Definition of Done section — only relevant for phase-boundary checks

Checkpoint Modes

Three modes, selected by the user at batch start:

Mode Behavior Typical Use
after-each (default) Pause after each group, show summaries, wait for "continue" New/uncertain tasks, want to review every output
after-batch:N Run N groups, then pause "Do the next 3, then I'll check"
at-end Run all groups, consolidated review at the end Well-understood tasks, high confidence

Checkpoint Behavior

At each checkpoint the orchestrator:

  1. Shows per completed task: task name, output file path, 2-3 sentence summary (provided by subagent)
  2. User can respond:
    • weiter (continue) — next task / next batch
    • stopp (stop) — orchestrator consolidates current results and ends
    • zeig mir [file] (show me) — review an output file in detail before deciding
  3. After final checkpoint: consolidation (status.md, decisions-log, task.md checkboxes, commit proposal)

Checkpoint with Parallel Tasks

Mode Behavior with group of 3 parallel tasks
after-each All 3 run in parallel, checkpoint when all 3 complete — show all summaries
after-batch:N Batch counter counts groups, not individual tasks
at-end Run through, checkpoint only after all groups

Parallelization does not change checkpoint rhythm — checkpoints refer to groups, not individual subagents.

Phase Protection

  • The orchestrator only processes tasks from the active phase (defined in CLAUDE.md "Aktiver Scope")
  • When the last open task of a phase completes: orchestrator stops unconditionally and informs the user that all DoD items are done and a phase transition is possible
  • The orchestrator never triggers /project:phase-transition itself or starts tasks from the next phase
  • Even in at-end mode: phase boundary = hard stop, no override possible

This is an implicit fourth checkpoint type: the phase checkpoint, which always applies regardless of selected mode.

Relationship to session-start / session-end

The batch orchestrator replaces the manual session-start/end cycle for batch work:

  • session-start is not required before run-batch. The orchestrator reads task.md and CLAUDE.md directly — it does not need the user to restore context manually.
  • session-end is incorporated into the orchestrator's consolidation step (Step 4). The consolidation performs the same updates as session-end: status.md, decisions-log.md, task.md checkboxes. A separate session-end after run-batch is not needed.
  • The user may still use session-start before run-batch if they want to review context first — but it is optional, not required.

In short: /project:run-batch is a self-contained replacement for the session-start → work → session-end → clear → commit cycle.

Orchestrator Flow

User: "Work through the open discovery tasks" (or /project:run-batch)
  │
  ├─ 1. Read task.md of active phase, parse open tasks
  │
  ├─ 1.5. Permission check:
  │     • Verify required tools are allowed (Read, Write, Edit, Glob, Grep, Agent)
  │     • If permissions missing → inform user, suggest settings.json config, abort
  │
  ├─ 2. Check batch parameters — anything missing? → Ask interactively:
  │     • "Which tasks? (all open / specific items / range)"
  │     • "Checkpoint mode? (after-each / after-batch:N / at-end)"
  │     • Show summary + wait for confirmation
  │
  ├─ 3. Process groups sequentially:
  │     │
  │     ├─ Group 1:
  │     │   ├─ Dispatch tasks in parallel (subagents)
  │     │   ├─ Wait for completion
  │     │   ├─ On subagent failure: mark task as failed, include error
  │     │   │   in checkpoint summary, do not block other parallel tasks
  │     │   ├─ Checkpoint per selected mode
  │     │   └─ On "stop" → consolidation → end
  │     │
  │     ├─ Group 2:
  │     │   └─ ... (only after Group 1 completes + checkpoint passes)
  │     │
  │     └─ Phase boundary reached?
  │           → HARD STOP, inform user
  │
  ├─ 4. Consolidation (orchestrator only, never subagents):
  │     • task.md: mark completed items [x], failed items stay [ ]
  │     • status.md: update current state (include failures if any)
  │     • decisions-log.md: add entries if applicable
  │     • Phase-specific shared artifacts (e.g., capabilities-inventory.md
  │       in Discovery): update with consolidated findings from subagents
  │
  └─ 5. Commit proposal to user

Subagent Contract

Context Loading Strategy

Each subagent needs context to do its work. The orchestrator determines context through a layered approach:

Base context (always loaded):

  • Project rules of the active phase (framework-constraints, language-conventions, parallel-agents)
  • status.md of the active phase (current state summary)

Phase-specific context (loaded by convention):

  • If phase-specific rules define context requirements (e.g., discovery-waves.md specifies "Wave 2 loads structure-map.md"), the orchestrator follows those rules.

Task-specific context (optional, from task.md):

  • Tasks in task.md can declare explicit context dependencies:
    - [ ] domain-model.md (context: structure-map.md, entry-points.md)
    
  • If no explicit context is declared: the orchestrator loads all existing output files from prior groups in the current batch as context.

Context budget principle: The orchestrator always prefers condensed documents (structure-map.md, status.md) over raw sources. Raw code is only passed when the task explicitly requires it (e.g., a module deep-dive task references a source path).

Input (provided by orchestrator)

  • Exactly one task with clear assignment
  • Context files per the loading strategy above
  • Output file path
  • Project rules (framework-constraints, language-conventions, parallel-agents)

Output (returned to orchestrator)

  • Output file written to disk at the specified path
  • Summary (2-3 sentences) as return value to orchestrator

Boundaries

  • Subagent writes only its own output file — never shared state
  • Subagent does not update task.md, status.md, or decisions-log.md
  • Orchestrator reads only subagent summaries — never full output files

Skill Interface

Command

/project:run-batch                              → asks everything interactively
/project:run-batch items:1-3 checkpoint:after-each
/project:run-batch items:all checkpoint:at-end
/project:run-batch items:wave-3 checkpoint:after-batch:2

Natural Language Triggers (examples)

"Arbeite die nächsten 3 offenen Tasks ab, zeig mir nach jedem das Ergebnis"
"Führe alle offenen Wave-3-Module parallel durch"
"Mach die Discovery-Tasks durch, Checkpoint am Ende"

Interactive Parameter Query

When parameters are missing or incomplete, the orchestrator asks before starting:

Orchestrator: "Ich sehe 5 offene Tasks in Phase 1 — Discovery:
  1. [ ] structure-map.md (Wave 1)
  2. [ ] entry-points.md (Wave 2)
  3. [ ] Module: auth (Wave 3)
  4. [ ] Module: invoicing (Wave 3)
  5. [ ] Module: reporting (Wave 3)

Welche Tasks soll ich abarbeiten? (alle / Nummern / Bereich)
Checkpoint-Modus? (after-each / after-batch:N / at-end)"

Parallelization Rules

  • Tasks under the same heading = one parallelization group
  • Within a group: all tasks dispatched simultaneously as subagents
  • Between groups: strictly sequential — Group 2 starts only after Group 1 fully completes AND checkpoint passes
  • Checkpoint mode applies per group, not per individual task within parallel groups

Permissions

Required permissions in .claude/settings.json:

{
  "permissions": {
    "allow": [
      "Read",
      "Glob",
      "Grep",
      "Edit",
      "Write",
      "Agent"
    ]
  }
}
  • Configured once in settings.json, no runtime flags needed
  • Subagents inherit the parent session's permission context
  • Orchestrator checks permissions at startup (Step 1.5) and aborts with guidance if missing

Artifacts

Artifact Purpose
rules/batch-orchestration.md Rules for orchestrator/subagent contract
.claude/commands/run-batch.md The skill/command definition
.claude/settings.json Permission configuration
docs/superpowers/specs/2026-03-21-batch-orchestrator-design.md This spec
docs/superpowers/specs/2026-03-21-batch-orchestrator-future.md Future improvements

Decisions Made

Decision Rationale
In-session orchestrator with subagents (not external script) Keeps checkpoints interactive, stays in Claude ecosystem
Slim orchestrator (summaries only, no full output reads) Prevents context overflow with many tasks
task.md as single source of truth No redundant runbook file, existing structure is machine-readable
Heading-based parallelization groups Already matches existing task.md structure
Phase boundary = hard stop Phase transitions require human intent per framework rules
Missing parameters always queried interactively Prevents accidental batch runs with wrong settings
Permission check at startup Prevents hanging on permission prompts mid-batch
Default checkpoint mode: after-each Safest option, user opts into more autonomy explicitly
Orchestrator parses work section (with ###), not DoD section Avoids ambiguity between actionable tasks and acceptance criteria
run-batch replaces session-start/end cycle Eliminates redundant manual steps; consolidation is built into the flow
Layered context loading for subagents Base context always available; phase-specific and task-specific context keeps the approach generic across phases