# Recoil Visual Pipeline

> **▶ START HERE (fresh session): [`recoil/docs/PIPELINE_SSOT.md`](../docs/PIPELINE_SSOT.md)** — the
> one-screen orientation SSOT (topography · tools · protocols · canonical-source index). `/engine` loads it.

> **Narrative engine scope:** see `recoil/CLAUDE.md` for the narrative engine (develop → validate → promote → treatment → generate-script → finish). This file covers the visual pipeline only.

**Git worktrees: OK if placed OUTSIDE the repo** (`~/Code/` or `/tmp`), never inside it. A full worktree contains both the narrative engine and the visual pipeline, so cross-engine resolution works (`PIPELINE_ROOT`/`RECOIL_ROOT` are `__file__`-derived). The old blanket ban was an in-repo-placement artifact (2026-05-19 incident) and was lifted 2026-06-02.

**ALL generation MUST go through StepRunner.** Never call KlingClient, FalAiKlingClient, or any API client directly — not even for testing. Use `pipeline/tools/dispatch_cli.py` for CLI tests or the Workspace UI for review. Everything must flow through ExecutionStore so the Workspace tree can see it.

> **Current topology (2026-06-02 TRACE):** Console v2 is CANCELLED. Flora is the decided worksurface + execution target. Workspace 8450 remains ACTIVE as the developer/review surface; Production Console 8430 and Pre-Pro Console 8420 are deprecated.

> **Naming note:** This folder is `recoil/pipeline/` — the **Visual Pipeline** half of the Recoil engine. The Narrative Engine lives in `recoil/` (parent). "Starsend" as a word now refers to a **project** (like Tartarus or Leviathan), never this codebase. The legacy sibling `starsend/` folder was absorbed into `recoil/pipeline/` on 2026-03-30 and no longer exists — do not search or reference it.

Frontier model visual production engine for vertical microdrama series. Reads project data from `~/Dropbox/CLAUDE_DATA/recoil/projects/`. Narrative engine code lives in `~/CLAUDE_PROJECTS/recoil/`. Recoil is the brain/provenance layer; EpisodeRunner is the converged production path into Flora. Final video downloads land in `projects/{project}/renders/ep_NNN/*.mp4`; generated project refs live under `projects/{project}/assets/{char,loc,prop}/<slug>/...`. Shared expression matrix lives at `recoil/pipeline/assets/expressions/`.

One-screen flow:
```
Recoil (brain/provenance) → flora.py adapter → Flora (models+canvas+hosting) → download → renders/ep_NNN/*.mp4
```

EpisodeRunner (`recoil/pipeline/orchestrator/episode_runner.py`) is the converged production path.

## Consoles & UIs

**Active developer/review surface: Recoil Workspace (`127.0.0.1:8450`)** — served by `recoil/workspace/server.py`. Shot tree, take review, approve/reject, viewer pane (where `mcp__recoil-workspace__show_in_viewer` lands). Launch:
```
python3 recoil/workspace/server.py --project tartarus --port 8450
```

| URL | UI | Status |
|-----|-----|---------|
| `127.0.0.1:8450/` | **Recoil Workspace** | **ACTIVE** — developer/review surface |
| `127.0.0.1:5173/` | **Console v2** (Vite dev) | **CANCELLED** |
| `127.0.0.1:8430/console` | Production Console (`production-console.html`) | DEPRECATED — do not use |
| `127.0.0.1:8420/` | Recoil Pre-Production Console (`prepro-console.html`) | DEPRECATED — superseded by Workspace |
| `127.0.0.1:8430/review` | Legacy Frame Review | DEPRECATED |

> **Do not refer to the Production Console as the active console.** It still listens on 8430 if launched but is no longer the production UI. Push visual content through `mcp__recoil-workspace__show_in_viewer` (lands on 8450) and tell JT to "open Workspace," not "open Console."
> Flora is the decided worksurface + execution target; Workspace remains the active developer/review surface.

## Quick Reference

> **Current canonical skills (2026-04-25):** the legacy `/starsend …` slash commands listed below have been superseded. Confirm against `~/.claude/skills/` before using.

| Command | Purpose |
|---------|---------|
| `/generate-video` | Generate video as scene-level coverage passes via `recoil/pipeline/cli/generate.py` |
| `/workspace` | Launch Recoil Workspace — shot review, approve/reject takes |
| `/engine` | Pipeline operator session bootstrap (loads pipeline state, opens Workspace) |
| `/pipeline` | **DEPRECATED 2026-06-07** — superseded by `/generate-video` + casting/location-ref phases + Workspace. Do not use. |
| `/dispatch` | Dispatch a harness build to Mac Studio (overnight runs) |

**Legacy commands (pre-2026-04-25; may not be present in `~/.claude/skills/`):**

| Legacy Command | Replaced By |
|----------------|-------------|
| `/starsend preview --episode N --shot N` | `/generate-video --dry-run` |
| `/starsend generate --episode N --shots N-M` | `/generate-video` |
| `/starsend location-refs --episode N` | casting/location-ref pipeline phase (Workspace + casting API) |
| `/starsend expressions --generate` | expression-library pipeline phase |
| `/starsend bundle --episode N --model kling-2.5` | (Use Workspace export) |
| `/starsend review --project tartarus` | `/workspace` |
| `/starsend backfill --episode N` | (Folded into Plan Pass) |
| `/starsend route --episode N` | (Folded into coverage planner) |

## Data Locations (per project)

| Data | Path | Created By |
|------|------|------------|
| Global Bible | `projects/{project}/_pipeline/state/visual/global_bible.json` | Breakdown Pass + enrichment via configured live models |
| Shot Plans | `projects/{project}/_pipeline/state/visual/plans/ep_NNN_plan.json` | Plan Pass (Stage 2) |
| Episode Renders | `projects/{project}/renders/ep_NNN/*.mp4` | EpisodeRunner → Flora download |
| Boundary Frames | `projects/{project}/renders/ep_NNN/boundary_frames/` | StepRunner video save |
| Previz Assets | `projects/{project}/prep/ep_NNN/` | Previz / prep generation |
| Camera-Tested | `projects/{project}/_pipeline/state/visual/camera_tested/ep_NNN.json` | Camera Test Pass (Stage 0) |
| Character Refs | `projects/{project}/assets/char/<slug>/base/pool/` | Casting pipeline (turnarounds, hero) |
| Location Refs | `projects/{project}/assets/loc/<slug>/base/pool/` | Location ref generation |
| Prop Refs | `projects/{project}/assets/prop/<slug>/base/pool/` | Prop ref generation |
| Expression Refs | `recoil/pipeline/assets/expressions/` | Expression library (shared across projects) |

> **State namespace:** Path segments use `STATE_NAMESPACE` (defined in `recoil/core/paths.py`).
> Source of truth: `recoil/config/pipeline_config.json` → `"visual_state_namespace": "visual"`.
> Rollback: change the value to `"starsend"` and restart processes.

## Pipeline Order (Scripts → Frames)

```
1. Script Lock (Recoil)
2. Camera Test Pass (Stage 0) → camera_tested/ep_NNN.json
3. Global Bible (Stage 1, configured live model) → global_bible.json
4. Enrichment (configured live model) → fills [OPUS_ENRICHMENT] placeholders in bible
5. Plan Pass (Stage 2, configured live model) → plans/ep_NNN_plan.json
6. Casting → character grids, hero selection, turnarounds, expressions
7. Location Refs → moodboards per location via configured live model
8. Previz Generation (configured live model) → previs gut-check gate
9. Previz Review → Workspace (approve/reject/regenerate)
10. Keyframe Generation → production frames via NBP
11. Dailies Review → Workspace (approve/reject/reroute)
12. Video Generation → Kling/SeedDance/Veo
13. Final Review + Export
```

Casting (step 6) and Location Refs (step 7) are driven through the casting API
endpoints. Reach them via Workspace (8450) for developer/review operations;
Flora is the decided worksurface + execution target.

## Modality Registry (post-CP-4, 2026-04-28)

Generation dispatch goes through `pipeline/core/registry.py`. SSOT for the
registered set is the manifest's `modality_registry` capability. Five
generation modalities are registered by `register_default_runners(step_runner)`:
`image_t2i` (wraps `execute_keyframe`), `video_i2v` (wraps `execute_video`),
`r2v_multi`, `audio_t2a` (LIVE since CP-8), `lipsync_post` (LIVE since CP-8).
Three eval modalities (`eval_image_v1`, `eval_video_v1`, `eval_audio_v1`) are
opt-in via `register_default_eval_runners()` and are not auto-bootstrapped.
After bootstrap, call `get_runner(modality).run(payload)`. `StepRunner.execute_*`
methods are unchanged — runners delegate to them. Full audit:
`recoil/docs/modality-registry-audit.md`. Rollback: `pre-cp4-modality-registry`.

## Dispatch Unification (post-CP-5, 2026-04-28)

Every generation call goes through one entry point:

```python
from pipeline.core import dispatch, DispatchContext, GenerationReceipt

ctx = DispatchContext(
    caller_id="production_loop",
    step_runner=my_step_runner,
    project="tartarus", episode=1,
)
receipt = dispatch("image_t2i", payload, context=ctx)
```

`dispatch()` lazily bootstraps runners, routes through the CP-4 registry, wraps `RunResult` in `GenerationReceipt`, emits a JSONL audit log (default at `$RECOIL_ROOT/_dispatch_logs/receipts.jsonl`; override via `DispatchContext.receipts_log_path`), and stamps `StepRunner._dispatch_path` for sidecar provenance. **Direct calls to `StepRunner.execute_*` from production code are deprecated** — only tests that exercise the StepRunner contract still call them directly.

`pipeline/lib/api_client.py` (the 7-line proxy) was deleted; use `from execution.api_client import ...` directly.

`pipeline/tools/test_via_steprunner.py` was renamed to `dispatch_cli.py` in CP-5 and deleted in Phase 16 — use `dispatch_cli.py` exclusively.

`register_default_runners` canonical home is `pipeline.core.dispatch`. Old paths (`pipeline.core.runners`, `pipeline.core`) re-export — old paths removed in CP-6.

Full audit: `recoil/docs/dispatch-unification-audit.md`. Rollback tag: `pre-cp5-entry-point-unification`. CP-6 hand-off: `consultations/recoil/cp5-entry-point-spec/CP6_HANDOFF.md`.

## Prompt Engine (post-CP-3, 2026-04-26)

ONE prompt engine for all video / keyframe builders: `recoil/pipeline/_lib/prompt_engine.py` (the SSOT). Reach every builder via `get_builder()`:

```python
from recoil.pipeline._lib.prompt_engine import get_builder
prompt = get_builder(model_id, modality)(shot, bible)
# e.g. get_builder("kling-o3", "i2v")(shot, bible)
#      get_builder("seeddance-2.0", "r2v_multi")(shots, bible)
#      get_builder("nbp", "keyframe")(shot, bible)
```

The `BUILDERS` dispatch table at the END of `pipeline/_lib/prompt_engine.py` maps `(model_id, modality)` → builder callable. 25 entries cover all 12 strategy model_ids. **Adding a new model = one line:** `BUILDERS[("new-model", "i2v")] = build_new_model_i2v_prompt`. Adding a new modality = update the canonical list in `recoil/docs/prompt-engine-audit.md` § Modality enumeration first, then add the BUILDERS entry.

`tools/prompt_engine.py` and `visual/prompt_engine.py` were deleted in CP-3 (2026-04-26). Do not search for them. The `PromptEngine` 10-layer class lives at `recoil/lib/prompt_compiler.py` (migrated from tools/ in Phase 5). Full migration log: `recoil/docs/prompt-engine-audit.md`.

### Cinema Mode Builder Wiring (Phase 2a)

- Six R2V/T2V builders are wired to `render_cinema_tokens` (Style block) and
  `render_camera_line` (Camera line): build_seeddance_r2v_prompt,
  build_seeddance_r2v_prompt_multi, build_seeddance_t2v_prompt,
  build_wan_r2v_prompt, build_kling_t2v_prompt, build_veo_t2v_prompt,
  build_seedream_*_prompt.
- I2V builders SKIP cinema tokens (start frame carries the look) but DO emit
  the Camera-line lens-type and the production/consistency subset of the
  constraint block. Era/look constraints are filtered out on i2v builders.
- Constraint dictionary: PROMPT_BIBLE.yaml::constraint_dictionary.
  Emit policy: keyed off model_profiles.json::supports_negative_prompt.

## Architecture

**Phase A Router-Pipeline architecture** (Feb 27, 2026): Evolved from linear 3-pass still pipeline into 4 sub-pipelines routed by `route_shot()` in `scene_planner.py`:

| Sub-Pipeline | Engine | Use Case |
|-------------|--------|----------|
| **Still** | NBP (Gemini 3 Pro) | Keyframes, inserts, ENV shots |
| **I2V** | Kling (current prod version) | Start+end frame precision edits |
| **T2V** | SeedDance 2.0 / Kling | Character motion, dialogue |
| **Multi-Shot** | SeedDance 2.0 | Scene batching (3-8 shots) |

> **Video models:** Call `live_model_status(<model>)` for current version / availability. Veo 3 was deprioritized 2026-04-09 pending Veo 4 — do not default to Veo in recommendations.

Full architecture documented in `recoil/pipeline/docs/coverage-pass-strategy.md` (current — Formats A/B/C). **All generation requests MUST go through the pipeline tools (`/generate-video` → `recoil/pipeline/cli/generate.py`, EpisodeRunner) — never call Gemini APIs directly.**

## Cross-Engine Path Resolution

The narrative engine (`recoil/`) and visual pipeline (`recoil/pipeline/`) are nested directories that reference each other. All paths are resolved through `recoil/core/paths.py`:

- **`PIPELINE_ROOT`** — Auto-detected; points to `recoil/pipeline/`. (The legacy `STARSEND_ROOT` alias and `ensure_starsend_importable()` shim were deleted in the engine-fix-phase-A sprint — use `PIPELINE_ROOT` and `ensure_pipeline_importable()` exclusively.)
- **`RECOIL_ROOT`** — Points to `recoil/` (narrative engine root).
- **`PROJECTS_ROOT`** — From `recoil/config/pipeline_config.json` → `projects_root` (default: `~/Dropbox/CLAUDE_DATA/recoil/projects`).

Tools that cross the engine boundary use `sys.path.insert(0, str(RECOIL_ROOT))`. This is intentional — a shared package is too complex for a solo-dev setup. Document any new cross-engine imports in this file.

## Key Principles

1. **Native 9:16** for final frames, **4K grids** for planning/exploration
2. **Recency bias ordering:** Scene → Pose → Expression → Identity → Prompt
3. **Pristine identity refs** (white-bg, never altered). Concept/casting grids use 18% neutral gray; rembg converts to white for final refs.
4. **Hardware mirroring** (`ImageOps.mirror()`) for screen direction
5. **Universal grayscale expression matrix** (27 refs: 9 emotions × 3 intensities, generic actor) to avoid Blank Stare Bug
6. **No output chaining** between shots (causes generation drift)
7. **Text-only grid prompting** (no template images). Use diegetic framing ("photographic contact sheet"), NEVER "character design sheet"

## Models & Pricing

**Never hardcode prices or version tags in this file — they rot.** Call the live tools:

- `live_pricing(<model>)` — current per-unit cost across providers, with verification dates
- `live_model_status(<model>)` — current version, availability, deprecation status

Canonical source of truth: `recoil/core/model_profiles.json` (what the live tools read).

## Gemini Consultation

5-round still architecture review in `recoil/pipeline/gemini_consultation/` + 3-round video pipeline review in `recoil/pipeline/consultations/full_review_feb27/`.

Key video consultation findings (Feb 27, 2026):
- Router architecture with 4 sub-pipelines
- SeedDance 2.0 as primary multi-shot engine
- Kling for editorial precision (start+end frame)
- EDL/FCPXML export for NLE handoff (Phase C)

---

## WORKFLOW OBJECT MODEL (CP-6, post-2026-04-28)

Generation calls can be composed into declarative workflows:

```python
from pipeline.core import Workflow, WorkflowStep, DispatchContext

ctx = DispatchContext(caller_id="production_loop", step_runner=sr,
                     project="tartarus", episode=1)
wf = Workflow(
    workflow_id="tartarus_ep001_sh02_full",
    steps=[
        WorkflowStep(step_id="keyframe", modality="image_t2i",
                     payload=kf_payload),
        WorkflowStep(step_id="video", modality="video_i2v",
                     payload=vid_payload, depends_on=["keyframe"]),
    ],
    global_provenance={"shot_id": "EP001_SH02", "scene_id": "scene_3"},
)
wf.run(context=ctx)
kf_receipt = wf.get_step("keyframe").receipt
vid_receipt = wf.get_step("video").receipt
```

`Workflow.run` walks the steps in declared order, calls `dispatch()` per step,
attaches the resulting `GenerationReceipt` to `step.receipt`, and stamps
`provenance["workflow_id"]` + `provenance["workflow_step_id"]` on each receipt.
Failed steps short-circuit dependent steps (`status="skipped"`); independent
branches continue. Hooks `pre_step` / `post_step` / `on_failure` give CP-9 a
place to hang eval calls without touching the executor.

CP-6 ships **linear execution semantics on a DAG-shaped data model** —
today every node has at most one downstream; future Flora/worksurface graph
work can add branching with zero data migration. CP-6 does NOT ship:
persistence, DIRECTOR step subtypes, or eval primitives. CP-7 wraps
`WorkflowStep.receipt` in `Take`. CP-8 adds audio steps. CP-9 fills
`eval_scores` via the hooks.

JSON round-trip: `Workflow.from_dict(wf.to_dict()) == wf`. Useful for tests,
debugging, and CP-7's persistence layer.

Full audit: `recoil/docs/workflow-object-model-audit.md`. Rollback tag:
`pre-cp6-workflow-object-model`. CP-7 hand-off:
`consultations/recoil/cp6-workflow-spec/CP7_HANDOFF.md`.

## TAKE MODEL (CP-7, post-2026-04-28)

Generation attempts can be organized into editorial Takes inside Beats inside Scenes:

```python
from pipeline.core import (
    Take, Beat, Scene,
    Workflow, WorkflowStep, DispatchContext,
)

ctx = DispatchContext(caller_id="production_loop", step_runner=sr,
                     project="tartarus", episode=1)

beat = Beat(beat_id="EP001_SH02",
            beat_metadata={"scene_id": "ep001_sc02"})

# First attempt
take_0 = beat.new_take(workflow=Workflow(
    workflow_id="EP001_SH02_wf0",
    steps=[
        WorkflowStep(step_id="keyframe", modality="image_t2i", payload=kf_payload),
        WorkflowStep(step_id="video", modality="video_i2v",
                     payload=vid_payload, depends_on=["keyframe"]),
    ],
))
take_0.execute(context=ctx)

# Re-attempt = NEW Take, not mutate
if take_0.status != "succeeded":
    take_1 = beat.new_take(workflow=Workflow(workflow_id="EP001_SH02_wf1", steps=[...]))
    take_1.execute(context=ctx)

# Pick the primary (default: first successful)
beat.select_primary()  # strategy="first_success"
primary = beat.primary_take
```

`Take` wraps exactly one `Workflow`. `Take.execute(context, ...)` runs the
workflow and compresses step status into a take-level status:
`succeeded` (all steps succeeded), `failed` (no step succeeded), `partial`
(mixed). Hooks (`pre_step` / `post_step` / `on_failure`) pass through to the
underlying `Workflow.run`.

`Beat` groups multiple Takes for one logical shot. `Beat.new_take` constructs
+ appends a Take with auto-assigned `take_index`. `Beat.select_primary` picks
the primary using a strategy:
- `"first_success"` (CP-7 default) — first take with status="succeeded"
- `"manual"` — caller sets `primary_take_id` directly
- `"score"` — `NotImplementedError` until CP-9 ships eval primitives

`Scene` is a thin grouping of Beats — dataclass + serialization only.

CP-7 is **in-memory only** — no disk persistence. JSON round-trip via
`to_dict` / `from_dict` works for testing, debugging, and a future
persistence CP. CP-7 does NOT ship: persistence, take-level hooks,
DIRECTOR step subtypes, or eval primitives. CP-8 adds audio Takes.
CP-9 fills `eval_scores` on receipts and replaces the primary-selection
default with score-based logic.

Full audit: `recoil/docs/take-model-audit.md`. Rollback tag:
`pre-cp7-take-model`. CP-8 hand-off:
`consultations/recoil/cp7-take-spec/CP8_HANDOFF.md`.

## AUDIO + LIP-SYNC (CP-8, post-2026-04-28)

Audio + lip-sync modalities are now LIVE under the modality registry:

```python
from pipeline.core import dispatch, DispatchContext

ctx = DispatchContext(caller_id="audio_demo", step_runner=sr,
                     project="tartarus", episode=1)

# Text-to-speech via ElevenLabs
audio_receipt = dispatch("audio_t2a", {
    "shot_id": "EP001_VO01",
    "text": "Hold the line — this is not a drill.",
    "voice_id": "Rachel",
    "model": "eleven_multilingual_v2",
    "output_format": "mp3",
}, context=ctx)
# audio_receipt.run_result.output_path → local .mp3

# Lipsync via sync.so (face-video + audio → lipsynced .mp4)
lipsync_receipt = dispatch("lipsync_post", {
    "shot_id": "EP001_VO01",
    "video_path": "/path/to/face_video.mp4",
    "audio_path": audio_receipt.run_result.output_path,
    "model": "lipsync-2.0",
    "output_format": "mp4",
    "sync_mode": "loop",
}, context=ctx)
```

Both modalities produce a `GenerationReceipt` with `RunResult.metadata`
containing cost (`cost_usd`), model id, vendor request/job ids, char count
(audio) or duration (lipsync). Receipts are JSONL-logged at
`$RECOIL_ROOT/_dispatch_logs/receipts.jsonl` like every other modality.

Output files default to `$RECOIL_ROOT/_audio_outputs/` and
`$RECOIL_ROOT/_lipsync_outputs/` (gitignored). Auth via env vars
`ELEVENLABS_API_KEY` and `SYNC_SO_API_KEY`.

Provider adapters: `recoil/execution/providers/elevenlabs.py` and
`recoil/execution/providers/sync_so.py`. They expose `synthesize_speech` /
`lipsync_video` callable functions, raise typed exceptions
(`AudioSynthesisError`, `LipSyncError` subclass tree), and accept a
`transport=` injection point for tests.

Retry policy: 3 retries with exponential backoff (1s, 2s, 4s) for 5xx +
network blips. Fail-fast on 401 / 402 / 422 / 429 (auth, quota, payload,
rate-limit).

CP-8 is the **first production-implementation CP** in the june-refactor
sequence — load-tests the CP-4 modality registry with a brand-new modality
that didn't exist in any pre-existing StepRunner method. CP-8 ships:

- Real `AudioRunner` + `LipSyncPostProcessor` (replacing CP-4 stubs).
- ElevenLabs + sync.so adapters under `execution/providers/`.
- Two new entries each in `model_profiles.json` + `provider_strategy.json`.
- ~85 new tests across 7 test files.
- ZERO new modality strings, ZERO StepRunner additions, ZERO frontend.

CP-9 builds eval primitives on top of this fully-populated 4-modality
surface (image_t2i, video_i2v, audio_t2a, lipsync_post all LIVE).

Full audit: `recoil/docs/audio-lipsync-impl-audit.md`. Rollback tag:
`pre-cp8-audio-lipsync`. CP-9 hand-off:
`consultations/recoil/cp8-audio-lipsync-spec/CP9_HANDOFF.md`.

Audio takes use the same Take/Beat pattern as visual takes — `Beat.new_take(workflow=...)` followed by `take.execute(context=ctx)`.

## EVAL PRIMITIVE (CP-9, post-2026-04-28)

Pluggable artifact eval — multi-judge `PanelOfJudges`, score-based
`Beat.select_primary("score")`, and `Take.aggregate_score`. Three new
eval modalities (`eval_image_v1`, `eval_video_v1`, `eval_audio_v1`)
register through the same modality registry that CP-4 introduced.

```python
from pipeline.core import (
    EvalContext, EvalResult, PanelOfJudges,
    register_eval_node, get_eval_node, attach_eval_hooks,
    Take, Beat, Workflow, WorkflowStep, DispatchContext,
)
from pipeline.core.runners import GeminiVisionEvalNode
from pipeline.core.dispatch import register_default_eval_runners

# Bootstrap eval runners (opt-in; not auto-registered — eval modalities
# require GEMINI_API_KEY, so making them auto-register on import would
# break test environments without the key).
register_default_eval_runners()

# Register one or more EvalNode judges per modality. Each EvalNode wraps
# the Gemini Vision adapter for its target artifact_modality
# ("image" | "video" | "audio").
register_eval_node("eval_image_v1",
                   GeminiVisionEvalNode(artifact_modality="image",
                                        judge_id="eval_image_v1"))

# Build a panel.
panel = PanelOfJudges(
    panel_id="visual_quality_v1",
    judges=[get_eval_node("eval_image_v1")],
    aggregation="median",        # or "mean"
    cost_cap_usd=0.50,           # hard-aborts mid-panel if projected
)

# Attach hooks to a workflow + execute.
beat = Beat(beat_id="EP001_SH02")
take = beat.new_take(workflow=Workflow(
    workflow_id="EP001_SH02_wf0",
    steps=[WorkflowStep(step_id="kf", modality="image_t2i", payload={...})],
))
ctx = DispatchContext(caller_id="prod", step_runner=sr,
                     project="tartarus", episode=1)
pre, post, on_fail = attach_eval_hooks(take.workflow, panel)
take.execute(context=ctx, pre_step=pre, post_step=post, on_failure=on_fail)

# Compute aggregate + select primary on the beat.
take.compute_aggregate_score()
beat.select_primary(strategy="score")
```

Eval scores land in `step.receipt.eval_scores[panel_id]` as a ScoreCard
dict: `{panel_score, panel_warnings, judges: [...], aggregation,
panel_cost_usd}`. Hooks mutate the receipt's `eval_scores` and
`provenance` dicts in place — `GenerationReceipt` is `@dataclass(frozen=True)`
so field reassignment raises `FrozenInstanceError`, but in-place dict
mutation is the supported escape hatch. Eval cost flows through
`receipt.provenance["eval_cost_usd"]` (separated from generation cost,
which stays in `RunResult.metadata.cost_usd`).

`Beat.select_primary("score")` is now implementable (replaced the CP-7
`NotImplementedError` branch): highest aggregate wins; ties broken by
`take_index` ASC; score-less takes sort below scored takes; no-eval-at-all
returns None (parallels `first_success` semantics — does NOT raise).

`Take.aggregate_score: Optional[float]` is the 7th `Take` field (additive
since CP-7 hand-off explicitly permitted additive fields). Round-trips
through `to_dict` / `from_dict`; legacy take dicts without the field
default to None.

All three eval modalities wrap the **same** Gemini 3.1 Pro adapter with
modality-specific defaults (rubric template, MIME-type whitelist, cost
cap). Auth via `GEMINI_API_KEY` env var (`GOOGLE_API_KEY` fallback). The
adapter handles inline-base64 (<14.5 MB raw / <20 MB encoded) vs Files
API resumable upload automatically.

`gemini-3.1-pro-preview` is added to `recoil/config/model_profiles.json`
with per-1k-token cost rates (standard + long-context tiers; threshold at
200k input tokens). Provider key `gemini_vision` added to
`provider_strategy.json` — distinct from the existing `google` image-gen
mappings. Live numbers via `live_pricing("gemini-3.1-pro-preview")` per
the file's own no-hardcoded-prices rule.

The legacy `core/critic.py` Gemini Flash visual critic gets a
`LegacyFlashCriticEvalNode` adapter appended at the BOTTOM of
`recoil/core/critic.py` — no body modification, no modification to any
of its 30+ callers. Wraps any `CriticLoop` instance behind the `EvalNode`
Protocol. The proxy at `recoil/pipeline/_lib/critic.py` stays byte-stable.

Retry-strategy substrate bridge ships as `from_score_card(score_card) ->
(FailureMode, float)` in `recoil/pipeline/orchestrator/strategy_registry.py`
— mirrors `detect_failure_mode` return shape so consumers swap inputs
without changing downstream consumption. **Substrate only** —
`production_loop.py` is NOT rewired; switchover gated on JT sign-off after
he lives with PanelOfJudges output for N production runs.

CP-9 is the **last CP** in the june-refactor sprint. Flora/worksurface
iteration unblocks (eval scores + take aggregation are now real surfaces a UI
can read). Retry-strategy iteration unpauses on the `PanelOfJudges` substrate.

CP-9 does NOT ship: tournament/elimination, CostGate, pre-generation
prompt eval, scene-continuity critic, multi-panel weighting,
production cutover of legacy `critic.py`, production cutover of
`detect_failure_mode` via `from_score_card`, `EvalContext.scene_takes`
implementation (designed in dataclass, not implemented). All deferred
to CP-N+.

Full audit: `recoil/docs/eval-primitive-audit.md`. Rollback tag:
`pre-cp9-eval-primitive`. Intermediate rollback tags:
`pre-cp9-runners` (Phase 4), `pre-cp9-score-strategy` (Phase 5),
`pre-cp9-critic-migration` (Phase 7). Sprint-complete tag:
`june-refactor-complete` (placed in Phase 9). Post-CP-9 hand-off:
`consultations/recoil/cp9-eval-spec/POST_CP9_HANDOFF.md`.

# Canonical capability map: recoil/architecture/ssot_manifest.yaml