# Modality Registry Audit (CP-4 Phase 1)

Generated: 2026-04-28T02:42:29Z
CP-4 input: consultations/recoil/cp4-modality-registry-spec/CP4_SPEC_INPUT.md
CP-4 spec:  consultations/recoil/cp4-modality-registry-spec/BUILD_SPEC_CP4.md
Predecessor: post-cp3-prompt-consolidation (parent + engine-memory)
Rollback tag: pre-cp4-modality-registry (parent + engine-memory)

## Tag verification

```text
Parent post-cp3-prompt-consolidation: 0f4c0aa22017ba0a05a96f4e9987d72fabdfb91d
Parent pre-cp4-modality-registry:     36817827d4172e13be60661c0bcd0b369a01d196
engine-memory post-cp3-prompt-consolidation: 0f3893b24fdf75beb25212f2065792bddff118e6
engine-memory pre-cp4-modality-registry:     0f3893b24fdf75beb25212f2065792bddff118e6
```

Note: engine-memory carries the same SHA on both tags because no engine-memory
changes were required for CP-3→CP-4 transition. This matches the Hidden Issue
guidance in the BUILD_SPEC — engine-memory diff stays empty across CP-4 Phase 1.

## StepRunner method catalog

Re-verified 2026-04-28 by re-grepping `^    def execute_` against the live
`recoil/execution/step_runner.py`. **All line numbers match the spec defaults
exactly — no shifts.**

| Line | Method | Classification | CP-4 action |
|---|---|---|---|
| 335  | `execute_video`                | MODALITY (video_i2v)             | Phase 4 — VideoRunner wraps |
| 948  | `execute_multi_shot`           | ORCHESTRATION                    | CP-5 followup — leave alone |
| 1160 | `execute_multi_shot_with_takes`| ORCHESTRATION                    | CP-5 followup — leave alone |
| 1384 | `execute_keyframe`             | MODALITY (image_t2i)             | Phase 3 — ImageRunner wraps |
| 1954 | `execute_previz`               | ORCHESTRATION (8-slot multimodal)| CP-5 followup — leave alone |
| 2216 | `execute_wan_r2v`              | MODALITY-ish (variant)           | CP-5 followup — could be `video_r2v` later |
| 2379 | `execute_pass`                 | ORCHESTRATION (multi-segment)    | CP-5 followup — leave alone |

Module-level helper functions (NOT methods) — also out of CP-4 wrap scope:
- L51   `_make_shot_label_for_keyframe`
- L66   `_make_shot_label_for_video`
- L79   `_make_shot_label_canonical`
- L94   `make_identity_gate`            (re-exported via `pipeline/lib/run_shot.py:202`)
- L152  `make_video_drift_gate`
- L195  `_summarize_gate_2`
- L219  `_write_seeddance_meta_sidecar`
- L302  `class SessionCostLogger`
- L310  `class StepRunner`

Total `step_runner.py` size: 2552 lines (per Phase 1 snapshot).

## StepResult field mapping

The runner shims wrap `StepRunner.execute_*` methods which return `StepResult`
instances. The CP-4 adapter `_step_result_to_run_result` translates StepResult →
RunResult. **This mapping table is load-bearing**: if the adapter probes the
wrong attribute names, every successful generation produces
`RunResult(success=False, output_path=None, metadata={None…})` and Phase 7's
kwargs-byte-diff gate would PASS while the registry silently ships broken.

Verified 2026-04-28 against `recoil/execution/step_types.py:81-94` —
`@dataclass(frozen=True) class StepResult`. **All 10 fields confirmed present
in the live dataclass.**

| StepResult field  | Type                    | Required        | Mapped into RunResult |
|---|---|---|---|
| `shot_id`         | str                     | yes             | (not propagated — RunResult.id is fresh) |
| `success`         | bool                    | yes             | `RunResult.success` (direct copy) |
| `final_state`     | str                     | yes             | `RunResult.metadata["final_state"]` |
| `output_path`     | Optional[str]           | yes             | `RunResult.output_path` (str-coerced) |
| `cost_usd`        | float                   | yes             | `RunResult.metadata["cost_usd"]` |
| `error`           | Optional[str]           | yes             | `RunResult.error` (direct copy) |
| `take_index`      | int                     | yes             | `RunResult.metadata["take_index"]` |
| `gate_verdict`    | Optional[GateVerdict]   | yes             | `RunResult.metadata["gate_verdict"]` |
| `model`           | str                     | default ""      | `RunResult.metadata["model"]` |
| `pipeline`        | str                     | default ""      | `RunResult.metadata["pipeline"]` |

Notes:
- StepResult uses singular `gate_verdict`, not plural `gate_verdicts`. The adapter
  carries this through unchanged.
- `success` is a real boolean on StepResult; the adapter MUST use it directly,
  not infer success from string-matching `final_state`.
- `attempts` is NOT a StepResult field. If a CP-5 design needs retry counts,
  it must come from elsewhere (PassResult / sidecar / engine-memory).
- StepResult is `frozen=True` — adapters MUST construct a fresh RunResult, not
  mutate StepResult.
- All other types: `PassResult` (step_types.py:111), `SegmentResult`
  (step_types.py:97) — out of CP-4 scope. Their shapes are documented here for
  CP-5 reference but no adapter wraps them yet.

## Canonical modality strings (locked at CP-4)

| Modality string  | Source class           | Status after CP-4 | Notes |
|---|---|---|---|
| `image_t2i`      | ImageRunner            | LIVE              | wraps execute_keyframe |
| `video_i2v`      | VideoRunner            | LIVE              | wraps execute_video (covers both i2v and t2v paths) |
| `audio_t2a`      | AudioRunner            | STUB              | NotImplementedError("Audio modality lands in CP-8") |
| `lipsync_post`   | LipSyncPostProcessor   | STUB              | NotImplementedError("Lip-sync lands in CP-8") |

Future CP-5/CP-6/CP-8/CP-9 modality strings (not registered in CP-4):
- `video_r2v` (would absorb execute_wan_r2v)
- `video_pass` (would absorb execute_pass)
- `image_previz` (would absorb execute_previz, OR fold into image_t2i with a sub-mode)
- `eval_image_v1` / `eval_video_v1` (CP-9)

## RunResult schema (locked at CP-4)

```python
@dataclass
class RunResult:
    id: str                        # stable result id
    modality: str                  # canonical modality string
    output_path: Optional[str]     # local path to artifact
    output_url: Optional[str]      # remote URL if applicable
    metadata: dict                 # opaque dict — duration, cost, etc.
    success: bool                  # True iff completed and gates passed
    error: Optional[str]           # error message on failure
```

Adding fields post-CP-4: SAFE. Renaming/removing existing fields: CONTRACT BREAK.

## Caller inventory (StepRunner importers)

Re-verified 2026-04-28 by `grep -rn 'from execution.step_runner' --include='*.py' recoil/`
(filtered for `_archive/`, `__pycache__`, `.venv/`). **13 importers — matches
the spec's pre-spec evidence count exactly.** Machine-readable inventory at
`consultations/recoil/cp4-modality-registry-spec/cp4_phase1_callers.json`.

Production importers (5):

| Site | Symbol | CP-4 action | CP-5 action |
|---|---|---|---|
| `tools/produce.py:29`                            | `StepRunner`              | unchanged | migrate to `get_runner` |
| `tools/shootout/run_shootout.py:26`              | `StepRunner`              | unchanged | migrate to `get_runner` |
| `pipeline/tools/generate_keyframes.py:105`       | `StepRunner`              | unchanged | migrate to `get_runner` |
| `pipeline/lib/run_shot.py:202`                   | `make_identity_gate`      | unchanged | possibly migrate (helper-only import) |
| `pipeline/orchestrator/step_runner.py:7`         | `*` (star-import re-export)| unchanged | shim — leave alone |

Test importers (8 sites across 6 files):

| Site | Symbol |
|---|---|
| `pipeline/tests/test_phase_3_acceptance.py:472`         | `StepRunner` |
| `pipeline/tests/test_step_runner_take_numbers.py:4`     | `_make_shot_label_for_keyframe`, `_make_shot_label_for_video`, `_make_shot_label_canonical` |
| `pipeline/tests/test_step_runner_take_numbers.py:26`    | `StepRunner` |
| `pipeline/tests/test_step_runner_take_numbers.py:59`    | `StepRunner` |
| `pipeline/tests/test_phase_2_5_acceptance.py:9`         | `StepRunner` |
| `pipeline/tests/test_phase_2_5_acceptance.py:60`        | `StepRunner` |
| `pipeline/tests/test_step_runner_video_blocking.py:9`   | `StepRunner` |
| `pipeline/tests/test_step_runner_no_fix_loop.py:9`      | `StepRunner` |

Also verified: `import execution.step_runner` (package-import form) returns 0
hits — every caller uses `from execution.step_runner import …`. CP-5 migration
can rely on a single import-form pattern.

Method-call counts (informational, not migration targets in CP-4):
- `execute_keyframe(` — 21 call sites
- `execute_video(` — 32 call sites
- `execute_pass(` — 7 call sites
- `execute_previz(` — 1 call site

## Out-of-scope code (CP-4 must NOT touch)

- `recoil/execution/providers/*.py` — CP-2 stable surface.
- `recoil/pipeline/lib/prompt_engine.py` — CP-3 stable surface.
- `recoil/execution/step_runner.py` — read-only in CP-4. No internal rewrite.
- `recoil/execution/video_model_client.py` — CP-2 stable.
- `recoil/pipeline/lib/api_client.py` — 7-line star-import proxy to
  `execution/api_client.py` (NOT a 61-line stub — verified 2026-04-27). 10+
  active callers via `from lib.api_client import ...`
  (probe_kling_features, screen_test_gen, scene_keyframe, prep_character_angles,
  concept_grid, review_server, test_jobs). CP-5 followup is MIGRATE callers OR
  keep proxy alive — NOT a simple delete.
- Frontend / workspace UI.
- engine-memory subrepo content (diff stays empty per Hidden Issue).
- `recoil/config/PROMPT_BIBLE.yaml`, `provider_strategy.json` — read-only.

## Risk register surfaced by audit

Sub-agent observations from reading `execute_video` (L335) and `execute_keyframe`
(L1384) signatures live and step_types.py:81-94:

### R1 — `gates` callback list (both methods)

`execute_video(..., gates: Optional[list[GateFunction]] = None, ...)` and
`execute_keyframe(..., gates: Optional[list[GateFunction]] = None, ...)` accept
arbitrary callables `(Path, dict) → GateVerdict` (type alias defined at
`step_types.py:78`). The CP-4 runner payload schema MUST surface this slot
verbatim — the registry-side test cannot fabricate gates, it has to thread the
list through. Phases 3/4 must:
- Accept `payload["gates"]: list[GateFunction] | None`
- Pass-through to the wrapped method without serialization (gates are live
  Python callables, not JSON-serializable).
- Never silently drop gates if absent — pass `None` explicitly.

### R2 — `on_status` callback (video only)

`execute_video(..., on_status=None, ...)` is a status-update callback used by
the live UI. ImageRunner does NOT need this slot, but VideoRunner MUST surface
it. Phase 4 payload schema must include `on_status: Optional[Callable]`.

### R3 — `max_gate_retries` default = 3 (keyframe only)

`execute_keyframe(..., max_gate_retries: int = 3, ...)`. The runner payload must
allow the caller to override the default — do NOT hardcode 3 in ImageRunner.run.
Use `payload.get("max_gate_retries", 3)` so it matches StepRunner's behavior
exactly.

### R4 — `inputs_snapshot` dict (both methods)

Both methods accept `inputs_snapshot: dict | None = None`. This is the
provenance dict (refs, bible, prompts, overrides) — load-bearing for the
Provenance Explorer project. Runners must thread it through unchanged.

### R5 — `generate_audio` default True (video only)

`execute_video(..., generate_audio: bool = True, ...)`. The runner payload must
allow override — driver-beware seatbelt project ran with audio off, and the
default-on flag flowed through `unified_payload["generate_audio"]` (L412-413).
Phase 4 must respect explicit `False`, not coalesce to default.

### R6 — Reference image bundles (keyframe-specific)

`execute_keyframe` takes FOUR distinct ref slots:
- `scene_ref_path: Optional[Path]` — single path
- `pose_ref_path: Optional[Path]` — single path
- `identity_refs: Optional[list[Path]]` — list
- `expression_refs: Optional[list[Path]]` — list

These are bundled into a `KeyframeRefBundle` (assembler.py) and compiled to
parts via `compile_keyframe_parts(bundle)`. Phase 3 ImageRunner payload must
preserve all four slots distinctly — collapsing them into a single
`reference_images` list (as `execute_video` accepts) would break the assembler
contract.

### R7 — `reference_images` flat list (video-specific)

`execute_video(..., reference_images: Optional[list[str]] = None, ...)` is a
flat string list (not Path), passed straight to
`unified_payload["reference_images"]` (L409-410). Distinct semantically from
keyframe's `identity_refs/expression_refs`. Phase 4 must keep it as `list[str]`,
not Paths.

### R8 — Side effects in StepRunner constructor

`class StepRunner` (L310) — instantiating a StepRunner inside ImageRunner.run()
or VideoRunner.run() may trigger constructor work that the registry-side test
does not anticipate (cost logger init, store wiring, feedback registration).
**Phase 3/4 should accept an INJECTED StepRunner instance via payload (or a
cached singleton) rather than instantiating fresh per call.** This protects the
test surface and avoids hidden cost-logger duplication.

### R9 — `elements_payload` legacy fal.ai O3 pathway (video)

`execute_video(..., elements_payload: Optional[dict] = None, ...)`. Legacy
fal.ai O3 elements pathway — surfaced via `unified_payload["hints"]["elements"]`
(L418-424). Rare callers but must be preserved for backward compat. Don't drop.

### R10 — `negative_prompt` (video)

`execute_video(..., negative_prompt: Optional[str] = None, ...)`. Per the JT
feedback memory `feedback-prompt-quality-guard.md`, negative_prompt in config
is dead code; only the explicit-call form is live. Phase 4 must surface it but
treat absence as truly absent (None), not empty-string.

### R11 — `end_frame` validation timing (video)

`execute_video` validates `end_frame.exists()` BEFORE building the unified
payload (L378-381). Phase 4 must mirror that ordering — defer the existence
check to inside StepRunner.execute_video, do not pre-validate in VideoRunner.run
(would duplicate work and risk skew if the path is mutated mid-call).

### R12 — `gate_verdict` is singular, plural in metadata?

StepResult.gate_verdict is `Optional[GateVerdict]` (singular). The CP-3-era
mapping puts it under `RunResult.metadata["gate_verdict"]`. Adapters MUST NOT
rename to `gate_verdicts` (plural) when packing — the contract is singular all
the way through.

### R13 — Frozen dataclass invariant

StepResult is `@dataclass(frozen=True)`. Adapters MUST construct a fresh
RunResult; mutating StepResult will raise FrozenInstanceError. Phase 7
verification must include a test that writes through the adapter without
attempting in-place mutation of StepResult.

## CP-5 followups (do NOT do in CP-4 — list only)

After CP-4, these become the CP-5 work surface:
1. Migrate the 13 StepRunner importers to call `get_runner(modality).run(payload)`.
2. Define `GenerationReceipt` (CP-5 deliverable) wrapping `RunResult`.
3. Move `execute_pass` / `execute_previz` / `execute_multi_shot` /
   `execute_wan_r2v` into the registry surface OR rebuild as orchestration on
   top of runners.
4. Resolve `pipeline/lib/api_client.py` (7-line star-import proxy with 10+
   active callers — verified 2026-04-27, NOT 61-line stub) — migrate callers
   off the proxy OR commit to keeping it as the canonical lib-side import path.
5. Decide if `execute_previz` becomes its own modality (`image_previz`) or
   folds into `image_t2i` with a sub-mode.
6. Decide whether `make_identity_gate` and `make_video_drift_gate` (module-level
   helpers in step_runner.py) move to a separate `gates.py` module, or remain
   coupled to step_runner.
7. Decide whether the `pipeline/orchestrator/step_runner.py` star-import shim
   (line 7) is collapsed into the registry import or kept as a thin re-export.

## Post-CP-4 summary

Generated: 2026-04-28
CP-4 status: GREEN (Phases 1-9 complete)
Tag: post-cp4-modality-registry (parent + engine-memory)

### What changed

- New package: `recoil/pipeline/core/` containing `registry.py` and `runners/`.
- Four modalities registered: `image_t2i` (LIVE), `video_i2v` (LIVE),
  `audio_t2a` (STUB until CP-8), `lipsync_post` (STUB until CP-8).
- New API:
  ```python
  from pipeline.core.registry import get_runner
  result = get_runner("image_t2i").run(payload)
  ```
- Adding a new modality is a one-line edit:
  ```python
  register_runner("my_modality", MyRunner())
  ```

### What did NOT change

- `recoil/execution/step_runner.py` — byte-unchanged from `pre-cp4-modality-registry`.
  All seven `execute_*` methods still live; the runners delegate to them.
- Provider adapters, prompt engine, video model client — all CP-2/CP-3 stable.
- engine-memory subrepo — diff stays empty.

### Bootstrap pattern (until CP-5 centralizes)

```python
import pipeline.core.runners                    # registers stubs
from pipeline.core.runners import register_default_runners
register_default_runners(my_step_runner)        # registers image_t2i + video_i2v

# now usable:
from pipeline.core.registry import get_runner
get_runner("image_t2i").run({...})
```

### Conftest fix (load-bearing — keep)

`recoil/pipeline/tests/conftest.py` now inserts `RECOIL_ROOT` BEFORE
`STARSEND_ROOT` on `sys.path`. Phase 2's new `pipeline/core/` would otherwise
shadow `recoil/core/` for pre-existing tests using `from core.paths import ...`.
This fix is permanent infrastructure, not a workaround.

### Rollback (within 1 week of Phase 8)

```bash
cd /Users/joeturnerlin/Dropbox/CLAUDE_PROJECTS
git checkout pre-cp4-modality-registry -- recoil/pipeline/core/
git checkout pre-cp4-modality-registry -- recoil/pipeline/tests/conftest.py
# step_runner.py was untouched; nothing to revert there.
```
