# Context Bundle: Client-Side StepRunner Architecture

## Project Overview

Recoil Studios is an AI-powered vertical microdrama production company. The **Visual Pipeline** (codebase in `starsend/`) generates frames and video for scripted series (Tartarus, Leviathan, etc.) via frontier models (Gemini for stills, Kling/SeedDance/Veo for video).

A new use case has emerged: **client video projects** like "Driver Beware" — a Culver corporate safety video. These differ fundamentally from the scripted series pipeline:

## The Problem

The existing pipeline (`pipeline.py` → `StepRunner`) is deeply coupled to the **Recoil series workflow**:
- Reads from Recoil storyboards via `recoil_bridge.py`
- Expects `global_bible.json` with character/location data from Narrative Engine extraction
- Uses scene planning/routing (`scene_planner.py`) to classify shots into tiers
- Has a 13-step pipeline: Script Lock → Camera Test → Bible → Enrichment → Plan Pass → Casting → Location Refs → Previz → Review → Keyframe → Dailies → Video → Export
- ExecutionStore state machine with 25+ states (previs_pending → keyframe_generating → video_complete → approved)
- 3 generation modes via StepRunner: `execute_keyframe`, `execute_video`, `execute_previz`

For **client video** (Driver Beware), none of the early pipeline steps apply:
- No Recoil script — it's song-driven (79s MP3 with lyrics)
- No narrative engine extraction — manual shot plan (`ep_001_plan.json` with 12 sequences)
- No global_bible — manual `client_bible.json` with characters/locations/props
- No scene planning/routing — all sequences use Kling O3 multi-shot with elements
- No previz pass — go straight to video generation
- No keyframe generation — start frames come from draft video cut frames or Gemini grid exploration
- Different aspect ratio (16:9 client delivery vs 9:16 vertical series)

### What DOES work from the existing pipeline:
- `StepRunner.execute_video()` — works perfectly for single-shot I2V/T2V
- `StepRunner.execute_multi_shot()` — works perfectly for multi-prompt sequences
- `ElementManager` — character element handling for Kling O3
- `ExecutionStore` — shot state tracking (simplified states)
- `api_client.py` / `FalAiKlingClient` — API layer
- Visual validation critics (RefImageCritic, StartFrameCritic, VideoFrameCritic)
- `lib/slicer.py` — video slicing for editorial cuts
- Take management (versioned outputs, take records)

### What's awkward/broken:
- `pipeline.py` is 800+ lines of Recoil-specific orchestration we'd never call
- `recoil_bridge.py` fails on client projects (no storyboards, no breakdown)
- `client_bridge.py` exists as a thin shim but doesn't connect to StepRunner
- Grid exploration workflow (Gemini → pick quadrant → use as start frame) has no StepRunner integration
- No concept of "sequence" as a unit — StepRunner thinks in individual shots
- No way to batch a sequence with start frame + elements + multi-prompt in one call
- The shot plan format (sequences with shots) doesn't match the plan format StepRunner expects

## Current Driver Beware Architecture

### Data Structure
```
projects/driver-beware/
├── project_config.json          # project_type: "client_video", aspect_ratio: "16:9"
├── casting_state.json           # 7 entries: driver, deer, woman, man, driver2, blue_car, dark_blue_car
├── refs/
│   ├── characters/{id}/         # Character ref sheets (18% gray background)
│   ├── locations/{id}/          # Location moodboards
│   └── props/{id}/              # Prop refs (cars)
├── state/starsend/
│   ├── client_bible.json        # Manual bible with characters, locations, props
│   └── plans/ep_001_plan.json   # 12 sequences, each with 3-5 shots
└── output/
    ├── video/                   # Generated clips (~36 so far)
    └── frames/ep_001/clean/     # Clean start frames (~15)
```

### Shot Plan Format (client)
```json
{
  "episode_id": "EP001",
  "sequences": [
    {
      "id": "SEQ01",
      "song_section": "Intro",
      "song_timestamp": "0:00-0:05",
      "narrative": "Car drives peacefully down the street",
      "duration": 15,
      "model": "kling-o3",
      "elements": ["blue_car", "driver"],
      "shots": [
        {"prompt": "Extreme wide shot...", "duration": 3},
        {"prompt": "Tracking shot...", "duration": 3}
      ],
      "director_notes": "Establish the world. Peaceful, sunny, safe feeling."
    }
  ]
}
```

### Proven Client Workflow (manually done for SEQ08-09)
```
1. Select sequence from plan
2. Resolve elements (ElementManager.build_fal_elements)
3. Optional: Generate start frame via grid exploration
   a. Gemini 2x2 or 3x3 grid → pick best quadrant → crop
   b. JT approves start frame
4. Build multi-prompt sequence from plan shots
5. Submit via StepRunner.execute_multi_shot(batch, prompts, model, start_frame, elements)
6. Review output, iterate
```

### Existing StepRunner API (relevant methods)
```python
class StepRunner:
    def __init__(self, store: ExecutionStore, paths: ProjectPaths, cost_logger=None, validate_frames=True)

    def execute_video(self, shot_id, prompt, model, start_frame=None, end_frame=None,
                      duration=5, aspect_ratio="9:16", gates=None, elements_payload=None,
                      generate_audio=True, reference_images=None, negative_prompt=None,
                      inputs_snapshot=None) -> StepResult

    def execute_multi_shot(self, batch, multi_prompt_sequence, model="kling-v3",
                          start_frame=None, aspect_ratio="9:16", elements_payload=None,
                          cfg_scale=None) -> list[StepResult]

    def execute_keyframe(self, shot_id, prompt, model, scene_ref_path=None, ...) -> StepResult

    def execute_previz(self, shot, all_shots, bible=None, ...) -> StepResult

    def transition(self, shot_id, to_state, reason="") -> None
```

### Existing client_bridge.py
```python
# Thin shim — reads client project data but doesn't execute anything
def load_client_storyboard(project, episode=1)  # Loads plan JSON
def load_client_bible(project)                    # Loads bible JSON
def load_client_project_config(project)           # Loads project config
def save_client_bible(project, bible_data)         # Writes bible
def get_client_refs_dir(project)                   # Returns refs path
```

## The Question

We need a way to run the client video workflow through the pipeline. Three options:

### Option A: Fork StepRunner
Create `ClientStepRunner` that inherits from or forks `StepRunner`. Strip out the Recoil-specific assumptions, add sequence-level orchestration, grid exploration, client plan loading.

**Pro:** Clean separation, no risk of breaking series pipeline.
**Con:** Code duplication, two runners to maintain, diverge over time.

### Option B: Lightweight Wrapper
Keep StepRunner as-is. Build a thin `ClientSequenceRunner` that:
- Reads client plan format
- Resolves elements
- Manages grid exploration workflow
- Calls StepRunner.execute_multi_shot() and execute_video() as-is
- Handles client-specific state (sequence progress, start frame approval)

**Pro:** Reuses all existing code, no fork divergence. StepRunner stays the single source of truth.
**Con:** Two levels of abstraction. Sequence-level state (which sequences are done?) lives outside ExecutionStore.

### Option C: Build New
Build a purpose-built `ClientPipeline` from scratch that shares only the lowest-level components (api_client, ElementManager, execution_store, critics).

**Pro:** Perfect fit for client workflow, no legacy baggage.
**Con:** Massive duplication of battle-tested code (save logic, take management, cost tracking, validation hooks).

## Specific Questions for Consultant

1. **Which option (A/B/C) is best for a solo developer** who already has a working pipeline and needs to ship client videos NOW?
2. **Should the grid exploration workflow be formalized?** Currently it's ad-hoc (run Gemini grid, eyeball it, crop). Should it become a StepRunner method like `execute_grid_exploration()`?
3. **How should sequence-level state be tracked?** Each sequence = 1 multi-shot API call producing 1 video. But the plan has 12 sequences. Where does "SEQ03 is done, SEQ04 is next" live?
4. **Should client projects use a simplified ExecutionStore state machine?** The current 25+ states include previz/keyframe layers that client video skips. Should there be a "video-only" state subset?
5. **What's the right integration point with the Production Console?** The Console is built around the full pipeline. Can Dailies/Board tabs work with client projects?
6. **How should we handle the different plan format?** Client plans have `sequences[].shots[]` vs Recoil plans that have flat `shots[]` with routing metadata. Adapter? New format? Convert on load?

## Key Constraints
- Solo developer (JT + Claude)
- Must not break the existing series pipeline
- Client video needs to ship ASAP (this is paid work)
- The grid exploration → start frame → multi-shot workflow is proven but not formalized
- Future client projects are likely (not just Driver Beware)
- 16:9 aspect ratio for client, 9:16 for series
- Kling O3 with elements is the primary model for client video