# Revised Architecture: Client Video Pipeline — Round 2

## Preamble

Good pushback across the board. You caught me being hand-wavy on state flow and grid exploration, and the ElementManager correction is noted — I was wrong there, the real risk is the element cap, not schema mismatch. Let me revise with specifics.

---

## 1. Sequence Tracking: Takes, Shot-Level Prompts, and Mixed Mode

### Multiple Takes — No Change Needed

You confirmed `execute_multi_shot` already registers takes on a primary `shot_id`. This means SEQ01 run three times produces `SEQ01_take001`, `SEQ01_take002`, `SEQ01_take003`. That's correct and sufficient. No new machinery needed.

### Shot-Level Prompt Iteration — You Will Regret Not Having It

Here's the scenario: SEQ08 is a 5-shot sequence. Take 1 looks great except shot 3 has weird hand motion. You tweak shot 3's prompt and re-run. Take 2 is better on shot 3 but now shot 5 is worse. You want to go back to take 1's shot 5 prompt but keep take 2's shot 3 prompt.

If you only track at the sequence level, you're manually diffing prompt lists across takes to reconstruct what changed. That's exactly the kind of thing that's fine for the first 3 sequences and maddening by sequence 10.

**Revised position:** Store the per-shot prompt array on each take record. Not as separate shot entries in ExecutionStore — that would pollute the shot namespace. Instead, on the take record itself:

```python
# Inside the take record for SEQ08_take002
{
    "take_id": "SEQ08_take002",
    "shot_prompts": [
        {"index": 0, "prompt": "...", "duration": 5},
        {"index": 1, "prompt": "...", "duration": 5},
        {"index": 2, "prompt": "REVISED: ...", "duration": 5},
        {"index": 3, "prompt": "...", "duration": 5},
        {"index": 4, "prompt": "...", "duration": 5}
    ],
    "start_frame": "path/to/approved_frame.png",
    "model": "kling-o3",
    "concatenated_prompt_summary": "..."
}
```

This gives you shot-level prompt history without creating phantom shots. When you want to cherry-pick prompts across takes, you read `shot_prompts` from each take and compose the next run's prompt array.

### Mixed-Mode Sequences — Yes, Support Both

This is a real need. Some sequences will be a single `execute_multi_shot` call. Others — especially sequences with very specific per-shot compositions — will be individual I2V calls, each with its own start frame.

**Don't make this a sequence-level toggle.** Make it per-take. The same sequence might start as multi-shot, then switch to individual shots after you realize shot 3 needs its own start frame.

In `ClientSequenceRunner`:

```python
def run_sequence(self, seq_id, mode="multi_shot"):
    """mode: 'multi_shot' or 'individual'"""
    if mode == "multi_shot":
        self.step_runner.execute_multi_shot(...)
    elif mode == "individual":
        for i, prompt in enumerate(prompts):
            shot_id = f"{seq_id}_shot{i:02d}"
            self.step_runner.execute_video(shot_id, prompt, start_frame=...)
```

When running individual mode, the child shots (`SEQ08_shot00` through `SEQ08_shot04`) are real shot entries in ExecutionStore. The orchestrator tracks that they belong to SEQ08, but each has independent state, prompt history, and takes.

---

## 2. Grid Exploration: Lives in the Orchestrator, Execution in StepRunner

### StepRunner Gets One New Method

```python
def generate_grid(self, shot_id, prompt, elements, model="flux-pro"):
    """Generate a 2x2 grid image. Returns path to saved grid."""
```

That's it for StepRunner. It generates and saves. It does NOT crop, does NOT present options, does NOT wait for human input.

### Orchestrator Owns the Grid Workflow

```python
class ClientSequenceRunner:
    def explore_grid(self, seq_id):
        """Generate grid for start frame exploration."""
        grid_path = self.step_runner.generate_grid(...)
        self._update_sequence_state(seq_id, phase="grid_review", grid_path=grid_path)
        return grid_path

    def approve_grid_quadrant(self, seq_id, quadrant):
        """Crop selected quadrant and set as start frame."""
        grid_path = self._get_sequence_state(seq_id)["grid_path"]
        start_frame = crop_quadrant(grid_path, quadrant)
        self._update_sequence_state(seq_id, phase="start_frame_approved", start_frame=start_frame)
        return start_frame
```

### How the Console Presents Grid Selection

For Day 1 (CLI only), the human types `--quadrant top_left` after visually inspecting the grid. Zero Console changes needed.

Week 2: Four clickable quadrant overlays on grid images in the Console.

---

## 3. State Flow — The Full Picture

Two separate state stores, two separate state machines.

### Orchestrator State (sequences.json)

```
PLANNED → GRID_EXPLORING → GRID_REVIEW → START_FRAME_APPROVED → GENERATING → REVIEW → APPROVED → FINAL
```

Not every sequence hits every state. If no grid exploration needed: `PLANNED → GENERATING → REVIEW → APPROVED`.

### ExecutionStore State (existing shot state)

```
registered → video_submitted → video_processing → video_complete
```

No new states added. Grid exploration state does NOT touch ExecutionStore. The orchestrator creates the shot entry right before calling `execute_multi_shot`.

---

## 4. CLI Tool

**Option (b): New `tools/client_generate.py`**

```bash
# Show sequence status
python tools/client_generate.py status driver-beware

# Explore grid for a sequence
python tools/client_generate.py grid driver-beware SEQ05

# Approve grid quadrant
python tools/client_generate.py approve-grid driver-beware SEQ05 --quadrant bottom_right

# Generate video for a sequence
python tools/client_generate.py generate driver-beware SEQ08

# Generate in individual shot mode
python tools/client_generate.py generate driver-beware SEQ08 --mode individual

# Re-run with tweaked prompt for specific shot
python tools/client_generate.py generate driver-beware SEQ08 --shot-override 2:"new prompt"

# Approve a take
python tools/client_generate.py approve driver-beware SEQ08 --take 2
```

---

## 5. Element Cap — The Real Risk

Handled in `ClientSequenceRunner`, not in StepRunner or ElementManager:

```python
def _resolve_elements(self, seq_id):
    elements = self.plan[seq_id]["elements"]
    has_start_frame = self._get_sequence_state(seq_id).get("start_frame") is not None
    max_elements = 3 if has_start_frame else 4

    if len(elements) > max_elements:
        elements = self._prioritize(elements, max_elements)
        logger.warning(f"{seq_id}: Dropped elements due to start frame cap")
    return elements
```

---

## 6. Revised Architecture Summary

```
┌─────────────────────────────────────────────┐
│  CLI (client_generate.py)                   │
│  Console API (/api/client/...)              │
└──────────────┬──────────────────────────────┘
               │
┌──────────────▼──────────────────────────────┐
│  ClientSequenceRunner                        │
│  • Reads client plan natively                │
│  • Owns sequence_state.json                  │
│  • Grid exploration workflow                 │
│  • Element priority / cap management         │
│  • Mixed-mode dispatch (multi vs individual) │
│  • Prompt array assembly + shot_prompts      │
└──────────────┬──────────────────────────────┘
               │
┌──────────────▼──────────────────────────────┐
│  StepRunner (minimal changes)                │
│  • execute_multi_shot (existing)             │
│  • execute_video (existing)                  │
│  • generate_grid (NEW — thin wrapper)        │
│  • ExecutionStore (existing states only)     │
└──────────────┬──────────────────────────────┘
               │
┌──────────────▼──────────────────────────────┐
│  API Layer / ElementManager (unchanged)      │
└─────────────────────────────────────────────┘
```

### What I Changed from Round 1

| Position | Round 1 | Round 2 |
|----------|---------|---------|
| Grid exploration | Separate GridExplorer class | Lives in ClientSequenceRunner, only generate_grid() in StepRunner |
| state_profile in ExecutionStore | Add it | Dropped — no ExecutionStore changes |
| Mixed-mode sequences | Not addressed | Per-take mode, individual shots get real shot_ids |
| Shot-level prompt tracking | Not addressed | shot_prompts array on take records |
| Element cap | Wrong risk identified | Orchestrator handles priority |
| CLI tool | Not addressed | New tools/client_generate.py |