## Round 5 — Structured Grid Prompting Discovery

JT found two prompt engineering approaches from the AI video production community that may significantly improve our grid planning pass. These introduce **structured grid positions** — each cell in the 3x3 grid has a defined shot type, rather than "9 random angles."

### Prompt 1: "Cinematic Contact Sheet" (concise version)

This prompt assigns specific shot types to specific grid positions:

```
Row 1 (Establishing Context):
  1. Extreme Long Shot (ELS) — subject(s) small within vast environment
  2. Long Shot (LS) — complete subject(s) head to toe
  3. Medium Long Shot (3/4) — knees up or 3/4 view

Row 2 (Core Coverage):
  4. Medium Shot (MS) — waist up, focus on interaction/action
  5. Medium Close-Up (MCU) — chest up, intimate framing
  6. Close-Up (CU) — tight on face or front of object

Row 3 (Details & Angles):
  7. Extreme Close-Up (ECU) — macro detail on key feature
  8. Low Angle Shot (Worm's Eye) — looking up, imposing/heroic
  9. High Angle Shot (Bird's Eye) — looking down from above
```

Requirements: same people/objects, same clothes, same lighting across all 9 panels. Depth of field shifts realistically (bokeh in close-ups).

### Prompt 2: "Cinematic Sequence Director" (elaborate version)

This is a full pipeline that runs BEFORE grid generation:

**Step 1 — Scene Breakdown:**
- Subjects: list each key subject (A/B/C), visible traits, positions, facing direction
- Environment & Lighting: spatial layout, light direction & quality, time-of-day, vibe keywords
- Visual Anchors: 3-6 traits that must stay constant (palette, signature prop, key light source, weather/fog, grain/texture)

**Step 2 — Theme & Story:**
- Theme, logline, emotional arc (setup/build/turn/payoff)

**Step 3 — Cinematic Approach:**
- Shot progression strategy (wide to close or reverse)
- Camera movement plan (push/pull/pan/dolly and WHY)
- Lens & exposure: focal length range, DoF tendency, shutter feel
- Light & color: contrast, key tones, grain

**Step 4 — Keyframes:**
Per frame: composition, action/beat, camera (height/angle/movement), lens/DoF, lighting, sound/atmos
Hard requirements: 1 establishing wide, 1 intimate CU, 1 ECU, 1 power-angle shot

**Step 5 — Contact Sheet:**
ONE master 3x3 grid image containing ALL keyframes. Labels in safe margins. Strict continuity.

### Why This Matters for Starsend

1. **Our storyboards already have this data.** Each shot in `storyboard_ep_001.json` has `shot_type` (ECU, CU, MCU, MS, LS, WIDE), `camera_angle` (eye, low, high, dutch), `focal_length`, `aperture`, `lighting`, `emotion`, etc. We could compile a structured grid prompt that assigns specific storyboard shots to specific grid positions.

2. **Scene coverage in one call.** A scene with 5-8 shots could be covered by a single 3x3 grid generation — with each panel showing a different storyboard shot but maintaining perfect environment/character/lighting consistency via shared seed.

3. **The Visual Anchors concept maps to our scene_planner.** Their "3-6 visual traits that must stay constant" is exactly what our scene_planner should extract from the storyboard: palette, lighting direction, key props, atmospheric conditions.

4. **Progressive framing (ELS → LS → MS → CU → ECU)** matches our shot progression within a scene. We could generate the grid following the storyboard's actual shot order.

### Questions for Gemini

1. **Can we map our actual storyboard shots to grid positions?** Instead of the generic ELS→LS→MS→CU→ECU progression, can each of the 9 cells contain a SPECIFIC shot from our storyboard with its own unique framing/action?

2. **Does structured grid prompting (assigning shot types per cell) produce better results than generic "9 different angles" prompting?** Is there something about explicitly labeling "TOP LEFT: Extreme Long Shot, TOP CENTER: Long Shot..." that helps the model?

3. **How does this interact with our reference images?** If we pass Jinx identity refs + scene ref alongside a structured grid prompt, does the model maintain identity across all 9 panels reliably?

4. **Should the grid planning pass use this structured format, or should we keep it as "9 variations of the same moment" and apply the structured approach only for scene coverage grids?**

5. **Can we use this approach to generate our grayscale expression library more effectively?** Instead of "9 expressions," use "Row 1: subtle emotions (neutral, focused, concerned), Row 2: moderate emotions (exhausted, determined, angry), Row 3: extreme emotions (terror, rage, grief)"

6. **For a single-shot test: what would the ideal prompt look like for Leviathan EP001 Shot 2 (Jinx wedges hook, MS, low angle)?** Write us the exact prompt using these techniques, incorporating our storyboard data, character refs, and the structured grid approach. Give us something we can run immediately.

Please also evaluate: is the "Cinematic Sequence Director" planning layer (Steps 1-3) worth integrating into our prompt_engine.py, or does our storyboard data already provide equivalent or superior information?
