This is the final blueprint. We have successfully stripped away the theoretical and built a hardened, production-ready pipeline. 

Here is the definitive resolution to your final concerns, the merged architecture, and the execution plan for EP001.

---

### PART 1: CONFIRMED AGREEMENTS

The following architectural pillars are absolutely **LOCKED**:
1. **Reference Ordering:** Recency bias rules. Scene (Low) → Pose/Flash (Med) → Expression (Med-High) → Identity (High/Closest to Prompt).
2. **Native 9:16:** No triptychs. Native vertical generation via API aspect ratio parameters.
3. **Pristine Identity:** White-background identity refs remain strictly unaltered to prevent degradation.
4. **Hardware Mirroring:** `ImageOps.mirror()` dictates screen direction; prompts do not.
5. **Expression Transfer:** Using non-identity reference images mapped via cross-attention.
6. **Wide-Shot Branching:** Explicit removal of facial details from WIDE/LS prompts.
7. **Positive Constraints & Kinetic Descriptors:** Affirmative prompting only.
8. **Lighting Vector Locking:** Hardcoded directional coordinates in every prompt.

---

### PART 2: FINAL RESOLUTIONS TO REMAINING CONCERNS

**1. Expression Reference Library**
*Final Decision:* **Option (A) - Pre-generated Canonical Library.**
*Rationale:* Generating on the fly adds latency and introduces hallucination risks. Stock photos introduce licensing nightmares. Pre-generate ~30 expressions using a *single, generic 3D-style human model*. 
*Undocumented Trick:* **Batch convert this entire expression library to Grayscale.** If the expression reference has no color, Gemini's cross-attention will pull *only* the facial geometry/muscle tension, completely eliminating the risk of skin-tone or lighting contamination on Jinx.

**2. Two-Character Shots Reference Budget**
*Final Decision:* **Cap at 7 References Total.**
*Rationale:* 8-10 pushes the token window limits and dilutes attention. For Jinx + Kian: 
Scene (1) + Flash Pose (1) + Jinx Expression (1) + Jinx Identity (2) + Kian Identity (2) = 7. 
*What gets cut?* Kian's expression ref. He is a mechanical entity; his "emotion" is conveyed via posture and prompt, not micro-expressions. Drop the 3rd identity ref for both characters; 2 pristine refs are enough if placed last.

**3. Cost Optimization (Complexity Tiers)**
*Final Decision:* **Adopt the 3-Tier System.**
*Rationale:* Not every shot needs a $0.30 pipeline. 
* `simple` (Inserts, static talking heads): Straight to Pro ($0.134).
* `standard` (Action, lighting shifts): 4x Flash -> 1x Pro ($0.30).
* `complex` (Two-shots, heavy emotion, extreme angles): 4x Flash -> up to 3x Pro candidates ($0.56).

**4. Batch Efficiency (Scene-Level Amortization)**
*Final Decision:* **DO NOT chain Flash heroes.**
*Rationale:* Feeding Shot N's output as a reference for Shot N+1 creates "Generation Drift" (a photocopy of a photocopy). By Shot 5, Jinx's proportions will warp and the contrast will deep-fry. Always anchor back to the *pristine* ENV reference and *pristine* identity refs. Consistency comes from the locked references, not from chaining outputs.

**5. gemini-3.1-flash-image-preview**
*Final Decision:* **Use 3.1 Flash, but implement aggressive Exponential Backoff.**
*Rationale:* Pricing is identical to 2.5 in preview, but QPS (Queries Per Second) limits are much stricter. Limit concurrent API calls to 2.

---

### PART 3: THE DEFINITIVE MERGED ARCHITECTURE

Here is your file-by-file production blueprint.

**1. `config.py` (The Ruleset)**
*   Defines `COMPLEXITY_TIERS` and their respective API routing.
*   Holds the locked Lighting Vectors per scene.
*   Enforces the max reference limits (7 max per shot).

**2. `asset_manager.py` (The Vault)**
*   Loads and caches Pristine Identity Refs (white BG).
*   Loads and caches the Pre-generated ENV anchors.
*   Loads the Grayscale Expression Library.
*   *Responsibility:* Ensures references are never overwritten or tint-altered.

**3. `prompt_engine.py` (The Translator)**
*   Implements `build_cinematic_prompt()`.
*   Injects kinetic descriptors, positive constraints, and lighting vectors.
*   *Crucial:* Executes the Wide-Shot Branching logic (stripping facial demands for WIDE/LS).

**4. `assembler.py` (The Compiler)**
*   Implements `ShotAssembler`.
*   Executes `ImageOps.mirror()` based on `faces_left` boolean.
*   Sorts the multipart payload strictly by weight: `[ENV(1) -> Pose(3) -> Grayscale Expression(5) -> Kian ID(8) -> Jinx ID(9) -> Text Prompt(10)]`.

**5. `pipeline.py` (The Orchestrator)**
*   Reads the shot JSON. Checks the `complexity` tier.
*   *If Standard/Complex:* Calls `gemini-3.1-flash-image-preview` for 4x pose/lighting candidates. Pauses for human/auto selection.
*   Takes the Flash Hero, compiles via `assembler.py`, and calls `gemini-3-pro-image-preview` natively at 9:16 aspect ratio.
*   Saves final frame to `delivery/EP001/`.

---

### PART 4: TOP 5 IMPLEMENTATION RISKS & MITIGATIONS

1. **Risk: Prompt Bleeding in Two-Character Shots** (Jinx gets Kian's metallic parts).
   * **Mitigation:** Strict spatial syntax in the prompt. Use `"LEFT SIDE OF FRAME: [Jinx description]. RIGHT SIDE OF FRAME: [Kian description]."` Keep character descriptions physically separated by periods, not commas.
2. **Risk: The "Mushy Monster" Wide Shot Face** (Client rejects impressionistic faces).
   * **Mitigation:** Do not fight the foundational model. Pipe rejected wide shots through an automated SDXL FaceDetailer node as a detached post-process step. Keep it out of the Gemini pipeline.
3. **Risk: API Rate Limiting (HTTP 429) on 3.1 Preview.**
   * **Mitigation:** Implement `tenacity` for exponential backoff. Cap async concurrency to 2 workers max. This is a marathon, not a sprint.
4. **Risk: Anatomy Hallucinations on Complex Poses** (e.g., holding a tool).
   * **Mitigation:** If the Flash exploration fails to generate the correct prop interaction across all 4 candidates, downgrade the shot to `simple`, remove the Flash pass, and provide a rough 3D blockout or sketch as the Pose Reference directly to Pro.
5. **Risk: Aspect Ratio Rejections / Cropping.**
   * **Mitigation:** Use the native API parameter `aspect_ratio="9:16"` in the Imagen 3 payload. Do not request 1024x1024 and crop later—this ruins the model's compositional framing.

---

### PART 5: EP001 TEST PROTOCOL (SHOTS 1-3)

Run this exact sequence to validate the pipeline before scaling to 1,800 shots.

**Shot 1: The Establishing Wide (Test: Wide-Branching & ENV Anchor)**
*   *Setup:* WIDE SHOT. Jinx walking down the corridor. `complexity: simple`.
*   *Refs:* Scene ENV (1), Jinx Identity (9).
*   *Success Criteria:* The script strips facial demands. The environment matches the anchor perfectly. No Flash pass needed.

**Shot 2: The Action Medium (Test: 3-Pass Pipeline & Kinetic Descriptors)**
*   *Setup:* MEDIUM SHOT. Jinx violently wrenching a rusted panel. `complexity: standard`.
*   *Refs:* Scene ENV (1), Jinx Identity (9).
*   *Execution:* Generate 4x Flash candidates. Select the best pose. Feed Flash Hero (3) + Identity (9) to Pro.
*   *Success Criteria:* Motion blur/kinetic energy is present. Lighting matches the Flash hero. Identity remains pristine.

**Shot 3: The Emotional Close-Up (Test: Expression Transfer & Mirroring)**
*   *Setup:* CLOSE UP. Jinx looking screen-right, exhausted and terrified. `complexity: standard`.
*   *Refs:* Grayscale "Exhausted/Terrified" stock ref (5), Jinx Identity (9). `is_mirrored: True` applied to all.
*   *Success Criteria:* Jinx is facing right. The gray expression ref maps the emotion without making Jinx look like the stock actor. Her skin tone remains correct (no gray contamination).

---

### PART 6: EP001 COST MODEL (31 Shots)

Assuming a standard microdrama distribution for EP001:
*   **10 Simple Shots** (Inserts, Static Wides): 10 x $0.134 = **$1.34**
*   **15 Standard Shots** (Mediums, Action): 15 x $0.30 = **$4.50**
*   **6 Complex Shots** (Two-shots, High Emotion): 6 x $0.56 = **$3.36**
*   **Total EP001 API Cost: ~$9.20** 

*(Extrapolated for the full 1,800 shot season: ~$534. Well within any production budget, saving thousands in human hours).*

### Final Word
You have the architecture, the code snippets, the cost model, and the exact weights. The grayscale expression trick and dynamic wide-shot branching will save you weeks of debugging. 

Lock the codebase. Begin the EP001 test protocol. You are ready for production.