# Appendix E — Flux 2 Protocols for Storyboard Generation

> **Source:** Gemini Deep Research (2026-01-29)
> **Original:** Obsidian `04_Resources/Flux 2 Protocols - Gemini Deep Research.md`

> **Critical:** When using Flux 2 Dev with LoRA, scale must be **1.0** (not 1.3). Scale 1.3 causes generation blowout — muddy textures, destroyed faces. The `lora_registry.json` now has `flux2_scale_solo: 1.0` for engine-specific lookup. See `archive/generation_workflows.md` for full parameter cards and gotchas (archived — LoRA pipeline eliminated Feb 27, 2026).

---

## Model Variants

| Variant | Scale | Use In Pipeline |
|---------|-------|-----------------|
| **Flux 2 [klein]** | Compact/RTX | Local storyboard iteration, first/last frame pairs (what we use now) |
| **Flux 2 [dev]** | 32B open weights | LoRA training for character consistency, local development |
| **Flux 2 [max]** | 32B | Final production-grade storyboard panels |
| **Flux 2 [pro]** | Optimized API | High-volume generation with consistent pricing |
| **Flux 2 [flex]** | Tunable | Precision tasks — manual step/guidance control, typography |

**Current setup:** Klein via GGUF on Apple Silicon. Production path: [dev] for LoRA training, [max] or [pro] for final renders.

---

## Multi-Reference System (Up to 10 Images)

Flux 2 natively accepts up to 10 reference images. Allocate slots strategically:

| Slots | Purpose | What to Provide |
|-------|---------|-----------------|
| 1-5 | **Character identity** | Front, profile, three-quarter, full body, back |
| 6 | **Wardrobe/props** | Signature items (Jinx's salvage hook, debt counter) |
| 7-8 | **Environment** | Location references, architecture, color palette |
| 9 | **Lighting reference** | Photo with the desired light quality |
| 10 | **Pose/layout** | Depth maps, layout sketches, blocking guides |

**Key insight:** The model acts as a "latent compositor" — it draws features from each reference based on prompt instructions. Front-loading character refs (slots 1-5) is critical for identity consistency across shots.

### Identity Persistence Techniques

- **Latent Panel Anchoring (LPA):** Reuse a shared reference latent across all shots to mathematically anchor character identity from Shot 1 to Shot 18.
- **Reciprocal Attention Value Mixing (RAVM):** Blends visual features between token pairs, preventing gradual identity drift across a sequence.
- **Practical implication:** Generate character reference sheets FIRST, then pass them into every frame generation. No LoRA needed for storyboarding — LoRAs only for multi-episode production consistency.

---

## Prompt Engineering

### Hierarchy (Front-Load What Matters)

Content words (nouns, proper nouns) have stronger effect than modifiers. Structure prompts:

1. **Main Subject** — concrete, specific entity description
2. **Key Action/Pose** — what the subject is doing
3. **Critical Style** — photographic/artistic medium
4. **Essential Context** — lighting, environment, mood

This matches our existing prompt formula:
```
[shot_type] of [subject + character visual], [action], [location], [lighting], [cinematic modifiers]
```

### Hybrid JSON + Prose Prompts (ACTIVE)

Flux 2's VLM backbone (Qwen3 for Klein, Mistral for Dev/Pro) responds better to **novelistic relationship descriptions** than tags. The hybrid format combines structured metadata with evocative prose:

```json
{
  "scene": "A young woman crouches in the throat of a corroded maintenance shaft, her fingers tracing the seam of a cryo-pod lodged against the far wall. Emergency amber lighting catches the sweat on her temples. The air itself seems to press inward, thick with recycled humidity and the faint chemical bite of leaked coolant.",
  "subjects": [
    {
      "id": "jinx",
      "reference_index": 2,
      "description": "Late 20s, lean and angular, cropped dark hair with a streak of hydraulic grease above the left ear. Debt counter embedded in left wrist pulses amber. Salvage suit patched at the shoulders, sleeves torn to the elbow.",
      "action": "crouching, fingers tracing cryo-pod seam",
      "hair_makeup": "sweat-matted hair, grime on jaw, faint bruise on right cheekbone"
    }
  ],
  "camera": {
    "angle": "low",
    "lens": "85mm f/1.4",
    "depth_of_field": "Shallow focus on hands and pod seam, background falls to amber bokeh"
  },
  "lighting": {
    "type": "chiaroscuro",
    "source": "Single overhead amber emergency strip, reflected off wet metal surfaces",
    "color_temp": "warm amber (2700K)"
  },
  "color_palette": ["#1A1A1A", "#E65100", "#4A3728", "#FFFFFF"],
  "film_stock": "Kodak Vision3 500T",
  "mood": "Tense, claustrophobic"
}
```

**Key principles:**
- **"scene" field:** 30-80 words of novelistic prose. Describe what the camera sees as if writing a novel — atmosphere, relationships between elements, sensory details.
- **"subjects" field:** Describe characters with full visual identity + current physical state. Include `reference_index` for multi-reference slot mapping.
- **"camera" field:** Use the project's lens package — don't invent random focal lengths. Lens defaults per CONSTANTS.md.
- **"color_palette" field:** HEX codes from visual_bible.md. Flux 2 follows hex values accurately when assigned to specific objects.

**Where the data comes from:**
- `scene` → Storyboard agent builds from script action + location (from breakdown.json and visual_bible.md)
- `subjects` → Characters from breakdown.json (visual, wardrobe phase, hair/makeup state)
- `camera` → Lens package from visual_bible.md, mapped by shot type
- `lighting` → visual_bible.md lighting guides + breakdown.json location lighting notes
- `color_palette` → visual_bible.md per-location and per-character palettes

### HEX Color Matching

Flux 2 follows HEX codes accurately when assigned to specific objects:
- `"strictly in color #E65100 (amber orange)"` for debt counter glow
- Associate HEX with specific objects, not vaguely: `"The counter glows in color #E65100"` not `"use #E65100 somewhere"`
- Gradients supported: `"gradient from #02eb3c to #edfa3c"`

### Lens Package Integration

Projects codify a **lens package** in visual_bible.md — a limited set of lenses mimicking professional film production practice. The storyboard agent maps shot types to lenses automatically:

| Shot Type | Default Lens | Rationale |
|-----------|-------------|-----------|
| ECU, CU | Close-Up (85mm f/1.4, per CONSTANTS.md) | Telephoto compression, strong bokeh, subject isolation |
| MCU, MS | Primary (50mm f/2.0, per CONSTANTS.md) | Natural perspective, minimal distortion |
| LS, WIDE | Wide (24mm f/8, per CONSTANTS.md) | Deep focus, dramatic perspective, environmental context |
| POV | Primary (50mm f/2.0, per CONSTANTS.md) | Matches human perspective |
| VFX | Varies | Match the underlying visual intent |

> Lens defaults (85mm, 50mm, 24mm) per CONSTANTS.md. Override in project `visual_bible.md`.

**The lens package is a creative constraint, not a technical spec.** It ensures every shot in a project has a coherent visual language, the same way a DP selects glass before a shoot.

---

## Cinematic Lexicon

### Focal Length → Visual Effect

| Lens | Effect in Flux 2 | When to Use |
|------|-------------------|-------------|
| 14-24mm | Wide-angle, dramatic perspective, deep space | Establishing shots, WIDE, environmental reveals |
| 35-50mm | Natural perspective, minimal distortion | MS, MCU, dialogue, standard coverage |
| 85-100mm | Telephoto compression, strong background separation | ECU, CU, emotional close-ups, detail shots |

> **Note:** These are general cinematography ranges for understanding lens behavior. Project-specific defaults (24mm f/8, 50mm f/2.0, 85mm f/1.4) are established in `CONSTANTS.md` and the project's `visual_bible.md` during Visual Design.

### Aperture → Depth of Field

| Aperture | Effect | When to Use |
|----------|--------|-------------|
| f/1.4-2.8 | Shallow DOF, soft bokeh, subject isolated | Character reactions, critical objects, emotional beats |
| f/8-16 | Deep focus, everything sharp | Complex blocking, architectural scenes, wide establishing |

### Lighting Setups

| Setup | Prompt Language | Narrative Use |
|-------|----------------|---------------|
| Chiaroscuro/Noir | "Rembrandt lighting", "hard directional shadows" | Mystery, conflict, tension |
| Three-Point | "key, fill, and rim light" | Neutral, polished, commercial |
| Volumetric | "light beams filtering through dust", "neon reflected on wet surfaces" | Atmosphere, physical grounding |

---

## Performance Optimization

| Technique | Effect | When to Use |
|-----------|--------|-------------|
| **FP8 Quantization** | 40% VRAM reduction, minimal quality loss | Running 32B model on 24GB GPUs |
| **Step tuning** | 6-20 steps for drafts, 30 for production | Draft thumbnails vs final renders |
| **Kontext iteration** | Step-by-step dramatic changes, identity preserved | Restyling, environment swaps |

---

## Pipeline Integration Notes

### What We Use Now

1. **Hybrid JSON + prose prompts** — Storyboard agent builds novelistic scene descriptions (30-80 words) plus structured camera/lighting/color metadata. `generate_from_storyboard.py` emits the hybrid format.
2. **Lens package per project** — Established in Visual Design phase (visual_bible.md), codified in storyboard JSON. Shot types map to specific lenses.
3. **HEX color pinning** — Per-character and per-location color palettes from visual_bible.md, carried through breakdown.json to storyboard prompts.
4. **Focal length + aperture per shot** — Every shot specifies lens and aperture from the project's lens package.
5. **Cinematic lexicon** — Film stock, lighting setups, and camera language built into every prompt.
6. **Reference image slots** — Character refs (1-5: front, profile, 3/4, full body, back), props (6), environment (7-8), lighting (9), pose/layout (10).
7. **Atmospheric inference** — Agent infers implicit visual details (fog, dust, temperature, particle effects) from emotional context + location before building prompts.
8. **breakdown.json integration** — Storyboard agent reads wardrobe phases, hair/makeup states, prop ownership, and continuity data from the breakdown pipeline.

### Future Upgrades

1. **Latent Panel Anchoring (LPA)** — Implement shared reference latent across all shots in a generation run for mathematical identity consistency.
2. **Verifier agent** — Post-generation VLM check for identity drift, prop consistency. Auto-trigger regeneration on failure.
3. **Step-based workflow** — Use low-step drafts (4-6) for rapid iteration, high-step (20-30) for final renders.
4. **Sandwich workflow for video** — Two-anchor interpolation: first_frame + last_frame → WAN 2.1 fills middle frames.
5. **LoRA training pipeline** — ACTIVE. See `tools/train_lora.py`. Registry at `[project]/visual/lora_registry.json`. Character identity LoRAs for multi-episode production consistency.

---

*Distilled from Gemini Deep Research report. Full original in Obsidian.*
