> **ARCHIVED 2026-03-02.** LoRA pipeline eliminated Feb 27, 2026. Retained for potential future
> re-adoption of open-source identity models. For the current visual pipeline, see
> `../PRODUCTION_PIPELINE_GUIDE.md` Phase 6.

# Candidate Generation Engines

**Last updated:** 2026-02-16 (Qwen MA no-prompt default, character-driven rendering_directives from breakdown.json, ARRI Alexa lens, wardrobe preservation block)
**Applies to:** LoRA training candidate generation via `batch_threepass.py` (recommended) and `batch_generate_refs.py` (alternative)

> **See also:** `generation_workflows.md` — master reference covering ALL generation use cases (hero stills, production keyframes, video, engine selection decision tree, known gotchas).

This document is the single source of truth for engine-specific parameters, capabilities, and the hybrid pipeline architecture used to generate LoRA training candidates.

---

## Engine 1: Qwen Image Edit 2511 + Multi-Angle LoRA (fal.ai)

### Overview

Rotates a hero reference image to arbitrary camera angles using a fine-tuned multi-angle LoRA. The endpoint takes numeric angle parameters (not text prompts) and produces a re-rendered view of the subject from the specified camera position.

### API Details

| Property | Value |
|----------|-------|
| Endpoint | `fal-ai/qwen-image-edit-2511-multiple-angles` |
| Provider | fal.ai |
| Cost | ~$0.035/image at 1024x1024 |
| Speed | ~9s/image |
| Rate limit | Standard fal.ai (no special RPM cap) |

### Parameters

| Parameter | Type | Description |
|-----------|------|-------------|
| `image_urls` | `[str]` | List with one image URL (the hero reference) |
| `horizontal_angle` | `int` | Camera rotation around subject. 0=front, 45=3/4 right, 90=profile right, 135=back-right, 180=back, 225=back-left, 270=profile left, 315=3/4 left |
| `vertical_angle` | `int` | Camera elevation. -30=low angle (looking up), 0=eye level, 30=elevated, 60=high angle, 90=bird's eye |
| `zoom` | `int` | Framing distance. 0=wide shot (far), 5=medium (default), 10=close-up |
| `lora_scale` | `float` | LoRA influence strength. Default: 0.9 |
| `guidance_scale` | `float` | CFG scale. Default: 4.5 |
| `num_inference_steps` | `int` | Denoising steps. Default: 28. **Pipeline uses 40** for quality. |
| `additional_prompt` | `str` | Range: any string, default `""`. **Empty by default** — text prompts cause Qwen to regenerate content instead of rotating geometry. Opt-in via `run_qwen_angle(**kwargs)`. |
| `negative_prompt` | `str` | Range: any string, default `""`. **Empty by default** — same reason. Opt-in via kwargs. |
| `image_size` | `str` | Output size. Use `"square_hd"` for 1024x1024 |
| `num_images` | `int` | Images per call. Use 1. |
| `enable_safety_checker` | `bool` | Set to `false` for production use |

### Parameter Ranges (Confirmed from fal.ai Docs)

| Parameter | Range | Default | Our Pipeline |
|-----------|-------|---------|-------------|
| `horizontal_angle` | 0-360 | 0 | Per QWEN_ANGLE_MAP |
| `vertical_angle` | -30 to 90 | 0 | -30 (low) to 30 (high), 10 (full body) |
| `zoom` | 0-10 | 5 | 0 (wide/full body), 5 (medium), 10 (closeup) |
| `lora_scale` | 0-4 | 1.0 | 0.9 (slightly reduced for stability) |
| `guidance_scale` | 1-20 | 4.5 | 4.5 (default) |
| `num_inference_steps` | 1-50 | 28 | 40 (higher quality) |
| `additional_prompt` | str | `""` | Empty by default (opt-in via kwargs) |
| `negative_prompt` | str | `""` | Empty by default (opt-in via kwargs) |

### Angle Reference Table

| Name | h | v | z | Description |
|------|---|---|---|-------------|
| Front eye medium | 0 | 0 | 5 | Standard front-facing, eye level |
| 3/4 right eye medium | 45 | 0 | 5 | Three-quarter right |
| Profile right eye medium | 90 | 0 | 5 | Right profile, 90-degree side |
| Back-right quarter | 135 | 0 | 5 | Between profile and back |
| Back eye medium | 180 | 0 | 5 | Full rear view |
| Back-left quarter | 225 | 0 | 5 | Between back and left profile |
| Profile left eye medium | 270 | 0 | 5 | Left profile, 90-degree side |
| 3/4 left eye medium | 315 | 0 | 5 | Three-quarter left |
| Front low medium | 0 | -30 | 5 | Low angle looking up |
| Front elevated wide | 0 | 30 | 0 | High angle, full body |
| Front eye close-up | 0 | 0 | 10 | Face close-up |
| 3/4 right close-up | 45 | 0 | 10 | Three-quarter close-up |
| Full body front | 0 | 10 | 0 | Full body head to feet, eye level |
| Full body 3/4 | 45 | 10 | 0 | Full body head to feet, three-quarter |

### Strengths

- Accurate angle geometry — reliably produces the requested camera position
- Low/high angles work correctly (Gemini cannot do these)
- Back views work (face correctly hidden)
- Fast and cheap — 10 images in ~90s for ~$0.35
- Deterministic numeric control (not prompt-dependent)

### Weaknesses

- Gaussian splat artifacts visible in fine detail (soft edges, minor 3D reconstruction noise)
- No wardrobe variation — outputs the same outfit as the input hero
- No background variation — backgrounds tend to match or simplify from input
- Identity smoothing — fine skin texture can be averaged out
- No expression control — expression is inherited from input or defaults to neutral

### Best For

- Angle geometry coverage in LoRA training data (front, profiles, 3/4s, back, low, high, close-ups)
- Location/environment multi-angle views (generate different camera angles of the same room)
- Quick angle rotation tests

---

## Engine 2: Gemini 2.5 Flash Image (Google)

### Overview

Google's image generation model accessed via the Gemini API. Takes text prompts + optional reference images and generates new images. Supports up to 3 input reference images per call.

### API Details

| Property | Value |
|----------|-------|
| Model | `gemini-2.5-flash-image` |
| Provider | Google (via `google-genai` SDK) |
| Cost | ~$0.039/image at 1024x1024 |
| Speed | ~60s/image |
| Rate limit | 15 RPM (requires 4.5s delay between calls) |
| Max input images | 3 per call |

### Parameters

```python
response = client.models.generate_content(
    model="gemini-2.5-flash-image",
    contents=contents,  # List[Content] with image Parts + text Part
    config=types.GenerateContentConfig(
        response_modalities=["IMAGE", "TEXT"],
        image_config=types.ImageConfig(
            aspect_ratio="1:1",  # or "9:16", "16:9"
        ),
    ),
)
```

### Strengths

- Excellent wardrobe variation — can change clothing convincingly from a reference
- Natural expression diversity — generates varied, believable expressions
- Background variation — can place characters in described environments
- Fine facial detail — skin texture, pores, hair strands (when angles are easy)
- Multi-reference support — 2-3 keystones for identity anchoring

### Weaknesses

- **Cannot do low/high angles** — ignores camera elevation instructions, always produces roughly eye-level views
- **Cannot do back views** — will not hide the face; always shows a front or 3/4 view
- Slow — ~60s per image (6x slower than Qwen)
- Rate limited — 15 RPM means minimum 4s between calls
- Can produce white backgrounds when prompted for "studio" (avoid this for LoRA prep)

### Best For

- Diversity dimensions: wardrobe, expression, lighting, background variation
- Eye-level and moderate angles: front, 3/4, profile, close-up
- Keystone-conditioned candidate generation (2-3 reference images per call)

### Prompting Notes for LoRA Prep

- **Never say "white background"** in LoRA prep prompts — describe environments narratively
- **Never say "studio lighting"** — describe specific lighting conditions
- Include environment description from breakdown.json locations
- Include wardrobe description from breakdown.json wardrobe states
- The keystone identity instruction should say "Change ONLY the camera angle, expression, lighting, and environment"

---

## Engine 3: NBP / Gemini 3 Pro Image Preview (Google)

### Overview

Google's highest-quality image model. Best identity preservation, expression control, and skin detail of all tested engines. Used as the quality polish pass in the three-pass pipeline.

### API Details

| Property | Value |
|----------|-------|
| Model | `gemini-3-pro-image-preview` |
| Provider | Google (via `google-genai` SDK) |
| Cost | ~$0.134/image (3.4x Flash, confirmed) |
| Speed | ~20-40s/image (confirmed) |
| Max input images | 5 per call |

### Parameters

Same API as Gemini Flash (Engine 2) — uses `google-genai` SDK with `response_modalities=["IMAGE", "TEXT"]`.

### Strengths

- **Best identity preservation** of all tested engines — facial geometry, bone structure, skin tone maintained from reference
- **Best expression control** — reliably produces requested expressions (exhausted, furious, resolved, etc.)
- **Angle preservation from reference** — doesn't rotate the face when given a correctly-angled input
- **Cleans up Qwen splat artifacts** — removes Gaussian splat noise from Pass 1/2 outputs
- **Superior skin detail** — with proper prompting, produces visible pores, freckles, natural imperfections
- Supports 4K output resolution
- Multi-subject identity consistency (up to 5 reference people)

### Weaknesses

- **May change wardrobe** from reference — controllable with explicit "keep the same wardrobe" prompting
- **3.4x Flash cost** — $0.134 vs $0.039 per image
- Cannot reliably do back views (same as Flash)
- May ignore low/high angle instructions when generating from scratch

### Best For

- **Pass 2 in three-pass pipeline:** Background swap + expression control on Qwen MA outputs
- Quality upgrade when Flash output is insufficient
- Multi-character scenes needing identity consistency

### Surface Texture System (Character-Driven)

The pipeline uses **character-specific** surface texture prompting in **NBP Pass 2 only**, driven by `rendering_directives` in `breakdown.json`.

**Qwen Pass 1 — NO text prompts (Feb 16, 2026):**

Qwen MA is a pure geometric rotation model. It receives only angle parameters (`horizontal_angle`, `vertical_angle`, `zoom`) and the input image. No `additional_prompt` or `negative_prompt` is sent by default. Text prompts cause Qwen to regenerate content (distorted heads, wrong proportions) instead of cleanly rotating geometry. Prompts are available as opt-in kwargs for future use.

**NBP Pass 2 — SURFACE TEXTURE block (breakdown-driven):**

Each character in `breakdown.json` has a `rendering_directives` object:

```json
"JINX": {
  "rendering_directives": {
    "texture_prompt": "Photograph on Kodak Portra 400. Real human skin — visible pores...",
    "texture_negative": "smooth plastic skin, airbrushed, wax figure..."
  }
}
```

```json
"KIAN": {
  "rendering_directives": {
    "identity_type": "non_human",
    "texture_prompt": "Face has realistic skin — visible pores... Body below the neck is military combat chassis — dark gray-blue alloy plating with brushed metal grain, tool marks, forge patina...",
    "texture_negative": "smooth plastic face, airbrushed face, wax figure, CGI render, cartoon robot, toy-like metal, glowing chest, blue chest light"
  }
}
```

```json
"VAREK": {
  "rendering_directives": {
    "texture_prompt": "Photograph on Kodak Portra 400. Human skin — visible pores, predator tension lines... Chrome uniform elements with mirror-polished reflective highlights...",
    "texture_negative": "smooth plastic skin, airbrushed, wax figure, CGI render, pristine unblemished chrome"
  }
}
```

**Why character-driven:** The old hardcoded skin prompt assumed every character was a regular human with pores, freckles, and peach fuzz. Kian is a military android with synthetic skin over metallic alloy — pore/freckle prompts don't apply to his body. Future projects may have aliens, cyborgs, plant creatures. The `rendering_directives` field lets each character define its own surface texture language.

**Fallback:** Characters without `rendering_directives` get a default human skin texture prompt (Kodak Portra 400, visible pores, editorial look). The shared `config_loader.load_rendering_directives()` function handles the lookup and fallback for all tools.

**Note:** Qwen steps set to 40 (vs default 28) for higher fidelity source material. All surface texture work happens in NBP Pass 2.

### Config Unification (Feb 16, 2026)

All candidate generation tools now read rendering values from shared config sources instead of hardcoding them. The shared module `lib/config_loader.py` eliminates duplicated defaults.

**What reads from where:**

| Value | Source | Key |
|-------|--------|-----|
| Camera body | `project_config.json` | `camera_body` |
| Face lens | `project_config.json` | `candidate_lenses.face` |
| Body lens | `project_config.json` | `candidate_lenses.body` |
| Close-up lens | `project_config.json` | `candidate_lenses.close_up` |
| Quality guard | `project_config.json` | `quality_guard` |
| Negative prompt | `project_config.json` | `negative_prompt` |
| Identity type | `breakdown.json` | `rendering_directives.identity_type` |
| Surface texture | `breakdown.json` | `rendering_directives.texture_prompt` |
| Texture negative | `breakdown.json` | `rendering_directives.texture_negative` |
| Mandatory traits | `breakdown.json` | `rendering_directives.mandatory_traits` |

**Tools using shared config:**
- `engine_shootout.py` — reads camera/lens/quality/texture/traits from config
- `batch_generate_refs.py` — reads lens/texture from config
- `prompt_compiler.py` — reads all project config fields from shared defaults
- `batch_threepass.py` — calls engine_shootout.py as subprocess (inherits automatically)

### Rendering Directives Schema (4 Fields)

Each character in `breakdown.json` can have a `rendering_directives` object with four fields:

```json
"rendering_directives": {
  "identity_type": "human",
  "texture_prompt": "Character-specific surface texture instructions",
  "texture_negative": "Character-specific negative texture instructions",
  "mandatory_traits": "Character-specific visual markers that MUST appear"
}
```

- **`identity_type`** — `"human"` (default) or `"non_human"`. Controls which identity lock template NBP uses. Human characters get the full anatomical lock (skull structure, brow ridge, etc.). Non-human characters get a softer lock that drops skull/brow cues and adds explicit anti-humanization ("Do NOT infer a bare human head"). Defined in `config_loader.py` as `IDENTITY_LOCK_HUMAN` / `IDENTITY_LOCK_NON_HUMAN`.
- **`texture_prompt`** — Surface texture language (skin, metal, synthetic, etc.)
- **`texture_negative`** — What to avoid (plastic, airbrushed, etc.)
- **`mandatory_traits`** — Visual markers unique to this character (Kian's helmet, Jinx's wrist device, Varek's throat scar). Auto-injected as "MANDATORY CHARACTER TRAITS" in NBP prompts. Falls back to `visual_description` if not set.

**Why identity_type exists:** The default identity lock uses "skull structure, brow ridge" to anchor facial identity. These human anatomy cues cause NBP to strip helmets and chassis from non-human characters — it reads the anatomy terms and infers a bare human head. Non-human characters need a different lock that preserves identity without triggering humanization.

**Why mandatory_traits separate from identity lock:** The identity lock covers universal facial proportions. Character-specific markers like hair color, helmets, scars, and glowing eyes are NOT universal — they belong in `mandatory_traits` per character.

### Candidate Lenses (project_config.json)

```json
"candidate_lenses": {
  "face": "85mm f/1.8 prime",
  "body": "35mm f/2.8 prime",
  "close_up": "85mm f/1.4 prime"
}
```

These are intentionally different from the visual bible lens package — candidate gen lenses were tuned for identity preservation, not cinematic shot composition.

### Identity Lock Prompt (Three-Pass Mode, Feb 16, 2026)

When NBP is used as Pass 2, the prompt handles background swap, expression control, and quality in a single pass:

1. **Lead with identity lock:** "face identity locked — DO NOT generate a new face"
2. **Scope includes background swap + expression:** prompt handles environment change, expression control, and quality refinement together (NOT scoped to enhancement only)
3. **Camera body + lens switching (config-driven):** Camera body and lenses read from `project_config.json` `candidate_lenses` field. Default: 85mm f/1.8 prime for face/medium angles, 35mm f/2.8 prime for body angles.
4. **Wardrobe preservation block:** Explicit instruction to maintain exact wardrobe, armor, and accessories from input. Material-aware: metal shows tool marks, leather shows creasing, fabric shows weave pattern.
5. **Breakdown-driven surface texture:** `rendering_directives.texture_prompt` from `breakdown.json` replaces the old hardcoded skin block. Each character gets appropriate texture language (human skin for Jinx/Varek, synthetic-over-alloy for Kian). Fallback to human default for characters without directives.
6. **Two-tier permanent/expression system:** Permanent identity features are locked, expression muscles are freed. The identity lock text is selected by `identity_type` from `rendering_directives`: `"human"` gets full anatomical anchors (skull, brow ridge, etc.), `"non_human"` drops skull/brow cues and adds anti-humanization. Both types share: nose bridge, nose width, cheekbones, chin, eye spacing/size, iris color, skin tone, skin texture. Character-specific markers (hair, helmets, scars, eyes) go in `mandatory_traits`.
7. **Single photo constraint:** "Single photorealistic photograph. One person only. No text. No split panels."
8. **Mandatory character traits:** Traits from `breakdown.json` `rendering_directives.mandatory_traits` (falls back to `visual_description`) auto-injected as "MANDATORY CHARACTER TRAITS — these MUST be visible." Auto-loaded from config — no need to pass `--character-traits` manually. CLI override still supported.
9. **Body proportionality:** For body angles, explicit anatomical ratio instructions (head = 1/7.5 body height).

**Prompt assembly order:**
1. Identity lock
2. Environment / expression / angle / lighting
3. Identity feature list (permanent vs expression)
4. Character traits (from breakdown.json)
5. Proportion block (body angles only)
6. Wardrobe preservation block (all angles)
7. Camera/framing lock
8. ARRI Alexa lens block
9. Surface texture block (from rendering_directives)
10. Single photo constraint

### Back-Angle Limitation

NBP rotates the subject's head toward camera when given emotional expression prompts on back/near-back angles. This is a fundamental model behavior, not a prompt issue. For back angles (back, back_left, back_right):
- Use neutral expression only
- Skip Pass 3 entirely (`--skip-pass3`) — Pass 2 output is the final candidate
- The 2-pass result (Qwen MA → SeedVR2) is sufficient for back/silhouette LoRA training data

### Current Status

**ACTIVE** — integrated in `engine_shootout.py` (standalone and `--threepass` mode). Single-input pipeline (no dual-reference), SeedVR2 quality pass added. Pipeline redesign Feb 15, 2026.

---

## Engine 4: Qwen Edit 2511 — Standard (fal.ai)

### Overview

Standard Qwen image editing model. Takes a reference image + text instruction and applies edits while preserving identity, pose, and angle. Dual-pathway architecture: one path encodes the image, another encodes the text instruction.

### API Details

| Property | Value |
|----------|-------|
| Endpoint | `fal-ai/qwen-image-edit-2511` |
| Provider | fal.ai |
| Cost | ~$0.031/image (confirmed) |
| Speed | ~27s/image (confirmed) |
| Rate limit | Standard fal.ai |

### Parameters

| Parameter | Type | Description |
|-----------|------|-------------|
| `image_urls` | `[str]` | List with one image URL (the reference to edit) |
| `prompt` | `str` | Text instruction describing the edit |
| `negative_prompt` | `str` | Negative prompt to steer away from quality degradation |
| `image_size` | `str` | Output size. Use `"square_hd"` for 1024x1024 |
| `num_inference_steps` | `int` | Denoising steps. Default: 28. **Use 45 for identity preservation.** |
| `guidance_scale` | `float` | Maps to underlying `true_cfg_scale`. Default: 4.5. **Use 3.5 for identity preservation.** |
| `acceleration` | `str` | `"high"`, `"regular"` (default), or `"none"`. **Use `"none"` for full quality.** |
| `output_format` | `str` | `"jpeg"` (default) or `"png"`. **Use `"png"` for lossless pipeline handoff.** |
| `num_images` | `int` | Images per call. Use 1 (or 2-3 and pick best for critical shots). |

### Identity Preservation Settings (Feb 14, 2026)

The fal.ai defaults (guidance 4.5, steps 28, regular acceleration) are speed-optimized and lose facial detail in pipeline use. The official Qwen HuggingFace demo uses guidance 4.0 and 40-50 steps. Our tested optimized settings:

| Parameter | Default | Optimized | Effect |
|-----------|---------|-----------|--------|
| `guidance_scale` | 4.5 | **3.5** | Lower = less aggressive editing, better identity lock |
| `num_inference_steps` | 28 | **45** | More passes to resolve fine detail (pores, iris, hair) |
| `acceleration` | `"regular"` | **`"none"`** | No step-skipping shortcuts |
| `negative_prompt` | *(none)* | See below | Steers away from facial degradation |
| `output_format` | `"jpeg"` | **`"png"`** | Lossless for pipeline handoff |

**Recommended negative prompt:** `"blurry face, distorted features, deformed eyes, asymmetric face, smooth skin, plastic look, low quality, artifacts"`

**Important:** No `strength`/`denoising_strength` parameter exists for this model. It is NOT traditional img2img. The only levers for controlling change magnitude are `guidance_scale`, steps, acceleration, and prompt specificity.

### Prompt Engineering for Background Swaps

Identity-anchoring prompts produce significantly better results than generic instructions:

```
# Good: explicit preservation + narrow scope
"Change only the background environment to [TARGET]. Preserve the person's exact
facial features, skin tone, skin texture, hair color, hair texture, expression,
and clothing unchanged. Keep the exact same camera angle, framing, and pose.
Keep all foreground elements identical. Only replace the background."

# Bad: vague, invites full re-generation
"Place the person in a dark alley at night"
```

**Key patterns:**
- Explicitly state what to preserve (face, expression, hair, clothing, pose, angle)
- Scope the edit narrowly ("change only the background")
- Use "change/replace/swap" not "transform/reimagine/create"

### Strengths

- **Excellent environment swap** — changes backgrounds while preserving identity, pose, and camera angle
- Preserves the input image's angle geometry — doesn't rotate the subject
- Dual-pathway architecture: image encoder + text instruction encoder work independently
- Cheap (~$0.031/image)

### Weaknesses

- **Cannot change camera angles** — if the input is profile_left, the output stays profile_left
- **Cannot reliably change expressions** — expression changes are inconsistent
- Cannot add fine skin detail (inherits Qwen's smoothing tendency)
- **No strength parameter** — cannot dial down how much the image changes (unlike SD img2img)

### Best For

- **Pass 2 in three-pass pipeline:** Environment/background swap on Qwen Multi-Angle outputs while preserving the angle geometry

### LoRA Variant

`fal-ai/qwen-image-edit-2511/edit_with_lora` supports up to 3 LoRAs with scale 0-4. Applying a character LoRA at low scale (0.2-0.4) during Pass 2 could reinforce identity. Not yet tested in Recoil pipeline.

### Current Status

**ACTIVE** — integrated in `engine_shootout.py`. API uses `image_urls` (plural, array). Optimized parameters tested Feb 14, 2026.

---

## Engine 5: Z-Image Turbo + LoRA (fal.ai)

### Overview

Fast, cheap img2img model with optional character LoRA. Uses the character's trained LoRA weights to maintain identity.

### API Details

| Property | Value |
|----------|-------|
| Endpoint | `fal-ai/z-image/turbo/image-to-image` (without LoRA) or `fal-ai/z-image/turbo/image-to-image/lora` (with LoRA) |
| Provider | fal.ai |
| Cost | ~$0.009/image |
| Speed | ~5-8s/image |

### Strengths

- Fastest engine (~5s/image)
- Cheapest engine (~$0.009/image)
- Good identity consistency when using character LoRA

### Weaknesses

- **Circular dependency for candidate generation:** Using an existing LoRA to generate LoRA training data amplifies the LoRA's learned biases rather than providing new visual information. NOT suitable for candidate generation.
- Without LoRA, identity preservation is poor

### Best For

- **Downstream production** with a trained LoRA (keyframe generation, previz, fast sketches)
- **NOT for candidate generation** — use Qwen/NBP pipeline instead

### Current Status

Available in `engine_shootout.py` but **removed from default engines**. Useful for production workflows, not for LoRA training data generation.

---

## Pipeline Architecture

### Why Multi-Engine

No single engine covers all LoRA training needs:
- Qwen Multi-Angle handles angle geometry (including low/high/back) but has no diversity or expression control
- Qwen Edit swaps environments but can't change angles or expressions
- NBP has the best quality and expression control but needs correct angle/environment input
- Gemini Flash has diversity but can't hold challenging angles

### Mode: Three-Pass Sequential (Recommended)

**Tool:** `engine_shootout.py --threepass` (single angle) or `batch_threepass.py` (full training set with smart defaults)
**Reviewer:** `http://127.0.0.1:8420/shootout_reviewer.html?project=<name>&character=<CHAR>` — browse and compare all shootout runs, mark winners, add notes

```
Hero Image
  → Pass 1: Qwen Multi-Angle (14 angles, 40 steps)
  │   └── Pure geometric rotation — NO text prompts
  │   └── Angle params only: horizontal_angle, vertical_angle, zoom
  │
  → Smart routing per angle type:

Face angles (front, closeup_front, closeup_three_quarter, three_quarter_right, three_quarter_left):
  → Pass 1 (Qwen MA) → Pass 2 (NBP) — SKIP SeedVR2 to preserve skin texture
  │   NBP: ARRI Alexa LF + 85mm f/1.8, character-driven texture (rendering_directives), wardrobe lock
  │   Expression angles (front, CU-F, CU-3/4): 5 expressions
  │   Mild angles (3/4R, 3/4L): 3 expressions

Body angles (low_angle, high_angle, full_body, full_body_three_quarter):
  → Pass 1 (Qwen MA) → Pass 2 (NBP) → Pass 3 (SeedVR2) — full 3-pass
  │   NBP: ARRI Alexa LF + 35mm f/2.8, proportionality + character-driven texture + wardrobe lock
  │   Neutral expression only

Profile angles (profile_right, profile_left):
  → Pass 1 (Qwen MA) → Pass 3 (SeedVR2) — no NBP
  │   Neutral expression only

Back angles (back, back_left, back_right):
  → Pass 1 (Qwen MA) → Pass 3 (SeedVR2) — no NBP
  │   Neutral expression only (avoids NBP head rotation on back views)
```

**Cost:** ~$0.036-0.17/angle (~$2.90 for default 14 angles × smart expressions)
**Time:** ~26-56s/angle (~28 min for default 14 angles)

**Advantages:**
- Each engine does only what it excels at
- NBP handles background + expression + identity lock in a single pass
- Expression control happens at the final stage (best results)
- Accurate angle geometry preserved through all passes
- Skin detail far superior to any two-pass approach
- Environment rotation from `breakdown.json` habitat zones prevents background overfitting — characters appear in show-accurate locations, not generic stock environments
- Smart back-angle handling avoids Gemini head rotation artifacts

### Batch Generation (Full Training Set)

`batch_threepass.py` orchestrates the full matrix run with smart defaults:

```
python3 batch_threepass.py leviathan/ --character JINX --dry-run    # preview all jobs
python3 batch_threepass.py leviathan/ --character JINX              # full run
```

**Smart defaults:**
- 30 jobs (smart expression distribution: 15 frontal + 6 mild + 4 body + 2 profile + 3 back)
- 5-tier routing system:
  - **Expression angles** (front, closeup_front, closeup_three_quarter): 5 expressions each → Pass 1 → Pass 2 (NBP) — skip SeedVR2 (skin texture priority)
  - **Mild expression angles** (three_quarter_right, three_quarter_left): 3 expressions each → Pass 1 → Pass 2 (NBP) — skip SeedVR2 (skin texture priority)
  - **Body angles** (low_angle, high_angle, full_body, full_body_three_quarter): neutral only → Pass 1 → Pass 2 (NBP, proportionality) → Pass 3 (SeedVR2, armor/clothing detail)
  - **Profile angles** (profile_right, profile_left): neutral only → Pass 1 → Pass 3 (SeedVR2), no NBP
  - **Back angles** (back, back_left, back_right): neutral only → Pass 1 → Pass 3 (SeedVR2), no NBP
- Environments rotate from `breakdown.json` habitat zones (visual_dna field). Falls back to generic pool if no breakdown exists.
- **Prerequisite:** `/breakdown` must run before candidate generation to populate habitat zones.
- Flags: `--no-smart-back`, `--no-env-rotation` to override

**Qwen angle set (14 images, Pass 1):**

| # | Name | h | v | z |
|---|------|---|---|---|
| 1 | front | 0 | 0 | 5 |
| 2 | 3/4 right | 45 | 0 | 5 |
| 3 | profile right | 90 | 0 | 5 |
| 4 | back right | 135 | 0 | 5 |
| 5 | back | 180 | 0 | 5 |
| 6 | back left | 225 | 0 | 5 |
| 7 | profile left | 270 | 0 | 5 |
| 8 | 3/4 left | 315 | 0 | 5 |
| 9 | low angle | 0 | -30 | 5 |
| 10 | high angle | 0 | 30 | 0 |
| 11 | close-up front | 0 | 0 | 10 |
| 12 | close-up 3/4 | 45 | 0 | 10 |
| 13 | full body | 0 | 10 | 0 |
| 14 | full body 3/4 | 45 | 10 | 0 |

### Mode: Parallel (Legacy)

**Flag:** `--hybrid parallel` (or just `--hybrid`)

```
Hero Image
    ├── Qwen Pass (10-12 images) — angle coverage
    ├── Gemini Flash Pass (15-20 images) — diversity coverage
    └── Merged candidate pool → lora_picker
```

**Issues found (Feb 14, 2026):** Gemini Flash can't hold angles — back_left inputs produce frontal outputs. Expressions weak across the board.

### Mode: Two-Pass (Legacy)

**Flag:** `--hybrid twopass`

```
Hero Image → Qwen angles → Gemini Flash variations per angle
```

**Issues found (Feb 14, 2026):** Compounds Qwen splat + Gemini smoothing artifacts. Identity drifts through two model hops.

### Manifest Format

Both modes produce a unified `manifest.json` in the candidates directory:

```json
{
  "version": 1,
  "character": "JINX",
  "target_model": "z_image",
  "generated_at": "2026-02-13T...",
  "total_candidates": 30,
  "hybrid_mode": "parallel",
  "engines": {
    "qwen": {"count": 12, "model": "fal-ai/qwen-image-edit-2511-multiple-angles"},
    "gemini": {"count": 18, "model": "gemini-2.5-flash-image"}
  },
  "candidates": [
    {
      "filename": "qwen_000_front.png",
      "engine": "qwen",
      "angle": "front",
      "h": 0, "v": 0, "z": 5,
      "status": "pending"
    },
    {
      "filename": "gemini_000_front_SALVAGER_neutral.png",
      "engine": "gemini",
      "angle": "front",
      "wardrobe": "lower_deck_salvager",
      "expression": "neutral",
      "lighting": "dramatic_side",
      "location": "INT. LEVIATHAN - LOWER DECK CORRIDOR",
      "status": "pending"
    }
  ]
}
```

### Caption Strategy Differences

| Property | Qwen candidates | Gemini candidates |
|----------|----------------|-------------------|
| Angle | Caption the angle (known precisely from numeric params) | Caption the angle (from prompt, may be approximate) |
| Wardrobe | Do NOT caption (same as hero — constant) | Caption explicitly (varies per image) |
| Background | Caption if visible; may match hero background | Caption explicitly (varies per image) |
| Expression | Caption if different from hero; often neutral | Caption explicitly (varies per image) |

---

## Location Generation (Future)

Qwen Multi-Angle is viable for **location/environment multi-view generation**:

- Input: one reference image of a location (e.g., a corridor interior shot)
- Output: multiple camera angles of the same environment
- Useful for: storyboard consistency (same location, different shot angles), location LoRA training

This is not yet implemented in the pipeline but the endpoint supports it — `image_urls` accepts any image, not just character photos. The horizontal/vertical angle controls work for environments the same way they work for characters.

---

## Quick Reference

| Engine | Speed | Cost/img | Angles | Diversity | Best For |
|--------|-------|----------|--------|-----------|----------|
| Qwen Multi-Angle | ~7-37s | $0.035 | All (numeric) | None | Pass 1: angle geometry |
| Qwen Edit 2511 | ~27s | $0.031 | Preserves input | Environment swap | (legacy — not in recommended pipeline) |
| NBP (Gemini 3 Pro) | ~20-40s | $0.134 | Preserves input | Expression + quality | Pass 2: bg swap + expression + identity lock |
| SeedVR2 | ~5-10s | $0.001 | N/A | Non-generative quality | Pass 3: quality upscale |
| Gemini 2.5 Flash | ~60s | $0.039 | Eye-level only | Full | Legacy: diversity (not recommended) |
| Z-Image Turbo | ~5s | $0.009 | N/A | N/A | Production only (NOT for candidates) |
| **Three-pass pipeline** | **~25-56s** | **~$0.036-0.170** | **All (14)** | **Full** | **Recommended for LoRA training** |