# BUILD_SPEC: PROMPT_BIBLE.yaml + prompt_engine Refactor

**Generated:** 2026-04-13 (revised after 3-sample spec review)
**Input:** consultations/recoil/prompt-engineering-bible/ (2-round dual consult)
**Detail level:** high
**Visual design:** no
**Phases:** 5
**Estimated build time:** 2-3 hours
**Python:** 3.14+

## Spec Review Corrections Applied

From 3-sample Self-MoA review (consultations/recoil/prompt-bible-spec-review/):
- **B1 FIXED:** Model count stays at 12 in YAML (for documentation), but code/tests only reference models that exist in model_profiles.json. Added kling-o3 alias.
- **B2 FIXED:** Import path changed to `core.paths.CONFIG_DIR` (not `RECOIL_ROOT / "config"`).
- **B3 FIXED:** Phase 3b (negative prompt gating) DROPPED entirely — negative_prompt is handled in payload_builder.py and step_runner.py, not prompt_engine.py.
- **H1 FIXED:** YAML structural validation moved to Phase 1 gate.
- **H2 FIXED:** Length enforcement limited to 8 specific model-aware builders (listed explicitly).
- **M1 FIXED:** `build_prompt_from_bible()` returns `str`, not tuple. Negative prompt returned via separate function.
- **M2 FIXED:** Added kling-v3 mode-specific optimal_words (i2v vs t2v).

---

## Validation command

```bash
cd ~/Dropbox/CLAUDE_PROJECTS/recoil && python3 -c "
import ast, yaml
ast.parse(open('pipeline/lib/prompt_engine.py').read())
ast.parse(open('pipeline/lib/bible_loader.py').read())
bible = yaml.safe_load(open('config/PROMPT_BIBLE.yaml').read())
# Structural integrity
for m, rules in bible.items():
    for s in ['meta', 'prompt', 'refs', 'aspect_ratio']:
        assert s in rules, f'{m} missing {s}'
    assert rules['meta']['confidence'] in ('CONFIRMED', 'INFERRED', 'UNTESTED'), f'{m} bad confidence'
    if rules['prompt'].get('negative_prompt') and not rules['prompt'].get('negative_param'):
        raise ValueError(f'{m}: negative_prompt=true but no negative_param')
    ar_beh = rules['aspect_ratio'].get('i2v_behavior')
    if ar_beh is not None:
        assert ar_beh in ('respects_param', 'matches_start_frame'), f'{m} bad i2v_behavior: {ar_beh}'
print(f'YAML: {len(bible)} models, all structural checks pass')
" && cd pipeline && PYTHONPATH=.. python3 -m pytest tests/lib/test_bible_loader.py -q && echo "ALL OK"
```

---

## Phase 1: Create PROMPT_BIBLE.yaml

### Files to create
- `recoil/config/PROMPT_BIBLE.yaml` — Single source of truth for all model-specific prompting rules

### Requirements

Create the YAML file with entries for ALL 12 models. Each model block uses this schema:

```yaml
model-name:
  meta:
    provider: bytedance | google | kuaishou | bfl | open-source
    modality: image | video
    confidence: CONFIRMED | INFERRED | UNTESTED
    codenames: []  # internal names
    endpoints:
      default: "endpoint-path"

  prompt:
    optimal_words:  # can be a single [min, max] or mode-keyed:
      default: [min, max]
      i2v: [min, max]       # optional, for models with different I2V prompt needs
      t2v: [min, max]       # optional
    max_chars: null | integer
    style: prose | keywords | headers | cot_prose | formula_6step
    negative_prompt: true | false
    negative_param: "negative_prompt" | null
    best_practices: []
    anti_patterns: []

  refs:
    max_count: integer
    format: url | base64 | part
    param_name: "image_urls" | "reference_images" | "elements" | null
    role_labeling: true | false
    upload_required: true | false
    role_map: {}

  aspect_ratio:
    param_name: "aspect_ratio" | "image_size" | null
    i2v_behavior: respects_param | matches_start_frame | null
    supported: []
    mapping: {}

  first_last_frame:
    start_param: "image_url" | "start_image_url" | "image" | null
    end_param: "end_image_url" | "last_frame" | null
    start_format: url | base64 | part
    end_format: url | base64 | part
    upload_required: true | false
    notes: ""

  duration:
    param_name: "duration" | null
    format: string | integer
    range: [min, max]
    even_only: true | false

  audio:
    supported: true | false
    param_name: "generate_audio" | "audio_prompt" | null
    default_on: true | false
    notes: ""

  gotchas: []
```

### Models to include (12 entries):

**With active code paths (9 — tests reference these):**
1. **seedream-v4.5** — CONFIRMED. t2i + edit. Labeled roles. No negative prompt.
2. **seedream-v5-lite** — CONFIRMED. Auto-upgrade at 4+ identity refs. CoT.
3. **gemini-3-pro-image-preview** — CONFIRMED. Codenames: [NBP, "Nanobanana Pro"]. Part refs.
4. **gemini-3.1-flash-image-preview** — CONFIRMED. Codenames: [NB2, "Nanobanana 2"]. Cheaper, less detail.
5. **seeddance-2.0** — CONFIRMED. i2v/t2v/r2v. image_url + end_image_url. Respects AR param.
6. **kling-v3** — CONFIRMED. I2V: ~30 words, matches start frame AR. T2V: 50-100 words, respects AR. Supports negative_prompt. Include alias note for kling-o3 (same model, different quality tier).
7. **wan-2.7-i2v** — INFERRED. Dense prose 150-300 words. Duration is integer. AR matches start frame.
8. **wan-2.7-r2v** — INFERRED. Extra dense 250-400 words. Wider AR. NO negative_prompt.
9. **veo-3.1** — CONFIRMED. Start frame and refs mutually exclusive. Even durations only.

**Documentation-only (3 — no active code paths, marked UNTESTED):**
10. **wan-2.2** — UNTESTED. Legacy/open-source. 50-100 words. Temporal degradation after 3s.
11. **z-image-turbo** — UNTESTED. Keywords, <50 words. Sub-second.
12. **flux-2** — UNTESTED. T5-driven dense prose. Hybrid JSON+prose.

### Source data
- Round 1+2 Gemini + Opus consultation responses
- `SeedreamClient._ROLE_MAP` and `_AR_MAP` from `execution/api_client.py`
- `model_profiles.json` for cost/provider
- `pipeline-learnings.md` for empirical findings

### Scope boundary
- Do NOT modify any Python files
- Do NOT modify model_profiles.json
- Write ONLY the YAML file

### Validation
```bash
python3 -c "
import yaml
bible = yaml.safe_load(open('config/PROMPT_BIBLE.yaml').read())
models = list(bible.keys())
assert len(models) == 12, f'Expected 12 models, got {len(models)}'
required_sections = ['meta', 'prompt', 'refs', 'aspect_ratio']
for m in models:
    for s in required_sections:
        assert s in bible[m], f'{m} missing {s}'
    assert bible[m]['meta']['confidence'] in ('CONFIRMED', 'INFERRED', 'UNTESTED'), f'{m} bad confidence'
    ar_beh = bible[m]['aspect_ratio'].get('i2v_behavior')
    if ar_beh is not None:
        assert ar_beh in ('respects_param', 'matches_start_frame'), f'{m} bad i2v_behavior'
print(f'PROMPT_BIBLE.yaml: {len(models)} models, all structural checks pass')
"
```

---

## Phase 2: Bible Loader Module

### Files to create
- `recoil/pipeline/lib/bible_loader.py` — Bible loading, caching, and model rule accessors

### What already exists (from Phase 1)
- `recoil/config/PROMPT_BIBLE.yaml` — the canonical data file

### Requirements

Create a standalone module that loads and provides typed access to the bible:

- `load_bible() -> dict` — Loads YAML, caches in module-level `_bible`. Returns full dict.
- `get_model_rules(model_name: str) -> dict | None` — Returns the full model block or None.
- `get_prompt_rules(model_name: str) -> dict` — Returns the `prompt` section. Raises KeyError if model unknown.
- `get_ref_rules(model_name: str) -> dict` — Returns the `refs` section.
- `get_ar_rules(model_name: str) -> dict` — Returns the `aspect_ratio` section.
- `get_frame_rules(model_name: str) -> dict | None` — Returns `first_last_frame` section (None for image models).
- `supports_negative_prompt(model_name: str) -> bool` — Quick accessor.
- `get_optimal_word_range(model_name: str, mode: str = "default") -> tuple[int, int]` — Returns (min, max) from `prompt.optimal_words`. Supports mode-specific ranges: if `mode="i2v"` and `prompt.optimal_words.i2v` exists, use that; otherwise fall back to `prompt.optimal_words.default` or the flat `prompt.optimal_words` list.
- `get_i2v_ar_behavior(model_name: str) -> str | None` — Returns "respects_param" or "matches_start_frame" or None.
- `get_gotchas(model_name: str) -> list[str]` — Returns gotchas list.
- `reload_bible()` — Clear cache and reload.

**Path resolution:** Use `core.paths.CONFIG_DIR / "PROMPT_BIBLE.yaml"` (CONFIG_DIR is already defined at line 16 of core/paths.py).

Follow existing patterns in `lib/model_profiles.py` for caching and module structure.

### Scope boundary
- Do NOT modify prompt_engine.py
- Do NOT modify any existing files
- Write ONLY the new module

### Validation
```bash
cd pipeline && PYTHONPATH=.. python3 -c "
from lib.bible_loader import load_bible, get_model_rules, supports_negative_prompt, get_optimal_word_range, get_i2v_ar_behavior
bible = load_bible()
assert len(bible) == 12
assert get_model_rules('kling-v3') is not None
assert get_model_rules('nonexistent') is None
assert supports_negative_prompt('kling-v3') == True
assert supports_negative_prompt('seeddance-2.0') == False
lo, hi = get_optimal_word_range('kling-v3', mode='i2v')
assert lo <= 40 and hi <= 50, f'Kling I2V range wrong: {lo}-{hi}'
lo2, hi2 = get_optimal_word_range('kling-v3', mode='t2v')
assert hi2 > hi, f'Kling T2V should be wider than I2V'
assert get_i2v_ar_behavior('kling-v3') == 'matches_start_frame'
assert get_i2v_ar_behavior('seeddance-2.0') == 'respects_param'
assert get_i2v_ar_behavior('seedream-v4.5') is None
print('bible_loader: all accessors OK')
"
```

---

## Phase 3: Integrate Bible Into Existing Builders

### Files to modify
- `recoil/pipeline/lib/prompt_engine.py` — Wire bible rules into existing builder functions

### What already exists (from prior phases)
- `config/PROMPT_BIBLE.yaml` — model rules data
- `pipeline/lib/bible_loader.py` — accessor functions

### Requirements

This phase does NOT rewrite builders. It adds bible-aware guardrails:

**3a. Prompt length enforcement:**
Add `_enforce_prompt_length(prompt: str, model: str, mode: str = "default") -> str`:
- Reads `get_optimal_word_range(model, mode)` and `get_prompt_rules(model).get("max_chars")`
- If word count exceeds optimal max, logs WARNING with count and optimal range
- If char count exceeds max_chars (when set), truncates at last sentence boundary before limit, logs ERROR
- Returns the (possibly truncated) prompt
- Does NOT truncate within the optimal range — only warns

**Wire `_enforce_prompt_length()` into these 8 specific model-aware builders (and ONLY these):**
1. `build_kling_i2v_prompt()` (line 2153) — model="kling-v3", mode="i2v"
2. `build_kling_t2v_prompt()` (line 2488) — model="kling-v3", mode="t2v"
3. `build_wan_i2v_prompt()` (line 2639) — model="wan-2.7-i2v", mode="default"
4. `build_wan_r2v_prompt()` (line 2806) — model="wan-2.7-r2v", mode="default"
5. `build_veo_prompt()` (line 2062) — model="veo-3.1", mode="default"
6. `build_video_prompt()` (line 1822) — model param already exists in function
7. `build_multi_shot_prompt()` (line 1921) — model="seeddance-2.0", mode="default"
8. `build_previs_prompt()` (line 3121) — model="gemini-3.1-flash-image-preview", mode="default"

**Do NOT add to generic/plan-based builders** (`build_prompt_from_plan`, `build_cinematic_prompt`, `build_prompt_sections_from_plan`) — these don't know the target model.

**3b. AR validation for I2V:**
Add `validate_start_frame_ar(start_frame_path: Path | None, model: str, target_ar: str) -> list[str]`:
- If start_frame_path is None, return []
- Reads `get_i2v_ar_behavior(model)`
- If `matches_start_frame`: read image dimensions (PIL or header parsing), check against target_ar. Return warning string if mismatch.
- If `respects_param` or None: return []

**3c. Import bible_loader at top of prompt_engine.py:**
```python
from lib.bible_loader import (
    get_prompt_rules,
    get_optimal_word_range,
    get_i2v_ar_behavior,
)
```

### Scope boundary
- Do NOT rewrite any existing builder function logic
- Do NOT remove any existing builder functions
- Do NOT change function signatures of existing builders
- Do NOT add negative prompt gating (handled elsewhere in the pipeline)
- ONLY add the two new functions and wire length enforcement into the 8 listed builders

### Validation
```bash
cd pipeline && PYTHONPATH=.. python3 -c "
import ast; ast.parse(open('lib/prompt_engine.py').read())
from lib.prompt_engine import _enforce_prompt_length, validate_start_frame_ar
# Length enforcement
result = _enforce_prompt_length('short prompt', 'kling-v3', 'i2v')
assert isinstance(result, str)
assert result == 'short prompt'  # not truncated
# AR validation — None path
warnings = validate_start_frame_ar(None, 'seeddance-2.0', '9:16')
assert warnings == []
# AR validation — Seedance always passes
warnings = validate_start_frame_ar(None, 'kling-v3', '9:16')
assert warnings == []  # None path = no validation even for Kling
print('Phase 3: length enforcement + AR validation OK')
"
```

---

## Phase 4: Generic Bible-Driven Builder

### Files to modify
- `recoil/pipeline/lib/prompt_engine.py` — Add generic builder

### What already exists (from prior phases)
- `bible_loader.py` with all accessors
- `_enforce_prompt_length()` function
- Existing model-specific builders still intact

### Requirements

Add `build_prompt_from_bible(model: str, scene_description: str, **kwargs) -> str`:

1. Loads prompt rules from `get_prompt_rules(model)`
2. Reads `style` field to determine formatting:
   - `prose`: natural language paragraph, concatenating scene_description with kwargs
   - `keywords`: comma-separated tags extracted from scene_description + kwargs
   - `formula_6step`: structured sections: Subject → Action → Environment → Camera → Style → Constraints (from kwargs)
   - `cot_prose`: "Let's think step by step. " prefix + prose
3. Reads `best_practices` and appends as guidance comments if they exist
4. Handles optional kwargs: `characters=[]`, `location=""`, `camera=""`, `lighting=""`, `action=""`, `style=""`
5. Calls `_enforce_prompt_length(prompt, model)` before returning
6. Returns `str` (NOT a tuple — follows existing builder convention)

For simple/casting generations. Complex plan-based generation still uses specialized builders.

### Usage example
```python
prompt = build_prompt_from_bible(
    model="seedream-v4.5",
    scene_description="A young woman at a rain-streaked window, neon light",
    characters=["Sadie"],
    camera="medium close-up",
    lighting="cyan neon from outside, warm interior",
)
```

### Scope boundary
- Do NOT modify existing builder functions
- Do NOT remove any existing functions
- Add ONLY `build_prompt_from_bible()`
- Returns str, not tuple

### Validation
```bash
cd pipeline && PYTHONPATH=.. python3 -c "
from lib.prompt_engine import build_prompt_from_bible
# Seedream — prose style
prompt = build_prompt_from_bible(model='seedream-v4.5', scene_description='A test scene with neon lights')
assert isinstance(prompt, str)
assert len(prompt) > 10
# Kling — should also return str
prompt2 = build_prompt_from_bible(model='kling-v3', scene_description='A test scene')
assert isinstance(prompt2, str)
# Z-Image — keywords style (if UNTESTED model, should still produce output)
prompt3 = build_prompt_from_bible(model='z-image-turbo', scene_description='neon city rain night')
assert isinstance(prompt3, str)
print('build_prompt_from_bible: all models return str')
"
```

---

## Phase 5: Tests + Acceptance Gate

### Files to create
- `recoil/pipeline/tests/lib/test_bible_loader.py` — Unit tests

### What already exists (from prior phases)
- `config/PROMPT_BIBLE.yaml` — 12 models
- `pipeline/lib/bible_loader.py` — loader + accessors
- `pipeline/lib/prompt_engine.py` — bible integration + generic builder

### Requirements

**5a. Bible integrity tests:**
- All 12 models present
- Every model has required sections (meta, prompt, refs, aspect_ratio)
- All confidence values valid (CONFIRMED, INFERRED, UNTESTED)
- No model has both `negative_prompt: true` and null `negative_param`
- All i2v_behavior values valid (respects_param, matches_start_frame, null)
- Mode-specific optimal_words for kling-v3 (i2v vs t2v different ranges)

**5b. Accessor tests:**
- `get_model_rules` returns dict for known model, None for unknown
- `supports_negative_prompt` correct for Kling (True), Seedream (False), Seedance (False)
- `get_optimal_word_range` with mode parameter works correctly
- `get_i2v_ar_behavior` correct for each video model
- `get_gotchas` returns non-empty list for models with known gotchas (seedream-v4.5, kling-v3)

**5c. Length enforcement tests:**
- Short Kling I2V prompt passes without truncation
- 500-word prompt for Kling I2V logs WARNING (mock logger, assert warning called)
- Prompt exceeding max_chars gets truncated

**5d. AR validation tests:**
- `validate_start_frame_ar` with None path returns empty list
- `validate_start_frame_ar` with Seedance returns empty (respects param)
- `validate_start_frame_ar` with a 1:1 image file and Kling returns warning (use tmp_path to create a small test image)

**5e. Generic builder tests:**
- `build_prompt_from_bible` for Seedream returns str containing scene description
- `build_prompt_from_bible` for Z-Image Turbo returns shorter output (keywords style)
- All 12 models produce non-empty str output (smoke test)

### Test patterns
Follow existing test patterns in `tests/lib/` — pytest, tmp_path fixture, mock where needed.

### Scope boundary
- Do NOT modify any source files
- Write ONLY test files

### Acceptance gate
```bash
cd pipeline && PYTHONPATH=.. python3 -m pytest tests/lib/test_bible_loader.py -v && \
python3 -c "
import yaml
from lib.bible_loader import load_bible
from lib.prompt_engine import build_prompt_from_bible, _enforce_prompt_length, validate_start_frame_ar
bible = load_bible()
assert len(bible) == 12
for model_name in bible:
    prompt = build_prompt_from_bible(model=model_name, scene_description='A neon-lit scene at night')
    assert isinstance(prompt, str) and len(prompt) > 5, f'{model_name} produced empty prompt'
print(f'Acceptance: all {len(bible)} models produce valid prompts')
" && echo "ALL ACCEPTANCE GATES PASS"
```

---

## Dependency Map

```
Phase 1 (YAML) — data only, validated structurally
    ↓
Phase 2 (bible_loader.py) ← reads YAML via CONFIG_DIR
    ↓
Phase 3 (prompt_engine guards) ← imports bible_loader, wires into 8 specific builders
    ↓
Phase 4 (generic builder) ← uses bible_loader + _enforce_prompt_length
    ↓
Phase 5 (tests) ← tests all of the above
```

## Post-Build Actions

1. Run `/bugfix` on changed files
2. Run `/simplify` for code quality
3. Run full existing test suite: `cd pipeline && PYTHONPATH=.. python3 -m pytest tests/ -q`
4. Update `pipeline-learnings.md` with findings