# TEST PLAN — Coverage Pass Pipeline Validation

**Generated:** 2026-04-13
**Build:** BUILD_SPEC_coverage_pass.md (10/10 phases PASS, 0 debug rounds)
**Budget:** ~$10 at Atlas Cloud
**Time:** ~50 minutes total
**Working directory:** ~/Dropbox/CLAUDE_PROJECTS/recoil

## Prerequisites

- Build must be complete (all 10 phases PASS) and mirrored to MacBook via Dropbox
- `ATLAS_CLOUD_API_KEY` set in environment (for Tier 1+ tests)
- Workspace server running on 8450 (for review after tests)
- Existing r2v test videos available at `projects/afterimage-anime/output/video/ep_005/`

## Test Sequence

Run in exact order. **Stop and fix if any test fails.**

---

## TIER 0: Zero-Cost Unit Tests (~15 min)

### Test 0A: Import Smoke Test
**Validates:** All new modules importable, no circular deps
```bash
cd ~/Dropbox/CLAUDE_PROJECTS/recoil && python3 -c "
from execution.pass_store import PassStore
from execution.step_types import PassResult, SegmentResult
from pipeline.orchestrator.coverage_planner import CoveragePass, CoverageSegment
from pipeline.lib.scene_detect import validate_cut_count, detect_scene_changes
print('All imports OK')
"
```
Also test from pipeline working dir:
```bash
cd ~/Dropbox/CLAUDE_PROJECTS/recoil/pipeline && PYTHONPATH=.. python3 -c "
from lib.pass_store import PassStore
from orchestrator.coverage_planner import CoveragePass
from lib.scene_detect import validate_cut_count
from lib.prompt_engine import build_seeddance_r2v_prompt_multi
from execution.assembler import resolve_seedance_r2v_refs
print('Pipeline imports OK')
"
```
**Stop if:** Any ImportError. Fix before proceeding.

---

### Test 0B: Prompt Assembly Dry Run
**Validates:** build_seeddance_r2v_prompt_multi() produces valid @Image tokens + Shot N: structure
```bash
cd ~/Dropbox/CLAUDE_PROJECTS/recoil && .venv/bin/python3 -c "
import sys; sys.path.insert(0, 'pipeline')
from lib.prompt_engine import build_seeddance_r2v_prompt_multi

# Mock data
shots = [
    {'prompt_data': {'shot_type': 'MCU', 'prompt_skeleton': {'subject_line': 'Sadie at window', 'action_line': 'breathes shallowly', 'emotion_line': 'reflective'}}, 'routing_data': {'target_editorial_duration_s': 3}, 'asset_data': {'characters': [{'char_id': 'SADIE'}], 'location_id': 'int_sadie_apartment'}},
    {'prompt_data': {'shot_type': 'CU', 'prompt_skeleton': {'subject_line': 'Sadie face', 'action_line': 'blinks slowly', 'emotion_line': 'focused'}}, 'routing_data': {'target_editorial_duration_s': 4}, 'asset_data': {'characters': [{'char_id': 'SADIE'}], 'location_id': 'int_sadie_apartment'}},
    {'prompt_data': {'shot_type': 'ECU', 'prompt_skeleton': {'subject_line': 'Sadie hands', 'action_line': 'fingers tap', 'emotion_line': 'tension'}}, 'routing_data': {'target_editorial_duration_s': 3}, 'asset_data': {'characters': [{'char_id': 'SADIE'}], 'location_id': 'int_sadie_apartment'}},
]

bible = {
    'characters': {'SADIE': {'visual_description': 'young woman, platinum blonde hair, blue eyes, freckles', 'wardrobe': {'phase_1': 'cream cable-knit sweater'}}},
    'locations': {'int_sadie_apartment': {'description': 'small apartment, neon window, space heater'}},
}

coverage_pass_dict = {
    'segments': [{'source_shot_id': 'EP001_SH07', 'shot_type': 'MCU', 'duration_s': 3, 'prompt': 'Sadie at window'},
                 {'source_shot_id': 'EP001_SH08', 'shot_type': 'CU', 'duration_s': 4, 'prompt': 'Sadie blinks'},
                 {'source_shot_id': 'EP001_SH09', 'shot_type': 'ECU', 'duration_s': 3, 'prompt': 'Hands tap'}],
    'arc_preamble': '[SCENE ARC: escalating]',
    'focus_character': 'SADIE',
    'location_id': 'int_sadie_apartment',
}

result = build_seeddance_r2v_prompt_multi(
    shots=shots, bible=bible, project_config={'style': 'cyberpunk_anime'},
    episode=1, coverage_pass_dict=coverage_pass_dict,
)

assert result and len(result) > 50, f'Prompt too short: {len(result)} chars'
assert 'Shot 1:' in result, 'Missing Shot 1: marker'
assert 'Shot 2:' in result, 'Missing Shot 2: marker'
assert 'Shot 3:' in result, 'Missing Shot 3: marker'
print(f'Prompt OK ({len(result)} chars, {len(result.split())} words)')
print('---')
print(result[:500])
"
```
**Stop if:** Empty prompt, missing Shot N: markers, or crash.

---

### Test 0C: Reference Resolution
**Validates:** resolve_seedance_r2v_refs() replaces placeholders with @ImageN
```bash
cd ~/Dropbox/CLAUDE_PROJECTS/recoil && .venv/bin/python3 -c "
import sys; sys.path.insert(0, 'pipeline')
from types import SimpleNamespace
from execution.assembler import resolve_seedance_r2v_refs

# Mock refs
refs = [
    SimpleNamespace(path='/tmp/sadie_hero.jpg', ref_type='identity'),
    SimpleNamespace(path='/tmp/sadie_front.jpg', ref_type='identity'),
    SimpleNamespace(path='/tmp/apartment.jpg', ref_type='scene'),
]

# Mock prompt with placeholders (whatever format the builder emits)
prompt = '@Image{identity_1} is Sadie. @Image{identity_2} shows front. @Image{scene_1} is apartment. Shot 1: test.'

resolved, ordered_paths = resolve_seedance_r2v_refs(prompt, refs)

assert '{' not in resolved, f'Unresolved placeholders remain: {resolved}'
assert '@Image1' in resolved, f'Missing @Image1 in: {resolved}'
assert len(ordered_paths) == 3, f'Wrong path count: {len(ordered_paths)}'
print(f'Resolution OK')
print(f'Resolved: {resolved[:200]}')
print(f'Ordered paths: {ordered_paths}')
"
```
**Stop if:** Unresolved `{` placeholders remain, or crash.

---

### Test 0D: PassStore CRUD
**Validates:** Full lifecycle — create, update, get, append_take, list
```bash
cd ~/Dropbox/CLAUDE_PROJECTS/recoil && .venv/bin/python3 -c "
import tempfile, os, sys
sys.path.insert(0, 'pipeline')
from execution.pass_store import PassStore

# Use temp dir
with tempfile.TemporaryDirectory() as tmpdir:
    os.environ['RECOIL_TEST_PROJECTS_ROOT'] = tmpdir
    
    # Create project structure
    os.makedirs(f'{tmpdir}/test-project/state/visual/passes', exist_ok=True)
    
    store = PassStore('test-project')
    
    # Create
    store.create_pass('EP001_PASS_001_L_SADIE_B', ['EP001_SH01', 'EP001_SH02', 'EP001_SH03'])
    
    # Get
    p = store.get_pass('EP001_PASS_001_L_SADIE_B')
    assert p is not None, 'Pass not found after create'
    assert p['status'] == 'pending', f'Wrong status: {p[\"status\"]}'
    assert len(p['segment_shot_ids']) == 3, f'Wrong segment count: {len(p[\"segment_shot_ids\"])}'
    
    # Update
    store.update_pass('EP001_PASS_001_L_SADIE_B', status='generating', cost_usd=1.50)
    p = store.get_pass('EP001_PASS_001_L_SADIE_B')
    assert p['status'] == 'generating'
    assert p['cost_usd'] == 1.50
    
    # Append take
    store.append_pass_take('EP001_PASS_001_L_SADIE_B', {
        'take_number': 1,
        'video_path': 'output/video/ep_001/EP001_PASS_001_take1.mp4',
        'cost_usd': 1.50,
    })
    p = store.get_pass('EP001_PASS_001_L_SADIE_B')
    assert len(p['takes']) == 1
    
    # List
    passes = store.list_passes('EP001')
    assert len(passes) >= 1
    
    print('PassStore CRUD OK')
"
```
**Stop if:** Any assertion fails or crash. State management is critical for overnight crash recovery.

---

### Test 0E: Scene Detection on Existing Video
**Validates:** ffmpeg-based scene detection works on a real multi-cut video
```bash
cd ~/Dropbox/CLAUDE_PROJECTS/recoil && .venv/bin/python3 -c "
import sys; sys.path.insert(0, 'pipeline')
from pathlib import Path
from lib.scene_detect import detect_scene_changes, validate_cut_count

# Use the r2v v3 test video (had working cuts via Shot N:)
video = Path.home() / 'Dropbox/CLAUDE_PROJECTS/projects/afterimage-anime/output/video/ep_005/r2v_v4b_FIXED_image_urls.mp4'
if not video.exists():
    # Fallback to v5
    video = Path.home() / 'Dropbox/CLAUDE_PROJECTS/projects/afterimage-anime/output/video/ep_005/r2v_v5_anchor_shot.mp4'

assert video.exists(), f'No test video found at {video}'

# Detect cuts
cuts = detect_scene_changes(video, threshold=0.3)
print(f'Detected {len(cuts)} scene changes at: {cuts}')

# Validate
result = validate_cut_count(video, expected_cuts=2)  # 3 shots = 2 cuts
print(f'Validation: {result}')
assert 'detected' in result, 'Missing detected field'
assert 'status' in result, 'Missing status field'
print('Scene detection OK')
"
```
**Stop if:** ffmpeg not found, or detection returns no results on a known multi-cut video.

---

### Test 0F: Error Classification
**Validates:** Content filter errors correctly classified for softening path
```bash
cd ~/Dropbox/CLAUDE_PROJECTS/recoil && .venv/bin/python3 -c "
import sys; sys.path.insert(0, 'pipeline')

# Test with the actual error string from the r2v v2 content filter hit
error_str = 'Output video has sensitive content. content_policy_violation partner_validation_failed'

# Check if production_loop classifies this correctly
# (the exact function name may vary — adapt based on what Phase 7 built)
from orchestrator.production_loop import ProductionLoop
# Or test the pattern matching directly
patterns = ['content_policy', 'sensitive content', 'nsfw', 'safety']
matched = any(p in error_str.lower() for p in patterns)
assert matched, f'Content filter not detected in: {error_str}'
print('Error classification OK')
"
```

---

## TIER 1: Real API Tests (~$2.50 at Atlas Cloud)

**IMPORTANT:** These tests use real API calls and cost real money. Run only after ALL Tier 0 tests pass.

### Test 1A: Single-Pass End-to-End (THE critical test)
**Validates:** Full pipeline path — prompt build → ref resolution → execute_pass() → video saved → PassStore updated
**Cost:** ~$0.50 at Atlas Cloud (5s) or ~$1.52 at fal.ai
**Provider:** Atlas Cloud preferred

```
Run via Claude Code:
1. Build a 1-character, 2-segment, 5s coverage pass from EP001 plan data
2. Use Sadie hero (clean bg) + apartment as refs
3. Call execute_pass() through production_loop or directly
4. Verify: video file exists, PassStore has record, PassResult.success == True
```

**Stop if:** This fails. Everything depends on this path working.

---

### Test 1B: Two-Character Pass
**Validates:** Multi-character ref handling — 2 identity ref sets in one pass
**Cost:** ~$1.00 at Atlas Cloud (10s)

```
Run via Claude Code:
1. Build a 2-character (Sadie + Dusty), 2-segment, 10s pass
2. Use both character heroes (clean bg) + location ref
3. Verify: prompt has two @ImageN declarations, both identity refs uploaded, video generates
```

---

### Test 1C: Multi-Segment Cut Validation
**Validates:** 3-segment pass produces detectable cuts + scene detection finds them
**Cost:** ~$1.00 at Atlas Cloud (10s)

```
Run via Claude Code:
1. Build a 3-segment, 10s pass (3s + 4s + 3s)
2. Generate via execute_pass()
3. Run validate_cut_count(video, expected_cuts=2)
4. Verify: detected cuts >= 1, boundary frames extracted
```

---

## TIER 2: If Time Permits (~$2 at Atlas Cloud)

### Test 2A: Content Filter Recovery
**Validates:** Softening path fires on content filter hit
**Cost:** ~$0.50-1.00

Use the known-bad prompt from r2v v2. Verify classification → soften → retry fires.

### Test 2B: 15s Maximum Duration
**Validates:** Max-duration passes don't timeout
**Cost:** ~$1.50

Build a 3-segment, 15s pass. Verify video duration and cuts.

---

## Success Criteria

- All Tier 0 tests pass (zero cost)
- Test 1A succeeds (video file + PassStore record)
- Test 1B produces video with 2 characters referenced
- Test 1C has >= 1 detected cut

If all pass → overnight EP01 run is cleared for launch.

---

## After Tests Pass

1. Review test videos in workspace (8450)
2. Run `/pass estimate --episode 1` to get overnight cost estimate
3. Set budget cap, StopOnReview.NEVER, auto-approve thresholds
4. Launch overnight run via `/pass generate --episode 1 --all`
