# BUILD_SPEC — Visual Feedback System

**Generated:** 2026-03-31
**Input:** `~/Dropbox/CLAUDE_PROJECTS/consultations/recoil/visual-feedback-system/SYNTHESIS.md`
**Detail level:** max
**Visual design:** no
**Phases:** 6
**Estimated build time:** 3-4 hours (harness execution)

## Validation command

```bash
cd /Users/joeturnerlin/Dropbox/CLAUDE_PROJECTS/recoil && python3 -c "
import ast, sys, pathlib
files = [
    'execution/step_types.py',
    'execution/execution_store.py',
    'execution/step_runner.py',
    'pipeline/lib/validation.py',
    'execution/healer/agent.py',
    'execution/healer/fix_registry.py',
    'execution/healer/constants.py',
    'pipeline/editors/review_server.py',
]
ok = True
for f in files:
    try:
        ast.parse(pathlib.Path(f).read_text())
    except SyntaxError as e:
        print(f'SYNTAX ERROR: {f}: {e}', file=sys.stderr)
        ok = False
if ok:
    print('All files parse OK')
sys.exit(0 if ok else 1)
"
```

---

## Phase 1: DEFERRED Verdict Infrastructure

**Goal:** Add `deferred` field to the gate verdict system and wire it through StepRunner and ExecutionStore so that gates can flag shots for mandatory human review without blocking the pipeline.

### Files to modify

- `execution/step_types.py` — Add `deferred` field to `GateVerdict`
- `execution/execution_store.py` — Add `deferred` and `deferred_reason` to shot schema
- `execution/step_runner.py` — Track deferred verdicts in both `execute_video()` and `execute_keyframe()`, persist to store

### Exact implementation

**In `execution/step_types.py`, modify the `GateVerdict` dataclass (line 52-60):**

Replace the existing GateVerdict with:

```python
@dataclass(frozen=True)
class GateVerdict:
    """Result of a single QC gate evaluation.

    When deferred=True, the pipeline continues (passed is effectively True)
    but the shot is flagged for mandatory human review before final export.
    Used by Gate 3 video drift and as a fail-open fallback when gate APIs fail.
    """
    passed: bool
    gate_name: str
    reason: str
    details: dict[str, Any] = field(default_factory=dict)
    cost: float = 0.0
    retriable: bool = True
    deferred: bool = False
```

**In `execution/execution_store.py`, modify `_SHOT_FIELDS` (line 110-115):**

```python
_SHOT_FIELDS = (
    "shot_id", "episode_id", "pipeline", "model", "status", "job_id",
    "session_id", "gate_results", "cost_incurred", "retry_waste_cost",
    "output_path", "error_message", "attempts", "max_attempts", "takes",
    "updated_at", "is_coverage", "coverage_of", "deferred", "deferred_reason",
)
```

**In `execution/execution_store.py`, modify `_SHOT_DEFAULTS` (line 117-135), add at end before closing brace:**

```python
    "deferred": False,
    "deferred_reason": None,
```

**In `execution/step_runner.py`, modify `execute_video()` gate loop (lines 314-324):**

Replace the existing gate loop block:

```python
            # 7. Run gates (if any)
            gate_verdict = None
            if gates:
                shot_data = self._store.get_shot(shot_id) or {}
                for gate_fn in gates:
                    verdict = gate_fn(video_path, shot_data)
                    gate_verdict = verdict
                    if not verdict.passed:
                        cost += verdict.cost
                        break
                    cost += verdict.cost
```

With:

```python
            # 7. Run gates (if any)
            gate_verdict = None
            any_deferred = False
            deferred_reason = ""
            if gates:
                shot_data = self._store.get_shot(shot_id) or {}
                for gate_fn in gates:
                    verdict = gate_fn(video_path, shot_data)
                    gate_verdict = verdict
                    if verdict.deferred:
                        any_deferred = True
                        deferred_reason = verdict.reason
                    if not verdict.passed:
                        cost += verdict.cost
                        break
                    cost += verdict.cost
```

**In `execution/step_runner.py`, modify the execute_video() success path — the `update_shot` call around line 383-388:**

Replace:

```python
            self._store.update_shot(
                shot_id,
                output_path=rel_path,
                cost_incurred=cost,
                gate_results={"video_path": rel_path},
            )
```

With:

```python
            update_fields = {
                "output_path": rel_path,
                "cost_incurred": cost,
                "gate_results": {"video_path": rel_path},
            }
            if any_deferred:
                update_fields["deferred"] = True
                update_fields["deferred_reason"] = deferred_reason
            self._store.update_shot(shot_id, **update_fields)
```

**In `execution/step_runner.py`, modify `execute_keyframe()` gate loop (lines 870-880):**

Replace:

```python
                gate_verdict = None
                gate_failed = False
                if gates:
                    shot_data = self._store.get_shot(shot_id) or {}
                    for gate_fn in gates:
                        verdict = gate_fn(keyframe_path, shot_data)
                        gate_verdict = verdict
                        attempt_cost += verdict.cost
                        total_cost += verdict.cost
                        if not verdict.passed:
                            gate_failed = True
```

With:

```python
                gate_verdict = None
                gate_failed = False
                any_deferred = False
                deferred_reason = ""
                if gates:
                    shot_data = self._store.get_shot(shot_id) or {}
                    for gate_fn in gates:
                        verdict = gate_fn(keyframe_path, shot_data)
                        gate_verdict = verdict
                        attempt_cost += verdict.cost
                        total_cost += verdict.cost
                        if verdict.deferred:
                            any_deferred = True
                            deferred_reason = verdict.reason
                        if not verdict.passed:
                            gate_failed = True
```

**In `execution/step_runner.py`, modify the execute_keyframe() success path — the `update_shot` call around line 1030-1036:**

Replace:

```python
                self._store.update_shot(
                    shot_id,
                    output_path=rel_path,
                    cost_incurred=total_cost,
                    attempts=attempts,
                    gate_results={"keyframe_path": rel_path},
                )
```

With:

```python
                kf_update = {
                    "output_path": rel_path,
                    "cost_incurred": total_cost,
                    "attempts": attempts,
                    "gate_results": {"keyframe_path": rel_path},
                }
                if any_deferred:
                    kf_update["deferred"] = True
                    kf_update["deferred_reason"] = deferred_reason
                self._store.update_shot(shot_id, **kf_update)
```

### Scope boundary

- Do NOT modify the state machine (`VALID_TRANSITIONS`). Deferred is an orthogonal flag, not a state.
- Do NOT modify GateFunction type alias — the protocol stays the same.
- Do NOT modify StepResult — the deferred status lives in the store, not the return value.
- Do NOT change any existing gate behavior — this phase only adds infrastructure.

### Validation

```bash
cd /Users/joeturnerlin/Dropbox/CLAUDE_PROJECTS/recoil && \
python3 -c "import ast; ast.parse(open('execution/step_types.py').read())" && \
python3 -c "import ast; ast.parse(open('execution/execution_store.py').read())" && \
python3 -c "import ast; ast.parse(open('execution/step_runner.py').read())" && \
grep -q 'deferred: bool = False' execution/step_types.py && \
grep -q '"deferred"' execution/execution_store.py && \
grep -q '"deferred_reason"' execution/execution_store.py && \
grep -q 'any_deferred' execution/step_runner.py && \
grep -q 'deferred_reason' execution/step_runner.py && \
echo "Phase 1 OK"
```

---

## Phase 2: Gate 3 DEFERRED Mode + Video Drift Gate Factory

**Goal:** Convert Gate 3 from hard-fail to DEFERRED behavior. Create a StepRunner gate factory for video drift. Update the legacy pipeline.py path.

### What already exists (from Phase 1)

- Phase 1 added `deferred: bool = False` to `GateVerdict`
- Phase 1 added deferred tracking in `execute_video()` and `execute_keyframe()`
- Phase 1 added `deferred` and `deferred_reason` fields to ExecutionStore

### Files to modify

- `pipeline/lib/validation.py` — Gate 3 now returns `passed=True` with drift metadata when drift detected (instead of `passed=False`)
- `execution/step_runner.py` — Add `make_video_drift_gate()` factory function
- `pipeline/orchestrator/pipeline.py` — Update `_run_gate_3_on_video()` for DEFERRED

### Exact implementation

**In `pipeline/lib/validation.py`, modify `run_gate_3()` return when drift detected (lines 1082-1096):**

Replace the final return block (after the expanded check):

```python
        # Flag for review if 2+ frames show drift (majority vote)
        all_passed = drift_count < 2

        return ValidationResult(
            gate="gate_3",
            passed=all_passed,
            details={
                "frames": frame_results,
                "drift_count": drift_count,
                "strategy": "progressive_expanded",
                "flagged_for_review": not all_passed,
            },
            model=GATE_MODEL,
            cost=total_cost,
        )
```

With:

```python
        # DEFERRED mode: drift detected → passed=True but flagged for human review.
        # Pipeline continues generating; deferred shots block final export.
        drift_detected = drift_count >= 2

        return ValidationResult(
            gate="gate_3",
            passed=True,  # Always pass — Gate 3 uses DEFERRED, not auto-reject
            details={
                "frames": frame_results,
                "drift_count": drift_count,
                "strategy": "progressive_expanded",
                "flagged_for_review": drift_detected,
                "deferred": drift_detected,
            },
            model=GATE_MODEL,
            cost=total_cost,
        )
```

**In `pipeline/lib/validation.py`, modify `_check_identity_drift()` error handler (lines 1124-1126):**

Replace:

```python
        except Exception as e:
            logger.error("Gate 3 drift check at %d%% failed: %s", pct, e)
            return {"pass": True, "reason": f"Check failed: {e}"}
```

With:

```python
        except Exception as e:
            logger.error("Gate 3 drift check at %d%% failed: %s", pct, e)
            return {"pass": True, "reason": f"Check failed: {e}", "api_error": True}
```

**In `pipeline/lib/validation.py`, modify `run_gate_3()` spot-check success return (lines 1056-1064):**

After the spot check passes, check if the frame had an API error:

Replace:

```python
        if spot_result.get("pass", True):
            # 50% passed — assume video is stable, done
            return ValidationResult(
                gate="gate_3",
                passed=True,
                details={"frames": frame_results, "strategy": "progressive_spot_check"},
                model=GATE_MODEL,
                cost=total_cost,
            )
```

With:

```python
        if spot_result.get("pass", True):
            # 50% passed — assume video is stable, done
            had_api_error = spot_result.get("api_error", False)
            return ValidationResult(
                gate="gate_3",
                passed=True,
                details={
                    "frames": frame_results,
                    "strategy": "progressive_spot_check",
                    "deferred": had_api_error,
                    "flagged_for_review": had_api_error,
                },
                model=GATE_MODEL,
                cost=total_cost,
            )
```

**In `execution/step_runner.py`, add `make_video_drift_gate()` factory after `make_identity_gate()` (after line 90):**

```python
def make_video_drift_gate(
    ref_paths: list[Path],
    shot_metadata: Optional[dict] = None,
) -> GateFunction:
    """Factory: creates Gate 3 video drift check as a GateFunction.

    Returns DEFERRED verdict when drift detected — pipeline continues
    but shot is flagged for mandatory human review before export.
    API failures also produce DEFERRED (fail-open-but-flag).
    """
    def gate_fn(video_path: Path, shot_data: dict) -> GateVerdict:
        from core.paths import ensure_starsend_importable
        ensure_starsend_importable()
        from lib.validation import Validator
        v = Validator()
        result = v.run_gate_3(
            video_path=video_path,
            ref_paths=ref_paths,
            shot_metadata=shot_metadata or shot_data,
        )
        drift_detected = result.details.get("deferred", False)
        drift_count = result.details.get("drift_count", 0)

        return GateVerdict(
            passed=True,  # Always pass — Gate 3 uses DEFERRED, not reject
            gate_name="gate_3",
            reason=(
                f"Video drift: {drift_count} frames flagged"
                if drift_detected
                else "No drift detected"
            ),
            details=result.details,
            cost=result.cost,
            retriable=False,
            deferred=drift_detected,
        )
    return gate_fn
```

**In `pipeline/orchestrator/pipeline.py`, modify `_run_gate_3_on_video()` (around line 651-660):**

Replace:

```python
def _run_gate_3_on_video(video_path, context: PipelineContext, ref_paths=None):
    """Run Gate 3 (video drift). Fail-closed on API errors."""
    try:
        from lib.validation import Validator
        validator = Validator()
        g3 = validator.run_gate_3(video_path, ref_paths or [])
        return {"passed": g3.passed, "details": g3.details, "cost": g3.cost,
                "flagged_for_review": not g3.passed}
    except ImportError:
        return {"passed": True, "cost": 0.0}
    except Exception as e:
```

With:

```python
def _run_gate_3_on_video(video_path, context: PipelineContext, ref_paths=None):
    """Run Gate 3 (video drift). DEFERRED on drift or API failure."""
    try:
        from lib.validation import Validator
        validator = Validator()
        g3 = validator.run_gate_3(video_path, ref_paths or [])
        deferred = g3.details.get("deferred", False)
        return {"passed": True, "deferred": deferred, "details": g3.details,
                "cost": g3.cost, "flagged_for_review": deferred}
    except ImportError:
        return {"passed": True, "deferred": False, "cost": 0.0}
    except Exception as e:
```

Note: Preserve whatever comes after the `except Exception as e:` line — it should also set `deferred: True` since API errors are fail-open-but-flag. Read the surrounding code to see what follows and add `"deferred": True` to that return dict too.

### Scope boundary

- Do NOT change Gate 3's detection logic (progressive sampling, 2+ majority vote). Only change what happens with the result.
- Do NOT modify Gates 0, 1, or 2.
- Do NOT add new models or change GATE_MODEL.
- Do NOT modify the frame extraction helpers.

### Validation

```bash
cd /Users/joeturnerlin/Dropbox/CLAUDE_PROJECTS/recoil && \
python3 -c "import ast; ast.parse(open('pipeline/lib/validation.py').read())" && \
python3 -c "import ast; ast.parse(open('execution/step_runner.py').read())" && \
python3 -c "import ast; ast.parse(open('pipeline/orchestrator/pipeline.py').read())" && \
grep -q 'make_video_drift_gate' execution/step_runner.py && \
grep -q '"deferred": drift_detected' pipeline/lib/validation.py && \
grep -q 'DEFERRED' pipeline/orchestrator/pipeline.py && \
echo "Phase 2 OK"
```

---

## Phase 3: Dailies Deferred UI

**Goal:** Surface deferred shots in the Dailies tab with amber indicator and deferred-first sorting. Add deferred count to the priority queue and video clips APIs.

### What already exists (from prior phases)

- Phase 1 added `deferred` and `deferred_reason` fields to shot records in ExecutionStore
- Phase 2 stores `deferred=True` on shots with video drift or API failures
- The Dailies priority queue (`_api_dailies`) has P1-P5 tiers sorted by priority
- The Dailies videos API (`_api_dailies_videos`) returns clip objects
- Clip items in `dailies.js` render status as APPROVED/REJECTED/BINNED/PENDING

### Files to modify

- `pipeline/editors/review_server.py` — Add deferred data to both `_api_dailies()` and `_api_dailies_videos()`
- `pipeline/editors/tabs/dailies.js` — Add amber DEFERRED indicator and deferred-first sort

### Exact implementation

**In `pipeline/editors/review_server.py`, modify `_api_dailies()` (lines 2404-2501):**

Add a new P0 priority tier for deferred shots. Insert this block after `items = []` (line 2411) and before the P1 block (line 2413):

```python
        # P0: DEFERRED — shots that passed but need mandatory human review
        all_shots = store.get_all_shots() if hasattr(store, 'get_all_shots') else []
        deferred_count = 0
        for shot in all_shots:
            if shot.get("deferred") and shot.get("status") not in ("failed", "abandoned"):
                deferred_count += 1
                items.append({
                    "priority": 0,
                    "shot_id": shot["shot_id"],
                    "episode_id": shot["episode_id"],
                    "status": shot["status"],
                    "deferred": True,
                    "deferred_reason": shot.get("deferred_reason", ""),
                    "output_path": shot.get("output_path", ""),
                    "takes": shot.get("takes", []),
                    "actions": ["approve", "reject"],
                })
```

If `get_all_shots()` doesn't exist on ExecutionStore, use this alternative approach instead:

```python
        # P0: DEFERRED — scan all shot files for deferred flag
        deferred_count = 0
        for shot_file in store.shots_dir.glob("*.json"):
            try:
                import json as _json_d
                shot = _json_d.loads(shot_file.read_text())
                if shot.get("deferred") and shot.get("status") not in ("failed", "abandoned"):
                    deferred_count += 1
                    items.append({
                        "priority": 0,
                        "shot_id": shot["shot_id"],
                        "episode_id": shot.get("episode_id", ""),
                        "status": shot.get("status", ""),
                        "deferred": True,
                        "deferred_reason": shot.get("deferred_reason", ""),
                        "output_path": shot.get("output_path", ""),
                        "takes": shot.get("takes", []),
                        "actions": ["approve", "reject"],
                    })
            except Exception:
                pass
```

Also update the response to include `deferred_count`. In the final `self._json_response` call (line 2501):

Replace:

```python
        needs_action = sum(1 for i in items if i["priority"] <= 4)
        self._json_response({"items": items, "total": len(items), "needs_action": needs_action})
```

With:

```python
        needs_action = sum(1 for i in items if i["priority"] <= 4)
        self._json_response({
            "items": items,
            "total": len(items),
            "needs_action": needs_action,
            "deferred_count": deferred_count,
        })
```

**In `pipeline/editors/review_server.py`, modify `_api_dailies_videos()` — in the clip dict construction (around line 2573-2588):**

Add `deferred` field to each clip. After `"binned": file_path in binned_set,` add:

```python
                        "deferred": shot.get("deferred", False),
                        "deferred_reason": shot.get("deferred_reason", ""),
```

Also update the response to include `deferred_count`. Where the method returns the response, add to the response dict:

```python
"deferred_count": sum(1 for c in clips if c.get("deferred")),
```

**In `pipeline/editors/tabs/dailies.js`, modify `_clipItemHTML()` (line 178-197):**

Replace the status div (line 192-194):

```javascript
          <div class="dailies-clip-status ${clip.approved ? 'approved' : clip.rejected ? 'rejected' : clip.binned ? 'binned' : 'pending'}">
            ${clip.approved ? 'APPROVED' : clip.rejected ? 'REJECTED' : clip.binned ? 'BINNED' : 'PENDING'}
          </div>
```

With:

```javascript
          <div class="dailies-clip-status ${clip.deferred ? 'deferred' : clip.approved ? 'approved' : clip.rejected ? 'rejected' : clip.binned ? 'binned' : 'pending'}">
            ${clip.deferred ? 'DEFERRED' : clip.approved ? 'APPROVED' : clip.rejected ? 'REJECTED' : clip.binned ? 'BINNED' : 'PENDING'}
          </div>
```

**In `pipeline/editors/tabs/dailies.js`, add CSS for the deferred state.**

Find where the `.dailies-clip-status` CSS styles are defined (likely in the buildDOM function or a `<style>` block). Add this rule alongside the existing status styles:

```css
.dailies-clip-status.deferred { color: #f59e0b; font-weight: 600; }
```

If the status styles are defined in an external CSS file, add it there. If they're inline in the JS, add alongside the existing status color rules.

**In `pipeline/editors/tabs/dailies.js`, add deferred-first sort.**

Find the `getFiltered()` function. After any existing sort logic, add deferred-first sorting. If clips are currently unsorted or sorted by episode/shot, wrap the sort to put deferred items first:

```javascript
// Deferred-first binary sort within existing sort order
filtered.sort((a, b) => {
    if (a.deferred && !b.deferred) return -1;
    if (!a.deferred && b.deferred) return 1;
    return 0;  // Preserve existing order for same-deferred-status items
});
```

### Scope boundary

- Do NOT add new API endpoints. The deferred data piggybacks on existing `/api/dailies` and `/api/dailies/videos` responses.
- Do NOT modify the approve/reject logic — deferred shots use the same approve/reject actions.
- Do NOT add a "resolve deferred" endpoint — approving a deferred shot clears it from the deferred list naturally.
- Do NOT modify the badge count logic beyond what's needed for deferred display.

### Validation

```bash
cd /Users/joeturnerlin/Dropbox/CLAUDE_PROJECTS/recoil && \
python3 -c "import ast; ast.parse(open('pipeline/editors/review_server.py').read())" && \
grep -q 'deferred_count' pipeline/editors/review_server.py && \
grep -q 'deferred_reason' pipeline/editors/review_server.py && \
grep -q 'deferred' pipeline/editors/tabs/dailies.js && \
grep -q 'DEFERRED' pipeline/editors/tabs/dailies.js && \
echo "Phase 3 OK"
```

---

## Phase 4: CROP_TO_CLOSEUP Healing Strategy

**Goal:** Add a last-resort healing strategy that crops a medium/wide shot to a close-up to salvage the character's face when all other strategies have failed. Includes guards to prevent overuse.

### What already exists (from prior phases)

- `HealingStrategy` enum in `execution/healer/agent.py` has 12 strategies (ANATOMY_ANCHOR through VIDEO_MOTION_SIMPLIFY)
- `FIX_REGISTRY` in `execution/healer/fix_registry.py` has 11 entries (5 Gate 1, 3 Gate 2A, 3 Gate 2B)
- `match_fix()` in `fix_registry.py` does keyword matching, first match wins, has wardrobe escalation
- `HealerAgent.diagnose()` in `agent.py` calls `match_fix()` after never-heal and severity checks
- `RefChanges` dataclass has `face_crop_expression_refs` flag but no crop-to-closeup concept
- `constants.py` has `NEVER_HEAL`, `SEVERITY_CEILING`, retry limits

### Files to modify

- `execution/healer/agent.py` — Add CROP_TO_CLOSEUP to HealingStrategy enum, add guard logic in diagnose()
- `execution/healer/fix_registry.py` — Add CROP_TO_CLOSEUP entry with guards
- `execution/healer/constants.py` — Add CROP_TO_CLOSEUP constants

### Exact implementation

**In `execution/healer/agent.py`, add to `HealingStrategy` enum (after line 31):**

```python
    CROP_TO_CLOSEUP = "crop_to_closeup"
```

**In `execution/healer/agent.py`, add to `RefChanges` dataclass (after line 41, before the properties):**

```python
    crop_to_closeup: bool = False
```

**In `execution/healer/agent.py`, modify `HealerAgent.diagnose()` (lines 126-165).**

Add CROP_TO_CLOSEUP fallback after the deterministic fix check (after line 161). Replace the LVLM fallback section:

```python
        # Deterministic fix from registry
        fix = match_fix(verdict, attempt_number)
        if fix:
            logger.info("Healing: %s (confidence %.2f)", fix.strategy.value, fix.confidence)
            return fix

        # LVLM fallback (Phase 2 — stub for now)
        logger.info("No deterministic fix found. LVLM fallback not yet implemented.")
        return None
```

With:

```python
        # Deterministic fix from registry
        fix = match_fix(verdict, attempt_number)
        if fix:
            logger.info("Healing: %s (confidence %.2f)", fix.strategy.value, fix.confidence)
            return fix

        # CROP_TO_CLOSEUP fallback — last resort before ICU escalation
        fix = self._try_crop_to_closeup(verdict, attempt_number)
        if fix:
            logger.info("Healing: CROP_TO_CLOSEUP (last resort, confidence %.2f)", fix.confidence)
            return fix

        # LVLM fallback (Phase 2 — stub for now)
        logger.info("No deterministic fix found. LVLM fallback not yet implemented.")
        return None
```

**In `execution/healer/agent.py`, add `_try_crop_to_closeup()` method to HealerAgent class (after `diagnose()`, before `log_attempt()`):**

```python
    def _try_crop_to_closeup(
        self,
        verdict,
        attempt_number: int,
    ) -> Optional[HealingFix]:
        """Last-resort: crop a medium+ shot to close-up to salvage the face.

        Guards (ALL must pass):
        - Original framing is MS or wider (CU/MCU/ECU already tight)
        - Not an action or establishing shot (would break narrative)
        - All other strategies exhausted (attempt >= 2)
        - Max 2 consecutive CUs not exceeded (tracked per-session)
        - Shot verb_strength is LOW (vague action survives reframing)

        Synthesis: Track crop % per series — >5% means upstream ref/prompt problem.
        """
        from .constants import (
            CROP_CLOSEUP_MIN_FRAMING_ORDER,
            CROP_CLOSEUP_BLOCKED_SHOT_TYPES,
            CROP_CLOSEUP_MIN_ATTEMPT,
            CROP_CLOSEUP_MAX_CONSECUTIVE,
        )

        # Guard: must have failed enough times
        if attempt_number < CROP_CLOSEUP_MIN_ATTEMPT:
            return None

        # Guard: check framing from verdict details
        shot_type = getattr(verdict, 'details', {}).get("shot_type", "")
        if not shot_type:
            shot_type = getattr(verdict, 'details', {}).get("framing", "")
        shot_upper = shot_type.upper() if shot_type else ""

        # Framing order: EWS=0, WS=1, MWS=2, MS=3, MCU=4, CU=5, ECU=6
        FRAMING_ORDER = {"EWS": 0, "WS": 1, "MWS": 2, "MS": 3, "MCU": 4, "CU": 5, "ECU": 6, "OTS": 3}
        framing_rank = FRAMING_ORDER.get(shot_upper, -1)
        if framing_rank < CROP_CLOSEUP_MIN_FRAMING_ORDER:
            return None  # Already a close-up or unknown framing

        # Guard: not action or establishing
        if shot_upper in CROP_CLOSEUP_BLOCKED_SHOT_TYPES:
            return None

        # Guard: consecutive CU limit
        recent_crops = sum(
            1 for a in self.attempt_history[-CROP_CLOSEUP_MAX_CONSECUTIVE:]
            if a.strategy == HealingStrategy.CROP_TO_CLOSEUP
        )
        if recent_crops >= CROP_CLOSEUP_MAX_CONSECUTIVE:
            logger.warning("CROP_TO_CLOSEUP: max consecutive (%d) reached", CROP_CLOSEUP_MAX_CONSECUTIVE)
            return None

        # Guard: verb_strength should be LOW (from PlanPassCritic)
        verb_strength = getattr(verdict, 'details', {}).get("verb_strength", "")
        if verb_strength and verb_strength.upper() not in ("LOW", ""):
            return None

        return HealingFix(
            strategy=HealingStrategy.CROP_TO_CLOSEUP,
            layer_patches={
                "composition": "REWRITE: Reframe as a CLOSE-UP. Focus tightly on the character's face and upper shoulders. Crop out the wider environment.",
                "quality_guard": "APPEND: Ensure face fills at least 40% of the frame. _cropped suffix applied.",
            },
            ref_changes=RefChanges(
                keep_only=["hero"],
                face_crop_expression_refs=True,
                crop_to_closeup=True,
            ),
            negative_prompt_additions=["wide shot", "full body", "establishing shot"],
            confidence=0.30,
            rationale=f"Last-resort crop-to-closeup: {shot_upper} → CU (attempt {attempt_number})",
            diagnosis_cost=0.00,
        )
```

**In `execution/healer/constants.py`, add CROP_TO_CLOSEUP constants (at end of file):**

```python
# CROP_TO_CLOSEUP healing strategy guards
CROP_CLOSEUP_MIN_FRAMING_ORDER = 3    # MS (3) or wider — don't crop CU/MCU/ECU
CROP_CLOSEUP_BLOCKED_SHOT_TYPES = frozenset({"EWS", "WS"})  # Establishing/wide shots break narrative
CROP_CLOSEUP_MIN_ATTEMPT = 2          # Only after all other strategies exhausted
CROP_CLOSEUP_MAX_CONSECUTIVE = 2      # Max 2 consecutive crops before ICU escalation
CROP_CLOSEUP_CREEP_THRESHOLD = 0.05   # >5% crop rate = upstream problem
```

### Scope boundary

- Do NOT add CROP_TO_CLOSEUP to the fix_registry — it's a fallback in diagnose(), not a keyword-matched entry.
- Do NOT implement actual image cropping — the "crop" is a prompt rewrite that reframes the composition.
- Do NOT modify any existing healing strategies.
- Do NOT modify the StepRunner retry loop — it already handles HealingFix from diagnose().

### Validation

```bash
cd /Users/joeturnerlin/Dropbox/CLAUDE_PROJECTS/recoil && \
python3 -c "import ast; ast.parse(open('execution/healer/agent.py').read())" && \
python3 -c "import ast; ast.parse(open('execution/healer/constants.py').read())" && \
grep -q 'CROP_TO_CLOSEUP' execution/healer/agent.py && \
grep -q 'crop_to_closeup' execution/healer/agent.py && \
grep -q '_try_crop_to_closeup' execution/healer/agent.py && \
grep -q 'CROP_CLOSEUP_MIN_FRAMING_ORDER' execution/healer/constants.py && \
grep -q 'CROP_CLOSEUP_CREEP_THRESHOLD' execution/healer/constants.py && \
echo "Phase 4 OK"
```

---

## Phase 5: Gate 2A Explicit Identity Boolean

**Goal:** Add an explicit `same_person` boolean to Gate 2A's response schema. This gives a direct, unambiguous identity signal independent of the nuanced mismatch analysis.

### What already exists (from prior phases)

- Gate 2A in `pipeline/lib/validation.py` (`run_gate_2a_character()`) already checks identity as one of 5 mismatch categories
- The prompt says "Categories: IDENTITY, HAIRSTYLE, WARDROBE, ACCESSORIES, DISTINGUISHING_MARKS"
- The schema returns `visual_observations` + `mismatches` array
- Identity mismatches with severity CRITICAL are caught but identity is embedded in the broader mismatch flow

### Files to modify

- `pipeline/lib/validation.py` — Add `same_person` boolean to Gate 2A schema and prompt

### Exact implementation

**In `pipeline/lib/validation.py`, modify `run_gate_2a_character()` schema (lines 800-826):**

Replace the schema definition:

```python
        schema = {
            "type": "object",
            "properties": {
                "visual_observations": {
                    "type": "object",
                    "properties": {
                        "observed_hairstyle": {"type": "string"},
                        "observed_wardrobe": {"type": "string"},
                        "observed_accessories": {"type": "string"},
                    },
                    "required": ["observed_hairstyle", "observed_wardrobe", "observed_accessories"],
                },
                "mismatches": {
                    "type": "array",
                    "items": {
                        "type": "object",
                        "properties": {
                            "category": {"type": "string"},
                            "visual_evidence": {"type": "string"},
                            "severity": {"type": "string", "enum": ["MINOR", "NOTICEABLE", "CRITICAL"]},
                        },
                        "required": ["category", "visual_evidence", "severity"],
                    },
                },
            },
            "required": ["visual_observations", "mismatches"],
        }
```

With:

```python
        schema = {
            "type": "object",
            "properties": {
                "same_person": {
                    "type": "boolean",
                    "description": "Is the person in the target keyframe the SAME person as in the casting references? True if same person (even with wardrobe/hair differences), False only if clearly a different individual.",
                },
                "visual_observations": {
                    "type": "object",
                    "properties": {
                        "observed_hairstyle": {"type": "string"},
                        "observed_wardrobe": {"type": "string"},
                        "observed_accessories": {"type": "string"},
                    },
                    "required": ["observed_hairstyle", "observed_wardrobe", "observed_accessories"],
                },
                "mismatches": {
                    "type": "array",
                    "items": {
                        "type": "object",
                        "properties": {
                            "category": {"type": "string"},
                            "visual_evidence": {"type": "string"},
                            "severity": {"type": "string", "enum": ["MINOR", "NOTICEABLE", "CRITICAL"]},
                        },
                        "required": ["category", "visual_evidence", "severity"],
                    },
                },
            },
            "required": ["same_person", "visual_observations", "mismatches"],
        }
```

**In `pipeline/lib/validation.py`, modify `run_gate_2a_character()` prompt text (lines 771-793):**

Add the same_person instruction to the beginning of the prompt. Find the text block starting with `f"Compare the TARGET KEYFRAME..."` and prepend to it:

Replace:

```python
        parts.append(
            f"Compare the TARGET KEYFRAME (below) against all references above.{comp_note}\n\n"
            f"STEP 1: Fill out visual_observations — describe EXACTLY what you see "
```

With:

```python
        parts.append(
            f"Compare the TARGET KEYFRAME (below) against all references above.{comp_note}\n\n"
            f"STEP 0: Answer same_person — is this the SAME PERSON as the casting references? "
            f"Yes if same person even with wardrobe/hair differences. No ONLY if clearly a different individual.\n\n"
            f"STEP 1: Fill out visual_observations — describe EXACTLY what you see "
```

**In `pipeline/lib/validation.py`, modify `_score_gate_result()` (lines 542-584) to handle the `same_person` signal:**

After parsing the result JSON and before the scoring logic, add an identity override:

Find the line `result = json.loads(raw_json)` in `_score_gate_result` and add after `mismatches = result.get("mismatches", [])`:

```python
        # Identity override: if same_person is explicitly False, ensure CRITICAL identity mismatch
        if result.get("same_person") is False:
            has_identity_mismatch = any(
                m.get("category", "").upper() == "IDENTITY" for m in mismatches
            )
            if not has_identity_mismatch:
                mismatches.append({
                    "category": "IDENTITY",
                    "visual_evidence": "same_person=False: different individual detected",
                    "severity": "CRITICAL",
                })
                result["mismatches"] = mismatches
```

### Scope boundary

- Do NOT modify Gate 2B, Gate 2A wide shot, or any other gate.
- Do NOT change the scoring thresholds (MINOR=1, NOTICEABLE=3, CRITICAL=5).
- Do NOT add same_person to video gates or Gate 3.
- The same_person field is FIRST in the schema `required` array to force Flash to answer it before the detailed analysis.

### Validation

```bash
cd /Users/joeturnerlin/Dropbox/CLAUDE_PROJECTS/recoil && \
python3 -c "import ast; ast.parse(open('pipeline/lib/validation.py').read())" && \
grep -q 'same_person' pipeline/lib/validation.py && \
grep -q 'STEP 0' pipeline/lib/validation.py && \
grep -q 'same_person.*False' pipeline/lib/validation.py && \
echo "Phase 5 OK"
```

---

## Phase 6: Feedback Report Tool

**Goal:** Create a post-series analysis script that reads ExecutionStore and healing JSONL logs, aggregates failures by character/location/shot-type/model, and outputs a human-readable markdown report.

### What already exists (from prior phases)

- All prior phases are complete
- ExecutionStore stores per-shot JSON files at `projects/{project}/state/visual/shots/`
- HealerAgent writes JSONL logs to `engine-memory/healing/healing_log.jsonl`
- Shot records contain: status, gate_results, takes (with gate_verdict per take), model, pipeline, deferred, deferred_reason
- Takes contain: gate_verdict (passed, gate_name, reason), cost_usd, model, pipeline

### Files to create

- `pipeline/tools/feedback_report.py` — Post-series feedback analysis CLI tool

### Exact implementation

**Create `pipeline/tools/feedback_report.py`:**

```python
#!/usr/bin/env python3
"""feedback_report.py — Post-series failure analysis.

Reads ExecutionStore shot data and healing JSONL logs for a project.
Aggregates failures by character, location, shot type, and model.
Outputs a markdown report for human review.

Usage:
    python3 tools/feedback_report.py --project tartarus
    python3 tools/feedback_report.py --project tartarus --episode 1
    python3 tools/feedback_report.py --project tartarus --output report.md
"""

import argparse
import json
import sys
from collections import Counter, defaultdict
from pathlib import Path

# Resolve project roots
_TOOL_DIR = Path(__file__).parent
_PIPELINE_ROOT = _TOOL_DIR.parent
_RECOIL_ROOT = _PIPELINE_ROOT.parent
sys.path.insert(0, str(_RECOIL_ROOT))
sys.path.insert(0, str(_PIPELINE_ROOT))

from core.paths import PROJECTS_ROOT


def load_shots(project: str, episode: int | None = None) -> list[dict]:
    """Load all shot records from ExecutionStore JSON files."""
    shots_dir = PROJECTS_ROOT / project / "state" / "visual" / "shots"
    if not shots_dir.exists():
        print(f"No shots directory found: {shots_dir}", file=sys.stderr)
        return []

    shots = []
    for f in sorted(shots_dir.glob("*.json")):
        try:
            data = json.loads(f.read_text())
            if episode is not None:
                ep_id = data.get("episode_id", "")
                ep_num = int("".join(c for c in ep_id if c.isdigit()) or "0")
                if ep_num != episode:
                    continue
            shots.append(data)
        except (json.JSONDecodeError, ValueError):
            continue
    return shots


def load_healing_log(project: str) -> list[dict]:
    """Load healing JSONL log entries for a project."""
    log_path = _RECOIL_ROOT / "engine-memory" / "healing" / "healing_log.jsonl"
    if not log_path.exists():
        return []

    entries = []
    for line in log_path.read_text().splitlines():
        line = line.strip()
        if not line:
            continue
        try:
            entry = json.loads(line)
            if entry.get("project") == project:
                entries.append(entry)
        except json.JSONDecodeError:
            continue
    return entries


def analyze(shots: list[dict], healing_entries: list[dict]) -> dict:
    """Analyze shots and healing data, return aggregated stats."""
    total = len(shots)
    if total == 0:
        return {"total": 0}

    # Status distribution
    status_counts = Counter(s.get("status", "unknown") for s in shots)

    # Failure analysis
    failed_shots = [s for s in shots if "failed" in s.get("status", "")]
    deferred_shots = [s for s in shots if s.get("deferred")]

    # Group failures by gate
    gate_failures = Counter()
    failure_reasons = defaultdict(list)
    for s in shots:
        for take in s.get("takes", []):
            gv = take.get("gate_verdict", {})
            if gv and not gv.get("passed", True):
                gate_name = gv.get("gate_name", "unknown")
                gate_failures[gate_name] += 1
                reason = gv.get("reason", "")[:100]
                failure_reasons[gate_name].append({
                    "shot_id": s["shot_id"],
                    "reason": reason,
                    "model": take.get("model", ""),
                })

    # Group by model
    model_stats = defaultdict(lambda: {"total": 0, "failed": 0, "cost": 0.0})
    for s in shots:
        model = s.get("model", "unknown") or "unknown"
        model_stats[model]["total"] += 1
        if "failed" in s.get("status", ""):
            model_stats[model]["failed"] += 1
        model_stats[model]["cost"] += s.get("cost_incurred", 0)

    # Group by pipeline
    pipeline_stats = defaultdict(lambda: {"total": 0, "failed": 0})
    for s in shots:
        pipeline = s.get("pipeline", "unknown") or "unknown"
        pipeline_stats[pipeline]["total"] += 1
        if "failed" in s.get("status", ""):
            pipeline_stats[pipeline]["failed"] += 1

    # Healing strategy effectiveness
    strategy_stats = defaultdict(lambda: {"used": 0, "passed": 0, "failed_same": 0, "failed_different": 0})
    for entry in healing_entries:
        strat = entry.get("strategy", "unknown")
        result = entry.get("result", "")
        strategy_stats[strat]["used"] += 1
        if result == "passed":
            strategy_stats[strat]["passed"] += 1
        elif result == "failed_same":
            strategy_stats[strat]["failed_same"] += 1
        elif result in ("failed_different", "failed_worse"):
            strategy_stats[strat]["failed_different"] += 1

    # Cost analysis
    total_cost = sum(s.get("cost_incurred", 0) for s in shots)
    waste_cost = sum(s.get("retry_waste_cost", 0) for s in shots)

    # Character extraction (from shot_id pattern EPxx_SHxxx)
    episode_stats = defaultdict(lambda: {"total": 0, "failed": 0, "deferred": 0})
    for s in shots:
        ep = s.get("episode_id", "unknown")
        episode_stats[ep]["total"] += 1
        if "failed" in s.get("status", ""):
            episode_stats[ep]["failed"] += 1
        if s.get("deferred"):
            episode_stats[ep]["deferred"] += 1

    # CROP_TO_CLOSEUP creep tracking
    crop_count = sum(
        1 for entry in healing_entries
        if entry.get("strategy") == "crop_to_closeup"
    )
    crop_pct = (crop_count / total * 100) if total > 0 else 0

    return {
        "total": total,
        "status_counts": dict(status_counts),
        "failed_count": len(failed_shots),
        "deferred_count": len(deferred_shots),
        "gate_failures": dict(gate_failures),
        "failure_reasons": dict(failure_reasons),
        "model_stats": dict(model_stats),
        "pipeline_stats": dict(pipeline_stats),
        "strategy_stats": dict(strategy_stats),
        "total_cost": total_cost,
        "waste_cost": waste_cost,
        "episode_stats": dict(episode_stats),
        "crop_closeup_count": crop_count,
        "crop_closeup_pct": crop_pct,
    }


def format_report(project: str, stats: dict, episode: int | None = None) -> str:
    """Format analysis stats into a markdown report."""
    lines = []
    ep_label = f" Episode {episode}" if episode else ""
    lines.append(f"# Feedback Report — {project}{ep_label}")
    lines.append("")

    if stats["total"] == 0:
        lines.append("No shots found.")
        return "\n".join(lines)

    # Summary
    lines.append("## Summary")
    lines.append(f"- **Total shots:** {stats['total']}")
    lines.append(f"- **Failed:** {stats['failed_count']} ({stats['failed_count']/stats['total']*100:.1f}%)")
    lines.append(f"- **Deferred:** {stats['deferred_count']}")
    lines.append(f"- **Total cost:** ${stats['total_cost']:.2f}")
    lines.append(f"- **Waste cost (retries):** ${stats['waste_cost']:.2f}")
    lines.append("")

    # Status distribution
    lines.append("## Status Distribution")
    for status, count in sorted(stats["status_counts"].items(), key=lambda x: -x[1]):
        lines.append(f"- {status}: {count}")
    lines.append("")

    # Gate failures
    if stats["gate_failures"]:
        lines.append("## Gate Failures")
        for gate, count in sorted(stats["gate_failures"].items(), key=lambda x: -x[1]):
            lines.append(f"### {gate} — {count} failures")
            reasons = stats["failure_reasons"].get(gate, [])
            # Top 5 reasons
            reason_counts = Counter(r["reason"] for r in reasons)
            for reason, rc in reason_counts.most_common(5):
                lines.append(f"- ({rc}x) {reason}")
            lines.append("")

    # Model performance
    if stats["model_stats"]:
        lines.append("## Model Performance")
        lines.append("| Model | Shots | Failed | Fail % | Cost |")
        lines.append("|-------|-------|--------|--------|------|")
        for model, ms in sorted(stats["model_stats"].items()):
            fail_pct = ms["failed"] / ms["total"] * 100 if ms["total"] > 0 else 0
            lines.append(f"| {model} | {ms['total']} | {ms['failed']} | {fail_pct:.1f}% | ${ms['cost']:.2f} |")
        lines.append("")

    # Pipeline performance
    if stats["pipeline_stats"]:
        lines.append("## Pipeline Performance")
        for pipeline, ps in sorted(stats["pipeline_stats"].items()):
            fail_pct = ps["failed"] / ps["total"] * 100 if ps["total"] > 0 else 0
            lines.append(f"- **{pipeline}:** {ps['total']} shots, {ps['failed']} failed ({fail_pct:.1f}%)")
        lines.append("")

    # Healing effectiveness
    if stats["strategy_stats"]:
        lines.append("## Healing Strategy Effectiveness")
        lines.append("| Strategy | Used | Passed | Same Fail | Different Fail | Success % |")
        lines.append("|----------|------|--------|-----------|----------------|-----------|")
        for strat, ss in sorted(stats["strategy_stats"].items()):
            success_pct = ss["passed"] / ss["used"] * 100 if ss["used"] > 0 else 0
            lines.append(
                f"| {strat} | {ss['used']} | {ss['passed']} | "
                f"{ss['failed_same']} | {ss['failed_different']} | {success_pct:.0f}% |"
            )
        lines.append("")

    # CROP_TO_CLOSEUP creep
    if stats["crop_closeup_count"] > 0:
        lines.append("## CROP_TO_CLOSEUP Creep")
        warning = " **⚠ ABOVE 5% THRESHOLD — upstream problem**" if stats["crop_closeup_pct"] > 5 else ""
        lines.append(f"- Crop count: {stats['crop_closeup_count']} ({stats['crop_closeup_pct']:.1f}%){warning}")
        lines.append("")

    # Episode breakdown
    if stats["episode_stats"]:
        lines.append("## Episode Breakdown")
        lines.append("| Episode | Shots | Failed | Deferred |")
        lines.append("|---------|-------|--------|----------|")
        for ep, es in sorted(stats["episode_stats"].items()):
            lines.append(f"| {ep} | {es['total']} | {es['failed']} | {es['deferred']} |")
        lines.append("")

    return "\n".join(lines)


def main():
    parser = argparse.ArgumentParser(description="Post-series feedback analysis")
    parser.add_argument("--project", required=True, help="Project name (e.g. tartarus)")
    parser.add_argument("--episode", type=int, default=None, help="Filter to specific episode number")
    parser.add_argument("--output", type=str, default=None, help="Output file path (default: stdout)")
    parser.add_argument("--json", action="store_true", help="Output raw JSON instead of markdown")
    args = parser.parse_args()

    shots = load_shots(args.project, args.episode)
    healing = load_healing_log(args.project)
    stats = analyze(shots, healing)

    if args.json:
        output = json.dumps(stats, indent=2)
    else:
        output = format_report(args.project, stats, args.episode)

    if args.output:
        Path(args.output).write_text(output)
        print(f"Report written to {args.output}")
    else:
        print(output)


if __name__ == "__main__":
    main()
```

### Scope boundary

- Do NOT modify any existing files in this phase.
- Do NOT add automated prompt/ref adjustment — this is a read-only analysis tool.
- Do NOT connect this to any API endpoint — it's a CLI tool only.
- Do NOT import from the healing module directly — read JSONL files as raw JSON.
- Do NOT add statistical confidence metrics — at 500 shots, the data is directional, not statistically significant.

### Validation

```bash
cd /Users/joeturnerlin/Dropbox/CLAUDE_PROJECTS/recoil/pipeline && \
python3 -c "import ast; ast.parse(open('tools/feedback_report.py').read())" && \
grep -q 'def load_shots' tools/feedback_report.py && \
grep -q 'def load_healing_log' tools/feedback_report.py && \
grep -q 'def analyze' tools/feedback_report.py && \
grep -q 'def format_report' tools/feedback_report.py && \
grep -q 'crop_closeup_pct' tools/feedback_report.py && \
grep -q 'deferred_count' tools/feedback_report.py && \
python3 tools/feedback_report.py --help && \
echo "Phase 6 OK"
```
