# Phase C Taxonomy Audit

Generated 2026-05-01 during Phase 1 of engine-fix-phase-c-build.
Authoritative input for Phases 2-12.

This audit document is **self-contained**: a reader who has not read the
parent `BUILD_SPEC_ENGINE_FIX_PHASE_C.md` can use this doc to understand
every Phase C decision. All file:line citations were re-verified by direct
file inspection during Phase 1 (2026-05-01) — line drift from the original
spec is documented inline in Section 4.

The output of Phase C is a unified failure-mode taxonomy, a canonical cost
helper pair, and a typed `payload.hints` Pydantic surface. Phase 1 produces
**zero production code edits** — it produces only this audit.

---

## Section 0: Phase C high-level rationale

Phase C addresses the largest single architectural-debt cluster identified
by the 2026-04-30 architectural audit (DEBT-1: failure-mode/retry-classifier
fragmentation), folded together with two adjacent debts that share the same
gravitational center of "error handling + provenance":

| Debt id | Headline | Phase C resolution |
|---------|---------------------------------------------|--------------------|
| DEBT-1  | Two parallel enums + 3 classifiers + 5 transient lists | Single canonical enum (`FailureMode`), single classifier (`classify_failure`), single pattern set (`TRANSIENT_PATTERN_STRINGS` + `TRANSIENT_HTTP_CODES`) |
| DEBT-2  | 4-shape `compute_cost` + 27 cost-zero-fallback sites | One dispatcher (`pipeline/core/cost.py::compute_cost`) + typed reader (`read_cost_from_result`) |
| MF-10 / T2.21 | Untyped `payload.hints` dict escape-hatch | Pydantic models (`PayloadHints` base + 4 provider variants) |

The three workstreams are deliberately bundled into one CP because splitting
them risks a half-collapsed taxonomy where `FailureCategory` is gone but the
cost helper still silently zeroes missing values, and the operator still
cannot answer "where did this number come from."

The **single most important behavioural change** in Phase C is that the
canonical classifier MUST escalate on unknown error shapes (Tenet 6 — Errors
Must Be Visible). Section 6 below codifies the escalation contract.

---

## Section 1: Enum inventory

### 1.1 FailureMode (canonical-elect)

**File:** `recoil/core/critic.py:51`
**Verified by:** direct read of `core/critic.py` lines 51-86 on 2026-05-01.

**Members (22):**

| # | Name | String value | Origin |
|---|------|--------------|--------|
| 1 | `NONE` | `"none"` | core.critic.py:62 |
| 2 | `ANATOMY_FACE_MERGE` | `"anatomy_face_merge"` | core.critic.py:63 |
| 3 | `ANATOMY_LIMB_MISCOUNT` | `"anatomy_limb_miscount"` | core.critic.py:64 |
| 4 | `IDENTITY_DRIFT` | `"identity_drift"` | core.critic.py:65 |
| 5 | `BACKGROUND_CONTAMINATION` | `"background_contamination"` | core.critic.py:66 |
| 6 | `WARDROBE_MISMATCH` | `"wardrobe_mismatch"` | core.critic.py:67 |
| 7 | `LIGHTING_MISMATCH` | `"lighting_mismatch"` | core.critic.py:68 |
| 8 | `GRID_INFLUENCE` | `"grid_influence"` | core.critic.py:69 |
| 9 | `SAFETY_SOFTENED` | `"safety_softened"` | core.critic.py:70 |
| 10 | `UNKNOWN` | `"unknown"` | core.critic.py:71 |
| 11 | `MOTION_FAILURE` | `"motion_failure"` | core.critic.py:72 |
| 12 | `END_FRAME_DRIFT` | `"end_frame_drift"` | core.critic.py:73 |
| 13 | `CONTENT_FILTER_HARD_BLOCK` | `"content_filter_hard_block"` | core.critic.py:74 |
| 14 | `REF_BLEED` | `"ref_bleed"` | core.critic.py:75 |
| 15 | `AUDIO_SYNC_DRIFT` | `"audio_sync_drift"` | core.critic.py:76 |
| 16 | `COVERAGE_GEOMETRY_BROKEN` | `"coverage_geometry_broken"` | core.critic.py:77 |
| 17 | `COMPOSITION_WRONG` | `"composition_wrong"` | core.critic.py:80 |
| 18 | `STYLE_DRIFT` | `"style_drift"` | core.critic.py:81 |
| 19 | `CUTS_TOO_SOFT` | `"cuts_too_soft"` | core.critic.py:82 |
| 20 | `PROMPT_DURATION_MISMATCH` | `"prompt_duration_mismatch"` | core.critic.py:83 |
| 21 | `COST_OVERRUN` | `"cost_overrun"` | core.critic.py:84 |
| 22 | `TRANSIENT` | `"transient"` | core.critic.py:85 |
| 23 | `GATE_MECHANICAL` | `"gate_mechanical"` | core.critic.py:86 |

(Numbering goes to 23 because the spec text quoted the enum as "22 members"
treating one of the early values as redundant. Direct count of distinct
members in the source file is 23, including `NONE`. The decision below is
unaffected; both `NONE` and `UNKNOWN` are non-failure-classification
sentinels.)

**Persisted in:**
- Critics' Dimension records (`data-contracts.md §1c —
  sidecar.provenance.gate_results.dimensions[*].failure_mode`).
- `ops.log.jsonl` and `review_queue.jsonl` (string values are stable
  per docstring at `core/critic.py:58-59`).

**Used by (consumer files):**
- `recoil/pipeline/orchestrator/strategy_registry.py` (Tier 0/1 classifier)
- `recoil/pipeline/lib/critics/*` (5 critic files — Dimension.failure_mode)
- `recoil/pipeline/lib/run_shot.py` (`_extract_failure_mode`)
- `recoil/pipeline/lib/coverage_context.py`
- `recoil/workspace/mcp_server.py`

**Decision: PROMOTE to canonical.** All consumers route through this enum.
Phase 2 creates `recoil/pipeline/core/failure_mode.py` which **re-exports**
this `FailureMode` from `core.critic`. The enum body stays at its existing
home (per CP-9 lock + critic-callers byte-stability) — Phase C's
`failure_mode.py` is a re-export node, not a relocation.

### 1.2 FailureCategory (to-be-derived view)

**File:** `recoil/pipeline/orchestrator/production_types.py:27`
**Verified by:** direct read of `production_types.py` lines 27-37 on
2026-05-01.

**Members (9):**

| # | Name | String value | Notes |
|---|------|--------------|-------|
| 1 | `TRANSIENT` | `"transient"` | 429, 500, 503, timeout — auto-retry with backoff |
| 2 | `GATE_MECHANICAL` | `"gate_mechanical"` | Gate 1 fail — retry with different seed |
| 3 | `GATE_IDENTITY` | `"gate_identity"` | Gate 2A fail — retry with stronger refs |
| 4 | `GATE_WARDROBE` | `"gate_wardrobe"` | Gate 2A wardrobe fail — check phase, swap ref |
| 5 | `GATE_VIDEO_DRIFT` | `"gate_video_drift"` | Gate 3 — flag for review (not auto-reject) |
| 6 | `CONTENT_FILTER` | `"content_filter"` | Model refused — needs prompt rewrite |
| 7 | `PROMPT_DURATION_MISMATCH` | `"prompt_duration_mismatch"` | fal.ai schema validation — retry with corrected duration |
| 8 | `PERMANENT` | `"permanent"` | Exhausted retries or unfixable |
| 9 | `BUDGET` | `"budget"` | Budget exceeded — pause batch |

**Used by:**
- `recoil/pipeline/orchestrator/production_loop.py` (the
  `_classify_pass_error` method at line 811 returns `FailureCategory`)
- `recoil/pipeline/orchestrator/retry_dispatcher.py` (the
  `classify_failure` function at line 42 returns `FailureCategory`)
- `recoil/pipeline/orchestrator/production_types.py` itself
  (`DEFAULT_RETRY_POLICIES` keyed by `FailureCategory` at lines 58-68)

**Decision: KEEP as derived view.** Compute from `FailureMode` via a
canonical coarsening function `failure_category_for(mode: FailureMode)
-> FailureCategory`. Preserve enum shape and string values for
retry-policy lookup compatibility. Phase 4 wires the coarsening function;
`DEFAULT_RETRY_POLICIES` is rekeyed but its keys remain the same nine
`FailureCategory` values.

### 1.3 Mapping table (FailureMode → FailureCategory)

This is the **deliberate coarsening contract** for `failure_category_for()`.
Every `FailureMode` value (except `NONE` and `UNKNOWN`) must map to exactly
one `FailureCategory` value. Adding a new `FailureMode` without updating
this map raises `ValueError` at first call (mapping is exhaustive).

| FailureMode | FailureCategory | Rationale |
|---|---|---|
| `TRANSIENT` | `TRANSIENT` | Direct mapping. |
| `CONTENT_FILTER_HARD_BLOCK` | `CONTENT_FILTER` | Already content-policy domain. |
| `SAFETY_SOFTENED` | `CONTENT_FILTER` | Soft-failure variant of same domain. |
| `GATE_MECHANICAL` | `GATE_MECHANICAL` | Direct mapping. |
| `ANATOMY_FACE_MERGE` | `GATE_MECHANICAL` | Mechanical gate failure (face merge artifact). |
| `ANATOMY_LIMB_MISCOUNT` | `GATE_MECHANICAL` | Mechanical gate failure (limb count). |
| `IDENTITY_DRIFT` | `GATE_IDENTITY` | Direct mapping. |
| `REF_BLEED` | `GATE_IDENTITY` | Reference cross-contamination = identity domain. |
| `WARDROBE_MISMATCH` | `GATE_WARDROBE` | Direct mapping. |
| `MOTION_FAILURE` | `GATE_VIDEO_DRIFT` | Video-only failure mode. |
| `END_FRAME_DRIFT` | `GATE_VIDEO_DRIFT` | Video-only failure mode. |
| `CUTS_TOO_SOFT` | `GATE_VIDEO_DRIFT` | Video coverage failure. |
| `AUDIO_SYNC_DRIFT` | `GATE_VIDEO_DRIFT` | Audio-video sync = video-domain drift. |
| `PROMPT_DURATION_MISMATCH` | `PROMPT_DURATION_MISMATCH` | Direct mapping. |
| `COST_OVERRUN` | `BUDGET` | Direct mapping. |
| `BACKGROUND_CONTAMINATION` | `PERMANENT` | No retry-strategy fix. |
| `COMPOSITION_WRONG` | `PERMANENT` | No mechanical retry path. |
| `STYLE_DRIFT` | `PERMANENT` | Style-level drift requires re-prompt, not retry. |
| `LIGHTING_MISMATCH` | `PERMANENT` | No mechanical retry. |
| `GRID_INFLUENCE` | `PERMANENT` | No mechanical retry. |
| `COVERAGE_GEOMETRY_BROKEN` | `PERMANENT` | Plan-pass-level fix only. |
| `UNKNOWN` | (escalates — does NOT map to any FailureCategory) | Tenet 6 escalation. See §6. |
| `NONE` | (caller-error to coarsen NONE; raises `ValueError`) | NONE means "no failure"; caller should not call coarsening on it. |

**Validation note:** The mapping above covers 21 of 23 `FailureMode`
members; `UNKNOWN` and `NONE` are the two non-mappable sentinels. Phase 2's
`failure_category_for()` test verifies exhaustiveness via:

```python
@pytest.mark.parametrize("mode", list(FailureMode))
def test_failure_category_for_is_total(mode):
    if mode in (FailureMode.NONE, FailureMode.UNKNOWN):
        with pytest.raises((ValueError, UnknownFailureEscalation)):
            failure_category_for(mode)
    else:
        cat = failure_category_for(mode)
        assert isinstance(cat, FailureCategory)
```

---

## Section 2: Classifier inventory

Five classifier functions exist in production today. Phase C consolidates
to **one canonical entry point** while preserving function-level wrappers
at the existing call boundaries (so `production_loop.py` stays
byte-untouched per Section 7 hard sequencing rule #4).

### 2.1 Function-level inventory

| # | Function | File:line (verified 2026-05-01) | Signature | Returns | Disposition in Phase C |
|---|----------|--------------------------------|-----------|---------|------------------------|
| 1 | `_extract_failure_mode` | `recoil/pipeline/lib/run_shot.py:99` | `(step_result) -> FailureMode` | `FailureMode` | Phase 4: thin wrapper around canonical `classify_failure()`. |
| 2 | `detect_failure_mode` | `recoil/pipeline/orchestrator/strategy_registry.py:1284` | `(pass_result, coverage_pass) -> tuple[FailureMode, float]` | `(FailureMode, confidence)` | Phase 4: thin wrapper around canonical `classify_failure()`. This is the production_loop's call boundary — keeping it here preserves byte-stability of `production_loop.py`. |
| 3 | `classify_failure` (legacy retry-dispatcher version) | `recoil/pipeline/orchestrator/retry_dispatcher.py:42` | `(step_result, shot_data=None) -> FailureCategory` | `FailureCategory` | Phase 3: rewritten as a thin wrapper around canonical `classify_failure()` + `failure_category_for()`. Returns `FailureCategory` for backward compatibility — same call sites continue to use it. |
| 4 | `from_score_card` | `recoil/pipeline/orchestrator/strategy_registry.py:1552` | `(score_card: dict) -> tuple[FailureMode, float]` | `(FailureMode, confidence)` | CP-9 substrate; not on production live path. Preserved as substrate only — Phase 4 leaves untouched. |
| 5 | `_classify_pass_error` | `recoil/pipeline/orchestrator/production_loop.py:811` | `(self, error_text: str) -> FailureCategory` | `FailureCategory` | **OUT-OF-SCOPE for Phase C** — production_loop is byte-untouched per CP-9 lock. Phase E or later. Documented in `POST_PHASE_C_HANDOFF.md`. |

### 2.2 Canonical entry point (introduced by Phase 2)

```python
# recoil/pipeline/core/failure_mode.py — NEW (Phase 2)

def classify_failure(
    *,
    error_text: Optional[str] = None,
    gate_verdict: Optional[GateVerdict] = None,
    http_status: Optional[int] = None,
    escalate_unknown: bool = True,
) -> tuple[FailureMode, float]:
    """Single canonical classifier. Tenet 6 escalation by default."""
    ...
```

**Stability of signature:** keyword-only, all params optional, returns
`(mode, confidence in [0.0, 1.0])`. Phase 12 hard-gates on this signature.

### 2.3 Behaviour preservation across the 5 functions

The 5 functions agree on **most** common error texts but disagree on
edges. The `phase_c_classifier_fixtures.json` captured by Phase 0
becomes the source of truth — Phase 2 ships a test that runs every
fixture through the canonical `classify_failure()` and verifies the
`(mode, confidence)` tuple matches the captured value (modulo the
`expect_unknown: True` cases, which raise `UnknownFailureEscalation`).

Disagreement risk surface:

- `_TRANSIENT_PATTERNS` (retry_dispatcher) vs inline `detect_failure_mode`
  patterns (strategy_registry:1303): see Section 3.
- `from_score_card` is the only score-card-driven classifier; canonical
  classifier accepts a `gate_verdict` argument that subsumes its inputs.
- `_classify_pass_error` re-imports `_TRANSIENT_PATTERNS` (production_loop
  line 818), so its behaviour is co-bound with `retry_dispatcher`. As long
  as Phase 3 preserves `_TRANSIENT_PATTERNS` semantics (UNION of all 5
  lists), `_classify_pass_error` continues to behave identically without
  being touched.

---

## Section 3: Transient-pattern union

Five disagreeing transient-pattern lists exist in production today. The
canonical `pipeline/core/failure_mode.py` ships **two surfaces** built
from their UNION:

1. `TRANSIENT_PATTERN_STRINGS: tuple[str, ...]` — substring matches
   against error text.
2. `TRANSIENT_HTTP_CODES: frozenset[int]` — explicit HTTP code set.

### 3.1 Source list inventory (verified 2026-05-01)

| # | Source | File:line (verified) | Form | Patterns / codes | Lacks (vs union) |
|---|--------|---------------------|------|------------------|------------------|
| 1 | `_TRANSIENT_PATTERNS` | `recoil/pipeline/orchestrator/retry_dispatcher.py:29` | tuple of strings | `("429", "rate limit", "503", "502", "500", "timeout", "connection", "ECONNRESET")` | `"504"` |
| 2 | `production_loop._classify_pass_error` body | `recoil/pipeline/orchestrator/production_loop.py:818-826` | re-imports list 1 | (same as list 1) | (same as list 1) — co-bound by re-import |
| 3 | Inline `detect_failure_mode` patterns | `recoil/pipeline/orchestrator/strategy_registry.py:1303` | tuple of strings | `("timeout", "rate limit", "502", "503", "504", "connection")` | `"500"`, `"429"`, `"ECONNRESET"` |
| 4 | `RETRYABLE_HTTP` (elevenlabs) | `recoil/execution/providers/elevenlabs.py:38` | frozenset of HTTP codes | `{500, 501, 502, 503, 504}` | (lacks no string-form) — but excludes 429 deliberately (treated as fail-fast `RateLimitError`) |
| 5 | `RETRYABLE_HTTP` (sync_so) | `recoil/execution/providers/sync_so.py:35` | frozenset of HTTP codes | `{500, 501, 502, 503, 504}` | identical to elevenlabs |

### 3.2 Canonical UNION

**`TRANSIENT_PATTERN_STRINGS` (string fragments, substring-matched against
lowercased error text):**

| String fragment | Sourced from list(s) |
|-----------------|----------------------|
| `"429"` | 1, 2 |
| `"rate limit"` | 1, 2, 3 |
| `"500"` | 1, 2 |
| `"502"` | 1, 2, 3 |
| `"503"` | 1, 2, 3 |
| `"504"` | 3 |
| `"timeout"` | 1, 2, 3 |
| `"connection"` | 1, 2, 3 |
| `"ECONNRESET"` | 1, 2 |

Total: **9 distinct string fragments** in canonical UNION.

**`TRANSIENT_HTTP_CODES` (HTTP status codes, set-membership checked against
provider response status):**

| HTTP code | Sourced from list(s) |
|-----------|----------------------|
| `500` | 4, 5 |
| `501` | 4, 5 |
| `502` | 4, 5 |
| `503` | 4, 5 |
| `504` | 4, 5 |

Total: **5 HTTP codes** in canonical UNION (`{500, 501, 502, 503, 504}`).

### 3.3 Deliberate omissions

**None.** No pattern from any of the 5 source lists is excluded from the
canonical UNION. This is deliberate — the audit goal is behaviour
preservation; collapsing 5 lists with disagreeing membership into 1
must not lose any pattern that any list had. Phase 12 verifies this via
behaviour-preservation fixtures (Phase 0's
`phase_c_classifier_fixtures.json`).

### 3.4 The 429 disposition (special note from spec line 138)

**`429` is the rare cross-surface case.** A string match (`"429"` in error
text) considers it transient (lists 1 + 2 — retry_dispatcher + co-bound
production_loop). But the HTTP-code path in elevenlabs/sync_so
(lists 4 + 5) treats 429 as **fail-fast** `RateLimitError` (excluded from
`RETRYABLE_HTTP`).

**Canonical resolution: preserve the discrepancy.**

- `429` is in `TRANSIENT_PATTERN_STRINGS` (because retry_dispatcher used it
  that way and Phase C must preserve behaviour).
- `429` is **NOT** in `TRANSIENT_HTTP_CODES`.
- Provider adapters (elevenlabs, sync_so) that classify by HTTP code keep
  their existing fail-fast on 429.

The two surfaces are explicitly different, documented inline in
`failure_mode.py`'s module docstring, and preserved by Phase 6 + 7 (which
do NOT migrate elevenlabs/sync_so to the string-pattern path — they keep
HTTP-code classification).

### 3.5 Other pattern sets (in retry_dispatcher.py — for full inventory)

For completeness, `retry_dispatcher.py` also contains four NON-transient
pattern sets used by `classify_failure()`. These are NOT part of the
transient UNION but are documented here for taxonomy completeness:

| Constant | File:line | Patterns | Routes to |
|----------|-----------|----------|-----------|
| `_CONTENT_FILTER_PATTERNS` | `retry_dispatcher.py:30` | `("content filter", "safety", "blocked", "policy", "refused")` | `FailureCategory.CONTENT_FILTER` |
| `_IDENTITY_PATTERNS` | `retry_dispatcher.py:31` | `("identity", "drift", "face", "different person", "wrong character")` | `FailureCategory.GATE_IDENTITY` |
| `_WARDROBE_PATTERNS` | `retry_dispatcher.py:32` | `("wardrobe", "costume", "clothing", "outfit", "phase")` | `FailureCategory.GATE_WARDROBE` |
| `_MECHANICAL_PATTERNS` | `retry_dispatcher.py:33` | `("artifact", "finger", "limb", "hand", "anatomy", "merge", "distort")` | `FailureCategory.GATE_MECHANICAL` |
| `_SCHEMA_PATTERNS` | `retry_dispatcher.py:34-39` | `("422", "input should be", "unprocessable", "validation error")` | `FailureCategory.PROMPT_DURATION_MISMATCH` |

`detect_failure_mode` in strategy_registry.py:1299 also contains an inline
content-filter pattern set: `("content policy", "safety", "nsfw",
"moderation", "rejected")`. This UNION-merges into a canonical
`CONTENT_FILTER_PATTERNS` constant in Phase 2's `failure_mode.py`:

| Canonical `CONTENT_FILTER_PATTERNS` (UNION) |
|---------------------------------------------|
| `"content filter"` |
| `"content policy"` |
| `"safety"` |
| `"blocked"` |
| `"policy"` |
| `"refused"` |
| `"nsfw"` |
| `"moderation"` |
| `"rejected"` |

Total: 9 distinct fragments.

`detect_failure_mode` also contains an inline budget pattern list at
`strategy_registry.py:1307`: `("budget", "insufficient", "balance", "402")`.
UNION-merges into a canonical `BUDGET_PATTERNS`:

| Canonical `BUDGET_PATTERNS` (UNION) |
|-------------------------------------|
| `"budget"` |
| `"insufficient"` |
| `"balance"` |
| `"402"` |

Total: 4 distinct fragments. (retry_dispatcher's `classify_failure` checks
only `"budget"` substring at line 59 — UNION captures all four.)

---

## Section 4: Cost-zero-fallback site disposition

This section is the **definitive list** of all 27 production sites scoped
for Phase 9 migration. Each site is marked **hard migration** (must surface
missing cost as `CostMissingError`) or **sanctioned-fallback** (tolerates
missing with WARNING log). One additional production site
(`production_loop.py:83`) is **explicitly deferred** under the byte-untouched
lock. Test sites are NOT in scope.

### 4.1 Re-grep verification (2026-05-01)

A fresh grep of the codebase on 2026-05-01 (Phase 1 verification) found
the line numbers below. **Drift from spec is documented inline** —
particularly the workspace/server.py and pass_store.py sites, which
shifted because of Phase B's workspace extraction.

### 4.2 Production site table (27 sites + 1 deferred)

| # | File:line (verified 2026-05-01) | Pattern | Disposition | Notes / drift from spec |
|---|--------------------------------|---------|-------------|--------------------------|
| 1 | `recoil/execution/pass_store.py:205` | `(record.get("cost_usd") or 0.0) + value` | **Sanctioned-fallback** | Accumulator initial-value path. Wrap with `read_cost_from_record_safe()` (sanctioned-fallback variant; logs WARNING with `"FALLBACK_FIRED cost_unknown"` token). DRIFT: spec said line 177; actual line 205. |
| 2 | `recoil/workspace/server.py:969` | `pass_record.get("cost_usd", 0)` | **Sanctioned-fallback** | Display-only — UI tolerates missing. Migrate to `read_cost_from_record_safe()`. DRIFT: spec said line 1038; actual line 969 (Phase B refactor shifted lines). |
| 3 | `recoil/workspace/server.py:2218` | `record.get("cost_usd", 0)` | **Sanctioned-fallback** | Display-only. Migrate to `read_cost_from_record_safe()`. DRIFT: spec said line 3043; actual line 2218 (Phase B refactor shifted lines). |
| 4 | `recoil/tools/shootout/run_shootout.py:158` | `result.metadata.get("cost_usd", 0.0) or 0.0` | **Hard migration** to `read_cost_from_result()` | Production tool — must surface broken cost. |
| 5 | `recoil/pipeline/orchestrator/pipeline.py:315` | `result.metadata.get("cost_usd") or 0.0` | **Hard migration** to `read_cost_from_result()` | Pipeline orchestrator. |
| 6 | `recoil/pipeline/orchestrator/pipeline.py:340` | `"cost": result.metadata.get("cost_usd") or 0.0` | **Hard migration** | Receipt-write path. |
| 7 | `recoil/pipeline/orchestrator/pipeline.py:565` | `result.metadata.get("cost_usd") or 0.0` | **Hard migration** | Same pattern as line 315. |
| 8 | `recoil/pipeline/orchestrator/pipeline.py:588` | `"cost": result.metadata.get("cost_usd") or 0.0` | **Hard migration** | Receipt-write path. |
| 9 | `recoil/pipeline/orchestrator/pipeline.py:814` | `result.metadata.get("cost_usd") or 0.0` | **Hard migration** | Same pattern as line 315. |
| 10 | `recoil/pipeline/orchestrator/pipeline.py:837` | `"cost": result.metadata.get("cost_usd") or 0.0` | **Hard migration** | Receipt-write path. |
| 11 | `recoil/pipeline/orchestrator/pipeline.py:1288` | `video_cost = video_result.metadata.get("cost_usd") or 0.0` | **Hard migration** | Video-cost extraction; missing cost is real bug. |
| 12 | `recoil/pipeline/orchestrator/pipeline.py:1542` | `video_cost = video_result.metadata.get("cost_usd") or 0.0` | **Hard migration** | Video-cost extraction. |
| — | `recoil/pipeline/orchestrator/production_loop.py:83` | `cost_usd=md.get("cost_usd", 0.0) or 0.0` | **DEFERRED — out of scope per byte-untouched gate** | production_loop.py is byte-frozen. Phase 5 deferred migration documented in `POST_PHASE_C_HANDOFF.md`. |
| 13 | `recoil/pipeline/orchestrator/learning_engine.py:231` | `total_cost = sum(r.get("cost_usd", 0.0) for r in records)` | **Sanctioned-fallback** | Aggregation tolerates missing. Migrate to `read_cost_from_record_safe()` with WARNING log. |
| 14 | `recoil/pipeline/core/eval.py:215` | `cost_usd=float(d.get("cost_usd") or 0.0)` | **Sanctioned-fallback** | Legacy-dict deserialization. Migrate to `read_cost_from_record_safe()`. |
| 15 | `recoil/pipeline/api/routes/generation.py:2346` | `cost_usd = result.metadata.get("cost_usd", 0.0) or 0.0` | **Hard migration** to `read_cost_from_result()` | API route — must surface bug. |
| 16 | `recoil/pipeline/editors/review_server.py:7421` | `cost_usd = result.metadata.get("cost_usd", 0.0) or 0.0` | **Hard migration** | DRIFT: spec said line 7414; actual line 7421. |
| 17 | `recoil/pipeline/lib/run_shot.py:446` | `actual_cost = step_result.metadata.get("cost_usd") or 0.0` | **Hard migration** to `read_cost_from_result()` | Production loop's per-shot accounting. |
| 18 | `recoil/pipeline/tools/single_take_sh03.py:89` | `cost_usd = result.metadata.get("cost_usd", 0.0) or 0.0` | **Hard migration** | Production tool. |
| 19 | `recoil/pipeline/tools/generate_keyframes.py:175` | `cost_usd = result.metadata.get("cost_usd", 0.0) or 0.0` | **Hard migration** | Production tool. |
| 20 | `recoil/pipeline/lib/manifest_writer.py:209` | `total_cost += m.get("cost_usd") or 0` | **Sanctioned-fallback** | Aggregation. Migrate to `read_cost_from_record_safe()`. |
| 21 | `recoil/pipeline/tools/client_sequence_runner.py:62` | `cost_usd=md.get("cost_usd", 0.0) or 0.0` | **Hard migration** | Production tool. |
| 22 | `recoil/pipeline/tools/generate_camera_refs.py:336` | `sum(e.get("cost_usd", 0) for e in entries)` | **Sanctioned-fallback** | Aggregation over manifest entries. Migrate to `read_cost_from_record_safe()`. |
| 23 | `recoil/pipeline/tools/generate_camera_refs.py:408` | `total_cost = sum(e.get("cost_usd", 0) for e in entries)` | **Sanctioned-fallback** | Aggregation. |
| 24 | `recoil/pipeline/tools/dispatch_cli.py:610` | `cost_usd = float(result.metadata.get("cost_usd") or 0.0)` | **Hard migration** | Dispatch CLI. |
| 25 | `recoil/pipeline/tools/dispatch_cli.py:749` | `cost_usd = float(result.metadata.get("cost_usd") or 0.0)` | **Hard migration** | Same pattern as 610. |
| 26 | `recoil/pipeline/tools/dispatch_cli.py:1649` | `cost_usd = float(result.metadata.get("cost_usd") or 0.0)` | **Hard migration** | Same pattern. |
| 27 | `recoil/pipeline/tools/dispatch_cli.py:1651` | `cost_usd = float(getattr(result, "cost_usd", 0.0) or 0.0)` | **Hard migration** | getattr variant — wrap in `read_cost_from_result()` which handles both attr + metadata-dict shapes. |

**Site count:** 27 production sites (Phase 9 migrates all 27) + 1 explicitly
deferred (`production_loop.py:83`).

### 4.3 Hard / sanctioned split

| Disposition | Count | Sites |
|-------------|-------|-------|
| **Hard migration** to `read_cost_from_result()` (raises `CostMissingError`) | **18** | 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 21, 24, 25, 26, 27 |
| **Sanctioned-fallback** to `read_cost_from_record_safe()` (logs WARNING) | **9** | 1, 2, 3, 13, 14, 20, 22, 23 |

Wait — that's 19 hard + 8 sanctioned = 27. Let me re-count:

| Disposition | Count | Sites |
|-------------|-------|-------|
| **Hard migration** | 18 | 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 21, 24, 25, 26, 27 |
| **Sanctioned-fallback** | 9 | 1, 2, 3, 13, 14, 20, 22, 23 |

Counting the Hard list: 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 21, 24, 25, 26, 27 = 19 entries.
Counting the Sanctioned list: 1, 2, 3, 13, 14, 20, 22, 23 = 8 entries.

19 + 8 = 27 production sites. Final canonical split:

- **Hard migration: 19 sites.**
- **Sanctioned-fallback: 8 sites.**
- **Total: 27 production sites + 1 deferred (production_loop.py:83).**

Phase 9 executes the migrations in these counts.

### 4.4 Bonus eval-cost site (out of scope; documented for completeness)

| File:line | Pattern | Disposition |
|-----------|---------|-------------|
| `recoil/pipeline/core/eval.py:661` | `current = float(receipt.provenance.get("eval_cost_usd") or 0.0)` | NOT a generation-cost site — this reads `eval_cost_usd` from receipt provenance (CP-9 separation). NOT in scope for Phase 9 (which migrates `cost_usd` reads only). |
| `recoil/pipeline/core/eval.py:663` | `current + float(scorecard.get("panel_cost_usd") or 0.0)` | Same — eval cost domain. NOT in scope. |

These two are explicitly out of scope for Phase 9 because they read
`eval_cost_usd` / `panel_cost_usd`, which is the CP-9 separation (eval cost
flows through `receipt.provenance["eval_cost_usd"]` distinct from
generation cost in `RunResult.metadata.cost_usd`).

### 4.5 Test-only sites (left alone)

Per spec line 86 + Phase 9 § "no test-site migration": tests captured the
pre-fix behaviour; changing them mid-build can mask regressions. Test sites
are NOT enumerated in this audit — Phase 9's grep filter explicitly
excludes any path containing `tests/` or `test_*.py`.

### 4.6 Cost helper inventory (compute side)

For completeness — the `compute_cost` family that **writes** cost_usd:

| # | File:line (verified 2026-05-01) | Method | Signature | Returns | Phase C disposition |
|---|--------------------------------|--------|-----------|---------|---------------------|
| 1 | `recoil/execution/providers/base.py:167` | `ProviderAdapter.compute_cost` (Protocol) | `(self, duration_s: float, tier: str, profile: dict) -> float` | float | Becomes thin wrapper around canonical `pipeline/core/cost.py::compute_cost()`. |
| 2 | `recoil/execution/providers/fal.py:279` | `FalAdapter.compute_cost` | same as Protocol | float | Wrapper. |
| 3 | `recoil/execution/providers/atlas.py:181` | `AtlasAdapter.compute_cost` | same | float | Wrapper. |
| 4 | `recoil/execution/providers/piapi.py:215` | `PiapiAdapter.compute_cost` | same | float | Wrapper. |
| 5 | `recoil/execution/providers/wan.py:239` | `WanAdapter.compute_cost` | (slightly different — see source) | float | Wrapper. |
| 6 | `recoil/execution/providers/google.py:234` | `GoogleAdapter.compute_cost` | same as Protocol | float | Wrapper. |
| 7 | `recoil/execution/providers/kling.py:231` | `KlingAdapter.compute_cost` | (slightly different — see source) | float | Wrapper. |
| 8 | `recoil/execution/providers/elevenlabs.py:101` | `_compute_cost(model_id, char_count) -> float` (module-level) | **DIFFERENT — char_count-based** | float | Phase 6 wraps. |
| 9 | `recoil/execution/providers/sync_so.py:128` | `_compute_cost(model_id, duration_s) -> float` (module-level) | (different — duration_s-only, no tier) | float | Phase 7 wraps. |
| 10 | `recoil/execution/providers/gemini_vision.py:217` | `_compute_cost(...)` (module-level, token-based) | **DIFFERENT — per-1k-token rates** | float | Phase 8's canonical dispatcher absorbs this. |

**Phase 8's resolution:** create
`pipeline/core/cost.py::compute_cost(model_id, *, duration_s=None,
char_count=None, token_inputs=None, tier=None) -> float` — a **dispatcher**
that reads the model's profile via `core.model_profiles.get_profile()` (the
Phase A canonical helper), determines the model's billing modality from the
profile, and routes to the right calculation. Each provider adapter's
`compute_cost` (and module-level `_compute_cost`) becomes a thin wrapper.

This keeps the Protocol signature stable while collapsing the 4
implementations into one — Section 4.6 documents the Phase-8 dispatcher
contract; Phase 9 reads cost via the typed helper.

---

## Section 5: payload.hints field inventory

`UnifiedVideoPayload.hints` is currently typed as `dict[str, Any]` (verified
at `execution/providers/base.py:60`). 51 references exist in
`recoil/execution/` (production, excluding tests). Phase 10 replaces with a
typed Pydantic surface.

### 5.1 Per-file site inventory (verified 2026-05-01)

| File | Site count | Hint keys observed | File:line list |
|------|-----------|--------------------|----------------|
| `execution/providers/base.py` | 1 | (declaration: `hints: dict = field(default_factory=dict)`) | base.py:60 |
| `execution/step_runner.py` | 8 (visible from grep on tokens `hints[` / `hints.` / `hints=`) | `modality` (write), `multi_shots` (read/write), `parts`, `elements`, legacy `o3_elements` | step_runner.py:419, 588, 595, 988, 1452, 1463, 2055, 2459, 2466 |
| `execution/video_model_client.py` | 1 | passes hints dict through | video_model_client.py:102 |
| `execution/providers/wan.py` | 7 | `enable_prompt_expansion`, `enable_safety_checker`, `seed`, `audio_url`, `video_url`, `multi_shots` | wan.py:271, 274, 286, 290, 294, 328, 355, 358 |
| `execution/providers/kling.py` | 6 | `endpoint`, `mode`, `elements`, `image_url` | kling.py:53, 55, 56, 60, 61, 62, 154 |
| `execution/providers/google.py` | 7 | `modality`, `parts`, `genai_config` | google.py:70, 100, 110, 152, 254, 256, 259, 260 |
| `execution/providers/testing/mock_google.py` | 2 | `modality` (test mock) | mock_google.py:40, 60 |

The remaining ~20 references in the 51-reference total are comments,
docstrings, and inline references that don't read or write the hints
dict. The grep pattern `hints\[|hints\.|payload\.hints|hints=|hints:`
captured both production reads/writes and structural references (e.g.,
the typed-dataclass field declaration, docstring mentions).

### 5.2 Per-provider canonical key list

The Pydantic models in Phase 10 reflect the per-provider keys observed.
Each model has `extra="forbid"` so unknown keys raise on construction —
this is the typing teeth of the Phase 10 migration.

#### 5.2.1 `WanHints` (6 fields)

| Field | Type | Default | Source citation |
|-------|------|---------|-----------------|
| `enable_prompt_expansion` | `bool` | `False` | wan.py:271 |
| `enable_safety_checker` | `bool` | `True` | wan.py:274 |
| `seed` | `Optional[int]` | `None` | wan.py:286, 358 |
| `audio_url` | `Optional[str]` | `None` | wan.py:290 |
| `video_url` | `Optional[str]` | `None` | wan.py:294 |
| `multi_shots` | `bool` | `False` | wan.py:355 |

#### 5.2.2 `KlingHints` (4 fields)

| Field | Type | Default | Source citation |
|-------|------|---------|-----------------|
| `endpoint` | `Optional[str]` | `None` | kling.py:56 |
| `mode` | `Literal["standard", "professional"]` | `"standard"` | kling.py:60 |
| `elements` | `Optional[list[Any]]` | `None` | kling.py:61, 154 |
| `image_url` | `Optional[str]` | `None` | kling.py:62 |

#### 5.2.3 `GoogleHints` (3 fields)

| Field | Type | Default | Source citation |
|-------|------|---------|-----------------|
| `modality` | `Literal["image", "video"]` | `"video"` | google.py:100, 152 |
| `parts` | `Optional[list[Any]]` | `None` (multimodal prompt parts; assembler-built) | google.py:260 |
| `genai_config` | `Optional[dict]` | `None` (last-mile genai SDK config; opaque to recoil) | google.py:254-256 |

#### 5.2.4 `StepRunnerHints` (5 fields)

Used by `step_runner.py` BEFORE provider routing. `modality` is set by
step_runner for downstream provider dispatch (Google reads it; others
ignore). `multi_shots` is set for Seedance R2V multi-shot mode. `parts`
and `elements` are legacy passthrough.

| Field | Type | Default | Source citation |
|-------|------|---------|-----------------|
| `modality` | `Optional[Literal["image", "video", "audio"]]` | `None` | step_runner.py:588-597 (writes "video"); step_runner.py:1452, 1463 (writes "image") |
| `multi_shots` | `Optional[list[Any]]` | `None` | step_runner.py:988 (read in routing) |
| `parts` | `Optional[list[Any]]` | `None` | step_runner.py:1452, 1463 (writes for image-modality google routing) |
| `elements` | `Optional[list[Any]]` | `None` | step_runner.py:419 (legacy fal.ai O3 path), step_runner.py:1463, 2055 |
| `o3_elements` | `Optional[Any]` | `None` (legacy fal.ai pathway) | step_runner.py:419 (mentioned as legacy) |

### 5.3 Pydantic model surface (Phase 10 spec)

```python
# recoil/execution/providers/payload_hints.py — NEW FILE (Phase 10)

from pydantic import BaseModel, Field
from typing import Literal, Optional, Any


class PayloadHintsValidationError(Exception):
    """Phase-C temp local. FIXME(phase-e): migrate to recoil/lib/exceptions.py"""
    pass


class PayloadHints(BaseModel):
    """Base for all provider-specific payload hints. Subclass per provider."""
    model_config = {"extra": "forbid"}  # unknown keys raise


class WanHints(PayloadHints):
    enable_prompt_expansion: bool = False
    enable_safety_checker: bool = True
    seed: Optional[int] = None
    audio_url: Optional[str] = None
    video_url: Optional[str] = None
    multi_shots: bool = False


class KlingHints(PayloadHints):
    endpoint: Optional[str] = None
    mode: Literal["standard", "professional"] = "standard"
    elements: Optional[list[Any]] = None
    image_url: Optional[str] = None


class GoogleHints(PayloadHints):
    modality: Literal["image", "video"] = "video"
    parts: Optional[list[Any]] = None
    genai_config: Optional[dict] = None


class StepRunnerHints(PayloadHints):
    modality: Optional[Literal["image", "video", "audio"]] = None
    multi_shots: Optional[list[Any]] = None
    parts: Optional[list[Any]] = None
    elements: Optional[list[Any]] = None
    o3_elements: Optional[Any] = None
```

### 5.4 Backward-compat shim

Every provider's existing `(payload.hints or {}).get(KEY)` pattern stays
valid because `PayloadHints.model_dump()` produces a dict. Provider
adapters get one one-line edit each: replace
`(payload.hints or {}).get(KEY)` with
`(payload.hints.model_dump() if payload.hints else {}).get(KEY)`.

After Phase 11, callers transition to direct attribute access
(`payload.hints.modality` etc.) for new code. The dict-fallback remains
for one deprecation cycle (removed post-Phase-D).

### 5.5 NOT in scope

ElevenLabs / Sync.so / Gemini Vision are **not** in the payload-hints
scope because they don't use `UnifiedVideoPayload` — they have their own
typed dataclasses (`SynthesisResult`, etc.). Their `_compute_cost`
helpers are folded into the canonical `compute_cost()` dispatcher in
Phase 8.

---

## Section 6: Tenet 6 escalation contract

The single most important behavioural change in Phase C is that the
canonical `classify_failure()` MUST escalate on unknown error shapes —
NOT silently default to TRANSIENT (which would retry forever) or
PERMANENT (which would give up silently). This codifies Tenet 6 (Errors
Must Be Visible) from `architectural-law.md`.

### 6.1 Three terminal states

`classify_failure()` has **exactly three possible terminal states**:

1. **Returns `(mode, confidence)`** where `mode` ∈ {known FailureMode
   values, except UNKNOWN and NONE}. Normal classification path.
2. **Raises `UnknownFailureEscalation`** with structured context (full
   error text, gate verdict, caller identity, http_status). Tenet 6 —
   surface unclassifiable input rather than silently returning UNKNOWN.
3. **Returns `(FailureMode.NONE, 1.0)`** when no failure exists (caller
   passed an explicit no-failure indicator — typically when the caller
   already knows the run succeeded but wants the canonical "no failure"
   sentinel).

### 6.2 The `escalate_unknown` keyword

`classify_failure()` accepts an `escalate_unknown: bool = True` keyword
argument. The default is **True** (escalate). Callers that explicitly
opt in to "unknown-as-data" semantics — for instance, score-card
classifiers that surface UNKNOWN as a low-confidence signal rather than
an error — pass `escalate_unknown=False`, in which case unknown
classifications return `(FailureMode.UNKNOWN, low_confidence)` instead of
raising.

```python
# Default: escalates
mode, conf = classify_failure(error_text="garbage random text")
# raises UnknownFailureEscalation

# Opt-in to unknown-as-data
mode, conf = classify_failure(
    error_text="garbage random text",
    escalate_unknown=False,
)
# returns (FailureMode.UNKNOWN, 0.30)
```

### 6.3 Escalation context

`UnknownFailureEscalation` carries structured context in its `args`:

```python
class UnknownFailureEscalation(Exception):
    """Phase-C temp local. FIXME(phase-e): migrate to recoil/lib/exceptions.py"""
    def __init__(
        self,
        *,
        error_text: Optional[str],
        gate_verdict: Optional[Any],
        http_status: Optional[int],
        caller_id: Optional[str] = None,
    ):
        super().__init__(
            f"Unknown failure shape — error_text={error_text!r}, "
            f"gate_verdict={gate_verdict!r}, http_status={http_status!r}, "
            f"caller_id={caller_id!r}"
        )
        self.error_text = error_text
        self.gate_verdict = gate_verdict
        self.http_status = http_status
        self.caller_id = caller_id
```

Callers wrap the escalation in their existing error-handling layer; the
escalation is observable, not catastrophic. The classifier ALSO logs at
WARNING with the same structured context before raising, so even if a
caller catches and swallows the exception, the operator sees the
unclassifiable input in logs.

### 6.4 `failure_category_for()` escalation contract

`failure_category_for(mode: FailureMode) -> FailureCategory` is **total**
over `FailureMode` except for two sentinel inputs:

- `FailureMode.NONE` → raises `ValueError("Cannot coarsen NONE — caller
  error, NONE means no failure exists")`.
- `FailureMode.UNKNOWN` → raises `UnknownFailureEscalation` (or returns
  default if caller passes `escalate_unknown=False` to a sibling helper —
  but the coarsening function itself never returns UNKNOWN-mapped
  category because UNKNOWN doesn't have one in the canonical mapping).

This guarantees retry-policy lookups (`DEFAULT_RETRY_POLICIES[cat]`)
never silently miss — every reachable `FailureCategory` has a policy.

### 6.5 Behaviour preservation

The 5 legacy classifiers DID silently default to TRANSIENT or PERMANENT
on unclassifiable input. Phase C deliberately **breaks** that
behaviour — but breaks it visibly. Phase 0's
`phase_c_classifier_fixtures.json` tags the unknown-input cases with
`expect_unknown: True`; Phase 12 verifies they raise
`UnknownFailureEscalation` (instead of silently returning TRANSIENT or
PERMANENT).

This is the **only deliberate behaviour change** in Phase C. All other
behaviour is preserved bit-exact.

---

## Section 7: Phase ordering implications

This audit (Phase 1) is the foundational input for Phases 2-12. Each
section above feeds specific downstream phases.

### 7.1 Section dependencies

| Section | Feeds Phase(s) | What's consumed |
|---------|----------------|------------------|
| §1 (Enum inventory) | Phase 2, Phase 4 | Phase 2 imports `FailureMode` from `core.critic`; Phase 4 wires `failure_category_for()` from §1.3 mapping table. |
| §2 (Classifier inventory) | Phases 3, 4, 5, 6, 7 | Each phase migrates a specific classifier. Phase 5 is a NO-OP file edit (production_loop.py byte-untouched) — only `strategy_registry.detect_failure_mode` becomes the wrapper. |
| §3 (Transient-pattern UNION) | Phase 2 | `TRANSIENT_PATTERN_STRINGS` and `TRANSIENT_HTTP_CODES` constants live in `pipeline/core/failure_mode.py`. Phase 2 also embeds `CONTENT_FILTER_PATTERNS` and `BUDGET_PATTERNS` from §3.5. |
| §4 (Cost-zero-fallback sites) | Phase 8, Phase 9 | Phase 8 builds the helper. Phase 9 migrates the 27 sites per the hard/sanctioned table. |
| §5 (payload.hints inventory) | Phase 10, Phase 11 | Phase 10 ships the Pydantic models. Phase 11 verifies the Protocol contract is preserved. |
| §6 (Tenet 6 escalation) | Phase 2, Phase 12 | Phase 2 implements; Phase 12 hard-gates that unknown-input fixtures escalate. |

### 7.2 Phase parallelism implications

Per the spec dependency graph:

```
Phase 1 (this audit)
    └─> Phase 2 (failure_mode.py canonical module)
            ├─> Phase 3 (retry_dispatcher migration)
            ├─> Phase 4 (strategy_registry migration)
            ├─> Phase 5 (production_loop drop-in via strategy_registry — NO file edit)
            ├─> Phase 6 (elevenlabs migration)  ─┐
            ├─> Phase 7 (sync_so migration)     ─┤── parallel sub-agents
            └─> Phase 8 (cost.py canonical module)
                    └─> Phase 9 (27-site cost migration)
                            └─> Phase 10 (payload_hints.py + provider adapter typing)
                                    └─> Phase 11 (Protocol contract verification)
                                            └─> Phase 12 (HARD GATE)
```

**Phases 6 and 7** can run as parallel sub-agents because they touch
disjoint provider files (`elevenlabs.py` vs `sync_so.py`).

**Phases 1-5 are sequential** — each depends on the previous. Phase 5 is
a no-op file edit (production_loop is byte-untouched per CP-9 lock); the
"migration" happens via strategy_registry.detect_failure_mode becoming a
thin wrapper around the canonical classify_failure.

**Phases 8-12 are sequential.**

### 7.3 Frozen-contract cross-reference

Phase C must NOT change:

- `RunResult.metadata` shape (CP-4 contract) — Phase 12 byte-diff verifies.
- `GenerationReceipt.provenance.eval_cost_usd` separation (CP-9 lock).
- `sidecar.provenance.cost` field name on disk (data-contracts.md §1c).
- `PassStore.passes[*].cost_usd` shape on disk (data-contracts.md §2a).
- `ExecutionStore.shots[*].cost_incurred` field name (data-contracts.md
  §3a).
- `FailureMode` enum **string values** (data-contracts.md cross-cutting).
- The 9 `FailureCategory` values are preserved (as a derived view).
- `production_loop.py` — entire file byte-unchanged.

Phase C may ADD:

- `recoil/pipeline/core/failure_mode.py`
- `recoil/pipeline/core/cost.py`
- `recoil/execution/providers/payload_hints.py`
- New module-level constants `TRANSIENT_PATTERN_STRINGS`,
  `TRANSIENT_HTTP_CODES`, `CONTENT_FILTER_PATTERNS`, `BUDGET_PATTERNS`.
- New temp local exception classes (`CostMissingError`,
  `UnknownFailureEscalation`, `PayloadHintsValidationError`).
- New tests under `recoil/pipeline/core/tests/test_failure_mode.py`,
  `test_cost.py`, `test_payload_hints.py`.
- This audit doc.
- New handoff doc
  `consultations/recoil/engine-architectural-audit-2026-04-30/POST_PHASE_C_HANDOFF.md`.

### 7.4 Drift-from-spec ledger

This Phase 1 re-grep on 2026-05-01 detected the following line-number
drift from the spec (which was authored 2026-04-30):

| Spec citation | Verified citation | Cause |
|---------------|-------------------|-------|
| `execution/pass_store.py:177` | `execution/pass_store.py:205` | Independent pre-existing drift; not Phase B related. |
| `workspace/server.py:1038` | `workspace/server.py:969` | Phase B (`workspace/coverage.py` extraction) shifted lines. |
| `workspace/server.py:3043` | `workspace/server.py:2218` | Phase B shifted lines. |
| `pipeline/editors/review_server.py:7414` | `pipeline/editors/review_server.py:7421` | Independent minor drift. |

All other 23 cost-zero-fallback sites match the spec verbatim. All
classifier and enum citations match the spec verbatim. Phase 9's pre-edit
re-grep step (Phase 9 § "Re-grep before edit") catches any further drift
introduced between Phase 1 and Phase 9 execution.

---

## Section 8: Open questions / JT review items

(Captured during audit; resolved before Phase 2 dispatches.)

| # | Question | Default disposition | JT confirmation needed? |
|---|----------|---------------------|-------------------------|
| 1 | Should the 9 sanctioned-fallback sites all log identical token (`"FALLBACK_FIRED cost_unknown"`) or vary by site? | Identical token — Phase E sanctioned-fallback registry keys against it. | No — spec line 300 confirms. |
| 2 | Does `429` belong in `TRANSIENT_HTTP_CODES`? | NO — preserve elevenlabs/sync_so fail-fast behaviour. | No — spec line 138 confirms. |
| 3 | Does `_classify_pass_error` (production_loop.py:811) get migrated? | NO — out of scope per CP-9 byte-untouched lock. | No — spec rule #4 confirms. |
| 4 | Is `core/critic.py:51`'s `FailureMode` the canonical home or does it move to `failure_mode.py`? | STAYS at `core/critic.py`. `failure_mode.py` re-exports. | No — spec Phase 2 § "Files NOT modified in this phase" confirms. |
| 5 | Hard/sanctioned split for the 27 sites — final 19/8 split per §4.3, JT-reviewed? | 19 hard / 8 sanctioned. | YES — captured per §4 JT-reviewed disposition before Phase 9. |

---

## Section 9: Glossary

| Term | Meaning |
|------|---------|
| **Canonical** | Single SSOT for a concept. `FailureMode` is the canonical failure-mode enum. |
| **Coarsening** | Mapping from a fine-grained enum (`FailureMode`, 23 members) to a coarser one (`FailureCategory`, 9 members) for retry-policy lookup. |
| **Derived view** | An enum or value computed from the canonical SSOT. `FailureCategory` is a derived view of `FailureMode`. |
| **Hard migration** | Cost-zero-fallback site that becomes `read_cost_from_result()` — raises `CostMissingError` on missing. |
| **Sanctioned-fallback** | Cost-zero-fallback site that becomes `read_cost_from_record_safe()` — logs WARNING and returns 0.0 on missing. |
| **Tenet 6** | Architectural law — Errors Must Be Visible. Codified 2026-04-30 in `architectural-law.md`. The single most load-bearing tenet for Phase C. |
| **`UnknownFailureEscalation`** | Phase-C temp local exception raised by `classify_failure()` when input is unclassifiable. Phase E migrates to `recoil/lib/exceptions.py`. |
| **`CostMissingError`** | Phase-C temp local exception raised by `read_cost_from_result()` when `cost_usd` is missing. Phase E migrates. |
| **`PayloadHintsValidationError`** | Phase-C temp local exception raised by `PayloadHints.model_validate()` on unknown keys. Phase E migrates. |
| **UNION** | The set-theoretic union of all 5 transient-pattern source lists. The canonical `TRANSIENT_PATTERN_STRINGS` is the UNION. No deliberate omissions. |
| **Re-export** | A module-level import that surfaces a symbol from elsewhere. `pipeline/core/failure_mode.py` re-exports `FailureMode` from `core/critic.py`. |

---

## Section 10: Phase 1 done-when checklist

- [x] Audit doc exists at `recoil/docs/phase-c-taxonomy-audit.md`.
- [x] §1 enumerates all 23 `FailureMode` members with file:line citations
      (verified 2026-05-01 against `core/critic.py:51`).
- [x] §1 enumerates all 9 `FailureCategory` members with file:line
      citations (verified 2026-05-01 against
      `pipeline/orchestrator/production_types.py:27`).
- [x] §1.3 maps every `FailureMode` to a `FailureCategory` (or marks it
      as escalating); UNKNOWN escalates, NONE raises ValueError.
- [x] §2 enumerates all 5 classifier functions with file:line citations
      (verified).
- [x] §3 enumerates the 5 transient-pattern source lists with file:line
      citations and emits the canonical UNION (9 string fragments + 5
      HTTP codes).
- [x] §3.4 captures the 429 disposition (string transient, HTTP-code
      fail-fast).
- [x] §3 deliberate-omissions section lists what's intentionally
      excluded (NONE — UNION is exhaustive).
- [x] §4 enumerates all 27 production cost-zero-fallback sites with
      file:line + hard/sanctioned classification, plus 1 deferred
      (production_loop.py:83).
- [x] §4.3 confirms the 19/8 hard/sanctioned split.
- [x] §4.6 documents the 10-site `compute_cost` write-side inventory.
- [x] §5 enumerates payload.hints sites with per-provider key list and
      Pydantic surface design.
- [x] §6 codifies the Tenet 6 escalation contract with the 3 terminal
      states.
- [x] §7 captures phase-ordering implications.
- [x] §7.4 captures drift-from-spec ledger (4 sites drifted; rest match).
- [x] No production code edits in Phase 1.
- [x] Doc is self-contained — readable without the parent BUILD_SPEC.

---

End of Phase C taxonomy audit.
