Good context from the codebase. Let me write the review.

---

## Topic 1 — Take-rooted lineage

**Where Gemini has it right.** The core thesis is correct: `receipts.jsonl` is audit exhaust, not a primary datastore, and building UI lineage from it is a performance trap. Eager persistence of `parent_take_id` at write time is the right call. The "keep it a tree, duplicate shared nodes" simplification is also smart — the Console v2 frontend is col/row grid-based, and forcing a DAG renderer on it is premature complexity for a pipeline that almost always produces trees anyway.

**Where Gemini is wrong or oversimplifying.** Gemini says "extend the take JSON with `parent_take_id` and `chain_ids`" and "let CP-5 do the heavy lifting" — but this conflates two separate data stores. The beat JSON files at `projects/<slug>/state/visual/shots/*.json` are **not CP-7 artifacts**. The beats adapter says this explicitly: these are legacy JT-curated shot state files. CP-7's `Take` and `Beat` dataclasses live in-memory only with `to_dict`/`from_dict` round-trip support but no save/load methods. Gemini talks about persistence as if there's one canonical location to extend. There are actually two: the existing shot JSONs (which the lineage adapter already reads) and CP-7's in-memory objects (which have no disk home). "Extend the take JSON" begs the question of *which* take JSON.

The `chain_ids` suggestion is also premature. You don't need pre-computed ancestry chains for the Console v2 use case. `parent_take_id` alone gives you recursive walk-to-root, which is sufficient for the inspector panel. `chain_ids` is a denormalization that buys O(1) full-lineage lookup at the cost of maintaining a redundant field that can drift. Don't build it until you prove the recursive walk is too slow (it won't be — take trees are shallow, rarely >5 levels deep).

**What Gemini missed.** The lineage adapter (`recoil/api/adapters/lineage.py`) already synthesizes a graph from the flat `takes` list in shot JSON. The minimum viable change for Topic 1 is: add `parent_take_id` to the shot JSON schema, populate it at generation time, and update `get_lineage()` to use it for edge construction instead of the current flat-list heuristic. That's a schema migration on existing files + a 20-line adapter change. You do NOT need full CP-7 disk persistence to unblock lineage — you need one new field in an existing file format. Gemini's framing ("you must persist CP-7 first") makes this feel like a week of work when it's actually half a day.

---

## Topic 2 — MCP selection wire-up

**Where Gemini has it right.** Option C (FastAPI-direct via stdio shim) is the right architecture. The argument for state isolation is sound — Console v2 and Workspace have different UX paradigms, different selection semantics, and sharing `workspace/state.py` would be a coupling disaster. The shim-as-proxy pattern (stdio JSON-RPC → HTTP to FastAPI) keeps tool logic centralized where it has access to the EventBus and domain objects.

**Where Gemini is wrong or oversimplifying.** "The maintenance tax of a shim is zero because it's just a proxy" is flatly wrong. A stdio MCP shim has to handle JSON-RPC framing, capability negotiation, error serialization, and graceful shutdown — all outside FastAPI's request lifecycle. It's *low* maintenance, not *zero*. More importantly, Gemini handwaves the selection-push problem and then proposes a Ctrl-U terminal injection as the solution. Injecting raw keystrokes into a ttyd websocket to force Claude to call a tool is brittle in at least three ways: (1) it assumes Claude's prompt buffer is empty, (2) it races with anything the user is typing, (3) it breaks if Claude Code changes its input handling. This isn't a "gotcha to note" — it's the central design challenge of the entire MCP integration, and Gemini dismisses it in a paragraph.

**What Gemini missed.** Claude Code supports MCP server-sent notifications and `resources/updated` signals. The shim doesn't need to inject keystrokes — it can use the MCP protocol's own notification mechanism to signal that selection context has changed. When the user clicks a take in Console v2, the frontend POSTs to FastAPI, FastAPI updates server-side selection state, and the MCP shim sends a `notifications/resources/list_changed` to Claude Code. Claude Code already handles these — it's how the Workspace MCP server communicates viewer state changes today. Gemini designed an injection hack because it assumed Claude CLI is deaf between prompts. It's not — MCP notifications exist for exactly this purpose. The real design work is deciding which MCP resources to expose (selected beat, selected take, lineage context) and what granularity of notification to send.

Also missed: Console v2's ttyd routes exist in FastAPI but **the frontend doesn't embed them yet**. The iframe integration is unbuilt. That's the actual first move for MCP — before you wire up selection sync, you need the Claude panel to exist in the Console v2 UI at all.

---

## Topic 3 — Overnight loop + monitoring surface

**Where Gemini has it right.** The distinction between domain models (Beat/Take = the *what*) and operational envelopes (Run = the *how*) is the right abstraction. The Airflow-style per-workflow retry with `.resume()` is correct. The "poll on mount, SSE for in-flight" pattern for progress is standard and right. Eager digest computation at run completion is obviously better than lazy aggregation at display time.

**Where Gemini is wrong or oversimplifying.** Gemini talks about overnight infrastructure as if it doesn't exist. It very much does. `run_overnight.py` is a full autonomous orchestrator with budget enforcement (`BudgetGuard`), scene persistence (atomic JSON writes to `state/orchestration/scenes/`), stale-take recovery (marks "running" takes >5min as failed), and operational logging (`ops_log.py` with two-line pending/completed protocol and `scan_for_dangling_ops()` for crash detection). Gemini's `Run` schema (`run_id, project_id, status, budget_usd, workflows`) is roughly what `EpisodeRunner` already manages in memory — the gap isn't "build a run system," it's "wire the existing run system to the EventBus and expose it via API."

The SQLite suggestion is wrong for this codebase. Every other persistence layer in Recoil uses JSON files in the project directory tree. Introducing SQLite adds a dependency, a migration story, and a backup story that doesn't exist. JSON files in `projects/<slug>/state/orchestration/runs/` match the existing pattern and are human-inspectable, which matters for a one-operator pipeline.

**What Gemini missed.** The actual gap is three specific things: (1) `EpisodeRunner` doesn't emit to the EventBus — scene/beat/take lifecycle events never reach SSE clients. This is a ~50-line wiring job, not a new system. (2) No API endpoints expose run state — `/api/runs/active` and `/api/runs/{run_id}` don't exist. (3) Budget state dies with the process — `BudgetGuard.spent` is in-memory only and not written to the run record. These are incremental additions to existing infrastructure, not the ground-up build Gemini implies.

---

## Do I agree with "persist CP-7 Takes to disk first"?

**Partially.** Gemini is right that persistence is a precondition for Topic 3 (crash recovery means nothing if Take state evaporates on restart) and partially right for Topic 1 (lineage needs `parent_take_id` persisted somewhere). But Gemini is wrong that it blocks Topic 2 — MCP selection wire-up is entirely independent of disk persistence. And for Topic 1, you don't need full CP-7 persistence; you need `parent_take_id` added to the existing shot JSON schema.

The minimum-viable persistence layer to unblock all three: extend `Take.to_dict()` to emit `parent_take_id`, have the generation pipeline set it at creation time, and have the beats adapter write it through to shot JSON. Scene-level persistence already exists via `persistence.py`. The missing piece is take-level persistence that survives FastAPI restarts — which means `Beat.to_dict()` → shot JSON on every take completion, not a new storage backend.

## JT's first move

**Wire `parent_take_id` into the existing shot JSON schema and the generation pipeline.** It's a half-day change that unblocks Topic 1 (lineage filtering becomes a trivial adapter update) and partially unblocks Topic 3 (take identity survives restarts). Do the MCP iframe embedding in parallel if you want — it's independent. The full "Run entity + EventBus wiring + API surface" for overnight monitoring is the biggest lift, but it's wiring work on existing infrastructure, not greenfield. Don't let Gemini's framing convince you that you need a persistence rewrite before you can make progress — the shot JSON files are already your persistence layer, they just need one new field.
