# Topic 1 — Take-rooted lineage

**The Take:** Stop trying to compute application state from exhaust fumes. `receipts.jsonl` is an append-only audit log, not a primary datastore. Parsing a constantly growing JSONL file on every inspector open is a performance trap and a parsing nightmare. 

**1. Cleanest filtering:** Eager persistence. Extend the take JSON with `parent_take_id` and `chain_ids` at write time. This makes `get_lineage` a cheap, O(1) lookup. Let the generation pipeline (CP-5) do the heavy lifting of resolving the chain *once* upon success, writing the resolved IDs directly into the beat's JSON. 

**2. Shared refs (DAG vs Tree):** Keep it a tree. If three takes share a keyframe, duplicate the keyframe node in the UI for each take's lineage. The Console v2 frontend is already built for col/row trees. Forcing a DAG layout requires a complex graph-rendering library (like React Flow) and makes the visual UX cluttered. 

**3. Parent linkage:** Absolutely add `parent_take_id`. Yes, it slightly conflates "input asset" with "fork," but in a generative pipeline, an i2v generation *is* a fork of that keyframe's lineage. It gives you a clean recursive path back to the root prompt. 

**4. Join key location:** Persisted JSON. One write per take. Lazy reads against JSONL will bite you the second someone truncates the log file, moves the project, or the file hits 500MB and your API response spikes to 3 seconds.

**The Gotcha:** Log rotation or manual cleanup of `_dispatch_logs`. If you rely on `receipts.jsonl` for the UI graph, the moment that file is archived or cleared, your entire Console v2 lineage UI goes blank. Persist the graph topology in the beat JSON.

---

# Topic 2 — MCP selection wire-up

**The Take:** Option A is a trap. Workspace and Console v2 have fundamentally different UX paradigms; trying to unify their selection state will result in a bloated, nullable schema where both UIs constantly step on each other's toes. Go with **Option C (FastAPI-direct) via a thin stdio shim.**

**1 & 4. The Architecture:** The Claude CLI strictly requires a stdio-based MCP server. Write a tiny, dumb Python script (`console_mcp_shim.py`) that Claude runs. This shim does zero logic—it just translates stdio JSON-RPC into HTTP calls to FastAPI (`:8431/api/mcp/...`). This keeps all your actual tool logic in FastAPI where it belongs, giving you access to the DB, EventBus, and CP-5 without duplicating code.

**2 & 3. State Isolation:** Console v2 owns its own state namespace in FastAPI. Don't touch `workspace/state.py`. The maintenance tax of a shim is zero because it's just a proxy; you add tools by adding FastAPI routes.

**5. The Selection-Push Gotcha:** This is the hardest technical hurdle. Claude CLI is a REPL. It sits blocked on `stdin` waiting for the user. It does not natively support SSE or "push" notifications. If the user clicks Take T_005, Claude won't know until the user types their next prompt. 
*The solution:* When the user clicks a take, the frontend must write to FastAPI to update the server-side `selected_take_id`, **and** optionally inject a hidden string into the ttyd websocket (e.g., `<Ctrl-U> /tool get_selection <Enter>`) to force Claude to evaluate the new context. Otherwise, Claude is deaf until the user hits enter.

---

# Topic 3 — Overnight loop + monitoring surface

**The Take:** CP-7's lack of disk persistence is a ticking time bomb. You cannot build a robust overnight batch system on top of in-memory domain objects. 

**1 & 2. The Primitive:** You need a new `Run` entity. `Beat` and `Take` are domain models (the *what*); a `Run` is an operational envelope (the *how*). 
Schema: `run_id`, `project_id`, `status`, `budget_usd`, `workflows: [{wf_id, beat_id, status, error}]`. Store this in a simple SQLite DB or a `runs/` JSON directory.

**3. Partial failures:** Retry at the workflow/take level. Steal from Airflow: the `Run` is just a DAG of tasks. If take 42 of 50 fails due to a network timeout, the `Run` marks that workflow as failed but continues the rest. A `.resume()` call just queues up the failed workflows. 

**4. Progress:** Both. The frontend polls `/api/runs/active` on mount to get baseline state, then listens to `/api/events/stream` for `RunProgress` SSEs to paint the in-flight strip.

**5. The Digest:** Compute it **eagerly** at the end of the Run. The final step of the Run pipeline should aggregate the top eval scores, sum the costs, and write `digest.json`. When JT wakes up, the UI loads instantly. Surfacing time-to-first-success is noise; focus on cost, fail clusters (e.g., "Kling API rejected 15 prompts"), and the highest-scored take per beat.

**6. Crash recovery:** If the laptop sleeps, the SSH tunnel drops, but the FastAPI backend is local, the process might survive. But if FastAPI restarts, in-memory CP-7 is gone. 

**What JT should tackle FIRST:** 
**Persist CP-7 Takes to disk.** You cannot do overnight runs, you cannot do crash recovery, and you cannot build a reliable lineage graph (Topic 1) until `Take` and `Workflow` state survive a FastAPI reboot. Fix the persistence layer before you build the overnight UI.