# Recoil Workspace — Design Draft (Pre-Consult)

**Status:** Draft — pending dual consultation before finalization
**Date:** 2026-04-12
**Author:** JT + Claude (brainstorming session)

---

## 1. Product Concept

Recoil Workspace is a local web application that gives Claude Code visual access to the production pipeline. JT sees the project visually (file browser, media viewer, activity monitor); Claude sees it through MCP tools (selection state, shot provenance, project context). They act on the same state together.

**This is NOT a pipeline control panel.** The original Phase 4 design was an ops-registry-driven interface with 15 hand-written forms and a command palette. This is a **shared workspace** where JT and Claude Code look at outputs together, talk about them in natural language, and act on them collaboratively.

### The Learning Loop (Core Value Proposition)

The conversation between JT and Claude, tied to full provenance data, creates a learning corpus:

1. JT reviews a shot in the workspace and comments on it ("the lighting is flat, the pose doesn't match the emotion")
2. Claude analyzes: links the comment to the shot's full provenance (prompt layers, model, refs, take number, gate results)
3. Claude acts: re-rolls with modified prompt, approves/rejects, etc.
4. The Workspace MCP logs every interaction: {shot_id, provenance_hash, jt_feedback, action_taken, new_provenance}
5. At session end or via ingestion: feedback flows to mempalace as durable claims
6. In future sessions: mempalace surfaces patterns — "JT rejects shots with flat lighting from kling-v3 when prompt lacks explicit lighting direction (7 instances)"
7. Claude uses these patterns to pre-emptively adjust prompts and model selection

**The conversation log tied to provenance is the most important data artifact, not the UI.**

---

## 2. Architecture

### Approach: Web App + MCP Bridge to Claude Code

Three components:

1. **Web App** — FastAPI server serving the UI (file browser, media viewer, inspector, activity monitor). Also hosts WebSocket endpoint for the embedded terminal.

2. **Embedded Terminal** — xterm.js in the browser, connected via WebSocket to a local pty running Claude Code. This IS Claude Code — all skills, hooks, memory, file access work natively.

3. **Workspace MCP Server** — Exposed via stdio to Claude Code. Provides tools for reading workspace state (selection, provenance, activity) and taking actions (approve, reject, submit generation, log feedback).

```
┌─────────────────────────────────────────────────────┐
│  Browser (localhost:XXXX)                            │
│  ┌──────────────────────┐ ║ ┌──────────────────────┐│
│  │ File Browser + Activity│ ║ │                      ││
│  │ Inspector / Provenance │ ║ │   VIEWER (9:16)      ││
│  ├────────────────────────┤ ║ │   full height        ││
│  │ Terminal (Claude Code) │ ║ │                      ││
│  └────────────────────────┘ ║ └──────────────────────┘│
│            draggable divider ═╝                       │
└─────────────────────────────────────────────────────┘
         │                          │
         │ WebSocket (pty)          │ MCP (stdio)
         ▼                          ▼
    Claude Code ◄──────────── Workspace MCP Server
         │
         ├──── Mempalace MCP (knowledge, learnings, patterns)
         ├──── File system (read/write/edit)
         └──── Skills, hooks, memory (all existing)
```

### Why This Approach

- **Lightweight.** Preserves everything Claude Code already has (skills, hooks, memory, file access, git). No rebuilding.
- **MCP is the natural integration.** Claude Code already supports MCP servers. Adding workspace awareness is just adding tools.
- **Conversation logging comes free.** Every MCP tool call is loggable with shot context — this IS the conversation-provenance link.
- **If the interaction model is wrong, we've spent days not weeks.** The MCP tools become the API contract for a future full frontend if the POC validates.

### What's NOT in the POC

- No ops registry, no command palette, no auto-forms (original Phase 4 scope)
- No dockable/resizable panels (fixed layout with one draggable divider)
- No WebSocket push for real-time updates (polling is fine for POC)
- No multi-user, no cloud deployment, no auth

---

## 3. Layout

Single draggable vertical divider splits the screen into two sides:

### Left Side (all panels + terminal)
- **File Browser** (top-left): Project directory tree with shot status indicators (●approved, ○pending review, ✗rejected, ·not generated). Cmd-click for multi-select. Folders auto-update when new files appear (filesystem polling). Activity monitor section at the bottom of the file browser showing in-flight generations, recent completions/failures, and session budget.
- **Inspector / Provenance** (top-right, beside file browser): Shows provenance for the currently selected/viewed shot. Includes: expandable prompt layers (task prompt visible by default, bible/behavioral/continuity collapsed), reference image thumbnails with role labels (character, location, start_frame), routing decision (model, pipeline, reason), gate results with failure explanations, cost and bible version hash.
- **Terminal** (bottom of left side, spans full width of left): Embedded Claude Code via xterm.js + WebSocket pty. MCP tool calls shown as dim log lines. Full Claude Code with all skills, hooks, and memory.

### Right Side (viewer only)
- **Media Viewer**: Full window height. Displays images and video. Take navigation arrows (previous/next). Status footer (take number, review status, model). The 9:16 vertical content gets every pixel of vertical space.

### The Divider
- Drag left to give the viewer more space (reviewing media)
- Drag right to give the panels more room (reading prompts, browsing files)

---

## 4. Inspector Design

**Integrated Split (always visible when viewing a shot).** The inspector is not a separate panel — it's part of the viewing experience. When a shot is selected:

- The inspector shows provenance for that shot
- **Prompt layers** as collapsible sections (task prompt expanded by default)
- **Reference thumbnails** showing the actual images sent to the model, with role labels (character/hero, location/hero, start_frame)
- **Routing** info: model, pipeline mode (i2v/t2v), tier, reason for routing decision
- **Gate results** with pass/fail indicators and failure explanations for failed gates
- **Cost** and bible version hash

---

## 5. MCP Tool Design

### Project Context
- `prime_project(name)` → Compact project summary: episode count, shots by status, active models, pending review items, recent ops log entries, character list. Also returns hero images for all characters and locations from `_canonical/` so Claude is visually oriented in the world.
- `get_project_bible(name, section?)` → Full or partial bible text
- `get_shot_detail(shot_id)` → Full shot state JSON including all takes, provenance, gate results
- `load_episode_script(episode_id)` → Script beats for the active episode, so Claude understands the dramatic context

### Selection & Viewer
- `get_selection()` → List of currently selected shot IDs / file paths in the file browser
- `show_in_viewer(path)` → Push a specific file (image/video) to the viewer pane
- `get_viewer_state()` → What's currently displayed (shot_id, take number, file path)
- `get_shot_takes(shot_id)` → Paths to all take files for comparison

### Actions
- `approve_shot(shot_id, take_id, reason?)` → Mark take as approved, update shot state
- `reject_shot(shot_id, take_id, reason?)` → Mark take as rejected
- `submit_generation(shot_id, model?, prompt_override?, soften?)` → Kick off a new take via StepRunner
- `batch_submit(shot_ids, action, params?)` → Re-roll, new keyframes, etc. for multiple shots

### Activity
- `get_activity()` → All in-flight generations with elapsed time, recent completions/failures
- `get_session_budget()` → Spent / limit

### Logging (the learning corpus)
- `log_feedback(shot_id, take_id, feedback_text, action_taken)` → Structured log entry linking JT's comment to the shot's provenance hash. Raw material for the learning loop.
- `annotate_shot(shot_id, note)` → Persistent note attached to the shot state, survives across sessions
- `get_session_log()` → Session history so Claude can recall earlier feedback (survives context compaction)

### Autonomy Model
- Cheap/quick actions (approve, reject, show in viewer) execute automatically
- Expensive/risky actions (submit generation, batch operations) get a confirmation step
- This maps naturally to Claude Code's existing permission system

---

## 6. Context Priming

Two-layer priming on project open:

### Layer 1 — Project Prime (Workspace MCP)
`prime_project("the-afterimage")` returns:
- Project structure summary (episode count, shot count by status, active models)
- Character hero images from `_canonical/characters/`
- Location hero images from `_canonical/locations/`  
- Pending review queue items
- Recent ops log activity
- Active episode script beats

### Layer 2 — Knowledge Prime (Mempalace MCP, already exists)
- `mempalace_kg_query(entity="the-afterimage")` → claims about this project
- `mempalace_search("afterimage production learnings")` → semantic search for patterns
- `mempalace_kg_timeline(entity="the-afterimage")` → recent knowledge changes

### Layer 3 — On-Demand Depth (lazy, not pre-loaded)
- `get_shot_detail(shot_id)` only when looking at a specific shot
- `get_project_bible(section="characters")` only when needed
- Model-specific queries via mempalace when reasoning about re-roll strategy

**Key principle:** Don't load everything upfront. The prime gives Claude a map. MCP tools give on-demand depth. Claude decides what to pull based on conversation context.

---

## 7. Session Logging & Learning Corpus

### Per-Session Log
Stored at `projects/{project}/sessions/{timestamp}.jsonl`

Each entry:
```json
{
  "timestamp": "2026-04-12T01:30:00Z",
  "type": "feedback|action|prime|query",
  "shot_id": "EP001_SH03",
  "take_id": "take_002",
  "provenance_hash": "a1b2c3d4",
  "feedback_text": "spatial compliance is wrong — background through window is fine, just different angle",
  "action_taken": "approve_shot",
  "action_params": {"reason": "false positive on window view angle"},
  "mcp_tool": "log_feedback"
}
```

### Feeding the Learning Loop
- Session logs can be ingested by Cortex's claim_extractor (add a "workspace_session" adapter)
- Claims link JT's natural language feedback to specific provenance records
- Over time, mempalace accumulates patterns: model preferences, prompt failure modes, false positive gate patterns
- Future sessions surface these patterns during priming

---

## 8. Technical Stack

- **Server:** Python FastAPI (matches existing codebase pattern)
- **Frontend:** Vanilla JavaScript (matches existing Production Console — no build step)
- **Terminal:** xterm.js + WebSocket → local pty spawning `claude`
- **MCP:** stdio transport (Claude Code's standard MCP connection)
- **File watching:** Filesystem polling (2-3 second interval, matches existing console)
- **Styling:** Dark theme (matches existing console aesthetic), CSS in a single file
- **Estimated LOC:** ~2,000-3,000 total (server + frontend + MCP tools)

---

## 9. Success Criteria (POC)

1. JT opens the workspace, selects a project, and Claude is primed with character heroes + project state
2. JT selects a shot in the file browser → viewer shows the media, inspector shows provenance
3. JT types "re-roll this with tighter framing" → Claude reads selection via MCP, submits generation, activity monitor shows in-flight
4. New take appears in file browser when generation completes → JT clicks it → viewer updates
5. JT comments on a shot ("lighting is flat") → Claude logs feedback linked to provenance
6. All session interactions logged to JSONL with provenance hashes
7. Multi-select works: JT selects 3 shots, types "new keyframes for these"

---

## 10. Open Questions for Consultation

1. **Embedded terminal vs. side-by-side:** xterm.js in the web page requires WebSocket pty spawning, which adds complexity. Is this worth it for the POC, or should we start with Claude Code in a separate terminal window and add the embed later?

2. **MCP transport:** stdio is simplest but requires the workspace server to spawn Claude Code (or vice versa). Alternative: SSE transport where the workspace server runs an MCP server on an HTTP endpoint. Which is more robust for this use case?

3. **Session log → mempalace pipeline:** Should this be automatic (workspace adapter in Cortex ingests session logs) or manual (JT runs a command to ingest)? Trade-off: automatic means noise (every trivial interaction becomes a claim), manual means JT forgets.

4. **Prime cost:** Loading hero images for all characters + locations on project open could be expensive in context tokens. Should we limit to main characters only? Or use descriptions + load images on demand?

5. **Video playback:** HTML5 `<video>` tag for MP4s is straightforward, but what about the generated video formats? Are all outputs MP4? Do any need transcoding?

6. **Filesystem watching reliability:** Polling is simple but has latency. Is 2-3 seconds acceptable for "dynamically appear in the folder" or should we use fsevents/inotify for instant updates?

7. **Hardening:** What failure modes should we design for? Server crash mid-generation? Claude Code process dying? Browser disconnect? MCP connection drop? What's the recovery story for each?

8. **Multi-project:** The POC is single-project. But should the MCP tool design anticipate switching projects mid-session? Or is that a restart?
