# Manual Generation Workbench — Design Document

**Date:** 2026-03-08
**Status:** Approved — ready for implementation
**Deadline:** April 1, 2026
**Build estimate:** 12-17 hours

## Problem

The automated visual pipeline produces keyframes and video, but ~80% of shots will need manual intervention for the first 1-2 series. Failure modes include: wrong composition, 180-degree violations, artifacts (morphing, extra limbs), wrong motion direction, safety filter blocks, and character consistency issues.

The production console already calls APIs (Gemini, Kling, Veo) for generation. What's missing is a clean workflow for shots that need **web UI editing tools** — Kling's Motion Brush, NBP's circle-and-edit inpainting, Veo's safety filter feedback — that APIs cannot provide.

## Architecture

### Two-Mode Intervention

1. **Quick Retry (API)** — prompt tweak + regenerate via existing production console endpoints. Handles ~20% of failures (wrong prompt wording, bad seed, missing detail).

2. **Escalate to Bundle** — export shot package for manual web UI work. Handles ~80% of failures initially (orientation flip, inpainting, motion brush, safety filter workaround). This ratio should flip over time as the pipeline improves.

### System Topology

```
Production Console (port 8430)
├── Dailies Tab — review automated output
│   ├── APPROVE (A key) — shot is good
│   ├── RE-RENDER (R key) — send back to API pipeline
│   └── ESCALATE TO MANUAL (M key) — flag for manual fix [NEW]
│
└── Render Tab — API-driven generation
    ├── RENDER button — calls Kling/Veo/SeedDance API
    └── ENHANCE button — enriches prompt before API call [NEW]

Manual Workbench (port 8430/manual) [NEW]
├── Triage Grid — contact sheet of all flagged shots
├── Detail View — target vs latest output, editable prompt
├── Export — bundles for Kling/Veo/Sora web UIs
├── Reconciliation Grid — bulk re-import drop zone
└── Feedback Loop — failure classification on re-import
```

## Decision Log

| Decision | Chosen | Alternatives Rejected | Rationale |
|----------|--------|----------------------|-----------|
| Tab vs Standalone | Hybrid: button in Dailies → standalone `/manual` page | 6th tab, pure standalone | CanvasState too tightly coupled for a different UI paradigm; standalone reads same ExecutionStore with zero regression risk |
| Status tracking | `gate_results.manual_escalated = true` flag | New `manual_out` status | Don't break Board tab's `budget_summary()` bucketing which categorizes by formal status |
| Feedback loop | Auto-infer fix + 5-category failure pill on re-import | Free text, 8 categories, no tracking | 5 categories sufficient for v1; capture at re-import when director's attention is highest |
| Prompt enrichment | Build for v1, opt-in ENHANCE button | Cut for v1 | Highest ROI for reducing manual rate; pattern already exists in `/api/smart-prompt`; $0.001-0.01/call |
| Enrichment model | Route by target: Flash 3.1 for Gemini models, Sonnet 4.6 for Kling/Sora | Single model for all | Flash knows Gemini internals; Sonnet is a better cinematic writer for non-Gemini targets |
| Prompt format | Model-specific (Kling: 40-50 words scene→character→action→camera; Veo: dense prose) | Generic format | Web UIs auto-enhance; our enrichment must match each model's preferences |

## Feature Specifications

### Feature 1: ESCALATE TO MANUAL (Dailies Tab)

**What:** 'M' hotkey + button in Dailies player to flag a shot for manual intervention.

**Implementation:**
- Add ESCALATE TO MANUAL button next to RE-RENDER in `dailies.js` `renderPlayer()`
- Bind 'M' key in `onKeyDown()`
- POST to `/api/manual/escalate` with `{shot_id}`
- Backend sets `gate_results.manual_escalated = true` and `gate_results.manual_escalated_at = timestamp`
- Toast: "Flagged for manual workbench"
- Shot stays in its current formal status (doesn't disrupt Board bucketing)

**Files:** `editors/tabs/dailies.js`, `editors/review_server.py`
**Effort:** 2-3 hours

### Feature 2: Prompt Enhancement (Render Tab)

**What:** ENHANCE button next to prompt editor that enriches the prompt using an LLM before API generation.

**Implementation:**
- Add ENHANCE button in `production.js` next to the prompt textarea
- Calls `POST /api/enhance-prompt` with `{shot_id, prompt, target_model}`
- Backend routes to enrichment model based on target:
  - Kling V3 → Sonnet 4.6 with Kling-specific system prompt
  - Veo 3.1 → Flash 3.1 with Veo-specific system prompt
  - SeedDance → Sonnet 4.6 with SeedDance system prompt
- System prompt encodes model-specific best practices:
  - Kling: scene→character→action→camera order, 40-50 words, motion verbs with pacing, DP-style
  - Veo: dense prose, environment-first, no section headers, safety-filter-safe synonyms
- Returns enhanced prompt → populates textarea for director review
- REVERT TO DEFAULT button restores original JIT prompt
- Negative prompt defaults injected per model (stored in `model_profiles.json`)

**Model profiles addition:**
```json
{
  "kling-v3": {
    "enrichment_model": "claude-sonnet-4-6",
    "enrichment_style": "cinematic_dp",
    "negative_prompt_default": "morphing, blurry, disfigured hands, extra fingers, bad anatomy, text, watermark",
    "optimal_prompt_length": [40, 50]
  },
  "veo-3.1": {
    "enrichment_model": "gemini-3.1-flash-preview",
    "enrichment_style": "dense_prose",
    "negative_prompt_default": null,
    "safety_filter_words": ["blood", "fight", "struggle", "shoot", "blade"]
  }
}
```

**Files:** `editors/tabs/canvas/production.js`, `editors/review_server.py`, `config/model_profiles.json`, `lib/jit_prompt.py`
**Effort:** 3-4 hours

### Feature 3: Manual Workbench (`/manual`)

**What:** Standalone page for triaging and exporting flagged shots, and re-importing fixed assets.

**Layout:** Three sections, top to bottom (single-column responsive):

#### Section A: Triage Grid (Contact Sheet)
- Episode dropdown at top
- Grid of all shots flagged with `manual_escalated = true`
- Each cell: thumbnail (latest output), shot ID, model assignment, failure status
- Click to select (multi-select with Shift)
- Model assignment dropdown per shot (or bulk-assign)
- "EXPORT ALL" button → batch export grouped by model

#### Section B: Detail View (on shot click)
- Split view: Target frame (previz/approved keyframe) on LEFT, Latest Output (bad keyframe/video) on RIGHT
- Editable prompt with model-specific formatting
- Model selector (Kling V3, Veo 3.1, SeedDance 2.0)
- Word count indicator with warning threshold per model
- "EXPORT BUNDLE" button → single shot export
- "QUICK RETRY" button → calls existing API endpoints (same as Render tab)
- Shot metadata: shot type, camera, characters, action

#### Section C: Reconciliation Grid (Re-import)
- Bulk drop zone: drag multiple files from ~/Downloads
- Shows all shots currently "out for manual" (manual_escalated && !manual_resolved)
- Director clicks a dropped file → clicks a shot → "LINK" to associate
- OR: rename files to shot IDs before dropping for auto-match
- On confirmation: 5 failure-tag pill buttons (must select one)
  - `composition` | `artifacts` | `motion` | `safety_filter` | `character`
- Two actions after tagging:
  - "SAVE & RETURN TO PIPELINE" → overwrites asset, sets `manual_resolved = true`, pipeline continues
  - "SAVE & EXPORT VIDEO BUNDLE" → saves fixed keyframe, exports bundle for video gen

**API Endpoints:**
```
GET  /api/manual/shots/{episode}    — all shots with manual_escalated flag + plan data
POST /api/manual/export             — wraps build_bundle() in background thread
POST /api/manual/reimport           — accepts multipart upload, copies file, updates store
POST /api/manual/escalate           — sets manual_escalated flag
POST /api/manual/resolve            — clears manual_escalated, records fix data
```

**Export Bundle Structure:**
```
manual_export/EP001_SH03_kling_3.0/
├── prompt.txt              # Model-optimized, director-edited prompt
├── instructions.txt        # Step-by-step for web UI (plain text, no markdown)
├── start_frame.png         # Keyframe (for I2V models)
├── refs/
│   ├── 01_identity_hero.png
│   ├── 02_identity_profile.png
│   └── 03_location_corridor.png
└── return/                 # Alternative drop location (watched folder, v2)
```

**Files:** `editors/manual-workbench.html`, `editors/manual.js`, `editors/review_server.py`
**Effort:** 4-6 hours

### Feature 4: Feedback Loop (Learning Data)

**What:** Capture failure classification on every manual fix to improve the automated pipeline over time.

**Data shape in ExecutionStore:**
```json
{
  "gate_results": {
    "manual_fixes": [
      {
        "timestamp": 1709942400,
        "failure_type": "artifacts",
        "fix_type": "inpaint",
        "model_used": "kling-v3",
        "enrichment_used": false,
        "notes": "6 fingers on left hand",
        "auto_tagged": false
      }
    ]
  }
}
```

**Auto-inference rules:**
- If prompt was edited before export → `fix_type: "prompt_edit"`
- If model was changed → `fix_type: "model_switch"`
- If exported as bundle → `fix_type: "manual_intervention"`
- If Quick Retry was used → `fix_type: "seed_retry"`

**UI:** 5 colored pill buttons shown on re-import. Director must click one before CONFIRM. Optional free-text notes field.

**Files:** `editors/manual.js`, `editors/review_server.py`, `lib/execution_store.py` (no schema change needed — `gate_results` is a flexible dict)
**Effort:** 3-4 hours

## Build Priority

| Phase | Feature | Hours | Dependencies |
|-------|---------|-------|-------------|
| 1 | Prompt Enhancement (ENHANCE button) | 3-4h | model_profiles.json, jit_prompt.py |
| 2 | Escalate button + 'M' hotkey in Dailies | 2-3h | review_server.py |
| 3 | Manual Workbench page (/manual) | 4-6h | review_server.py, build_upload_bundle.py |
| 4 | Feedback loop (failure tags) | 3-4h | manual.js, review_server.py |

## Consultation Sources

- 3 rounds Gemini 3.1 Pro consultation (gemini_round_1-3.md)
- 3 rounds Opus 4.6 consultation (opus_round_1.md + 2 sub-agent reviews)
- Kling V3 prompt best practices research (fal.ai, klingaio.com, atlabs.ai)
- SeedDance 2.0 access research (fal.ai, seedancevideo.com)
- GPT 5.3 Codex / Google Antigravity / model comparison research

## Overnight Build Harness

### Overview

Autonomous build loop using Claude Code. Opus orchestrates, sub-agents execute each phase, Gemini consultations review code between phases. The harness runs until all 4 phases are complete and validated.

### How to Launch

Start a fresh Claude Code session in `~/Dropbox/CLAUDE_PROJECTS/starsend/` and run:

```
Read the design doc at docs/plans/2026-03-08-manual-workbench-design.md and execute the overnight build harness specified in it. Build all 4 phases autonomously, consulting Gemini between phases for code review. JT is away — do not ask for input.
```

### Phase Execution Protocol

For EACH phase (1-4):

1. **Read** the feature specification from this design doc
2. **Plan** the implementation (identify files to modify, functions to add)
3. **Build** using sub-agents where possible for parallel work
4. **Validate** — start the server (`python3 editors/review_server.py --project tartarus`), verify no crashes, test new endpoints with curl
5. **Consult Gemini** — send the new/modified code to `tools/consult.py` for review. Ask specifically: "Review this code for bugs, security issues, and integration risks with the existing codebase."
6. **Fix** any issues Gemini identifies
7. **Re-validate** — restart server, re-test
8. **Commit** when phase passes validation

### Phase 1: Prompt Enhancement (3-4h)

**Files to modify:**
- `config/model_profiles.json` — add `enrichment_model`, `enrichment_style`, `negative_prompt_default`, `optimal_prompt_length`, `safety_filter_words` fields per video model
- `lib/jit_prompt.py` — add `enhance_prompt(prompt, target_model, shot_data)` function that calls the appropriate enrichment model (Flash 3.1 for Gemini targets, Sonnet 4.6 for Kling/Sora)
- `editors/review_server.py` — add `POST /api/enhance-prompt` endpoint
- `editors/tabs/canvas/production.js` — add ENHANCE button next to prompt textarea, REVERT TO DEFAULT button, word count indicator

**Validation:**
```bash
# Start server
python3 editors/review_server.py --project tartarus &
# Test endpoint
curl -X POST http://127.0.0.1:8430/api/enhance-prompt \
  -H "Content-Type: application/json" \
  -d '{"shot_id": "EP001_SH01", "prompt": "Woman walks down corridor", "target_model": "kling-v3"}'
# Should return JSON with enhanced prompt text
```

**Enrichment system prompts to encode:**

For Kling (via Sonnet 4.6):
```
You are a video generation prompt optimizer for Kling V3. Rewrite the following prompt following these rules:
- Structure: Scene → Characters → Action → Camera → Style
- Length: 40-50 words
- Write like a Director of Photography, not a photographer
- Use specific motion verbs with pacing ("glides smoothly", "jerks to a halt")
- Specify camera movement with speed ("slow dolly in", "rapid pan left")
- Include lighting as mood ("bathed in sunset gold", not "3200K")
- Do NOT add content that isn't in the original (no new characters, locations, or story)
- Do NOT use abstract adjectives ("beautiful", "amazing", "cool")
- Output ONLY the enhanced prompt, no explanation
```

For Veo (via Flash 3.1):
```
You are a video generation prompt optimizer for Veo 3.1. Rewrite the following prompt following these rules:
- Dense prose, environment-first, no section headers or bullet points
- Include physics, lighting, and atmospheric details
- Avoid these safety-filter trigger words: blood, fight, struggle, shoot, blade, weapon, kill, attack, punch, stab
- Replace with safe synonyms: conflict→confrontation, wound→injury, strike→impact
- Output ONLY the enhanced prompt, no explanation
```

### Phase 2: Dailies Escalation (2-3h)

**Files to modify:**
- `editors/tabs/dailies.js` — add ESCALATE TO MANUAL button in `renderPlayer()`, bind 'M' key in `onKeyDown()`
- `editors/review_server.py` — add `POST /api/manual/escalate` endpoint that sets `gate_results.manual_escalated = true`

**Implementation pattern** (follow existing RE-RENDER pattern in dailies.js lines 187-195):
```javascript
case 'escalate-manual':
  await ConsoleApp.starsendPost('/api/manual/escalate', {
    shot_id: clip.shot_id,
  });
  await loadClips();
  buildDOM();
  ConsoleApp.Toast.show('Flagged for manual workbench');
  break;
```

**Validation:**
```bash
curl -X POST http://127.0.0.1:8430/api/manual/escalate \
  -H "Content-Type: application/json" \
  -d '{"shot_id": "EP001_SH01"}'
# Should return {ok: true}
# Verify: cat projects/tartarus/state/starsend/shots/EP001_SH01.json | jq .gate_results.manual_escalated
# Should return true
```

### Phase 3: Manual Workbench Page (4-6h)

**New files:**
- `editors/manual-workbench.html` — standalone page
- `editors/manual.js` — all workbench JS

**Files to modify:**
- `editors/review_server.py` — add `/manual` route + 4 API endpoints:
  - `GET /api/manual/shots/{episode}` — returns all shots with `manual_escalated = true`, merged with plan data
  - `POST /api/manual/export` — wraps `build_bundle()` in background thread, returns bundle path
  - `POST /api/manual/reimport` — accepts multipart file upload, copies to correct location, updates ExecutionStore
  - `POST /api/manual/resolve` — clears `manual_escalated`, records fix data in `manual_fixes[]`

**UI Architecture:**
- Single HTML page, no framework, vanilla JS (matches console pattern)
- Fetch data via polling (same pattern as dailies.js — 15s refresh)
- Use morphdom if already loaded, otherwise simple innerHTML
- CSS: reuse `editors/styles/canvas.css` variables for consistency, add `editors/styles/manual.css`

**Design approach — use the frontend-design skill** for the HTML/CSS to ensure production-grade UI quality. The workbench should feel like a filmmaker's tool, not a debug panel.

**Validation:**
```bash
# Navigate to http://127.0.0.1:8430/manual — should load without errors
# Console should show no JS errors
# If no shots are flagged, should show empty state message
```

### Phase 4: Feedback Loop (3-4h)

**Files to modify:**
- `editors/manual.js` — add pill buttons to reconciliation grid, require selection before confirm
- `editors/review_server.py` — `/api/manual/resolve` endpoint stores `manual_fixes[]` entry

**Pill button categories:**
1. `composition` (blue) — framing, placement wrong
2. `artifacts` (red) — visual defects, morphing, extra limbs
3. `motion` (orange) — wrong movement, physics
4. `safety_filter` (yellow) — content filter blocked
5. `character` (purple) — wrong face, costume, features

**Auto-inference logic (in backend):**
```python
def infer_fix_type(shot, reimport_data):
    if reimport_data.get('prompt_edited'):
        return 'prompt_edit'
    if reimport_data.get('model_changed'):
        return 'model_switch'
    if reimport_data.get('source') == 'bundle':
        return 'manual_intervention'
    return 'unknown'
```

**Validation:**
- Re-import a test file via the drop zone
- Verify `manual_fixes` array appears in shot JSON
- Verify pill button selection is required (confirm button disabled until tag selected)

### Inter-Phase Review Protocol

After each phase, run a Gemini consultation:
```bash
python3 tools/consult.py \
  --auto-package ~/Dropbox/CLAUDE_PROJECTS/starsend \
  --prompt-text "Review the code changes from Phase N of the Manual Workbench build. Focus on: bugs, security issues, integration risks with existing codebase, and any violations of the patterns established in review_server.py and dailies.js. Be specific — reference line numbers." \
  --output consultations/manual_workbench/build_review_phase_N.md
```

### Error Recovery

If any phase fails:
1. Read the error message
2. Check if it's a simple fix (typo, missing import, wrong path)
3. If complex, spawn an Opus sub-agent to diagnose
4. Fix and re-validate
5. Do NOT skip phases or mark as complete without validation
6. If blocked for >30 minutes on one issue, move to next phase and flag the blocker

### Success Criteria

All 4 phases are complete when:
- [ ] Server starts without errors
- [ ] `/manual` page loads in browser
- [ ] `POST /api/enhance-prompt` returns enriched text
- [ ] 'M' key in Dailies flags a shot
- [ ] Flagged shot appears in `/manual` triage grid
- [ ] Export bundle creates directory with prompt.txt + refs/
- [ ] Re-import via drop zone updates ExecutionStore
- [ ] Failure tag is required and stored in `manual_fixes[]`
- [ ] All phases pass Gemini code review

## Post-V1 Enhancements (Deferred)

- LLM prompt enrichment for automatic pipeline (not just manual ENHANCE button)
- Session manifest tracking (which bundles exported, uploaded, returned)
- Watched folder for `return/` directory (auto-detect re-imports)
- Failure pattern analysis dashboard (mine manual_fixes data)
- SeedDance 2.0 bundle support (when fal.ai goes live)
- Sora 2.0 bundle support
- Antigravity IDE integration for parallel agent builds
