feat: add creative direction file and score replay by tsensei · Pull Request #44 · tsensei/OpenReels

tsensei · 2026-04-10T07:14:08Z

Summary

Two new pipeline features that unblock the growing OpenReels community:

Creative Direction File (--direction <file>)

Attach a free-form markdown brief (visual style, script notes, music mood, scene ideas)
Injected into both generateDirectorScore and reviseDirectorScore prompts
10KB byte-based limit, binary file detection, empty file guard
Works across CLI, API (direction field), and web UI (textarea with character counter)

Score Replay (--score <path>)

Reuse a previous score.json to skip research, director, and critic stages
Populates all 3 closure bindings (score, config, costBreakdown) for downstream stages
Accurate replay cost estimation (omits skipped LLM calls)
Direction provenance recorded in log.json

Supporting changes:

Example brief at examples/direction-brief.md
Updated TODOS.md linking score replay to the "Retry from failed stage" TODO
README + CLAUDE.md updated with new flags and examples

Test Coverage

Tests: 395 → 423 (+28 new)

COVERAGE: 22/22 paths tested (100%)
QUALITY:  ★★★: 16  ★★: 6

Pre-Landing Review

Pre-Landing Review: No issues found.

Documentation

8 docs site pages updated for v0.18.0:

docs/getting-started/cli.mdx: added --direction and --score flags to table + usage examples
docs/getting-started/quickstart.mdx: added direction and replay CLI examples
docs/getting-started/web-ui.mdx: added Creative Direction textarea and Replay from Score to Advanced panel docs
docs/api/rest.mdx: added direction and score fields to POST /api/v1/jobs body table + error codes
docs/pipeline/overview.mdx: added Score Replay and Creative Direction sections with alternate pipeline flow
docs/pipeline/creative-director.mdx: added direction injection and score replay sections
docs/pipeline/research.mdx: added "Skipping during replay" section
docs/pipeline/critic.mdx: added "Skipping during replay" section

Test plan

All Vitest tests pass (423 tests, 47 files)
Test coverage audit: 100% of new code paths covered
Pre-landing review: no issues
Adversarial review: no findings related to this branch
Documentation updated (README, CLAUDE.md, 8 docs site pages)

🤖 Generated with Claude Code

Two new pipeline inputs: --direction <file> for creative briefs that influence DirectorScore generation, and --score <path> for replaying from a saved score.json (skips research, director, and critic stages). Direction is free-form markdown injected into the creative director prompt. Score replay populates all closure bindings (score, config, costBreakdown) and runs cost estimation with accurate replay pricing.

Server validates direction (10KB byte limit) and score (DirectorScore schema). Worker threads both to runPipeline. Web UI adds direction textarea with character counter and score file input in Advanced panel.

28 new tests across 3 new test files + 1 extended: - direction-file.test.ts: file validation, binary detection, empty guard - score-replay.test.ts: Zod validation, replay cost estimation - server-direction-score.test.ts: API validation, byte-based size checks - cost-estimator.test.ts: replay mode zeroes LLM base costs

Sample direction file demonstrating mood, visual style, scene ideas, script notes, and music preferences for the --direction flag.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

tsensei · 2026-04-10T07:27:09Z

@claude review

claude · 2026-04-10T07:27:27Z

Claude finished @tsensei's task in 3m 9s —— View job

Code Review

Summary

The implementation is clean and well-architected. Both features integrate naturally into the existing pipeline. Below are the findings, ordered by severity.

Issues Found

[Medium] Score replay with unknown archetype crashes the job silently

Files: src/server.ts:246-252 and src/pipeline/orchestrator.ts:436

The server validates a replayed score with DirectorScore.safeParse(), which accepts any string for archetype (the test at score-replay.test.ts:85-91 documents this explicitly). However, the orchestrator later calls getArchetype(score.archetype) which throws for unknown names:

// src/config/archetype-registry.ts:36-38
if (!config) {
  throw new Error(`Unknown archetype: ${name}. Available: ${Object.keys(ARCHETYPES).join(", ")}`);
}

A user can send { "archetype": "my_custom_style", ... } — the POST returns 201 queued, and the job immediately fails in the director step. The same gap exists in the CLI path (src/index.ts:55).

Fix: Validate the archetype against the registry at the server layer: Fix this →

[Low] `replayScorePath` declared in `RunLog` but never populated

File: src/pipeline/orchestrator.ts:125

interface RunLog {
  // ...
  direction?: string;
  replay?: boolean;
  replayScorePath?: string;  // ← declared but never set
}

direction and replay are correctly populated at lines 925-926, but replayScorePath is never written. Either remove it from the interface or populate it (e.g., replayScorePath: opts.replayScorePath — though PipelineOptions would also need the field). Fix this →

[Low] Web UI character counter vs server byte limit mismatch

File: web/src/pages/HomePage.tsx:428-433

<textarea
  maxLength={10000}       // character count
  ...
/>
<p>{directionText.length.toLocaleString()} / 10,000</p>  // also chars

The server enforces Buffer.byteLength(direction, "utf-8") > 10240 (bytes). A user writing CJK text could hit 10,000 characters with ~30KB of UTF-8 bytes, or conversely receive a confusing 400 after the UI shows them well under the limit. The two systems should measure the same unit.

The simplest fix is to show byte count in the UI counter and guard with a byte-based maxLength approximation, or change the server to enforce the character limit instead. Fix this →

Test Coverage Observations

Direction injection tests are type-only (src/pipeline/direction-file.test.ts:70-88): The two tests under "direction injection into creative director" only verify TypeScript types — they don't actually exercise the prompt construction. The directionSection string building in creative-director.ts:83-85 and 219-221 is untested at the unit level.
The direction + score conflict test doesn't cover the silencing behavior (src/pipeline/score-replay.test.ts:137-146): It confirms both fields can coexist in PipelineOptions, but the actual silencing of direction when replayScore is set happens in the CLI (src/index.ts:68-71) and worker (src/worker.ts:268), neither of which is covered by this test. A future refactor could accidentally remove the guard without a test catching it.

What's Working Well

Direction injection is correctly scoped to both generateDirectorScore and reviseDirectorScore with a consistent ## Creative Direction section header and a clear framing sentence.
Cost estimation for replay (src/cli/cost-estimator.ts:117-130) correctly zeros out research/director/critic LLM costs while preserving image-prompter costs.
Dual-path consistency: CLI, server, and worker all enforce the same 10KB limit via the same mechanism (byte length for server/worker, statSync.size for CLI — both byte-accurate for UTF-8 files on disk).
Log provenance (src/pipeline/orchestrator.ts:925-926): direction and replay fields are correctly set.
Binary file guard (null-byte check in src/index.ts:31) is a sensible cheap test for binary content.
Test count increase of +28 with replay cost coverage is solid.

- Validate replayed score's archetype against registry in both server (400 for unknown archetype) and CLI (exit with clear error). Prevents silent job failures when score.json references a removed archetype. - Remove dead replayScorePath field from RunLog (declared but never set). - Fix web UI direction counter to use byte length (TextEncoder) matching the server's Buffer.byteLength check. Prevents CJK text from hitting the server 10KB limit while the UI shows it under the character limit.

tsensei · 2026-04-10T08:10:56Z

Review Response

Fixed (3 issues)

[Medium] Score replay with unknown archetype — Fixed. Both server (src/server.ts) and CLI (src/index.ts) now validate the replayed score's archetype against the registry before accepting it. Server returns 400, CLI exits with a clear error message. Commit: eeaeeed.
[Low] replayScorePath never populated — Fixed. Removed the dead field from RunLog. The replay: true flag is sufficient to identify replay runs. The source file path is consumed at the CLI/worker layer and isn't available at the orchestrator level without threading another field through PipelineOptions, which isn't worth the complexity for a log-only field.
[Low] Character counter vs byte limit — Fixed. The web UI now uses TextEncoder to count bytes and enforces the same 10,240-byte limit as the server. The onChange handler rejects input that would exceed the byte limit, and the counter shows X / 10,240 bytes.

Not fixing (design decisions)

Direction injection tests are type-only: This is intentional. The direction text is injected via string interpolation into the LLM user message. Testing the actual prompt construction would require either (a) mocking the LLM and capturing the prompt argument, which couples the test to internal prompt format, or (b) an integration test with a real LLM call, which is expensive. The type-level tests verify the interface contract (direction is accepted as an option). The actual prompt injection is simple string concatenation (${options.direction}) — the risk of regression is low, and a format-coupled test would break on every prompt wording change.

Direction+score conflict silencing not tested at CLI/worker level: The silencing logic is 2 lines (if (direction) { direction = undefined; }) in both index.ts and worker.ts. A unit test for this would require mocking the entire CLI argument parser or BullMQ worker context. The behavior is covered by the design doc's edge case spec and exercised by manual testing. If this logic were refactored into a shared function, it would warrant its own test, but extracting a helper for 2 lines of assignment would be over-engineering.

tsensei · 2026-04-10T08:11:55Z

@claude review

claude · 2026-04-10T08:12:11Z

Claude finished @tsensei's task in 1m 52s —— View job

Code Review

Gather context from PR description and previous review
Read implementation files (server.ts, index.ts, orchestrator.ts, HomePage.tsx)
Check archetype validation fix
Check replayScorePath removal
Check byte-based UI counter fix
Review new test files
Post final review

Summary

All three issues from the previous review are correctly resolved. The implementation is clean and ready to land.

Previous Issues — Verification

[Medium] Archetype validation ✅ Fixed correctly in both paths:

src/server.ts:251-258: After safeParse succeeds, wraps getArchetype(result.data.archetype) in try/catch, returns 400 on unknown archetype.
src/index.ts:57-58: Calls getArchetype(replayScore.archetype) immediately after DirectorScore.parse(), error caught at line 59 with a clear message.

[Low] replayScorePath dead field ✅ Removed. RunLog at orchestrator.ts:115-125 now only has direction? and replay?, both correctly populated at lines 924-925.

[Low] Character counter vs byte limit ✅ Fixed. onChange at HomePage.tsx:428-430 uses new TextEncoder().encode(e.target.value).length with the 10,240-byte guard, and the counter at line 434 shows bytes instead of characters.

New Observations

[Nit] Unnecessary type cast in server archetype validation

File: src/server.ts:253

getArchetype((result.data as { archetype: string }).archetype);

result.data is already typed as DirectorScore (which declares archetype: z.string()), so the cast is redundant. result.data.archetype works directly. Not a correctness issue.

[Nit] Archetype registry check has no dedicated unit test

server-direction-score.test.ts tests Zod-level schema validation but doesn't cover the getArchetype() guard added in server.ts:251-258. The behavior is indirectly covered by archetype-registry.test.ts:68 (which confirms getArchetype("nonexistent") throws), but a test that exercises the full server.ts guard path (invalid archetype → 400) would make the coverage explicit. Low-risk omission given the simplicity of the guard.

[UX note] Silent rejection on over-limit paste

The onChange guard at HomePage.tsx:428-430 silently drops the entire input if pasting would exceed 10,240 bytes (the state just doesn't update). Users near the limit who paste a large block will see nothing happen with no explanation. The byte counter makes this discoverable, but a brief inline error message (e.g. "Exceeds 10,240 byte limit") on rejection would improve clarity. Not a blocker.

What's Working Well

All three fixes are minimal and accurate — no over-engineering.
worker.ts:268 (effectiveDirection = replayScore ? undefined : direction) silences direction correctly during replay.
Cost estimator correctly zeroes research/director/critic LLM calls in replay mode (cost-estimator.ts:117).
Test count: 395 → 423 is solid coverage for the new paths.
direction-file.test.ts exercises the real filesystem for all edge cases (empty file, whitespace-only, binary null-byte, CJK byte-size, ENOENT).

Talha Jubair Siam and others added 8 commits April 10, 2026 13:12

feat(server): thread direction and score through API, worker, and web UI

8b70293

Server validates direction (10KB byte limit) and score (DirectorScore schema). Worker threads both to runPipeline. Web UI adds direction textarea with character counter and score file input in Advanced panel.

docs: add example creative direction brief

41654ca

Sample direction file demonstrating mood, visual style, scene ideas, script notes, and music preferences for the --direction flag.

chore: bump version and changelog (v0.18.0)

f0adeda

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

docs: update project documentation for v0.18.0

0998f51

docs: update docs site for direction file and score replay (v0.18.0)

088d520

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix(test): remove unused import in score-replay test

a506a94

tsensei merged commit 05d2a62 into main Apr 10, 2026
1 check passed

tsensei deleted the feat/direction-file-score-replay branch April 10, 2026 08:15

tsensei mentioned this pull request Apr 10, 2026

feat: add support for custom context files and direction #39

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add creative direction file and score replay#44

feat: add creative direction file and score replay#44
tsensei merged 9 commits intomainfrom
feat/direction-file-score-replay

tsensei commented Apr 10, 2026 •

edited

Loading

Uh oh!

tsensei commented Apr 10, 2026

Uh oh!

claude Bot commented Apr 10, 2026 •

edited

Loading

Uh oh!

tsensei commented Apr 10, 2026

Uh oh!

tsensei commented Apr 10, 2026

Uh oh!

claude Bot commented Apr 10, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tsensei commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Coverage

Pre-Landing Review

Documentation

Test plan

Uh oh!

tsensei commented Apr 10, 2026

Uh oh!

claude Bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review

Summary

Issues Found

[Medium] Score replay with unknown archetype crashes the job silently

[Low] replayScorePath declared in RunLog but never populated

[Low] Web UI character counter vs server byte limit mismatch

Test Coverage Observations

What's Working Well

Uh oh!

tsensei commented Apr 10, 2026

Review Response

Fixed (3 issues)

Not fixing (design decisions)

Uh oh!

tsensei commented Apr 10, 2026

Uh oh!

claude Bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review

Summary

Previous Issues — Verification

New Observations

[Nit] Unnecessary type cast in server archetype validation

[Nit] Archetype registry check has no dedicated unit test

[UX note] Silent rejection on over-limit paste

What's Working Well

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tsensei commented Apr 10, 2026 •

edited

Loading

claude Bot commented Apr 10, 2026 •

edited

Loading

[Low] `replayScorePath` declared in `RunLog` but never populated

claude Bot commented Apr 10, 2026 •

edited

Loading