feat(video): enhance video generation quality with dedicated prompter and provider upgrades by tsensei · Pull Request #43 · tsensei/OpenReels

tsensei · 2026-04-10T04:52:06Z

Summary

Video generation quality enhancement for OpenReels v0.17.0. Two commits:

Video Quality Enhancement:

New prompts/video-prompter.md with structured cinematography language (shot type, subject, action, camera movement, lighting, style)
Negative prompts on both providers: defaults + archetype antiArtifactGuidance
Kling upgraded from v2.1 standard to v2.6 Pro (same $0.07/s pricing, better physics)
cfg_scale: 0.5 passed to Kling API
Fixed Veo config: removed personGeneration, enhancePrompt, generateAudio, negativePrompt params that veo-3.1-lite-generate-preview rejects (was causing Forbidden errors)
Fixed factory videoModel routing: GeminiVideo receives override in both primary/secondary positions, FalVideo always uses its own default
motionPrompt and negativePrompt logged in VideoResolution metadata for observability

Documentation:

CHANGELOG v0.17.0 entry
README: updated video provider versions, Visuals stage description
CLAUDE.md: updated test count (349 → 395), video provider versions
TODOS.md: added Motion enum expansion (P2) and video safety re-prompting (P2)
VERSION: 0.16.0 → 0.17.0

Test Coverage

Tests: 44 files, 395 passing (was 349 on main)

New tests cover:

Negative prompt construction (defaults + antiArtifactGuidance, empty fallback)
Provider param threading (Gemini negativePrompt, Fal negative_prompt + cfg_scale)
Veo config: unsupported params not passed to lite model
Kling v2.6 Pro model ID
VideoResolution metadata includes motionPrompt/negativePrompt
Motion prompt fallback when LLM fails
Factory videoModel routing (GeminiVideo gets override, FalVideo gets default)

Pre-Landing Review

No issues found. No frontend files changed. No prompt eval files matched.

TODOS

2 new items added:

Expand Motion enum with cinematic camera directions (P2)
Video generation safety re-prompting (P2)

Test plan

All vitest tests pass (395 tests, 0 failures)
Manual Veo API verification: bare config works, unsupported params correctly removed
Version bump and changelog updated

🤖 Generated with Claude Code

…, negative prompts, and provider upgrades - Add dedicated prompts/video-prompter.md with structured cinematography language (shot type, camera movement, temporal progression, physics/realism guidance) - Add negativePrompt support to VideoProvider interface, threaded through video-resolver with defaults + archetype antiArtifactGuidance - Upgrade Kling from v2.1 standard to v2.6 Pro (same pricing, better physics) - Add cfg_scale parameter to Fal/Kling provider - Fix Veo config: remove unsupported params (personGeneration, enhancePrompt, generateAudio, negativePrompt) that veo-3.1-lite-generate-preview rejects - Fix factory videoModel routing: pass to GeminiVideo only, not FalVideo - Add motionPrompt and negativePrompt to VideoResolution for observability - Load video-prompter.md instead of image-prompter.md when mode="video"

tsensei · 2026-04-10T04:54:37Z

@claude review

claude · 2026-04-10T04:54:50Z

Claude finished @tsensei's task in 1m 46s —— View job

Code Review: feat(video) — video generation quality enhancement

Gather context: read changed files
Review video provider changes (Gemini/Fal/video-resolver)
Review factory routing fix
Review prompt system (video-prompter.md + image-prompter.ts)
Review tests
Review schema and cost-estimator changes
Post findings

Overall this is a clean, well-motivated PR. The Veo Forbidden-error fix alone makes it worth shipping, the negative-prompt plumbing is solid, and the test coverage is genuinely comprehensive. A few things worth a look before merge:

Issues

1. `negativePrompt` silently dropped for Gemini — no observability signal

gemini.ts:28 accepts negativePrompt in the interface but never uses it. The comment explains why (veo-3.1-lite doesn't support it), but the caller in video-resolver.ts:104 passes it expecting an effect. Right now the negative prompt is logged in VideoResolution.metadata but there's no indication to the user/operator that it was silently discarded by Gemini.

A one-liner console.warn inside generate() (similar to the warn already used elsewhere in the codebase) would surface this clearly:

// In gemini.ts generate(), after the config block:
if (opts.negativePrompt) {
  console.warn(`[video] negativePrompt ignored: veo-3.1-lite-generate-preview does not support it`);
}

Fix this →

2. `imageGenTimeMs` is always `0` in `VideoResolution`

video-resolver.ts:62:

const imageGenTimeMs = 0; // Already tracked by caller

Every VideoResolution object emitted by resolveAIVideo will have imageGenTimeMs: 0. The comment is accurate ("already tracked by caller"), but the field being present with a hardcoded 0 is misleading in observability logs and the VideoResolution type. Either:

Remove the field from VideoResolution (and the method: "image_fallback" path already omits it correctly — wait, no, it sets imageGenTimeMs there too)
Or accept it as a parameter so the real value can be passed through

Since this field is part of the interface defined at line 15-23, I'd at minimum remove imageGenTimeMs from VideoResolution or mark it as deprecated. If the caller already tracks it separately, it's noise here.

Fix this →

3. `cfg_scale: 0.5` hardcoded — not in `VideoProvider` interface

fal.ts:44:

cfg_scale: 0.5,

This is always sent regardless of archetype. 0.5 for Kling means "loosely follow the prompt / prioritize motion diversity". That might be the right default, but it's a meaningful creative parameter (higher = more prompt-faithful, lower = more model-prior). Since antiArtifactGuidance is archetype-specific, a future extension might want per-archetype cfg_scale guidance. Right now this can't be tuned without editing the provider directly.

Not blocking, but worth a // TODO(P2): make cfg_scale configurable per-archetype comment if you want to keep the door open.

4. `mimeType: "image/png"` hardcoded in both providers

gemini.ts:39 and fal.ts:34:

mimeType: "image/png"   // gemini.ts
type: "image/png"       // fal.ts Blob

The VideoProvider.generate() interface (providers.ts:72) accepts sourceImage: Buffer with no MIME type hint. Today all callers pass PNG (image-gen always produces PNG), so this is fine. But it's a hidden assumption — if the buffer were ever JPEG the API would receive mismatched content.

A minimal guard would be to document this assumption in the interface or the method signature. Not blocking.

5. `pickDuration` fallback when `supportedDurations` is empty

video-resolver.ts:40:

return sorted[sorted.length - 1] ?? 5;

If a provider exposes supportedDurations: [], sorted[-1] is undefined and we fall back to 5 seconds. The provider may then reject 5 as unsupported. This is a provider-level contract violation so the ?? 5 fallback is reasonable, but a console.warn here would help debug future provider misconfigurations.

Minor observations

image_fallback path missing motionPrompt/negativePrompt in metadata (video-resolver.ts:151-163): when all video providers fail, the fallback VideoResolution omits these fields even though they were computed. They could be useful for diagnosing why generation failed. The fields are optional in the type (motionPrompt?: string) so this isn't a type error, but the debug value is lost.
Test placeholder value (video-resolver.test.ts:65): antiArtifactGuidance: "no artifacts" is a very generic mock. This works but a realistic archetype value (e.g. "avoid overexposed highlights, color banding") would make the test more representative. Cosmetic.

What's well done

The video-prompter.md is excellent — the 7-element anatomy and 10 motion rules are exactly what video models respond well to, and the length guidance (300-500 chars) is grounded.
Negative prompt construction is clean: archetypeGuidance ? \${DEFAULT}, ${archetypeGuidance}` : DEFAULT` is concise and handles the empty-string case correctly (tested).
The factory videoModel routing fix is correct and the two new tests ("primary" and "fal primary" cases) cover both branches well.
Removing personGeneration, enhancePrompt, generateAudio, negativePrompt from the Veo config is the right call. The comment documenting why is appreciated.
Test for "LLM optimization fails, use raw visual_prompt" (video-resolver.test.ts:205) is exactly the failure mode that matters most.
imageGenTimeMs: 0 aside, motionPrompt and negativePrompt in VideoResolution is good observability discipline.

Verdict: Items 1 and 2 are worth addressing before merge (low effort, concrete improvement). Items 3-5 are acceptable as-is with a comment or deferred to TODOS.

- Add console.warn when negativePrompt is silently dropped by Gemini's veo-3.1-lite model (review finding #1, valid)

tsensei · 2026-04-10T04:59:34Z

Code Review Response (Round 2)

Finding 1: negativePrompt silently dropped for Gemini — Fixed ✅

Added console.warn in gemini.ts when negativePrompt is provided but ignored by the lite model. Commit: e162d98.

Finding 2: imageGenTimeMs always 0 — By design, not a bug

The 0 in video-resolver.ts:62 is a deliberate placeholder. The caller in orchestrator.ts:284-286 patches it with the real value after resolveAIVideo returns:

// orchestrator.ts:284-286
if (videoResult.videoResolution) {
  videoResult.videoResolution.imageGenTimeMs = imageGenTimeMs;
}

The resolver doesn't know when image gen started (it receives a pre-generated image buffer), so it sets 0 and lets the caller fill in the actual timing. The final VideoResolution in logs always has the correct value.

Finding 3: cfg_scale hardcoded — Design decision

0.5 is Kling's documented default. It means "balanced between prompt adherence and model creativity." Making this configurable per-archetype is possible but premature — we'd need empirical data on which archetypes benefit from different values. If we add it later, it's a one-line change per archetype JSON + a new field on the interface. Not blocking.

Finding 4: mimeType hardcoded as PNG — Design decision

All image providers (Gemini Imagen, DALL-E) produce PNG. The sourceImage: Buffer in the VideoProvider interface is always PNG because it comes from generateAIImage() which always outputs PNG. Changing this would require the entire image pipeline to support format negotiation. The assumption is correct today and will remain correct unless we add a JPEG image provider (unlikely, since PNG is lossless and better for video source frames).

Finding 5: pickDuration empty array fallback — Design decision

Both GeminiVideo and FalVideo hardcode non-empty supportedDurations arrays ([4, 6, 8] and [5, 10]). An empty array would be a provider implementation bug, not something the resolver should guard against. The ?? 5 is a last-resort safety net, not a real code path. Adding a warn here would be dead code in practice.

Minor observations — Acknowledged

image_fallback path missing motionPrompt/negativePrompt: fair point. These were computed before the provider loop, so they'd be useful for debugging why generation failed. Low priority but could add in a follow-up.
Test mock values: cosmetic, works fine for the assertions being made.

tsensei · 2026-04-10T05:01:01Z

@claude review

claude · 2026-04-10T05:01:16Z

Claude finished @tsensei's task in 2m 35s —— View job

Code Review: feat(video) — Round 3

Gather context from PR and prior review exchange
Read changed files: gemini.ts, fal.ts, video-resolver.ts, factory.ts
Read prompt file (video-prompter.md) and image-prompter.ts
Read test files + orchestrator context
Post review findings

Round 3 covers the current branch state after the round-2 fix (negativePrompt warn on gemini.ts). The author's responses to findings 2-5 are well-reasoned and all confirmed by the code.

New finding: motion-prompt LLM usage is silently discarded

video-resolver.ts:67-76:

const optimized = await optimizeImagePrompt(...);
motionPrompt = optimized.prompt;
// optimized.usage ← never used

The optimizeImagePrompt call has real LLM cost (tokens for the motion prompt LLM call), but optimized.usage is dropped. Every other resolver in the codebase that calls optimizeImagePrompt accumulates the usage:

adaptive-resolver.ts:261: llmUsages.push(optimized.usage)
orchestrator.ts:159: usage = optimized.usage (for image mode)
music-resolver.ts:102: returns prompterUsage: promptResult.usage

The video resolver is the only one that doesn't. As a result, video pipeline runs silently under-report LLM cost — every AI video scene incurs an LLM call that's never reflected in the cost estimate.

The fix follows the music-resolver pattern: add a prompterUsage?: LLMUsage field to the resolveAIVideo return type and surface it to the orchestrator for accumulation.

Fix this →

Confirmed from round 2 (no action needed)

imageGenTimeMs = 0: Confirmed correct — orchestrator.ts:284-286 patches videoResult.videoResolution.imageGenTimeMs = imageGenTimeMs after resolveAIVideo returns. One nitpick: the if (videoResult.videoResolution) guard at line 285 checks a non-optional field (the return type always includes videoResolution), so the condition is always true. Harmless but mildly misleading — could be a direct assignment.

cfg_scale, mimeType, pickDuration fallback: All confirmed as acceptable design decisions. No changes needed.

Minor observation (new)

fal.ts:67 vs gemini.ts:81: Gemini checks fs.existsSync(tmpPath) || fs.statSync(tmpPath).size === 0; Fal only checks size. Since fsp.writeFile at line 65 would throw on path failure, this can't cause a silent issue in practice. But the asymmetry is worth aligning for consistency.

What's well done

The image-prompter.ts mode-split (VIDEO_PROMPT_PATH / IMAGE_PROMPT_PATH) is cleaner than the old approach of loading the image prompter and appending a video note — the dedicated system prompt can now evolve independently.
video-prompter.md is excellent production-quality prompt engineering. The 7-element anatomy with earlier elements weighted higher, one-action-one-camera-move rule, and the 300-500 char guidance are all well-grounded.
negativePrompt warn in gemini.ts (round-2 fix) is exactly right.
Factory routing (videoModel passed to GeminiVideo, undefined to FalVideo) is correct and both branches are covered by tests.
The image_fallback path consistency (provider fails → image is returned) is well-tested.

Verdict: One substantive finding (LLM usage gap) worth addressing before merge — it's a one-field addition that aligns with the existing pattern in every other resolver. Everything else is clean. Good PR overall.

The video resolver's optimizeImagePrompt call has real token cost but the returned usage was silently discarded. Every other resolver in the codebase (adaptive-resolver, music-resolver, orchestrator) accumulates LLM usage for cost reporting. This aligns video-resolver with that pattern: prompterUsage flows through VisualAssetResult to llmUsages.

tsensei · 2026-04-10T05:07:03Z

Code Review Response (Round 3)

Finding: motion prompter LLM usage silently discarded — Fixed ✅

Good catch. optimizeImagePrompt in video-resolver was the only resolver in the codebase that didn't surface its LLM usage for cost accumulation. Every ai_video scene was incurring a motion-prompt LLM call that never showed up in the cost report.

What changed (commit 5f9aec6):

video-resolver.ts: captures optimized.usage into prompterUsage variable, returns it on all 3 exit paths (success, fallback, unreachable)
orchestrator.ts: added prompterUsage?: LLMUsage | null to VisualAssetResult, accumulates it in the scene results loop alongside r.usage
video-resolver.test.ts: 2 new tests — verifies usage is surfaced on success, and null when the LLM call fails

Pattern matches music-resolver.ts (returns prompterUsage: promptResult.usage) and adaptive-resolver.ts (pushes optimized.usage to llmUsages).

Minor observation: fal.ts existsSync vs gemini.ts — Acknowledged

The asymmetry exists because fsp.writeFile (fal) throws on path failure while client.files.download (gemini) may silently produce an empty file. Both paths catch their respective failure modes. Could align for consistency in a follow-up, not blocking.

Confirmed: orchestrator guard at line 285 — Acknowledged

if (videoResult.videoResolution) is always true since the return type mandates it. Harmless guard, could be a direct assignment. Not worth a diff.

397 tests passing.

tsensei · 2026-04-10T05:07:13Z

@claude review

claude · 2026-04-10T05:07:30Z

Claude finished @tsensei's task in 1m 41s —— View job

Code Review: feat(video) — Round 4

Gather context: read changed files
Verify round-3 fix (motion prompter LLM usage surfacing)
Check orchestrator integration
Review test coverage for new fix
Final pass on remaining files
Post findings

Round 4 covers the current branch state after commit 5f9aec6. All findings from rounds 1–3 have been addressed. This is a clean pass.

Round-3 fix verified: prompterUsage surfacing is correct

video-resolver.ts:79 — Usage is correctly captured:

motionPrompt = optimized.prompt;
prompterUsage = optimized.usage;   // ← captured

All 3 exit paths return prompterUsage:

Success path (line 141): prompterUsage included ✅
Last-provider-failed fallback (line 168): prompterUsage included ✅
Unreachable empty-providers fallback (line 188): prompterUsage included ✅

orchestrator.ts:615-616 — Accumulation is correct:

if (r.usage) llmUsages.push(r.usage);
if (r.prompterUsage) llmUsages.push(r.prompterUsage);

The if (r.prompterUsage) guard correctly handles both null (LLM failed) and undefined (non-video scene types). No double-counting: image scenes route their prompter cost through r.usage (as before); video scenes carry it separately via r.prompterUsage.

Tests (video-resolver.test.ts:205-229):

"surfaces motion prompter LLM usage for cost tracking" — verifies { inputTokens: 100, outputTokens: 50 } ✅
"returns null prompterUsage when motion prompt LLM fails" — verifies null on exception ✅

All prior design decisions confirmed

Finding	Status
`negativePrompt` warn in `gemini.ts:33-35`	Correctly emits warn with model name in message ✅
`imageGenTimeMs: 0` filled by orchestrator	Confirmed — `orchestrator.ts:284-286` patches real value ✅
`cfg_scale: 0.5` hardcoded in `fal.ts:44`	Acceptable default; door left open for P2 ✅
`mimeType: "image/png"` hardcoded	Correct assumption for current image pipeline ✅
`pickDuration` `?? 5` fallback	Both providers have hardcoded non-empty arrays; safety net is fine ✅
Factory `videoModel` routing	`GeminiVideo(config.videoModel)` / `FalVideo(undefined)` — correct ✅

One minor observation (no action needed)

image-prompter.ts:36-44: If video-prompter.md fails to load, the catch silently falls back to the shorter inline prompt (line 38), then still appends the STYLE BIBLE (line 48). This is correct behavior and intentional. Just noting it so reviewers aren't surprised if they ever see a motion prompt that looks simpler than expected — it means the file was missing.

Verdict

This PR is ready to merge. All substantive issues across four rounds of review have been addressed. The round-3 LLM usage fix is complete and consistent with the pattern established by music-resolver.ts and adaptive-resolver.ts. Test coverage is comprehensive. The Veo Forbidden-error fix, negative prompt plumbing, Kling upgrade, and factory routing fix are all clean.

Talha Jubair Siam added 3 commits April 10, 2026 10:46

docs: update project documentation for v0.17.0

41511a8

fix(test): await async test to prevent unhandled rejection in CI

74f1ff2

fix(review): address round-2 code review findings on PR #42

e162d98

- Add console.warn when negativePrompt is silently dropped by Gemini's veo-3.1-lite model (review finding #1, valid)

tsensei merged commit d55601b into main Apr 10, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(video): enhance video generation quality with dedicated prompter and provider upgrades#43

feat(video): enhance video generation quality with dedicated prompter and provider upgrades#43
tsensei merged 5 commits intomainfrom
feat/video-generation-quality

tsensei commented Apr 10, 2026

Uh oh!

tsensei commented Apr 10, 2026

Uh oh!

claude Bot commented Apr 10, 2026 •

edited

Loading

Uh oh!

tsensei commented Apr 10, 2026

Uh oh!

tsensei commented Apr 10, 2026

Uh oh!

claude Bot commented Apr 10, 2026 •

edited

Loading

Uh oh!

tsensei commented Apr 10, 2026

Uh oh!

tsensei commented Apr 10, 2026

Uh oh!

claude Bot commented Apr 10, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tsensei commented Apr 10, 2026

Summary

Test Coverage

Pre-Landing Review

TODOS

Test plan

Uh oh!

tsensei commented Apr 10, 2026

Uh oh!

claude Bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review: feat(video) — video generation quality enhancement

Issues

1. negativePrompt silently dropped for Gemini — no observability signal

2. imageGenTimeMs is always 0 in VideoResolution

3. cfg_scale: 0.5 hardcoded — not in VideoProvider interface

4. mimeType: "image/png" hardcoded in both providers

5. pickDuration fallback when supportedDurations is empty

Minor observations

What's well done

Uh oh!

tsensei commented Apr 10, 2026

Code Review Response (Round 2)

Finding 1: negativePrompt silently dropped for Gemini — Fixed ✅

Finding 2: imageGenTimeMs always 0 — By design, not a bug

Finding 3: cfg_scale hardcoded — Design decision

Finding 4: mimeType hardcoded as PNG — Design decision

Finding 5: pickDuration empty array fallback — Design decision

Minor observations — Acknowledged

Uh oh!

tsensei commented Apr 10, 2026

Uh oh!

claude Bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review: feat(video) — Round 3

New finding: motion-prompt LLM usage is silently discarded

Confirmed from round 2 (no action needed)

Minor observation (new)

What's well done

Uh oh!

tsensei commented Apr 10, 2026

Code Review Response (Round 3)

Finding: motion prompter LLM usage silently discarded — Fixed ✅

Minor observation: fal.ts existsSync vs gemini.ts — Acknowledged

Confirmed: orchestrator guard at line 285 — Acknowledged

Uh oh!

tsensei commented Apr 10, 2026

Uh oh!

claude Bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review: feat(video) — Round 4

Round-3 fix verified: prompterUsage surfacing is correct

All prior design decisions confirmed

One minor observation (no action needed)

Verdict

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude Bot commented Apr 10, 2026 •

edited

Loading

1. `negativePrompt` silently dropped for Gemini — no observability signal

2. `imageGenTimeMs` is always `0` in `VideoResolution`

3. `cfg_scale: 0.5` hardcoded — not in `VideoProvider` interface

4. `mimeType: "image/png"` hardcoded in both providers

5. `pickDuration` fallback when `supportedDurations` is empty

claude Bot commented Apr 10, 2026 •

edited

Loading

claude Bot commented Apr 10, 2026 •

edited

Loading