Video Generation Through Code
The video generation pipeline inverted. Instead of timeline-based editing with static assets, Claude now codes React components that render individual frames. Each frame becomes programmable, each scene becomes modular, and the entire video becomes iteratively editable through natural language.
This shift matters because it separates video structure from video content. Traditional video editing locks decisions into rendered assets—change the background color, re-render everything. With component-based generation, modifications target specific elements while preserving the rest. The edit becomes surgical rather than destructive.
The workflow splits into distinct phases: planning generates the storyboard and asset requirements, implementation codes the React components, and refinement operates on individual scenes through targeted prompts. Each phase builds on deterministic foundations while leveraging generative capabilities appropriately.
The implementation artifacts below establish this separation—from initial skill installation through granular editing patterns that preserve video structure while enabling content iteration.
Foundation Setup
The video generation capability requires explicit skill installation rather than relying on base model knowledge. The Remotion skills package contains prompts that teach Claude how to structure video components, manage assets, and coordinate with the local development environment.
Install Remotion video generation skills:
npm create @remotion/video — —template=blank, then install skills using the command from remotion.dev. Configure for global scope when prompted to ensure availability across projects.
The skills installation transforms Claude’s understanding of video from abstract concept to concrete React implementation pattern. Without explicit instruction, Claude treats video as a black box output. With Remotion skills, it understands frame-by-frame composition, asset coordination, and browser-based editing interfaces.

Planning Protocol
Video generation benefits from explicit planning phases that separate conceptual design from implementation execution. The planning stage forces structure decisions before component coding begins, reducing iteration overhead and improving first-pass quality.
Add to CLAUDE.md:
For video projects, always use plan mode first. Create detailed storyboard with scene breakdown, identify required assets and sources, plan timing and transitions, get approval before coding. Only proceed to implementation after plan approval.
The planning requirement prevents expensive rewrites of React video components. Unlike text generation where iteration is cheap, video components carry rendering overhead and complex interdependencies. Planning upfront establishes constraints that guide implementation rather than forcing post-hoc fixes.
Add to CLAUDE.md:
For video projects: When I provide URLs, scrape content and extract relevant assets automatically. If I don’t provide assets, search web for appropriate images/graphics and incorporate them. Always ask if I want to upload custom assets before using web-sourced ones.
Asset sourcing automation addresses the common bottleneck in video creation—finding appropriate visual elements. By establishing clear precedence (custom uploads override web search, explicit assets override automatic sourcing), the prompt prevents Claude from making assumptions about asset preferences while maintaining generation speed.
Command Implementation
The video-from-url pattern automates the most common video generation use case: explaining existing content through visual presentation. This command encapsulates the full pipeline from content analysis to video delivery.
Create /video-from-url command:
Takes URL and duration parameters. Scrapes target content, analyzes key features, plans video structure, sources appropriate assets, generates video highlighting main points. Spawns local video editor for refinement and exports final output.
The command abstracts the complexity of content-to-video transformation while preserving granular control over the output. Users specify intent (URL + duration) rather than implementation details (scene structure + asset requirements + timing coordination). Claude handles the translation from content analysis to visual presentation.

Iterative Editing Pattern
The editing workflow leverages component modularity to enable surgical modifications without full regeneration. Each scene exists as an independent React component, making targeted changes feasible through natural language instruction.
Implement iterative video editing pattern:
After initial video generation, use granular editing workflow: Review specific scenes/frames, give targeted feedback (e.g., ‘Scene 3, fix overlapping text’), let Claude update individual components, preview changes in browser editor, repeat until satisfied.
The pattern inverts traditional video editing assumptions. Instead of timeline-based modifications that require re-rendering affected sections, component-based editing updates specific elements while preserving surrounding context. This enables faster iteration cycles and more precise control over individual elements.
The editing pattern also preserves video structure during content modifications. Changing text in scene 3 doesn’t affect timing in scenes 1, 2, 4, or 5. The modular architecture isolates changes to their intended scope, preventing cascading modifications that plague traditional video editing workflows.
Enhancement Protocol
Professional video polish emerges through systematic improvement rather than intuitive design. The enhancement protocol leverages Claude’s knowledge of motion graphics principles to suggest specific improvements rather than relying on subjective feedback.
Add to CLAUDE.md:
After creating initial video, ask: “If you were an expert motion graphics designer, what 5 improvements would you suggest for this video?” Then implement approved suggestions for animations, transitions, color schemes, and visual polish.
The enhancement prompt transforms Claude from video generator to design consultant. Rather than accepting first-pass output, the protocol systematically identifies improvement opportunities through expert knowledge application. This elevates video quality without requiring domain expertise from the user.
The five-improvement constraint forces prioritization and prevents overwhelming feedback. Unlimited suggestions create analysis paralysis; five specific improvements create actionable next steps. The constraint also ensures suggestions target the most impactful changes rather than exhaustive minutiae.
Synthesis
The artifacts above establish three distinct authority boundaries: planning constrains implementation decisions, asset sourcing constrains visual choices, and editing constrains modification scope. Each boundary prevents Claude from making assumptions while preserving generative capabilities within defined limits.
The video generation workflow succeeds because it separates creative decisions (what to show) from technical implementation (how to render). Users specify content intent, Claude handles React component architecture. Users approve enhancement suggestions, Claude implements motion graphics. The authority remains with the user while leveraging Claude’s technical execution capabilities.
The deeper pattern reveals component-based media generation as a new paradigm. Video becomes programmable, audio could follow similar patterns, and interactive content becomes systematically generatable. The shift from asset-based to code-based media creation enables iteration speeds and customization depth that traditional tools cannot match.