Case study · Building in public

MotionStudio.
A browser-based video editor, built on Remotion.

2026·Personal project · solo buildIn development

Place text, images, video and audio on a canvas, arrange them on a frame-accurate timeline, animate them, and export to MP4 / WebM / GIF / MOV — entirely in the browser, no backend. This page is the living build journal: the architecture, the decisions, and the bugs that taught me something.

~3.8K
Lines of strict TypeScript
7
Engines — data & logic, zero UI
60
Logically-grouped commits
0
Backend services — fully client-side
See it
The editor today — canvas, properties panel, frame-accurate timeline
The editor today — canvas, properties panel, frame-accurate timeline · click to zoom
Remotion

Why Remotion is the core bet

The product renders real video. Remotion gives me frames as a first-class unit, <Sequence> for temporal composition, <Player> for in-app preview, and renderMedia for the actual encode — building a renderer and encoder myself would be months of work that teaches nothing about this product. The interesting engineering is everything I built on top of it:

Frame-based temporal model — every element carries `startFrame` + `durationInFrames`, mapped 1:1 onto `<Sequence from={startFrame}>`. Visibility is the half-open window `[start, start + duration)`. Frames, not seconds, because Remotion is frame-based and frames are exact — no floating-point drift.

One shared renderer = guaranteed WYSIWYG — the editor preview and the Remotion export call the *same* style function, differing only by a `scale` argument (editor < 1, Remotion = 1). WYSIWYG isn't "we tried to match" — the preview is *mathematically* the export.

Animation on Remotion's primitives — `interpolate` with `extrapolate: 'clamp'` so animations finish instead of extrapolating to infinity, and `spring` for physics (it needs `fps`, because a bounce is real-time). Multiple animations accumulate into one transform — factors multiply, offsets add — the same algebra compositors use.

Export via the Remotion CLI — Remotion renders in Node (headless Chrome + FFmpeg), not the browser. A `registerRoot` entry exposes the composition, `calculateMetadata` derives size/fps/duration per project, and the in-app Export dialog generates the exact `npx remotion render` command plus a `props.json`. Formats via `--codec` (h264 / vp8 / gif / prores).

DOM video vs Remotion video — the editor previews with a DOM `<video>` (seek on scrub for exact frames, play natively during playback, muted for reliable autoplay); the export uses `<OffthreadVideo>`, which is authoritative.

Tech stack — and why each
Remotion

Frames, <Sequence>, <Player>, renderMedia — the video engine. The one dependency the product is genuinely built around.

React 19 + TypeScript (strict)

Discriminated-union element types; adding a new element type = one type + one renderer.

Zustand

Global state with zero boilerplate: one create() → a hook + selectors. No providers, no reducers.

React Router v7

/ dashboard, /editor/:projectId — the URL is the single input that selects a project.

react-moveable

Drag/resize/rotate handles are a solved problem; rebuilding them is weeks of hit-testing math that teaches nothing about this product.

Tailwind v4 + shadcn/ui

Fast, consistent dark UI via design tokens; accessible primitives (Dialog, Popover, Select) without reinventing them.

IndexedDB + localStorage

Client-only persistence, split by data shape: small JSON state in localStorage, large media blobs in IndexedDB.

Architecture

Two layers, one rule

engines/   own DATA + LOGIC   (no UI)
features/  own UI             (compose engines)

The Project is the aggregate root. One Project object owns all data — elements, assets, settings. Engines don't keep their own copies; they expose verbs that read and write the one Project through a single mutation point (updateProject). Everything downstream falls out of this: undo/redo hooks the one mutation point and covers every edit automatically, autosave persists the one Project with nothing to wire per-feature, and state drift is impossible because there's only ever one source of truth.

The three data moves — immutability everywhere

add     → [...arr, x]
remove  → arr.filter(x => x.id !== id)
update  → arr.map(x => x.id === id ? { ...x, ...patch } : x)

React and Zustand detect change by reference identity, and undo snapshots must stay frozen. A single .push() would skip re-renders and corrupt history.

The seven engines
projectThe projects array (aggregate root) + undo history + persistenceThe only real store
editorEphemeral view state: selection, current frame, playing, zoomDeliberately not persisted
canvasVerbs: addText/Image/Video/Audio, updateElement, removeElement, reorderLayerA hook, not a store — owns no data
timelinePure frame ↔ pixel mathStateless helpers
animationinterpolate / spring evaluation + presetsPure functions
renderingThe shared style.ts + Remotion MotionCompositionOne renderer, two consumers
assetUpload, metadata probing, blob persistenceReads/writes project.assets

Why some engines are stores and others are hooks: a store owns state (Project, Editor). A hook owns verbs over state it doesn't hold — Canvas and Asset read the active project and write back via updateProject. Keeping element data on the Project, not in the Canvas engine, is the aggregate root enforced.

Core systems & decisions

Composition-space coordinates

Elements are stored in output resolution (16:9 = 1920×1080), not screen pixels; the editor renders a scaled view. The export must match the editor — store coordinates once at final resolution and every view (editor at ~50%, Remotion at 100%) just multiplies by its own scale. This one conversion powers canvas dragging, drop-to-canvas, and scrubbing.

data → screen : × scale   (shrink to fit the window)
screen → data : ÷ scale   (grow a drag back to real coords)

Timeline coordinate math

The same idea, on the time axis. Dragging a clip to retime it is updateElement(id, { startFrame }) — the same verb as canvas dragging, through the same door.

pxPerFrame  = trackWidth / totalFrames
frameToX(f) = f × pxPerFrame          (draw a clip / ruler tick)
xToFrame(x) = round(x / pxPerFrame)   (scrub / drag)

Time-based playback clock

Playback advances by real elapsed time × fps, not currentFrame++ per animation frame. requestAnimationFrame fires at the monitor's rate — 60 or 120Hz, dropping under load — so frame++ would play 30fps content at 60fps on a 60Hz screen. Measuring wall-clock time keeps speed correct on any hardware.

Persistence, split by data shape

Object URLs die on reload, so the bytes are persisted and a fresh URL is minted each session. localStorage can't hold large binaries; IndexedDB is built for Blobs. Editor view state is intentionally not persisted — you don't want to reopen frozen mid-playback.

metadata (JSON, small)  → localStorage via Zustand persist
media bytes (binary)    → IndexedDB

Undo/redo — snapshots + coalescing

History is snapshots of the projects array. Because edits build new objects immutably, snapshots share unchanged sub-objects — no deep copies. Rapid edits within ~500ms coalesce into one step, so a whole drag or a typing burst is one undo. It was nearly free to build, because every edit already flows through updateProject.

Bugs that taught me something

Problems faced, and how they fell.

Bug

Omit on a discriminated union silently collapses to common fields — updateElement lost content, assetId, and friends with no error.

Fix

An ElementPatch type: the intersection of per-member partials, so every field of every union member is patchable.

Bug

Remotion's <Composition> inferred props as unknown. An interface isn't assignable to Record<string, unknown> — it could be augmented later; a type alias is.

Fix

Changed interface ExportProps to type ExportProps. One keyword, fixed inference.

Bug

contenteditable cursor jumped to the start on every keystroke — React re-rendering the element reset the DOM selection.

Fix

Set initial text via a ref on mount only, then let onInput push to the store without React re-writing the node.

Bug

react-moveable under a scaled canvas: element coords are composition-space but the stage is scaled, so handles and elements disagreed.

Fix

Drive drag/resize with client-pixel deltas ÷ scale, committing composition coords on release.

Bug

Layer reorder did nothing. The first attempt shuffled array order — but stacking is driven by zIndex, not array order.

Fix

Rewrote it to reassign contiguous zIndex values.

Bug

Undo restored dead blob: URLs — asset rehydration went through updateProject and entered history.

Fix

Rehydration became a silent update ({ history: false }).

Bug

White-on-white text: default text color was #ffffff on a white canvas. Invisible, and no error anywhere.

Fix

A default that contrasts with the canvas — and a lesson that "no crash" ≠ "correct."

What each decision bought
Project as aggregate rootUndo/redo + autosave added at one point, covered everything
Engines own verbs, not stateNo state drift; features stayed thin and composable
One shared rendererWYSIWYG guaranteed, not hoped for
Composition-space coordsEditor preview = export, at any zoom
Frames + <Sequence>Timeline maps directly onto Remotion; export just works
Immutable updatesCheap undo (structural sharing) + reliable re-renders
Discriminated-union elementsNew element type = one type + one renderer
Trade-offs, honestly
  • Export is terminal-based — a true in-browser render needs a backend or Lambda; Remotion can't render inside a tab.
  • Uploaded media in exports needs real file paths in props.json — blob: URLs are browser-only.
  • Editor audio/video preview is muted and autoplay-dependent; the export is authoritative for sound and timing.
  • No scene grouping yet — sequencing is done by positioning clips on the timeline.
  • Fonts fall back to system sans-serif in the Node render (not bundled yet).
Lessons
01

A single mutation path is a superpower — it's what made undo, autosave, and WYSIWYG cheap. Decide where data changes before deciding how.

02

Store data in the target domain (output resolution, frames), not the view's units — views come and go, the data shouldn't.

03

"No error" isn't "correct" — white-on-white text and silent Omit-on-union both shipped zero warnings.

04

Buy the boring parts (moveable handles, encoding) and build the parts that are actually your product: the composition model, the timeline, the animation system.