Stable Video Diffusion to niezależne narzędzie wideo AI zbudowane na otwartej rodzinie modeli Stable Video Diffusion (SVD, SVD-XT). Nie jest powiązane z Stability AI. Wszystkie znaki towarowe należą do ich odpowiednich właścicieli.

Refined 14-frame base model with consistent motion

Stable Video Diffusion 1.1 image to video

Stable Video Diffusion 1.1 is the refined release of Stability AI's base image-to-video model. It generates 14 frames from a single still using latent video diffusion, fine-tuned at fixed conditioning (6 fps, motion bucket 127) for more consistent, predictable motion and fewer artifacts than the original checkpoint.

Try Stable Video Diffusion 1.1 free Get started

58/500

Format

Stable Video Diffusion 1.1 at a glance

Max resolution

60s

Max duration

Full

Commercial license

Minutes

Typical render time

What Stable Video Diffusion 1.1 can do

Everything you need to turn a still image into motion.

14-frame base

SVD 1.1 produces 14 frames at 576x1024 from one image, the dependable starting point for image-to-video generation.

Fixed conditioning

Fine-tuned at 6 fps and motion bucket 127, it delivers reliable, repeatable motion without hunting for settings.

Consistent frames

Temporal latent diffusion keeps subjects stable across the clip, reducing the warping and flicker common in earlier checkpoints.

Lightweight & open

As the base SVD model it runs faster and lighter than the XT variant, ideal for quick iterations and shorter loops.

How Stable Video Diffusion 1.1 works

Upload an image

Start from any clear still — a portrait, product shot, or landscape.

Describe the motion

Pick a preset or write a short prompt to direct how the scene moves.

Generate & download

Render in minutes and download a ready-to-share video.

Stable Video Diffusion 1.1 FAQ

SVD 1.1 is a refined fine-tune of the base SVD checkpoint. It was trained at fixed conditioning of 6 fps and motion bucket id 127, which yields more consistent, higher-quality motion than the original 1.0 release.

The base SVD 1.1 model generates 14 frames at 576x1024 from a single input image. For longer 25-frame clips, use the SVD-XT variant instead.

Stable Video Diffusion 1.1 is an image-to-video model. It animates a still image through latent video diffusion; there is no text prompt driving the motion.

Pick SVD 1.1 when you want fast, consistent results and a shorter 14-frame clip. Its fixed conditioning makes output predictable, which is handy for quick tests and tight loops.

Explore more models & tools

Pick the right model or creative tool for your next clip.

Stable Video Diffusion XT

25-frame image-to-video for longer, smoother clips

Stable Video Diffusion XT (SVD-XT) is the extended image-to-video model from Stability AI. Built on the same latent video diffusion backbone as the base model, it is fine-tuned to generate 25 frames from a single still, producing longer and noticeably smoother motion. You steer the result with motion bucket and fps conditioning instead of a text prompt.

Explore

Image to Video

Animate a still with Stable Video Diffusion

Image to Video uses Stable Video Diffusion to turn a single still image into a short, coherent motion clip. Upload a frame and SVD conditions on it to generate smooth movement while keeping your subject anchored.

Open tool

Text to Video

Turn a prompt into a Stable Video Diffusion clip

Text to Video pairs a text-to-image step with Stable Video Diffusion: your written prompt becomes a starting frame, then SVD animates it into a short clip — no source photo required.

Open tool

Motion Control

Tune motion bucket and frame count

Motion Control gives you direct access to Stable Video Diffusion's key dials — the motion bucket, frame count, and target fps — so you can fine-tune exactly how much a still image moves.

Open tool

Showcase gallery

Browse real Stable Video Diffusion creations made with our models.

View showcase

Animate your first image with Stable Video Diffusion 1.1

Upload a photo, describe the motion, and get a cinematic clip in minutes.

Start free Get started