Text-to-Video Examples

Real examples generated using Open-Sora. Try these prompts yourself or explore the Open-Sora-Demo repository for more examples.

Raining Sea Video - Open-Sora generated video showing rain falling on sea

Raining Sea

Prompt: "raining, sea"

256px resolution, motion score 4, 17 frames

torchrun --nproc_per_node 1 --standalone \
    scripts/diffusion/inference.py \
    configs/diffusion/inference/256px.py \
    --prompt "raining, sea" \
    --save-dir samples
Mountain Landscape Video

Mountain Landscape

Prompt: "A beautiful sunset over mountains with clouds, cinematic view"

768px resolution, 16:9 aspect ratio, high quality

torchrun --nproc_per_node 8 --standalone \
    scripts/diffusion/inference.py \
    configs/diffusion/inference/768px.py \
    --prompt "A beautiful sunset over mountains with clouds" \
    --aspect_ratio 16:9
Urban Night Scene Video

Urban Night Scene

Prompt: "Busy city street at night with neon lights, people walking"

256px resolution, motion score 7, dynamic scene

torchrun --nproc_per_node 1 --standalone \
    scripts/diffusion/inference.py \
    configs/diffusion/inference/t2i2v_256px.py \
    --prompt "Busy city street at night with neon lights" \
    --motion-score 7
Ocean Waves Video

Ocean Waves

Prompt: "Waves crashing on a rocky shore, dramatic ocean scene"

256px resolution, motion score 5, natural motion

torchrun --nproc_per_node 1 --standalone \
    scripts/diffusion/inference.py \
    configs/diffusion/inference/256px.py \
    --prompt "Waves crashing on a rocky shore" \
    --motion-score 5
Forest Path Video

Forest Path

Prompt: "A peaceful forest path with sunlight filtering through trees"

256px resolution, motion score 3, serene atmosphere

torchrun --nproc_per_node 1 --standalone \
    scripts/diffusion/inference.py \
    configs/diffusion/inference/256px.py \
    --prompt "A peaceful forest path with sunlight filtering through trees" \
    --motion-score 3
Space Animation Video

Space Animation

Prompt: "Galaxy rotating in space, stars twinkling, cosmic scene"

768px resolution, motion score 6, high detail

torchrun --nproc_per_node 8 --standalone \
    scripts/diffusion/inference.py \
    configs/diffusion/inference/768px.py \
    --prompt "Galaxy rotating in space, stars twinkling" \
    --motion-score 6

More Examples Available

Explore the official Open-Sora-Demo repository for:

  • Complete video demonstrations with GIFs and videos
  • Additional prompt examples across different categories
  • Gradio web interface demos
  • Model comparison examples (v1.0, v1.1, v1.2, v1.3, v2.0)
  • Performance benchmarks and scaling demos
  • Motion score comparison examples
  • Aspect ratio variations

Note: The examples above link to the GitHub repository where you can view the actual generated videos and GIFs. Visit the demo folder to explore all available examples organized by version.

Image-to-Video Examples

Transform static images into dynamic videos with Open-Sora's image-to-video generation capabilities.

Pig in Mud Pond Video

Pig in Mud Pond

Prompt: "A plump pig wallows in a muddy pond on a rustic farm, its pink snout poking out as it snorts contentedly. The camera captures the pig's playful splashes, sending ripples through the water under the midday sun."

256px resolution, i2v_head condition, reference image provided

torchrun --nproc_per_node 1 --standalone \
    scripts/diffusion/inference.py \
    configs/diffusion/inference/256px.py \
    --cond_type i2v_head \
    --prompt "A plump pig wallows in a muddy pond..." \
    --ref assets/texts/i2v.png
Landscape Animation Video

Landscape Animation

Prompt: "Serene landscape with flowing water and gentle breeze, clouds moving across the sky"

768px resolution, batch processing, multiple frames

torchrun --nproc_per_node 8 --standalone \
    scripts/diffusion/inference.py \
    configs/diffusion/inference/768px.py \
    --cond_type i2v_head \
    --dataset.data-path assets/texts/i2v.csv
Portrait Animation Video

Portrait Animation

Prompt: "Portrait of a person with gentle facial expressions, subtle head movement"

256px resolution, i2v_head condition, low motion

torchrun --nproc_per_node 1 --standalone \
    scripts/diffusion/inference.py \
    configs/diffusion/inference/256px.py \
    --cond_type i2v_head \
    --prompt "Portrait with gentle expressions" \
    --ref assets/texts/portrait.png \
    --motion-score 2

Image-to-Video Tips

  • Use high-quality reference images for best results
  • Provide detailed prompts describing the desired motion
  • Batch processing is supported via CSV files
  • Works best with clear, well-lit images

Motion Score Comparison

Control the intensity of motion in generated videos. Motion scores range from 1 (static) to 7 (high motion). See examples in the demo repository.

Low Motion Example

Low Motion (Score 1)

Static scenes with minimal movement. Perfect for still life, portraits, or architectural shots.

Example: "A still photograph of a vintage camera on a wooden table"

--motion-score 1
Balanced Motion Example

Balanced Motion (Score 4)

Default setting with moderate movement. Ideal for most general video generation tasks.

Example: "A cat walking gracefully across a garden path"

--motion-score 4
High Motion Example

High Motion (Score 7)

Dynamic scenes with intense movement. Great for action sequences, sports, or fast-paced content.

Example: "A skateboarder performing tricks in an urban skatepark"

--motion-score 7

Dynamic Motion Score

Open-Sora also supports automatic motion score evaluation using OpenAI's API. Set --motion-score dynamic to let the system determine the optimal motion level based on your prompt.

export OPENAI_API_KEY=sk-xxxx
torchrun --nproc_per_node 1 --standalone \
    scripts/diffusion/inference.py \
    configs/diffusion/inference/t2i2v_256px.py \
    --prompt "your prompt here" \
    --motion-score dynamic

Aspect Ratio Examples

Open-Sora supports multiple aspect ratios to match your content needs. Use the --aspect_ratio parameter to specify your desired format.

16:9 (Widescreen)

Standard widescreen format, perfect for cinematic videos, YouTube, and general video content.

--aspect_ratio 16:9

9:16 (Vertical)

Vertical format, ideal for mobile viewing, TikTok, Instagram Stories, and Shorts.

--aspect_ratio 9:16

1:1 (Square)

Square format, great for Instagram posts, Facebook videos, and social media content.

--aspect_ratio 1:1

2.39:1 (Cinematic)

Ultra-wide cinematic format, perfect for film-style content and anamorphic video.

--aspect_ratio 2.39:1

Gradio Web Interface

Try Open-Sora interactively using the Gradio web interface. Launch the demo locally or explore hosted versions.

Launch Gradio Interface

Run the Gradio demo to generate videos through a user-friendly web interface:

# Install Gradio dependencies
pip install gradio

# Launch the web interface
python gradio/app.py

Features Available in Gradio Interface:

  • Interactive text-to-video generation
  • Image-to-video conversion
  • Real-time preview of generated videos
  • Motion score adjustment
  • Aspect ratio selection
  • Batch processing support

For more information and examples, visit the Open-Sora-Demo repository.

Real-World Prompt Examples

Explore diverse prompts that showcase Open-Sora's capabilities across different scenarios and styles.

Popular Prompt Categories

Nature & Landscapes

  • "A serene mountain lake at sunrise, mist rising from the water"
  • "Dense forest with sunlight filtering through tall trees"
  • "Desert dunes shifting in the wind, golden hour lighting"
  • "Waterfall cascading down moss-covered rocks"

Urban & Architecture

  • "Busy city intersection during rush hour, time-lapse effect"
  • "Modern skyscraper reflecting sunset colors"
  • "Historic European street with cobblestones, people walking"
  • "Neon-lit Tokyo street at night, cyberpunk aesthetic"

Animals & Wildlife

  • "Eagle soaring over mountain peaks, cinematic shot"
  • "Dolphins swimming in crystal clear ocean water"
  • "Lion pride resting in African savanna, golden hour"
  • "Butterfly landing on flower, macro photography style"

Abstract & Artistic

  • "Abstract fluid dynamics, colorful paint mixing"
  • "Geometric patterns morphing and transforming"
  • "Cosmic nebula with stars and galaxies rotating"
  • "Particle effects forming abstract shapes"

Tip: More detailed prompts often yield better results. Include descriptions of motion, lighting, camera movement, and style for optimal video generation.

Code Examples & Usage

Complete code examples for common use cases with Open-Sora.

Basic Text-to-Video

# Single GPU for 256px resolution
torchrun --nproc_per_node 1 --standalone \
    scripts/diffusion/inference.py \
    configs/diffusion/inference/256px.py \
    --prompt "A serene beach at sunset with waves gently crashing" \
    --save-dir ./outputs \
    --aspect_ratio 16:9 \
    --num_frames 17

High-Resolution Multi-GPU

# Multi-GPU for 768px resolution
torchrun --nproc_per_node 8 --standalone \
    scripts/diffusion/inference.py \
    configs/diffusion/inference/768px.py \
    --prompt "A futuristic cityscape with flying vehicles" \
    --save-dir ./outputs \
    --aspect_ratio 16:9 \
    --motion-score 5

Image-to-Video with Batch Processing

# Process multiple images from CSV
torchrun --nproc_per_node 1 --standalone \
    scripts/diffusion/inference.py \
    configs/diffusion/inference/256px.py \
    --cond_type i2v_head \
    --dataset.data-path assets/texts/i2v.csv \
    --save-dir ./outputs

Reproducible Generation

# Generate consistent results with seed
torchrun --nproc_per_node 1 --standalone \
    scripts/diffusion/inference.py \
    configs/diffusion/inference/t2i2v_256px.py \
    --prompt "Your prompt here" \
    --sampling_option.seed 42 \
    --seed 42 \
    --num-sample 3

Performance Examples

Real-world performance metrics and benchmarks from Open-Sora 2.0.

256x256 Resolution

  • 1 GPU: 60 seconds / 52.5GB memory
  • 2 GPUs: 40 seconds / 44.3GB memory
  • 4 GPUs: 34 seconds / 44.3GB memory

Tested on H100/H800 GPUs with 50 inference steps

768x768 Resolution

  • 1 GPU: 1656 seconds / 60.3GB memory
  • 2 GPUs: 863 seconds / 48.3GB memory
  • 4 GPUs: 466 seconds / 44.3GB memory
  • 8 GPUs: 276 seconds / 44.3GB memory

Tested on H100/H800 GPUs with 50 inference steps

Benchmark Results

Open-Sora 2.0 achieves:

  • VBench Score: Only 0.69% gap with OpenAI's Sora (down from 4.52%)
  • Human Preference: On par with HunyuanVideo 11B and Step-Video 30B
  • Cost Efficiency: Training completed in $200k
  • 46% Cost Reduction compared to baseline implementations

Want to Create Your Own?

Start generating videos with Open-Sora