Somake
Toggle sidebar
About Privacy Terms

Wan

Wan 2.6 transforms text and images into videos with lip-sync, multi-character dialogue, and custom personas.

Examples
Model
Wan 2.6

Smart shot scheduling for multi-shot storytelling

2m
100
Wan 2.5

High-Quality Video with Integrated Audio

2m
100
Wan 2.2 Turbo

Fast and really affordable

40s
20
Wan 2.2

Professional quality and budget friendly

1m
100
Audio Upload
Drag & drop your audio here, or click to browse
Audio files supported (e.g., MP3, WAV), max size 10 MB
First Frame
Edit Image
Edit preview
Drag to reposition
Zoom
1x 3x
Aspect Ratio
1:2 2:1
Prompt
/2000
Edit Prompt
/2000
Settings
Resolution
720p
1080p
Aspect Ratio
1:1
3:4
9:16
4:3
16:9
Duration
5s
10s
15s
Multi Shots

No history found

What is Wan

Wan is an open-source AI video generation model series developed by Alibaba Group's Tongyi Lab. The Wan family represents Alibaba's flagship effort in multimodal AI, designed to transform text prompts, images, and reference videos into high-quality video content with realistic motion and visual consistency.

Current Version: Wan 2.6 (December 2025)

Wan 2.6 — Latest Updates

Last updated: December 2025

Wan 2.6 launched shortly after version 2.5, focusing on tighter multimodal integration and expanded creative controls. This release addresses key limitations in earlier versions while introducing features designed for more complex content creation workflows.

Key improvements in Wan 2.6:

  • Native audio generation upgraded: Audio quality has improved substantially compared to Wan 2.5, with more natural-sounding output, though it still trails behind premium competitors like Veo 3 and Sora 2 in voice realism

  • Extended duration: Support for up to 15-second clips at 1080P, with the ability to combine multiple clips for longer sequences

  • Character reference system: Upload up to three character references from video to maintain consistency across generations (Note: This feature is not yet available on Somake)

  • Personal avatar creation: Record your own face from multiple angles and voice samples to create a consistent AI persona (Note: This feature is not yet available on Somake)

  • Multi-character dialogue: Clean handling of conversations between multiple characters without speech overlap

  • Environment and wardrobe control: Change character clothing and scene environments through prompts

  • Fluid motion quality: Video output features convincing camera effects like zoom and blur with smooth movement

Current limitations to be aware of:

  • Character resemblance and voice matching can be inconsistent—faces and voices sometimes differ from reference material

  • Complex action sequences with multiple characters (such as fight scenes) may produce visual artifacts and distortions

  • Anime-style video generation produces weaker visual quality compared to realistic styles

  • Some feature inconsistencies may occur, including occasional language mismatches in output

  • Unexpected elements or surreal outputs can appear, a common challenge in current text-to-video AI

Version History & Specs

Version

Key Capabilities

Max Duration

Max Resolution

Audio Support

Wan 2.1

Text-to-video, Image-to-video, Visual text generation

5 seconds

720P

No

Wan 2.2

Improved efficiency, VACE integration, Open-source

5 seconds

720P

No

Wan 2.5

Audio-visual sync introduced, Enhanced motion

10 seconds

1080P

Basic

Wan 2.6

Multi-shot narratives, Character references, Custom personas

15 seconds

1080P

Improved native A/V

Use Cases

For Marketers and Small Businesses

  • Quick Social Media Ads: Need a catchy 10-second video for Instagram? Just type, "A dynamic shot of our new sneaker splashing through a puddle, cinematic, high-energy," and get a professional-looking ad in minutes.

  • Product Visualizations: Create videos showing your product in any setting imaginable. "Our new coffee mug on a desk in a cozy, rain-swept Parisian cafe, steam rising."

For Educators and Students

  • Visualizing History: A teacher could generate a clip of "Roman soldiers marching through a forest, seen from a low angle" to make lessons more engaging.

  • Explaining Science: A student could create a video to explain a complex topic, like "An animated journey through a plant cell, showing the mitochondria at work."

For Artists and Independent Filmmakers

  • Rapid Prototyping: Quickly visualize a scene from your script to test if the mood and composition work, saving valuable time and resources.

  • Unique Visual Effects (VFX): Generate surreal, dream-like sequences or abstract background visuals that would be difficult or impossible to film in real life.

Advanced Prompting for Wan 2.6

Multi-Shot Storytelling Prompt Template

A cinematic [genre] scene.

Shot 1: [Wide/Medium/Close-up] shot, [describe scene, character, and action].

Shot 2: [Camera angle], [describe transition and new focus].

Shot 3: [Camera angle], [describe resolution or final moment].

Style: [realistic/cinematic/stylized]. Lighting: [natural/dramatic/soft].

Character Reference Best Practices

  • Use front-facing footage with clear lighting for character references

  • Record reference videos showing multiple angles when creating personal avatars

  • Limit to 3 character references maximum for best consistency

  • For voice matching, provide clear audio samples without background noise

  • Expect some variation in face and voice reproduction—plan for multiple generations

Scene Complexity Guidelines

  • Works well: Dialogue scenes, talking heads, single-character focus, simple interactions, conversational multi-character scenes

  • Use caution: Action sequences with multiple characters, fight choreography, rapid movement

  • Avoid or expect artifacts: Complex anime styles, highly dynamic group scenes

Prompt Expansion

Enable prompt expansion when your input is simple or you want richer visual detail. The system adds descriptive elements to improve composition, style consistency, and visual coherence in the output.

Troubleshooting Common Issues

Problem: Voice sounds robotic or unnatural → Solution: This is a current limitation of Wan 2.6. For projects requiring highly realistic voices, consider using the video output with separately generated or recorded audio.

Problem: Unexpected characters or surreal elements appear → Solution: AI artifacts are common in text-to-video generation. Simplify your prompt, reduce the number of characters or elements, and regenerate. Review outputs carefully before use.

Problem: Action scenes have visual distortions → Solution: Complex action sequences with multiple characters are a known weakness. Break dynamic scenes into simpler shots, focus on one or two characters per clip, and avoid choreographed fight sequences.

Problem: Anime-style output looks poor → Solution: Wan 2.6's anime generation is notably weak. For anime content, consider alternative models or use realistic style prompts instead.

Problem: Language mismatch in generated content → Solution: Some language inconsistencies may occur. Specify your target language clearly in the prompt and regenerate if output doesn't match expectations.

Why Choose Somake to Power Your AI Video Creations?

1

No Technical Skills Required

The intuitive interface lets anyone create professional visuals—just describe what you want and generate in seconds.

2

All-in-One Creative Suite

Handle both image and video generation on a single platform, streamlining your workflow from concept to final output.

3

Commercial Usage Rights

Paid subscribers get full commercial rights to their creations, making it easy to use outputs in ads, campaigns, and client projects.

FAQ

Not at all! That's the main benefit of our platform. We manage all the complex processing on our servers. All you need is a device with a web browser.

Yes! Any video you generate on our platform is yours to use. They are perfect for commercial use in marketing campaigns, for content on your monetized YouTube channel, or for any other business purpose.

Wan 2.6 is an open-source AI video generation model developed by Alibaba that creates videos from text, images, or reference videos. It features multi-shot storytelling, native audio synchronization, and character consistency tools, with output up to 15 seconds at 1080P resolution.

Audio quality has improved significantly from Wan 2.5 and approaches the quality of premium models, though voices can still sound noticeably robotic compared to Veo 3 and Sora 2.

Somake
Forgot Password Create an account Welcome Back Start creating in seconds Welcome to Somake
Enter your email to receive password reset instructions Enter your email address to create an account. Sign in to your account to continue creating. Sign up free and get: Sign in with Google to claim your credits and start creating for free!
Free credits to start Access 300+ AI tools Download in HD quality
OR
Remember me
Remember your password?

Join 500,000+ creators

By logging in, you agree to our Terms of Service and Privacy Policy .