Somake

Wan

Wan 2.6 transforms text and images into videos with lip-sync, multi-character dialogue, and custom personas.

Examples
0/2000
Settings
Resolution
Aspect Ratio
Duration
Multi Shots

What is Wan

Wan is an open-source AI video generation model series developed by Alibaba Group's Tongyi Lab. The Wan family represents Alibaba's flagship effort in multimodal AI, designed to transform text prompts, images, and reference videos into high-quality video content with realistic motion and visual consistency.

Current Version: Wan 2.6 (December 2025)

Wan 2.6 — Latest Updates

Last updated: December 2025

Wan 2.6 launched shortly after version 2.5, focusing on tighter multimodal integration and expanded creative controls. This release addresses key limitations in earlier versions while introducing features designed for more complex content creation workflows.

Key improvements in Wan 2.6:

  • Native audio generation upgraded: Audio quality has improved substantially compared to Wan 2.5, with more natural-sounding output, though it still trails behind premium competitors like Veo 3 and Sora 2 in voice realism

  • Extended duration: Support for up to 15-second clips at 1080P, with the ability to combine multiple clips for longer sequences

  • Character reference system: Upload up to three character references from video to maintain consistency across generations (Note: This feature is not yet available on Somake)

  • Personal avatar creation: Record your own face from multiple angles and voice samples to create a consistent AI persona (Note: This feature is not yet available on Somake)

  • Multi-character dialogue: Clean handling of conversations between multiple characters without speech overlap

  • Environment and wardrobe control: Change character clothing and scene environments through prompts

  • Fluid motion quality: Video output features convincing camera effects like zoom and blur with smooth movement

Current limitations to be aware of:

  • Character resemblance and voice matching can be inconsistent—faces and voices sometimes differ from reference material

  • Complex action sequences with multiple characters (such as fight scenes) may produce visual artifacts and distortions

  • Anime-style video generation produces weaker visual quality compared to realistic styles

  • Some feature inconsistencies may occur, including occasional language mismatches in output

  • Unexpected elements or surreal outputs can appear, a common challenge in current text-to-video AI

Version History & Specs

VersionKey CapabilitiesMax DurationMax ResolutionAudio Support
Wan 2.1Text-to-video, Image-to-video, Visual text generation5 seconds720PNo
Wan 2.2Improved efficiency, VACE integration, Open-source5 seconds720PNo
Wan 2.5Audio-visual sync introduced, Enhanced motion10 seconds1080PBasic
Wan 2.6Multi-shot narratives, Character references, Custom personas15 seconds1080PImproved native A/V

Use Cases

For Marketers and Small Businesses

  • Quick Social Media Ads: Need a catchy 10-second video for Instagram? Just type, "A dynamic shot of our new sneaker splashing through a puddle, cinematic, high-energy," and get a professional-looking ad in minutes.

  • Product Visualizations: Create videos showing your product in any setting imaginable. "Our new coffee mug on a desk in a cozy, rain-swept Parisian cafe, steam rising."

For Educators and Students

  • Visualizing History: A teacher could generate a clip of "Roman soldiers marching through a forest, seen from a low angle" to make lessons more engaging.

  • Explaining Science: A student could create a video to explain a complex topic, like "An animated journey through a plant cell, showing the mitochondria at work."

For Artists and Independent Filmmakers

  • Rapid Prototyping: Quickly visualize a scene from your script to test if the mood and composition work, saving valuable time and resources.

  • Unique Visual Effects (VFX): Generate surreal, dream-like sequences or abstract background visuals that would be difficult or impossible to film in real life.

Advanced Prompting for Wan 2.6

Multi-Shot Storytelling Prompt Template

A cinematic [genre] scene.

Shot 1: [Wide/Medium/Close-up] shot, [describe scene, character, and action].

Shot 2: [Camera angle], [describe transition and new focus].

Shot 3: [Camera angle], [describe resolution or final moment].

Style: [realistic/cinematic/stylized]. Lighting: [natural/dramatic/soft].

Character Reference Best Practices

  • Use front-facing footage with clear lighting for character references

  • Record reference videos showing multiple angles when creating personal avatars

  • Limit to 3 character references maximum for best consistency

  • For voice matching, provide clear audio samples without background noise

  • Expect some variation in face and voice reproduction—plan for multiple generations

Scene Complexity Guidelines

  • Works well: Dialogue scenes, talking heads, single-character focus, simple interactions, conversational multi-character scenes

  • Use caution: Action sequences with multiple characters, fight choreography, rapid movement

  • Avoid or expect artifacts: Complex anime styles, highly dynamic group scenes

Prompt Expansion

Enable prompt expansion when your input is simple or you want richer visual detail. The system adds descriptive elements to improve composition, style consistency, and visual coherence in the output.

Troubleshooting Common Issues

Problem: Voice sounds robotic or unnatural → Solution: This is a current limitation of Wan 2.6. For projects requiring highly realistic voices, consider using the video output with separately generated or recorded audio.

Problem: Unexpected characters or surreal elements appear → Solution: AI artifacts are common in text-to-video generation. Simplify your prompt, reduce the number of characters or elements, and regenerate. Review outputs carefully before use.

Problem: Action scenes have visual distortions → Solution: Complex action sequences with multiple characters are a known weakness. Break dynamic scenes into simpler shots, focus on one or two characters per clip, and avoid choreographed fight sequences.

Problem: Anime-style output looks poor → Solution: Wan 2.6's anime generation is notably weak. For anime content, consider alternative models or use realistic style prompts instead.

Problem: Language mismatch in generated content → Solution: Some language inconsistencies may occur. Specify your target language clearly in the prompt and regenerate if output doesn't match expectations.

Why Choose Somake to Power Your AI Video Creations?

1

No Technical Skills Required

The intuitive interface lets anyone create professional visuals—just describe what you want and generate in seconds.

2

All-in-One Creative Suite

Handle both image and video generation on a single platform, streamlining your workflow from concept to final output.

3

Commercial Usage Rights

Paid subscribers get full commercial rights to their creations, making it easy to use outputs in ads, campaigns, and client projects.

FAQ