Wan 2.1 (Text-to-Video)
Last updated
Was this helpful?
Last updated
Was this helpful?
A state-of-the-art video foundation model designed for advanced generative video tasks. Supporting Text-to-Video (T2V), it incorporates groundbreaking innovations to deliver high-quality outputs with exceptional computational efficiency.
Key Features:
Visual text generation: Generates text in both Chinese and English within videos.
Output Quality: Produces videos at resolutions up to 720P with a frame rate of approximately 16 .
If you don’t have an API key for the AI/ML API yet, feel free to use our .
Generating a video using this model involves sequentially calling two endpoints:
The first one is for creating and sending a video generation task to the server (returns a generation ID).
The second one is for requesting the generated video from the server using the generation ID received from the first endpoint.
Below, you can find two corresponding API schemas and examples for both endpoint calls.
This endpoint creates and sends a video generation task to the server — and returns a generation ID.
This endpoint lets you request the generated video from the server using the generation ID received from the first endpoint.
The text prompt to guide video generation.
Mona Lisa puts on glasses with her hands
The negative prompt to use. Use it to address details that you don't want in the image. This could be colors, objects, scenery and even the small details (e.g. moustache, blurry, low resolution).
Random seed for reproducibility. If None, a random seed is chosen.
Aspect ratio of the generated video (16:9 or 9:16).
16:9
Available options: Number of inference steps for sampling. Higher values give better quality but take longer.
30
Classifier-free guidance scale. Controls prompt adherence / creativity.
5
Noise schedule shift parameter. Affects temporal dynamics.
5
The sampler to use for generation.
unipc
Available options: If set to true, the safety checker will be enabled.
Whether to enable prompt expansion.
No Content
No Content