Wan 2.6 (Image-to-Video)

This documentation is valid for the following list of our models:

  • alibaba/wan-2-6-i2v

This model transforms images into dynamic video while preserving character identity, enabling consistent motion and synchronized audio. Compared to earlier versions, Wan 2.6 offers stronger instruction following, higher visual fidelity, and significantly enhanced sound generation.

Setup your API Key

If you don’t have an API key for the AI/ML API yet, feel free to use our Quickstart guide.

How to Make a Call

Step-by-Step Instructions

Generating a video using this model involves sequentially calling two endpoints:

  • The first one is for creating and sending a video generation task to the server (returns a generation ID).

  • The second one is for requesting the generated video from the server using the generation ID received from the first endpoint. Below, you can find both corresponding API schemas.

API Schemas

Create a video generation task and send it to the server

post
Body
modelstring · enumRequiredPossible values:
promptstringRequired

The text description of the scene, subject, or action to generate in the video.

image_urlstring · uriRequired

A direct link to an online image or a Base64-encoded local image that will serve as the visual base or the first frame for the video.

audio_urlstring · uriOptional

The URL of the audio file. The model will use this audio to generate the video.

resolutionstring · enumOptional

An enumeration where the short side of the video frame determines the resolution.

Default: 1080pPossible values:
durationinteger · enumOptional

The length of the output video in seconds.

Default: 10Possible values:
negative_promptstringOptional

The description of elements to avoid in the generated video.

shot_typestring · enumOptional

Specifies the shot type of the generated video, that is, whether the video consists of a single continuous shot or multiple switched shots. This parameter takes effect only when "prompt_extend" is set to 'true':

  • single: (default) Outputs a single-shot video.
  • multi: Outputs a multi-shot video.
Default: singlePossible values:
generate_audiobooleanOptional

Specifies whether to automatically add audio to the generated video. This parameter takes effect only when 'audio_url' is not provided.

Default: true
seedintegerOptional

Varying the seed integer is a way to get different results for the same other request parameters. Using the same value for an identical request will produce similar results. If unspecified, a random number is chosen.

enhance_promptbooleanOptional

Whether to enable prompt expansion.

Default: true
Responses
200Success
application/json
post
/v2/video/generations
200Success

Retrieve the generated video from the server

After sending a request for video generation, this task is added to the queue. This endpoint lets you check the status of a video generation task using its generation_id, obtained from the endpoint described above. If the video generation task status is completed, the response will include the final result — with the generated video URL and additional metadata.

get
Authorizations
AuthorizationstringRequired

Bearer key

Query parameters
generation_idstringRequiredExample: <REPLACE_WITH_YOUR_GENERATION_ID>
Responses
200Success
application/json
get
/v2/video/generations
200Success

Code Example

The code below creates a video generation task, then automatically polls the server every 15 seconds until it finally receives the video URL.

Response

Processing time: ~ 2 min 52 sec.

Generated video (1920x1080, with sound):

Last updated

Was this helpful?