Wan 2.2 vace fun 14b pose (Image-to-Video)

This documentation is valid for the following list of our models:

  • alibaba/wan2.2-vace-fun-a14b-pose

Vace is a video generation model that combines a source image, mask, and reference video to produce prompted videos with precise source control.

Setup your API Key

If you don’t have an API key for the AI/ML API yet, feel free to use our Quickstart guide.

How to Make a Call

Step-by-Step Instructions

Generating a video using this model involves sequentially calling two endpoints:

  • The first one is for creating and sending a video generation task to the server (returns a generation ID).

  • The second one is for requesting the generated video from the server using the generation ID received from the first endpoint.

Below, you can find two corresponding API schemas and examples for both endpoint calls.

Full Example: Generating and Retrieving the Video From the Server

The code below creates a video generation task, then automatically polls the server every 10 seconds until it finally receives the video URL.

import requests
import time

# replace <YOUR_AIMLAPI_KEY> with your actual AI/ML API key
api_key = "<YOUR_AIMLAPI_KEY>"
base_url = "https://api.aimlapi.com/v2"


# Creating and sending a video generation task to the server
def generate_video():
    url = f"{base_url}/video/generations"
    headers = {
        "Authorization": f"Bearer {api_key}", 
    }

    data = {
        "model": "alibaba/wan2.2-vace-fun-a14b-pose",
        "video_url": "https://storage.googleapis.com/falserverless/example_inputs/wan_animate_input_video.mp4",
        "prompt": "A lone woman strides through the neon-drenched streets of Tokyo at night.  Her crimson dress, a vibrant splash of color against the deep blues and blacks of the cityscape, flows slightly with each step. A tailored black jacket, crisp and elegant, contrasts sharply with the dress's rich texture. Medium shot:  The city hums around her, blurred lights creating streaks of color in the background. Close-up:  The fabric of her dress catches the streetlight's glow, revealing a subtle silk sheen and the intricate stitching at the hem. Her black jacket’s subtle texture is visible – a fine wool perhaps, with a matte finish. The overall mood is one of quiet confidence and mystery, a vibrant woman navigating a bustling, nocturnal landscape. High resolution 4k.", 
        "resolution": "720p",
    }
 
    response = requests.post(url, json=data, headers=headers)
    
    if response.status_code >= 400:
        print(f"Error: {response.status_code} - {response.text}")
    else:
        response_data = response.json()
        print(response_data)
        return response_data
    

# Requesting the result of the task from the server using the generation_id
def get_video(gen_id):
    url = f"{base_url}/video/generations"
    params = {
        "generation_id": gen_id,
    }
    
    headers = {
        "Authorization": f"Bearer {api_key}", 
        "Content-Type": "application/json"
        }

    response = requests.get(url, params=params, headers=headers)
    return response.json()


def main():
     # Running video generation and getting a task id
    gen_response = generate_video()
    gen_id = gen_response.get("id")
    print("Generation ID:  ", gen_id)

    # Trying to retrieve the video from the server every 10 sec
    if gen_id:
        start_time = time.time()

        timeout = 600
        while time.time() - start_time < timeout:
            response_data = get_video(gen_id)

            if response_data is None:
                print("Error: No response from API")
                break
        
            status = response_data.get("status")
            print("Status:", status)

            if status == "waiting" or status == "active" or  status == "queued" or status == "generating":
                print("Still waiting... Checking again in 10 seconds.")
                time.sleep(10)
            else:
                print("Processing complete:/n", response_data)
                return response_data
   
        print("Timeout reached. Stopping.")
        return None     


if __name__ == "__main__":
    main()
Response
Generation ID:   b5592d70-dd31-4e5a-bc5c-5063660c001b:alibaba/wan2.2-vace-fun-a14b-pose
Status: generating
Still waiting... Checking again in 10 seconds.
Status: generating
Still waiting... Checking again in 10 seconds.
Status: generating
Still waiting... Checking again in 10 seconds.
Status: generating
Still waiting... Checking again in 10 seconds.
Status: generating
Still waiting... Checking again in 10 seconds.
Status: completed
Processing complete:/n {"id":"b5592d70-dd31-4e5a-bc5c-5063660c001b:alibaba/wan2.2-vace-fun-a14b-pose","status":"completed","video":{"url":"https://v3b.fal.media/files/b/rabbit/L3U6CofKB0xe_fgCTKj4G.mp4"}}

API Schemas

Video Generation

This endpoint creates and sends a video generation task to the server — and returns a generation ID.

post
Body
modelundefined · enumRequiredPossible values:
video_urlstring · uriRequired

URL to the source video file. Required for pose task

promptstringRequired

The text description of the scene, subject, or action to generate in the video.

negative_promptstringOptional

The description of elements to avoid in the generated video.

Default: letterboxing, borders, black bars, bright colors, overexposed, static, blurred details, subtitles, style, artwork, painting, picture, still, overall gray, worst quality, low quality, JPEG compression residue, ugly, incomplete, extra fingers, poorly drawn hands, poorly drawn faces, deformed, disfigured, malformed limbs, fused fingers, still picture, cluttered background, three legs, many people in the background, walking backwards
match_input_num_framesbooleanOptional
num_framesinteger · min: 81 · max: 241Optional

Number of frames to generate. Must be between 81 to 241 (inclusive)

Default: 81
match_input_frames_per_secondbooleanOptional

Whether to match the input video's frames per second (FPS).

frames_per_secondinteger · min: 5 · max: 30Optional

Frames per second of the generated video. Must be between 5 to 30

Default: 16
seedintegerOptional

Varying the seed integer is a way to get different results for the same other request parameters. Using the same value for an identical request will produce similar results. If unspecified, a random number is chosen.

resolutionstring · enumOptional

An enumeration where the short side of the video frame determines the resolution.

Default: autoPossible values:
aspect_ratiostring · enumOptional

The aspect ratio of the generated video.

Default: autoPossible values:
num_inference_stepsintegerOptional

Number of inference steps for sampling. Higher values give better quality but take longer.

Default: 30
guidance_scalenumberOptional

Classifier-free guidance scale. Controls prompt adherence / creativity.

Default: 5
shiftnumberOptional

Noise schedule shift parameter. Affects temporal dynamics.

Default: 5
image_liststring · uri[]Optional

Array of image URLs (2-4 images) for multi-image-to-video generation.

image_urlstring · uriOptional

URL of the image to be used as the first frame of the video.

last_image_urlstring · uriOptional

A direct link to an online image or a Base64-encoded local image to be used as the last frame of the video.

enable_safety_checkerbooleanOptional

If set to true, the safety checker will be enabled.

enable_prompt_expansionbooleanOptional

Whether to enable prompt expansion.

preprocessbooleanOptional

Whether to preprocess the input video.

accelerationstring · enumOptional

Acceleration to use for inference. None or regular are available.

Default: regularPossible values:
video_qualitystring · enumOptional

The quality of the generated video.

Default: highPossible values:
video_write_modestring · enumOptional

The method used to write the video. Fast, balanced, small are available.

Default: balancedPossible values:
num_interpolated_framesintegerOptional

Number of frames to interpolate between the original frames

temporal_downsample_factorintegerOptional

Temporal downsample factor for the video

enable_auto_downsamplebooleanOptional

The minimum frames per second to downsample the video to

auto_downsample_min_fpsnumberOptional

The minimum frames per second to downsample the video to

Default: 15
interpolator_modelstring · enumOptional

The model to use for interpolation. Rife, or film are available.

Default: filmPossible values:
sync_modebooleanOptional

The synchronization mode for audio and video. Loose or tight are available.

Responses
201Success

No content

post
POST /v2/video/generations HTTP/1.1
Host: api.aimlapi.com
Content-Type: application/json
Accept: */*
Content-Length: 1189

{
  "model": "alibaba/wan2.2-vace-fun-a14b-pose",
  "video_url": "https://example.com",
  "prompt": "text",
  "negative_prompt": "letterboxing, borders, black bars, bright colors, overexposed, static, blurred details, subtitles, style, artwork, painting, picture, still, overall gray, worst quality, low quality, JPEG compression residue, ugly, incomplete, extra fingers, poorly drawn hands, poorly drawn faces, deformed, disfigured, malformed limbs, fused fingers, still picture, cluttered background, three legs, many people in the background, walking backwards",
  "match_input_num_frames": true,
  "num_frames": 81,
  "match_input_frames_per_second": true,
  "frames_per_second": 16,
  "seed": 1,
  "resolution": "auto",
  "aspect_ratio": "auto",
  "num_inference_steps": 30,
  "guidance_scale": 5,
  "shift": 5,
  "image_list": [
    "https://example.com"
  ],
  "image_url": "https://example.com",
  "last_image_url": "https://example.com",
  "enable_safety_checker": true,
  "enable_prompt_expansion": true,
  "preprocess": true,
  "acceleration": "regular",
  "video_quality": "high",
  "video_write_mode": "balanced",
  "num_interpolated_frames": 1,
  "temporal_downsample_factor": 1,
  "enable_auto_downsample": true,
  "auto_downsample_min_fps": 15,
  "interpolator_model": "film",
  "sync_mode": true
}
201Success

No content

Fetch the video

After sending a request for video generation, this task is added to the queue. This endpoint lets you check the status of a video generation task using its id, obtained from the endpoint described above. If the video generation task status is complete, the response will include the final result — with the generated video URL and additional metadata.

get
Query parameters
generation_idstringRequired
Responses
200Success

No content

get
GET /v2/video/generations?generation_id=text HTTP/1.1
Host: api.aimlapi.com
Accept: */*
200Success

No content

Last updated

Was this helpful?