Wan 2.2 VACE Fun Inpainting (Image-to-Video)

This documentation is valid for the following list of our models:

alibaba/wan2.2-vace-fun-a14b-inpainting

A video generation model that combines a source image, mask, and reference video to produce prompted videos with precise source control.

Setup your API Key

If you don’t have an API key for the AI/ML API yet, feel free to use our Quickstart guide.

How to Make a Call

Step-by-Step Instructions

Generating a video using this model involves sequentially calling two endpoints:

The first one is for creating and sending a video generation task to the server (returns a generation ID).
The second one is for requesting the generated video from the server using the generation ID received from the first endpoint.

Below, you can find two corresponding API schemas and examples for both endpoint calls.

API Schemas

Video Generation

This endpoint creates and sends a video generation task to the server — and returns a generation ID.

post

Body

modelundefined · enumRequiredPossible values:

video_urlstring · uriRequired

URL to the source video file. Required for inpainting

mask_video_urlstring · uriOptional

URL to the source mask file. Required for inpainting

mask_image_urlstring · uriOptional

URL to the guiding mask file. If provided, the model will use this mask as a reference to create masked video using salient mask tracking. Will be ignored if mask_video_url is provided

promptstringRequired

The text description of the scene, subject, or action to generate in the video.

negative_promptstringOptional

The description of elements to avoid in the generated video.

Default:

letterboxing, borders, black bars, bright colors, overexposed, static, blurred details, subtitles, style, artwork, painting, picture, still, overall gray, worst quality, low quality, JPEG compression residue, ugly, incomplete, extra fingers, poorly drawn hands, poorly drawn faces, deformed, disfigured, malformed limbs, fused fingers, still picture, cluttered background, three legs, many people in the background, walking backwards

match_input_num_framesbooleanOptional

num_framesinteger · min: 81 · max: 241Optional

Number of frames to generate.

Default: 81

match_input_frames_per_secondbooleanOptional

Whether to match the input video's frames per second (FPS).

frames_per_secondinteger · min: 5 · max: 30Optional

Frames per second of the generated video.

Default: 16

seedintegerOptional

Varying the seed integer is a way to get different results for the same other request parameters. Using the same value for an identical request will produce similar results. If unspecified, a random number is chosen.

resolutionstring · enumOptional

An enumeration where the short side of the video frame determines the resolution.

Default: autoPossible values:

aspect_ratiostring · enumOptional

The aspect ratio of the generated video.

Default: autoPossible values:

num_inference_stepsintegerOptional

Number of inference steps for sampling. Higher values give better quality but take longer.

Default: 30

guidance_scalenumberOptional

Classifier-free guidance scale. Controls prompt adherence / creativity.

Default: 5

shiftnumberOptional

Noise schedule shift parameter. Affects temporal dynamics.

Default: 5

image_liststring · uri[]Optional

Array of image URLs (2-4 images) for multi-image-to-video generation.

image_urlstring · uriOptional

URL of the image to be used as the first frame of the video.

last_image_urlstring · uriOptional

A direct link to an online image or a Base64-encoded local image to be used as the last frame of the video.

enable_safety_checkerbooleanOptional

If set to true, the safety checker will be enabled.

enable_prompt_expansionbooleanOptional

Whether to enable prompt expansion.

preprocessbooleanOptional

Whether to preprocess the input video.

accelerationstring · enumOptional

Acceleration to use for inference.

Default: regularPossible values:

video_qualitystring · enumOptional

The quality of the generated video.

Default: highPossible values:

video_write_modestring · enumOptional

The method used to write the video.

Default: balancedPossible values:

num_interpolated_framesintegerOptional

Number of frames to interpolate between the original frames.

temporal_downsample_factorintegerOptional

Temporal downsample factor for the video.

enable_auto_downsamplebooleanOptional

The minimum frames per second to downsample the video to.

auto_downsample_min_fpsnumberOptional

The minimum frames per second to downsample the video to.

Default: 15

interpolator_modelstring · enumOptional

The model to use for interpolation. Rife, or film are available.

Default: filmPossible values:

sync_modebooleanOptional

The synchronization mode for audio and video. Loose or tight are available.

Responses

200

Successfully generated video

application/json

post

/v2/video/generations

POST /v2/video/generations HTTP/1.1
Host: api.aimlapi.com
Content-Type: application/json
Accept: */*
Content-Length: 1273

{
  "model": "alibaba/wan2.2-vace-fun-a14b-inpainting",
  "video_url": "https://example.com",
  "mask_video_url": "https://example.com",
  "mask_image_url": "https://example.com",
  "prompt": "text",
  "negative_prompt": "letterboxing, borders, black bars, bright colors, overexposed, static, blurred details, subtitles, style, artwork, painting, picture, still, overall gray, worst quality, low quality, JPEG compression residue, ugly, incomplete, extra fingers, poorly drawn hands, poorly drawn faces, deformed, disfigured, malformed limbs, fused fingers, still picture, cluttered background, three legs, many people in the background, walking backwards",
  "match_input_num_frames": true,
  "num_frames": 81,
  "match_input_frames_per_second": true,
  "frames_per_second": 16,
  "seed": 1,
  "resolution": "auto",
  "aspect_ratio": "auto",
  "num_inference_steps": 30,
  "guidance_scale": 5,
  "shift": 5,
  "image_list": [
    "https://example.com"
  ],
  "image_url": "https://example.com",
  "last_image_url": "https://example.com",
  "enable_safety_checker": true,
  "enable_prompt_expansion": true,
  "preprocess": true,
  "acceleration": "regular",
  "video_quality": "high",
  "video_write_mode": "balanced",
  "num_interpolated_frames": 1,
  "temporal_downsample_factor": 1,
  "enable_auto_downsample": true,
  "auto_downsample_min_fps": 15,
  "interpolator_model": "film",
  "sync_mode": true
}

200

Successfully generated video

{
  "id": "text",
  "status": "queued",
  "video": {
    "url": "https://example.com",
    "duration": 1
  },
  "duration": 1,
  "error": null,
  "meta": {
    "usage": {
      "tokens_used": 1
    }
  }
}

Fetch the video

After sending a request for video generation, this task is added to the queue. This endpoint lets you check the status of a video generation task using its id, obtained from the endpoint described above. If the video generation task status is complete, the response will include the final result — with the generated video URL and additional metadata.

get

Query parameters

generation_idstringRequired

Responses

200

Successfully generated video

application/json

get

/v2/video/generations

GET /v2/video/generations?generation_id=text HTTP/1.1
Host: api.aimlapi.com
Accept: */*

200

Successfully generated video

{
  "id": "text",
  "status": "queued",
  "video": {
    "url": "https://example.com",
    "duration": 1
  },
  "duration": 1,
  "error": null,
  "meta": {
    "usage": {
      "tokens_used": 1
    }
  }
}

Full Example: Generating and Retrieving the Video From the Server

The code below creates a video generation task, then automatically polls the server every 10 seconds until it finally receives the video URL.

import requests
import time

# replace <YOUR_AIMLAPI_KEY> with your actual AI/ML API key
api_key = "<YOUR_AIMLAPI_KEY>"
base_url = "https://api.aimlapi.com/v2"


# Creating and sending a video generation task to the server
def generate_video():
    url = f"{base_url}/video/generations"
    headers = {
        "Authorization": f"Bearer {api_key}", 
    }

    data = {
        "model": "alibaba/wan2.2-vace-fun-a14b-inpainting",
        "video_url": "https://storage.googleapis.com/falserverless/example_inputs/wan_animate_input_video.mp4",
        "prompt": "A lone woman strides through the neon-drenched streets of Tokyo at night.  Her crimson dress, a vibrant splash of color against the deep blues and blacks of the cityscape, flows slightly with each step. A tailored black jacket, crisp and elegant, contrasts sharply with the dress's rich texture. Medium shot:  The city hums around her, blurred lights creating streaks of color in the background. Close-up:  The fabric of her dress catches the streetlight's glow, revealing a subtle silk sheen and the intricate stitching at the hem. Her black jacket’s subtle texture is visible – a fine wool perhaps, with a matte finish. The overall mood is one of quiet confidence and mystery, a vibrant woman navigating a bustling, nocturnal landscape. High resolution 4k.", 
        "resolution": "720p",
    }
 
    response = requests.post(url, json=data, headers=headers)
    
    if response.status_code >= 400:
        print(f"Error: {response.status_code} - {response.text}")
    else:
        response_data = response.json()
        print(response_data)
        return response_data
    

# Requesting the result of the task from the server using the generation_id
def get_video(gen_id):
    url = f"{base_url}/video/generations"
    params = {
        "generation_id": gen_id,
    }
    
    headers = {
        "Authorization": f"Bearer {api_key}", 
        "Content-Type": "application/json"
        }

    response = requests.get(url, params=params, headers=headers)
    return response.json()


def main():
     # Running video generation and getting a task id
    gen_response = generate_video()
    gen_id = gen_response.get("id")
    print("Generation ID:  ", gen_id)

    # Trying to retrieve the video from the server every 10 sec
    if gen_id:
        start_time = time.time()

        timeout = 600
        while time.time() - start_time < timeout:
            response_data = get_video(gen_id)

            if response_data is None:
                print("Error: No response from API")
                break
        
            status = response_data.get("status")
            print("Status:", status)

            if status == "waiting" or status == "active" or  status == "queued" or status == "generating":
                print("Still waiting... Checking again in 10 seconds.")
                time.sleep(10)
            else:
                print("Processing complete:/n", response_data)
                return response_data
   
        print("Timeout reached. Stopping.")
        return None     


if __name__ == "__main__":
    main()

const https = require("https");
const { URL } = require("url");

// Replace <YOUR_AIMLAPI_KEY> with your actual AI/ML API key
const apiKey = "<YOUR_AIMLAPI_KEY>";
const baseUrl = "https://api.aimlapi.com/v2";

// Creating and sending a video generation task to the server
function generateVideo(callback) {
  const data = JSON.stringify({
    model: "alibaba/wan2.2-vace-fun-a14b-inpainting",
    prompt: "A lone woman strides through the neon-drenched streets of Tokyo at night.  Her crimson dress, a vibrant splash of color against the deep blues and blacks of the cityscape, flows slightly with each step. A tailored black jacket, crisp and elegant, contrasts sharply with the dress's rich texture. Medium shot:  The city hums around her, blurred lights creating streaks of color in the background. Close-up:  The fabric of her dress catches the streetlight's glow, revealing a subtle silk sheen and the intricate stitching at the hem. Her black jacket’s subtle texture is visible – a fine wool perhaps, with a matte finish. The overall mood is one of quiet confidence and mystery, a vibrant woman navigating a bustling, nocturnal landscape. High resolution 4k.",
    image_url: "https://s2-111386.kwimgs.com/bs2/mmu-aiplatform-temp/kling/20240620/1.jpeg",
  });

  const url = new URL(`${baseUrl}/video/generations`);
  const options = {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${apiKey}`,
      "Content-Type": "application/json",
      "Content-Length": Buffer.byteLength(data),
    },
  };

  const req = https.request(url, options, (res) => {
    let body = "";
    res.on("data", (chunk) => body += chunk);
    res.on("end", () => {
      if (res.statusCode >= 400) {
        console.error(`Error: ${res.statusCode} - ${body}`);
        callback(null);
      } else {
        const parsed = JSON.parse(body);
        callback(parsed);
      }
    });
  });

  req.on("error", (err) => console.error("Request error:", err));
  req.write(data);
  req.end();
}

// Requesting the result of the task from the server using the generation_id
function getVideo(genId, callback) {
  const url = new URL(`${baseUrl}/video/generations`);
  url.searchParams.append("generation_id", genId);

  const options = {
    method: "GET",
    headers: {
      "Authorization": `Bearer ${apiKey}`,
      "Content-Type": "application/json",
    },
  };

  const req = https.request(url, options, (res) => {
    let body = "";
    res.on("data", (chunk) => body += chunk);
    res.on("end", () => {
      const parsed = JSON.parse(body);
      callback(parsed);
    });
  });

  req.on("error", (err) => console.error("Request error:", err));
  req.end();
}

// Initiates video generation and checks the status every 10 seconds until completion or timeout
function main() {
  generateVideo((genResponse) => {
    if (!genResponse || !genResponse.id) {
      console.error("Failed to start generation");
      return;
    }

    const genId = genResponse.id;
    console.log("Gen_ID:", genId);

    const startTime = Date.now();
    const timeout = 600000;

    const checkStatus = () => {
      if (Date.now() - startTime > timeout) {
        console.log("Timeout reached. Stopping.");
        return;
      }

      getVideo(genId, (responseData) => {
        if (!responseData) {
          console.error("Error: No response from API");
          return;
        }

        const status = responseData.status;
        console.log("Status:", status);

        if (["waiting", "active", "queued", "generating"].includes(status)) {
          console.log("Still waiting... Checking again in 10 seconds.");
          setTimeout(checkStatus, 10000);
        } else {
          console.log("Processing complete:\n", responseData);
        }
      });
    };

    checkStatus();
  });
}

main();

Response

Generation ID:   b5592d70-dd31-4e5a-bc5c-5063660c001b:alibaba/wan2.2-vace-fun-a14b-inpainting
Status: generating
Still waiting... Checking again in 10 seconds.
Status: generating
Still waiting... Checking again in 10 seconds.
Status: generating
Still waiting... Checking again in 10 seconds.
Status: generating
Still waiting... Checking again in 10 seconds.
Status: generating
Still waiting... Checking again in 10 seconds.
Status: completed
Processing complete:/n {"id":"b5592d70-dd31-4e5a-bc5c-5063660c001b:alibaba/wan2.2-vace-fun-a14b-inpainting","status":"completed","video":{"url":"https://v3b.fal.media/files/b/rabbit/L3U6CofKB0xe_fgCTKj4G.mp4"}}

PreviousWan 2.2 VACE Fun Outpainting (Image-to-Video)NextWan 2.2 VACE Fun Pose (Image-to-Video)

Last updated 17 hours ago

Was this helpful?