Wan 2.2 vace fun 14b inpainting (Image-to-Video)
Vace is a video generation model that combines a source image, mask, and reference video to produce prompted videos with precise source control.
Setup your API Key
If you don’t have an API key for the AI/ML API yet, feel free to use our Quickstart guide.
How to Make a Call
Full Example: Generating and Retrieving the Video From the Server
The code below creates a video generation task, then automatically polls the server every 10 seconds until it finally receives the video URL.
import requests
import time
# replace <YOUR_AIMLAPI_KEY> with your actual AI/ML API key
api_key = "<YOUR_AIMLAPI_KEY>"
base_url = "https://api.aimlapi.com/v2"
# Creating and sending a video generation task to the server
def generate_video():
url = f"{base_url}/video/generations"
headers = {
"Authorization": f"Bearer {api_key}",
}
data = {
"model": "alibaba/wan2.2-vace-fun-a14b-inpainting",
"video_url": "https://storage.googleapis.com/falserverless/example_inputs/wan_animate_input_video.mp4",
"prompt": "A lone woman strides through the neon-drenched streets of Tokyo at night. Her crimson dress, a vibrant splash of color against the deep blues and blacks of the cityscape, flows slightly with each step. A tailored black jacket, crisp and elegant, contrasts sharply with the dress's rich texture. Medium shot: The city hums around her, blurred lights creating streaks of color in the background. Close-up: The fabric of her dress catches the streetlight's glow, revealing a subtle silk sheen and the intricate stitching at the hem. Her black jacket’s subtle texture is visible – a fine wool perhaps, with a matte finish. The overall mood is one of quiet confidence and mystery, a vibrant woman navigating a bustling, nocturnal landscape. High resolution 4k.",
"resolution": "720p",
}
response = requests.post(url, json=data, headers=headers)
if response.status_code >= 400:
print(f"Error: {response.status_code} - {response.text}")
else:
response_data = response.json()
print(response_data)
return response_data
# Requesting the result of the task from the server using the generation_id
def get_video(gen_id):
url = f"{base_url}/video/generations"
params = {
"generation_id": gen_id,
}
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}
response = requests.get(url, params=params, headers=headers)
return response.json()
def main():
# Running video generation and getting a task id
gen_response = generate_video()
gen_id = gen_response.get("id")
print("Generation ID: ", gen_id)
# Trying to retrieve the video from the server every 10 sec
if gen_id:
start_time = time.time()
timeout = 600
while time.time() - start_time < timeout:
response_data = get_video(gen_id)
if response_data is None:
print("Error: No response from API")
break
status = response_data.get("status")
print("Status:", status)
if status == "waiting" or status == "active" or status == "queued" or status == "generating":
print("Still waiting... Checking again in 10 seconds.")
time.sleep(10)
else:
print("Processing complete:/n", response_data)
return response_data
print("Timeout reached. Stopping.")
return None
if __name__ == "__main__":
main()
API Schemas
Video Generation
This endpoint creates and sends a video generation task to the server — and returns a generation ID.
URL to the source video file. Required for inpainting
URL to the source mask file. Required for inpainting
URL to the guiding mask file. If provided, the model will use this mask as a reference to create masked video using salient mask tracking. Will be ignored if mask_video_url is provided
The text description of the scene, subject, or action to generate in the video.
The description of elements to avoid in the generated video.
letterboxing, borders, black bars, bright colors, overexposed, static, blurred details, subtitles, style, artwork, painting, picture, still, overall gray, worst quality, low quality, JPEG compression residue, ugly, incomplete, extra fingers, poorly drawn hands, poorly drawn faces, deformed, disfigured, malformed limbs, fused fingers, still picture, cluttered background, three legs, many people in the background, walking backwards
Number of frames to generate. Must be between 81 to 241 (inclusive)
81
Whether to match the input video's frames per second (FPS).
Frames per second of the generated video. Must be between 5 to 30
16
Varying the seed integer is a way to get different results for the same other request parameters. Using the same value for an identical request will produce similar results. If unspecified, a random number is chosen.
An enumeration where the short side of the video frame determines the resolution.
auto
Possible values: The aspect ratio of the generated video.
auto
Possible values: Number of inference steps for sampling. Higher values give better quality but take longer.
30
Classifier-free guidance scale. Controls prompt adherence / creativity.
5
Noise schedule shift parameter. Affects temporal dynamics.
5
Array of image URLs (2-4 images) for multi-image-to-video generation.
URL of the image to be used as the first frame of the video.
A direct link to an online image or a Base64-encoded local image to be used as the last frame of the video.
If set to true, the safety checker will be enabled.
Whether to enable prompt expansion.
Whether to preprocess the input video.
Acceleration to use for inference. None or regular are available.
regular
Possible values: The quality of the generated video.
high
Possible values: The method used to write the video. Fast, balanced, small are available.
balanced
Possible values: Number of frames to interpolate between the original frames
Temporal downsample factor for the video
The minimum frames per second to downsample the video to
The minimum frames per second to downsample the video to
15
The model to use for interpolation. Rife, or film are available.
film
Possible values: The synchronization mode for audio and video. Loose or tight are available.
No content
POST /v2/video/generations HTTP/1.1
Host: api.aimlapi.com
Content-Type: application/json
Accept: */*
Content-Length: 1273
{
"model": "alibaba/wan2.2-vace-fun-a14b-inpainting",
"video_url": "https://example.com",
"mask_video_url": "https://example.com",
"mask_image_url": "https://example.com",
"prompt": "text",
"negative_prompt": "letterboxing, borders, black bars, bright colors, overexposed, static, blurred details, subtitles, style, artwork, painting, picture, still, overall gray, worst quality, low quality, JPEG compression residue, ugly, incomplete, extra fingers, poorly drawn hands, poorly drawn faces, deformed, disfigured, malformed limbs, fused fingers, still picture, cluttered background, three legs, many people in the background, walking backwards",
"match_input_num_frames": true,
"num_frames": 81,
"match_input_frames_per_second": true,
"frames_per_second": 16,
"seed": 1,
"resolution": "auto",
"aspect_ratio": "auto",
"num_inference_steps": 30,
"guidance_scale": 5,
"shift": 5,
"image_list": [
"https://example.com"
],
"image_url": "https://example.com",
"last_image_url": "https://example.com",
"enable_safety_checker": true,
"enable_prompt_expansion": true,
"preprocess": true,
"acceleration": "regular",
"video_quality": "high",
"video_write_mode": "balanced",
"num_interpolated_frames": 1,
"temporal_downsample_factor": 1,
"enable_auto_downsample": true,
"auto_downsample_min_fps": 15,
"interpolator_model": "film",
"sync_mode": true
}
No content
Fetch the video
After sending a request for video generation, this task is added to the queue. This endpoint lets you check the status of a video generation task using its id
, obtained from the endpoint described above.
If the video generation task status is complete
, the response will include the final result — with the generated video URL and additional metadata.
No content
GET /v2/video/generations?generation_id=text HTTP/1.1
Host: api.aimlapi.com
Accept: */*
No content
Last updated
Was this helpful?