v5.5/image-to-video

This documentation is valid for the following list of our models:

  • pixverse/v5-5-image-to-video

The model generates high-quality video clips from text combined with an image, delivering smooth motion and sharp visual detail.

Setup your API Key

If you don’t have an API key for the AI/ML API yet, feel free to use our Quickstart guide.

How to Make a Call

Step-by-Step Instructions

Generating a video using this model involves sequentially calling two endpoints:

  • The first one is for creating and sending a video generation task to the server (returns a generation ID).

  • The second one is for requesting the generated video from the server using the generation ID received from the first endpoint.

Below, you can find both corresponding API schemas.

API Schemas

Create a video generation task and send it to the server

You can generate a video using this API. In the basic setup, you only need a reference image and a prompt. This endpoint creates and sends a video generation task to the server — and returns a generation ID.

post
Body
modelstring · enumRequiredPossible values:
promptstringRequired

The text description of the scene, subject, or action to generate in the video.

image_urlstring · uriRequired

URL of the image to be used as the first frame of the video.

resolutionstring · enumOptional

An enumeration where the short side of the video frame determines the resolution.

Default: 720pPossible values:
durationinteger · enumOptional

The output video length in seconds. The 1080p quality option does not support 8-second videos.

Default: 5Possible values:
negative_promptstringOptional

The description of elements to avoid in the generated video.

stylestring · enumOptional

The style of the generated video.

Possible values:
seedintegerOptional

Varying the seed integer is a way to get different results for the same other request parameters. Using the same value for an identical request will produce similar results. If unspecified, a random number is chosen.

generate_audio_switchbooleanOptional

Enable audio generation.

  • true: Audio on.
  • false: Audio off.
Default: false
generate_multi_clip_switchbooleanOptional

Enable multi-clip generation with dynamic camera changes.

  • true: Multi-clip.
  • false: Single-clip.
Default: false
thinking_typestring · enumOptional

Prompt reasoning enhancement mode.

  • "enabled": Turn on prompt optimization.
  • "disabled": Turn off prompt optimization.
  • "auto" or omitted: Let the model decide automatically.
Default: enabledPossible values:
Responses
200Success
application/json
post
/v2/video/generations
200Success

Retrieve the generated video from the server

After sending a request for video generation, this task is added to the queue. This endpoint lets you check the status of a video generation task using its id, obtained from the endpoint described above. If the video generation task status is complete, the response will include the final result — with the generated video URL and additional metadata.

get
Query parameters
generation_idstringRequired
Responses
200

Successfully generated video

application/json
get
/v2/video/generations
200

Successfully generated video

Full Example: Generating and Retrieving the Video From the Server

The code below creates a video generation task, then automatically polls the server every 15 seconds until it finally receives the video URL.

Response

Processing time: ~50 s.

Original: 864x1280

Low-res GIF preview:

"Mona Lisa puts on glasses with her hands."

Last updated

Was this helpful?