Veo 2 (Image-to-Video)

This documentation is valid for the following list of our models:

  • veo2/image-to-video

An advanced multimodal (image + text) AI model that transforms static images into high-quality, dynamic video content. It builds upon the success of Google's Veo2 text-to-video model, offering unprecedented control and realism in video generation from still images, faithful content preservation from source images, and intuitive motion generation with physics-aware movement.

How to Make a Call

Step-by-Step Instructions

Generating a video using this model involves sequentially calling two endpoints:

  • The first one is for creating and sending a video generation task to the server (returns a generation ID).

  • The second one is for requesting the generated video from the server using the generation ID received from the first endpoint.

Below, you can find both corresponding API schemas.

API Schemas

Create a video generation task and send it to the server

You can generate a video using this API.

post
Authorizations
AuthorizationstringRequired

Bearer key

Body
modelundefined · enumRequiredPossible values:
promptstringRequired

The text description of the scene, subject, or action to generate in the video.

image_urlstring · uriRequired

A direct link to an online image or a Base64-encoded local image that will serve as the visual base or the first frame for the video.

tail_image_urlstring · uriOptional

A direct link to an online image or a Base64-encoded local image to be used as the last frame of the video.

aspect_ratiostring · enumOptional

The aspect ratio of the generated video.

Possible values:
durationinteger · enumOptional

The length of the output video in seconds.

Possible values:
negative_promptstringOptional

The description of elements to avoid in the generated video.

seedintegerOptional

Varying the seed integer is a way to get different results for the same other request parameters. Using the same value for an identical request will produce similar results. If unspecified, a random number is chosen.

enhance_promptbooleanOptional

Whether to enhance the video generation.

Default: true
Responses
200

Successfully generated video

application/json
post
/v2/generate/video/google/generation
200

Successfully generated video

Fetch the video

After sending a request for video generation, this task is added to the queue. This endpoint lets you check the status of a video generation task using its id, obtained from the endpoint described above. If the video generation task status is complete, the response will include the final result — with the generated video URL and additional metadata.

get
Authorizations
AuthorizationstringRequired

Bearer key

Query parameters
generation_idstringRequired
Responses
200

Successfully generated video

application/json
get
/v2/generate/video/google/generation
200

Successfully generated video

Full Example: Generating and Retrieving the Video From the Server

We have a classic reproduction of the famous da Vinci painting. Let's ask the model to generate a video where the Mona Lisa puts on glasses.

Generation may take around 40-50 seconds for a 5-second video.

Response

Original: 1280x720

Low-res GIF preview:

Last updated

Was this helpful?