Veo2 (Text-to-Video)
Last updated
Was this helpful?
Last updated
Was this helpful?
Google’s cutting-edge AI model designed to generate highly realistic and cinematic video content from textual prompts or a combination of text and images. Leveraging advanced machine learning techniques, Veo2 excels in creating videos with natural motion, realistic physics, and professional-grade visual fidelity.
Key Features:
Text-to-Video (T2V): Converts descriptive text into dynamic video content.
High Resolution Support: Generates videos up to 4K resolution for professional-grade outputs.
Multimodal Input Encoding: Integrates text and image inputs seamlessly for creative flexibility.
If you don’t have an API key for the AI/ML API yet, feel free to use our .
Generating a video using this model involves sequentially calling two endpoints:
The first one is for creating and sending a video generation task to the server (returns a generation ID).
The second one is for requesting the generated video from the server using the generation ID received from the first endpoint.
Below, you can find two corresponding API schemas and examples for both endpoint calls.
You can generate a video using this API. In the basic setup, you only need a prompt, the aspect ratio, and the desired duration (5, 6, 7, or 8 seconds).
No Content
The text prompt describing the video you want to generate
The aspect ratio of the generated video
The duration of the generated video in seconds. Possible values: 5, 6, 7, 8
5
No Content