avatar-standard
From a single image and a voice track, this model generates expressive character animations aligned with the speech’s rhythm, intonation, and meaning. This version outputs 720p video at 24 fps.
Setup your API Key
If you don’t have an API key for the AI/ML API yet, feel free to use our Quickstart guide.
How to Make a Call
API Schemas
Create a video generation task and send it to the server
You can create a video with this API by providing a reference image of a character and an audio file. The character will deliver the audio with full lip-sync and natural gestures. This POST request creates and submits a video generation task to the server — and returns a generation ID.
A direct link to an online image or a Base64-encoded local image that will serve as the visual base or the first frame for the video.
The text description of the scene, subject, or action to generate in the video.
Successfully generated video
Successfully generated video
Retrieve the generated video from the server
After sending a request for video generation, this task is added to the queue. This endpoint lets you check the status of a video generation task using its generation_id, obtained from the endpoint described above. If the video generation task status is complete, the response will include the final result — with the generated video URL and additional metadata.
Successfully generated video
Successfully generated video
Full Example: Generating and Retrieving the Video From the Server
The code below creates a video generation task, then automatically polls the server every 10 seconds until it finally receives the video URL.
Generation time: ~ 4 min.
Original (1280x720, with sound):
The following video was generated by adding just one line to our example:
See how dramatically the prompt parameter can change the character’s behavior and mannerisms:
"prompt": "A person speaking playfully, laughing frequently and gesturing wildly."Last updated
Was this helpful?