OmniHuman
An advanced AI framework from ByteDance that generates realistic lip-sync videos from a single image and motion signals (audio). It supports multiple visual and audio styles and produces videos in any body proportion, with realism enhanced by motion, lighting, and texture details.
Setup your API Key
If you don’t have an API key for the AI/ML API yet, feel free to use our Quickstart guide.
How to Make a Call
Full Example: Generating and Retrieving the Video From the Server
The code below creates a video generation task, then automatically polls the server every 10 seconds until it finally receives the video URL.
import requests
import time
# replace <YOUR_AIMLAPI_KEY> with your actual AI/ML API key
api_key = "<YOUR_AIMLAPI_KEY>"
base_url = "https://api.aimlapi.com/v2"
# Creating and sending a video generation task to the server
def generate_video():
url = f"{base_url}/video/generations"
headers = {
"Authorization": f"Bearer {api_key}",
}
data = {
"model": "bytedance/omnihuman",
"image_url": "https://s2-111386.kwimgs.com/bs2/mmu-aiplatform-temp/kling/20240620/1.jpeg",
"audio_url": "https://storage.googleapis.com/falserverless/example_inputs/omnihuman_audio.mp3",
}
response = requests.post(url, json=data, headers=headers)
if response.status_code >= 400:
print(f"Error: {response.status_code} - {response.text}")
else:
response_data = response.json()
print(response_data)
return response_data
# Requesting the result of the task from the server using the generation_id
def get_video(gen_id):
url = f"{base_url}/video/generations"
params = {
"generation_id": gen_id,
}
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}
response = requests.get(url, params=params, headers=headers)
return response.json()
def main():
# Running video generation and getting a task id
gen_response = generate_video()
gen_id = gen_response.get("id")
print("Generation ID: ", gen_id)
# Trying to retrieve the video from the server every 10 sec
if gen_id:
start_time = time.time()
timeout = 600
while time.time() - start_time < timeout:
response_data = get_video(gen_id)
if response_data is None:
print("Error: No response from API")
break
status = response_data.get("status")
print("Status:", status)
if status == "waiting" or status == "active" or status == "queued" or status == "generating":
print("Still waiting... Checking again in 10 seconds.")
time.sleep(10)
else:
print("Processing complete:/n", response_data)
return response_data
print("Timeout reached. Stopping.")
return None
if __name__ == "__main__":
main()
Original (with sound): 896x1344
Low-res GIF preview:

API Schemas
Create a video generation task and send it to the server
You can create a video with this API by providing a reference image of a character and an audio file. The character will deliver the audio with full lip-sync and natural gestures. This POST request creates and submits a video generation task to the server — and returns a generation ID.
A direct link to an online image or a Base64-encoded local image that will serve as the visual base or the first frame for the video.
Reference song, should contain music and vocals. Must be a .wav or .mp3 file longer than 15 seconds.
No content
POST /v2/video/generations HTTP/1.1
Host: api.aimlapi.com
Content-Type: application/json
Accept: */*
Content-Length: 99
{
"model": "bytedance/omnihuman",
"image_url": "https://example.com",
"audio_url": "https://example.com"
}
No content
Retrieve the generated video from the server
After sending a request for video generation, this task is added to the queue. This endpoint lets you check the status of a video generation task using its generation_id
, obtained from the endpoint described above. If the video generation task status is complete
, the response will include the final result — with the generated video URL and additional metadata.
No content
GET /v2/video/generations HTTP/1.1
Host: api.aimlapi.com
Accept: */*
No content
Last updated
Was this helpful?