AI/ML API Documentation
API KeyModelsPlaygroundGitHubGet Support
  • 📞Contact Sales
  • 🗯️Send Feedback
  • Quickstart
    • 🧭Documentation Map
    • Setting Up
    • Supported SDKs
  • API REFERENCES
    • 📒All Model IDs
    • Text Models (LLM)
      • AI21 Labs
        • jamba-1-5-mini
      • Alibaba Cloud
        • qwen-max
        • qwen-plus
        • qwen-turbo
        • Qwen2-72B-Instruct
        • Qwen2.5-7B-Instruct-Turbo
        • Qwen2.5-72B-Instruct-Turbo
        • Qwen2.5-Coder-32B-Instruct
        • Qwen-QwQ-32B
        • Qwen3-235B-A22B
      • Anthracite
        • magnum-v4
      • Anthropic
        • Claude 3 Haiku
        • Claude 3.5 Haiku
        • Claude 3 Opus
        • Claude 3 Sonnet
        • Claude 3.5 Sonnet
        • Claude 3.7 Sonnet
      • Cohere
        • command-r-plus
      • DeepSeek
        • DeepSeek V3
        • DeepSeek R1
      • Google
        • gemini-1.5-flash
        • gemini-1.5-pro
        • gemini-2.0-flash-exp
        • gemini-2.0-flash-thinking-exp-01-21
        • gemini-2.0-flash
        • gemini-2.5-flash-preview
        • gemini-2.5-pro-exp
        • gemini-2.5-pro-preview
        • gemma-2
        • gemma-3
      • Gryphe
        • MythoMax-L2-13b-Lite
      • Meta
        • Llama-3-chat-hf
        • Llama-3-8B-Instruct-Lite
        • Llama-3.1-8B-Instruct-Turbo
        • Llama-3.1-70B-Instruct-Turbo
        • Llama-3.1-405B-Instruct-Turbo
        • Llama-3.2-11B-Vision-Instruct-Turbo
        • Llama-3.2-90B-Vision-Instruct-Turbo
        • Llama-Vision-Free
        • Llama-3.2-3B-Instruct-Turbo
        • Llama-3.3-70B-Instruct-Turbo
        • Llama-4-scout
        • Llama-4-maverick
      • MiniMax
        • text-01
        • abab6.5s-chat
      • Mistral AI
        • codestral-2501
        • mistral-nemo
        • mistral-tiny
        • Mistral-7B-Instruct
        • Mixtral-8x22B-Instruct
        • Mixtral-8x7B-Instruct
      • NVIDIA
        • Llama-3.1-Nemotron-70B-Instruct-HF
        • llama-3.1-nemotron-70b
      • NeverSleep
        • llama-3.1-lumimaid
      • NousResearch
        • Nous-Hermes-2-Mixtral-8x7B-DPO
      • OpenAI
        • gpt-3.5-turbo
        • gpt-4
        • gpt-4-preview
        • gpt-4-turbo
        • gpt-4o
        • gpt-4o-mini
        • gpt-4o-audio-preview
        • gpt-4o-mini-audio-preview
        • gpt-4o-search-preview
        • gpt-4o-mini-search-preview
        • o1
        • o1-mini
        • o1-preview
        • o3-mini
        • gpt-4.5-preview
        • gpt-4.1
        • gpt-4.1-mini
        • gpt-4.1-nano
        • o4-mini
      • xAI
        • grok-beta
        • grok-3-beta
        • grok-3-mini-beta
    • Image Models
      • Flux
        • flux-pro
        • flux-pro/v1.1
        • flux-pro/v1.1-ultra
        • flux-realism
        • flux/dev
        • flux/dev/image-to-image
        • flux/schnell
      • Google
        • Imagen 3.0
      • OpenAI
        • DALL·E 2
        • DALL·E 3
      • RecraftAI
        • Recraft v3
      • Stability AI
        • Stable Diffusion v3 Medium
        • Stable Diffusion v3.5 Large
    • Video Models
      • Alibaba Cloud
        • Wan 2.1 (Text-to-Video)
      • Google
        • Veo2 (Image-to-Video)
        • Veo2 (Text-to-Video)
      • Kling AI
        • v1-standard/image-to-video
        • v1-standard/text-to-video
        • v1-pro/image-to-video
        • v1-pro/text-to-video
        • v1.6-standard/text-to-video
        • v1.6-standard/image-to-video
        • v1.6-pro/image-to-video
        • v1.6-pro/text-to-video
        • v1.6-standard/effects
        • v1.6-pro/effects
        • v2-master/image-to-video
        • v2-master/text-to-video
      • Luma AI
        • Text-to-Video v2
        • Text-to-Video v1 (legacy)
      • MiniMax
        • video-01
        • video-01-live2d
      • Runway
        • gen3a_turbo
        • gen4_turbo
    • Music Models
      • MiniMax
        • minimax-music [legacy]
        • music-01
      • Stability AI
        • stable-audio
    • Voice/Speech Models
      • Speech-to-Text
        • stt [legacy]
        • Deepgram
          • nova-2
        • OpenAI
          • whisper-base
          • whisper-large
          • whisper-medium
          • whisper-small
          • whisper-tiny
      • Text-to-Speech
        • Deepgram
          • aura
    • Content Moderation Models
      • Meta
        • Llama-Guard-3-11B-Vision-Turbo
        • LlamaGuard-2-8b
        • Meta-Llama-Guard-3-8B
    • 3D-Generating Models
      • Stability AI
        • triposr
    • Vision Models
      • Image Analysis
      • OCR: Optical Character Recognition
        • Google
          • Google OCR
        • Mistral AI
          • mistral-ocr-latest
      • OFR: Optical Feature Recognition
    • Embedding Models
      • Anthropic
        • voyage-2
        • voyage-code-2
        • voyage-finance-2
        • voyage-large-2
        • voyage-large-2-instruct
        • voyage-law-2
        • voyage-multilingual-2
      • BAAI
        • bge-base-en
        • bge-large-en
      • Google
        • textembedding-gecko
        • text-multilingual-embedding-002
      • OpenAI
        • text-embedding-3-large
        • text-embedding-3-small
        • text-embedding-ada-002
      • Together AI
        • m2-bert-80M-retrieval
  • Solutions
    • Bagoodex
      • AI Search Engine
        • Find Links
        • Find Images
        • Find Videos
        • Find the Weather
        • Find a Local Map
        • Get a Knowledge Structure
    • OpenAI
      • Assistants
        • Assistant API
        • Thread API
        • Message API
        • Run and Run Step API
        • Events
  • Use Cases
    • Create Images: Illustrate an Article
    • Animate Images: A Children’s Encyclopedia
    • Create an Assistant to Discuss a Specific Document
    • Create a 3D Model from an Image
    • Create a Looped GIF for a Web Banner
    • Read Text Aloud and Describe Images: Support People with Visual Impairments
    • Summarize Websites with AI-Powered Chrome Extension
  • Capabilities
    • Completion and Chat Completion
    • Streaming Mode
    • Code Generation
    • Thinking / Reasoning
    • Function Calling
    • Vision in Text Models (Image-To-Text)
    • Web Search
    • Features of Anthropic Models
    • Model comparison
  • FAQ
    • Can I use API in Python?
    • Can I use API in NodeJS?
    • What are the Pro Models?
    • How to use the Free Tier?
    • Are my requests cropped?
    • Can I call API in the asynchronous mode?
    • OpenAI SDK doesn't work?
  • Errors and Messages
    • General Info
    • Errors with status code 4xx
    • Errors with status code 5xx
  • Glossary
    • Concepts
  • Integrations
    • 🧩Our Integration List
    • Langflow
    • LiteLLM
Powered by GitBook
On this page
  • Idea and Step-by-Step Plan
  • A Page We’re Bringing to Life
  • Full Walkthrough
  • Results
  • Room for Improvement

Was this helpful?

  1. Use Cases

Animate Images: A Children’s Encyclopedia

PreviousCreate Images: Illustrate an ArticleNextCreate an Assistant to Discuss a Specific Document

Last updated 20 days ago

Was this helpful?

Legal Notice Please remember that reference images may be subject to copyright. Make sure to respect the law and avoid sharing the animated versions online if doing so could infringe intellectual property rights. Just use them to bring a bit of joy to kids at home

Idea and Step-by-Step Plan

Today, we’re going to bring a page from a children’s encyclopedia to life — with pictures!

Here’s the plan:

  1. Take an article from some free encyclopedia for children. (Of course, you can use a children's story, short illustrated tales, or any other suitable content.) To keep it simple, we’ll focus only on text and illustrations.

  2. Based on each illustration, a smart chat model comes up with a short video idea — a little scene that matches the content. Model generates a video model prompt.

  3. With this prompt, generate a 5-second video using video model and download the generated video from the server.

  4. Convert it to a GIF using any free online tool.

  5. Replace the original static image with the animated GIF.

Repeat this process for every illustration on the page.

A Page We’re Bringing to Life

Article Example

What Are Raccoons?

Raccoons are small, furry animals with fluffy striped tails and black “masks” around their eyes. They live in forests, near rivers and lakes—and sometimes even close to people in towns and cities. Raccoons are very clever, curious, and quick with their paws.

One of the raccoon's most famous habits is "washing" its food. But raccoons aren’t really cleaning their meals. They just love to roll and rub things between their paws, especially near water. Scientists believe this helps them understand what they’re holding.

Raccoons eat almost anything: berries, fruits, nuts, insects, fish, and even bird eggs. They're nocturnal, which means they go out at night to look for food and sleep during the day in cozy tree hollows.

Raccoons are very social. Young raccoons love to play—tumbling in the grass, hiding behind trees, and exploring everything around them. And sometimes, if they feel safe, raccoons might even come closer to where people are—especially if there's a snack nearby!

Even though they can be a little mischievous, raccoons play an important role in nature. They help spread seeds and keep insect populations in check.

So next time you see a raccoon, remember: it’s not just a fluffy animal—it’s a real forest explorer!


Full Walkthrough

  1. Let’s take the raccoon article from the previous section. To upload the illustrations into the chat model, we’ll save them to disk first. Later, you can use the resulting folder of images to build an HTML page with animated visuals.

Python code
from openai import OpenAI
import base64
import mimetypes
from pathlib import Path

base_url = "https://api.aimlapi.com/"
api_key = "<YOUR_AIMLAPI_KEY>"

# image path (Insert your image file path instead. Images in PNG, JPG, and WebP formats are supported.)
file_path = Path("C:/Users/user/Documents/example/images/racoons_0.png")

# Detect the MIME type based on file extension
mime_type, _ = mimetypes.guess_type(file_path)

# Supported image formats
allowed_mime_types = {"image/png", "image/jpeg", "image/webp"}

# Raise an error if the format is not supported
if mime_type not in allowed_mime_types:
    raise ValueError(f"Unsupported image format: {mime_type}. Supported formats: PNG, JPG, WebP.")

# Read and encode the image in base64
with open(file_path, "rb") as image_file:
    base64_image = base64.b64encode(image_file.read()).decode("utf-8")

# Create a data URL for the base64 image
image_data_url = f"data:{mime_type};base64,{base64_image}"

# Send the image to GPT-4o via OpenAI's API
client = OpenAI(api_key=api_key, base_url=base_url)

completion = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "user", "content": "Based on the provided image, come up with a short scenario (no need to output it) and give me only a short, suitable prompt for generating a 5-second animation based on an image with the following description. Do not include the word 'Prompt:' — just output the prompt itself. Describe possible movements, background changes, etc."},
            {
                "role": "user", "content":[ 
                    {
                        "type": "image_url",
                        "image_url": {
                            "url": image_data_url
                         }
                    }
                ]
            }

        ],
    )

image_analysis_result = completion.choices[0].message.content
print(image_analysis_result)
Response: Generated Prompt Based On the Image Description
The raccoon's paw gently ripples the stream as tiny leaves float by; the trees sway slightly in the breeze, and sunlight filters through, casting shifting patterns on the rocks and grass.
Python code
import requests
import base64
import mimetypes
from pathlib import Path
import time

base_url = "https://api.aimlapi.com/v2"
api_key = "<YOUR_AIMLAPI_KEY>"

generated_prompt = "The raccoon's paw gently wash the fruit the stream as tiny leaves float by; the trees sway slightly in the breeze, and sunlight filters through, casting shifting patterns on the rocks and grass."

# Insert your image file path instead:
file_path = Path("C:/Users/user/Documents/example/images/racoons_0.png")

# Detect the MIME type based on file extension
mime_type, _ = mimetypes.guess_type(file_path)

# Supported image formats
allowed_mime_types = {"image/png", "image/jpeg", "image/webp"}

# Raise an error if the format is not supported
if mime_type not in allowed_mime_types:
    raise ValueError(f"Unsupported image format: {mime_type}. Supported formats: PNG, JPG, WebP.")


# Creating and sending a video generation task to the server
def generate_video(im_url):
    url = f"{base_url}/generate/video/kling/generation"
    headers = {
        "Authorization": f"Bearer {api_key}", 
    }

    data = {
        "model": "kling-video/v1/pro/image-to-video",
        "image_url": im_url,
        "prompt": generated_prompt,
        "duration": 5        
    }
 
    response = requests.post(url, json=data, headers=headers)
    
    if response.status_code >= 400:
        print(f"Error: {response.status_code} - {response.text}")
    else:
        response_data = response.json()
        print(response_data)
        return response_data
    
    
# Requesting the result of the task from the server using the generation_id
def get_video(gen_id):
    url = f"{base_url}/generate/video/kling/generation"
    params = {
        "generation_id": gen_id,
    }
    
    # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
    headers = {
        "Authorization": f"Bearer {api_key}", 
        "Content-Type": "application/json"
        }

    response = requests.get(url, params=params, headers=headers)
    # print("Generation:", response.json())
    return response.json()


def main():
    # Read and encode the image in base64
    with open(file_path, "rb") as image_file:
        base64_image = base64.b64encode(image_file.read()).decode("utf-8")

    # Create a data URL for the base64 image
    image_data_url = f"data:{mime_type};base64,{base64_image}" 
    
    # Generate video
    gen_response = generate_video(image_data_url)
    gen_id = gen_response.get("id")
    print("Gen_ID:  ", gen_id)

    # Try to retrieve the video from the server every 10 sec
    if gen_id:
        start_time = time.time()

        timeout = 600
        while time.time() - start_time < timeout:
            response_data = get_video(gen_id)

            if response_data is None:
                print("Error: No response from API")
                break
        
            status = response_data.get("status")
            print("Status:", status)

            if status == "waiting" or status == "active" or  status == "queued" or status == "generating":
                print("Still waiting... Checking again in 10 seconds.")
                time.sleep(10)
            else:
                print("Processing complete:/n", response_data)
                return response_data
   
        print("Timeout reached. Stopping.")
        return None     


if __name__ == "__main__":
    main()
Response
{'id': '9e4c45e7-5785-42f3-8271-ce8a8b31dd04:kling-video/v1.6/pro/image-to-video', 'status': 'queued'}
Gen_ID:   9e4c45e7-5785-42f3-8271-ce8a8b31dd04:kling-video/v1.6/pro/image-to-video
generating
Still waiting... Checking again in 10 seconds.
generating
Still waiting... Checking again in 10 seconds.
generating
Still waiting... Checking again in 10 seconds.
generating
Still waiting... Checking again in 10 seconds.
generating
Still waiting... Checking again in 10 seconds.
generating
Still waiting... Checking again in 10 seconds.
generating
Still waiting... Checking again in 10 seconds.
generating
Still waiting... Checking again in 10 seconds.
generating
Still waiting... Checking again in 10 seconds.
generating
Still waiting... Checking again in 10 seconds.
generating
Still waiting... Checking again in 10 seconds.
generating
...
generating
Still waiting... Checking again in 10 seconds.
completed
Processing complete:/n {'id': '9e4c45e7-5785-42f3-8271-ce8a8b31dd04:kling-video/v1.6/pro/image-to-video', 'status': 'completed', 'video': {'url': 'https://cdn.aimlapi.com/eagle/files/kangaroo/Kx8BCNAB0eqhasWyZMTo3_output.mp4', 'content_type': 'video/mp4', 'file_name': 'output.mp4', 'file_size': 11725406}}

Results

Animated Article Example

What Are Raccoons?

Raccoons are small, furry animals with fluffy striped tails and black “masks” around their eyes. They live in forests, near rivers and lakes—and sometimes even close to people in towns and cities. Raccoons are very clever, curious, and quick with their paws.

One of the raccoon's most famous habits is "washing" its food. But raccoons aren’t really cleaning their meals. They just love to roll and rub things between their paws, especially near water. Scientists believe this helps them understand what they’re holding.

Raccoons eat almost anything: berries, fruits, nuts, insects, fish, and even bird eggs. They're nocturnal, which means they go out at night to look for food and sleep during the day in cozy tree hollows.

Raccoons are very social. Young raccoons love to play—tumbling in the grass, hiding behind trees, and exploring everything around them. And sometimes, if they feel safe, raccoons might even come closer to where people are—especially if there's a snack nearby!

Even though they can be a little mischievous, raccoons play an important role in nature. They help spread seeds and keep insect populations in check.

So next time you see a raccoon, remember: it’s not just a fluffy animal—it’s a real forest explorer!


Room for Improvement

Of course, the goal is to automate the process as much as possible — and to make the images look more natural and visually appealing:

  • Generate looping videos to make sure the animated illustrations move smoothly.

  • Simply pass a page URL or document to the program and get back a local webpage with animations.

  • Add logic to skip images below a certain size, to avoid animating icons, logos, or other minor elements.

  • Support a wider range of image formats.

  • Automate GIF conversion from video directly within the program.

Let’s have the multimodal model analyze the image and suggest a prompt for the video:

Now it's time to generate a short video based on our image and the prompt prepared for us by the chat model in the previous step. We'll use a model from Kling AI.

We've generated two videos and will now convert them into GIF animations using , for easier playback on a web page. We'll also reduce the frame rate and size to ensure smoother playback. We'll save the resulting GIF files in the same folder, using the same names as the original PNGs.

You can also ask any chat model (e.g., ) to generate a web page with the original text and the GIF animations placed in the same spots as the original illustrations.

gpt-4o
kling-video/v1.6/pro/image-to-video
a free third-party web service
gpt-4o
🎉
gpt-4o