Vision Models

Welcome to the Vision Models API documentation! The AI/ML API allows you to leverage vision capabilities to analyze and understand images through our models.

Key Features

  • Image Analysis: Understand and describe the content of images.

  • Flexible Input Methods: Supports both image URLs and base64 encoded images.

  • Multiple Image Inputs: Analyze multiple images in a single request.

Quick Start

Images can be provided to the model in two main ways: by passing an image URL or by passing the base64 encoded image directly in the request.

Currently we support gpt-4o vision and gpt-4-turbo vision.

Example: What's in this image?

Python Example

import requests
import json

url = ""

payload = json.dumps({
  "model": "gpt-4o",
  "messages": [
      "role": "user",
      "content": [
        {"type": "text", "text": "What’s in this image?"},
          "type": "image_url",
          "image_url": {
            "url": ""
  "max_tokens": 300

headers = {
  'Content-Type': 'application/json',
  'Authorization': 'Bearer YOUR_API_KEY'

response =, headers=headers, data=payload)

Uploading Base64 Encoded Images

For local images, you can pass the base64 encoded image to the model.

Python Example

import base64
import requests

# Function to encode the image
def encode_image(image_path):
  with open(image_path, "rb") as image_file:
    return base64.b64encode('utf-8')

# Path to your image
image_path = "path_to_your_image.jpg"
base64_image = encode_image(image_path)

url = ""
headers = {
  "Content-Type": "application/json",
  "Authorization": "Bearer YOUR_API_KEY"
payload = {
  "model": "gpt-4o",
  "messages": [
      "role": "user",
      "content": [
        {"type": "text", "text": "What’s in this image?"},
        {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}}
  "max_tokens": 300

response =, headers=headers, json=payload)

Multiple Image Inputs

The API can process multiple images in a single request.

Python Example

import requests
import json

url = ""

payload = json.dumps({
  "model": "gpt-4o",
  "messages": [
      "role": "user",
      "content": [
        {"type": "text", "text": "What are in these images? Is there any difference between them?"},
        {"type": "image_url", "image_url": {"url": ""}},
        {"type": "image_url", "image_url": {"url": ""}}
  "max_tokens": 300

headers = {
  'Content-Type': 'application/json',
  'Authorization': 'Bearer YOUR_API_KEY'

response =, headers=headers, data=payload)

Last updated