arrow-left

Only this pageAll pages
gitbookPowered by GitBook
triangle-exclamation
Couldn't generate the PDF for 536 pages, generation stopped at 100.
Extend with 50 more pages.
1 of 100

AI/ML API Documentation

Quickstart

Loading...

Loading...

Loading...

Loading...

Loading...

API REFERENCES

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Documentation Map

Learn how to get started with the AI/ML API

This page helps you quickly find the right AI model for your task. Open the API reference and copy a working example to integrate it into your code in minutes.


Trending Models

Cover

Pro-Grade Image Model

Cover

Top Video Generator

Cover

Smarter Reasoning & Coding


Start with this code block 🚀 Setup guide 🧩 SDKs

▶️ Run in Playgroundarrow-up-right

from openai import OpenAI
client = OpenAI(
base_url="https://api.aimlapi.com/v1",
api_key="<YOUR_AIMLAPI_KEY>",
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Write a one-sentence story about numbers."}]
)
print(response.choices[0].message.content)


hashtag
Browse Models

Popular | View all 400+ models >

Select the model by its Task, by its Developer or by the supported Capabilities:

circle-info

If you've already made your choice and know the model ID, use the Search panelarrow-up-right on your right.

text-sizeText Models (LLM)chevron-rightimage-landscapeImage Modelschevron-rightvideoVideo Modelschevron-rightguitarMusic Modelschevron-rightwaveformVoice/Speech Modelschevron-rightcube3D-Generating Modelschevron-righteyeVision Modelschevron-rightbinaryEmbedding Modelschevron-right

Alibaba Cloud: Text/Chat Image Video Text-to-Speech Embedding

Anthracite: Text/Chat

Anthropic: Text/Chat Embedding

Assembly AI: Speech-To-Text

Baidu: Text/Chat

ByteDance: Text/Chat Image Video

Cohere:

DeepSeek:

Deepgram:

ElevenLabs:

Flux:

Google:

Gryphe:

Hume AI:

Inworld:

Kling AI:

Krea:

LTXV:

Meta:

Microsoft:

MiniMax:

Mistral AI:

Moonshot:

NousResearch:

NVIDIA:

OpenAI:

Perplexity:

PixVerse:

RecraftAI:

Reve:

Runway:

Stability AI:

Sber AI:

Tencent:

VEED:

xAI:

Zhipu:


hashtag
Going Deeper

Use more text model capabilities in your project: 📖

📖

📖

📖

📖

📖

📖

Miscellaneous: 🔗

📗

⚠️

❓ ​

Learn more about developer-specific features: 📖

hashtag
Have a Minute? Help Make the Docs Better!

We’re currently working on improving our documentation portal, and your feedback would be incredibly helpful! Take a quick 5-question surveyarrow-up-right (no personal info required!)

You can also rate each individual page using the built-in form on the right side of the screen:

Have suggestions for improvement? Let us know!arrow-up-right

Quickstart

Access leading AI models (GPT-4o, Gemini, and others) through a single unified API. Initial setup takes just a few minutes.

circle-check

If you are a manager and simply want to test a model to evaluate its performance, for instance in content generation, the quickest approach is to use our Playgroundarrow-up-right. It offers an intuitive, user-friendly interface—no coding required.

Programmatic API calls are best suited for developers who want to integrate a model into their own apps.


Here, you'll learn how to start using our API in your code. The following steps must be completed regardless of which of our models you plan to call:

  • generating an AIML API Key,

  • choosing and preparing your development environment,

  • making an API call.

Let's walk through an example of connecting to the Gemma 3 model via REST API.

hashtag
Generating an AIML API Key

chevron-rightWhat is an API Key?hashtag

You can find your AIML API key on the account pagearrow-up-right.

An AIML API key is a credential that grants you access to our API from your code. It is a sensitive string that is shown only at creation time and should be kept confidential. Do not share this key with anyone, as it could be misused without your knowledge. If you lose it, generate a new key from your dashboard.

⚠️ Note that API keys from third-party organizations cannot be used with our API: you need an AIML API Key.

To use the AIML API, you need to create an account and generate an AIML API key. Follow these steps:

  1. Create an Accountarrow-up-right: Visit the AI/ML API website and create an account.

  2. Generate an API Keyarrow-up-right: After logging in, navigate to your account dashboard and generate your API key. Ensure that key is enabled on UI.


hashtag
Choosing the Development Environment

Each language has recommended environments for running code samples.

cURL

  • is a web-based REST client that lets you quickly run cURL requests directly in your browser, without installing any tools.

  • (Windows) or the built-in Terminal (macOS/Linux) allow you to run cURL examples and other command-line tools locally.

Python

  • is a popular online environment for running Python code and is the fastest option if you do not want to install anything locally.

  • (VS Code) is a lightweight and widely used code editor that supports both Python and Node.js. It is suitable for running and debugging local examples and for working on real projects.

JavaScript

  • (VS Code)

circle-info

In the examples below for cURL, JavaScript and Python, we use the REST API. This approach works with all of our APIs, but it is not the only way to integrate. You can use other supported SDKs.

hashtag
Making an API Call

Based on your environment, you will call our API differently. Below are three common ways to call our API using two popular languages: cURL (a command-line format for making HTTP requests rather than a programming language), Python, and JavaScript (NodeJS).

If you want to get started really quickly, choose one of the four expandable sections below. Each one contains instructions for calling our model using different tools and environments. The first two options are especially simple and suitable even for beginners.

For completeness, the same example is explained in detail in the Code Step-by-Step section.

curl -L \
  --request POST \
  --url 'https://api.aimlapi.com/v1/chat/completions' \
  --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
  --header 'Content-Type: application/json' \
  --data '{
    "model": "google/gemma-3-4b-it",
    "messages": [
      {
        "role": "user",
        "content": "Tell me about San Francisco"
      }
    ],
    "temperature": 0.7,
    "max_tokens": 512
  }'
userPrompt = 'Tell me about San Francisco' // insert your request here

async function main() {
  const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
    method: 'POST',
    headers: {
      // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
      'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      model: 'google/gemma-3-4b-it',
      messages:[
          {
              role:'user',
              content: userPrompt
          }
      ],
      temperature: 0.7,
      max_tokens: 512,
    }),
  });

  const data = await response.json();
  const answer = data.choices[0].message.content;
  
  console.log('User:', userPrompt);
  console.log('AI:', answer);
}

main();
import requests 

user_prompt = "Tell me about San Francisco"  # insert your request here

response = requests.post(
    "https://api.aimlapi.com/v1/chat/completions",
    headers={
        # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
        "Authorization":"Bearer <YOUR_AIMLAPI_KEY>"
    },
    json={
        "model":"google/gemma-3-4b-it",
        "messages":[  
            {
                "role":"user",
                "content": user_prompt
            }
        ],
        "temperature": 0.7,
        "max_tokens": 512,
    }
)

data = response.json()
answer = data["choices"][0]["message"]["content"]

print("User:", user_prompt)
print("AI:", answer)
chevron-right⭐ How to run a cURL example in a web-based REST client (REQBIN)hashtag

Calling the API via cURL through a web service like this is the simplest and fastest method, requiring no additional libraries. However, there is a downside: cURL is not a programming language, which means it has very limited capabilities for adding logic—only API calls, no loops or conditions. You can’t even extract just the specific field with the model’s text response—cURL returns the model’s full output, as you’ll see below.


1. Copy the cURL example above and paste it into a text editor, such as Notepad or Notepad++.

2. Replace the placeholder <YOUR_AIMLAPI_KEY> with your actual AIMLAPI Key.

3. If needed, modify the prompt (the content field).

4. Copy the modified example, go to the website, paste it into the designated field and click Run:

5. After the model processes your request, the model’s full output will be shown directly below the input field.

circle-info

Pro tip: try experimenting with the three different ways of displaying the model’s output. Some are more readable than others.

chevron-right⭐ How to run a Python example in an online Jupyter Notebookhashtag

The second fastest option, and a much more convenient choice, while offering more flexibility for customizing how the output is displayed in code.


1. When you open Jupyter Notebookarrow-up-right for the first time, select “Python 3.13 (XPython)” in the pop-up window to indicate the programming language kernel you will be working with:

circle-info

In some browsers, the kernel selection may look different:

2. Enter the following command in the first cell to install the requests library:

Click the Run button in the toolbar above the cell to execute it:

3. Paste our example into the second cell, replace the placeholder with your AIMLAPI Key, then click the Run button in the toolbar:

4. After the model processes your request, the result will be shown directly below the cell:

chevron-rightHow to run a Python example locally from the command line (without an IDE)hashtag

Let's start from very beginning. We assume you already installed Python (with venv), if not, here a guide for the beginners.

Create a new folder for test project, name it as aimlapi-welcome and change to it.

mkdir ./aimlapi-welcome
cd ./aimlapi-welcome

(Optional) If you use IDE then we recommend to open created folder as workspace. On example, in Visual Studio Code you can do it with:

code .

Run a terminal inside created folder and create virtual envorinment with a command:

Activate created virtual environment:

Install requirement dependencies. In our case (REST API SDK) we need only request library:

Create new file and name it as travel.py:

Paste following content inside this travel.py and replace <YOUR_AIMLAPI_KEY> with your API key you got on :

Run the application:

If you done all correct, you will see following output:

chevron-rightHow to run a JavaScript example locally from the command line (without an IDE)hashtag

We assume you already have Node.js installed. If not, here is a guide for beginners.

Create a new folder for the example project:

mkdir ./aimlapi-welcome
cd ./aimlapi-welcome

Create a project file:

npm init -y

Create a file with the source code:

touch ./index.js

And paste the following content to the file and save it:

async function main() {
  const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
    method: 'POST',
    headers: {
      // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
      'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      model: 'google/gemma-3-4b-it',
      messages:[
          {
              role:'user',
              content: 'Tell me about San Francisco'  // Insert your prompt here
          }
      ],
      temperature: 0.7,
      max_tokens: 256,
    }),
  });

  const data = await response.json();
  console.log(JSON.stringify(data, null, 2));
}

main();

Run the file:

You will see a response that looks like this:


hashtag
Code Step-by-Step

Below is a step-by-step explanation of the same API call in three variants: cURL, JavaScript, and Python. All three examples send an identical request to the google/gemma-3-4b-it chat model.

chevron-rightcURLhashtag

1. Command start

curl -L \

Runs the cURL HTTP client. The -L flag tells cURL to follow redirects (if any).


2. HTTP method

--request POST \

Specifies that the request uses the POST method.


3. Endpoint

The full endpoint URL used to call chat models.


4. Authorization header

Sends your AIMLAPI key in the Authorization header.


5. Content type

Indicates that the request body is JSON.


6. Request body

This is the payload sent to the API:

  • model – the model identifier.

  • messages – the chat history.

    • role: "user" – the user message.

circle-info

These are the input parameters used to tell the endpoint—which in this case generates text answers—what exactly we want it to produce.

With the parameters shown above, we are effectively asking the API to use the google/gemma-3-4b-it model and generate a reasonably vivid and engaging description of San Francisco, limited to roughly 300–350 words— with the temperature and max_tokens parameters controlling the creativity and approximate length of the output, respectively.


7. Response

In the cURL example, you receive the entire JSON response. No fields are extracted — cURL simply prints the raw output.

chevron-rightJavaScript (Node.js)hashtag

1. Define the user prompt

userPrompt = 'Tell me about San Francisco'

Stores the text of the user request.


2. Call the API

const response = await fetch(
  'https://api.aimlapi.com/v1/chat/completions',
  { ... }
);

Sends an HTTP request to the endpoint.


3. HTTP method

method: 'POST',

Specifies that the request uses the POST method.


4. Headers

  • Sends your AIMLAPI key in the Authorization header.

  • Indicates that the request body is JSON.


5. Request body

This is the payload sent to the API:

  • model – the model identifier.

  • messages – the chat history.

    • role: "user" – the user message.

circle-info

These are the input parameters used to tell the endpoint—which in this case generates text answers—what exactly we want it to produce.

With the parameters shown above, we are effectively asking the API to use the google/gemma-3-4b-it model and generate a reasonably vivid and engaging description of San Francisco, limited to roughly 300–350 words— with the temperature and max_tokens parameters controlling the creativity and approximate length of the output, respectively.


6. Parse the response

Converts the API response into a JavaScript object.


7. Extract the model’s text output

Reads the text of the first generated message.


8. Print the result

Output formatting: from the model’s full response, only the generated text is extracted, and it is presented together with the original prompt in a dialogue-style format.


chevron-rightPythonhashtag

1. Import the HTTP library

import requests

The requests library is used to send HTTP requests.


2. Define the user prompt

user_prompt = "Tell me about San Francisco"

Stores the text of the user query.


3. Call the API

Sends a POST request to the endpoint.


4. Headers

  • Sends your AIMLAPI key in the Authorization header.

  • Indicates that the request body is JSON.


5. Request body

This is the payload sent to the API:

  • model – the model identifier.

  • messages – the chat history.

    • role: "user" – the user message.

circle-info

These are the input parameters used to tell the endpoint—which in this case generates text answers—what exactly we want it to produce.

With the parameters shown above, we are effectively asking the API to use the google/gemma-3-4b-it model and generate a reasonably vivid and engaging description of San Francisco, limited to roughly 300–350 words— with the temperature and max_tokens parameters controlling the creativity and approximate length of the output, respectively.


6. Parse the response

Converts the JSON response into a Python dictionary.


7. Extract the model’s text output

Reads the text of the first generated message.


8. Print the result

Output formatting: from the model’s full response, only the generated text is extracted, and it is presented together with the original prompt in a dialogue-style format.


hashtag
Future Steps

  • Move to production-ready models: see the guide for connecting GPT-4o

  • Browse and compare AI models, including GPT, Claude, and many others, using the Playgroundarrow-up-right

  • Know more about supported SDKs

  • Learn more about special text model capabilities

ChatGPT

DeepSeek

Flux

Text/Chat
Text/Chat
Speech-To-Text
Text-to-Speech
Text-to-Speech
Voice Chat
Music
Image
Text/Chat
Image
Video
Music
Vision(OCR)
Embedding
Text/Chat
Text-to-Speech
Text-to-Speech
Image
Video
Video
Video
Text/Chat
Text-to-Speech
Text/Chat
Video
Music
Voice-Chat
Text/Chat
Vision(OCR)
Text/Chat
Text/Chat
Text/Chat
Text/Chat
Image
Speech-To-Text
Embedding
Text/Chat
Video
Image
Image
Video
Image
Music
3D-Generation
Video
Image
Video
3D
Video
Text/Chat
Image
Text/Chat
Completion and Chat Completionchevron-right
Streaming Modechevron-right
Code Generationchevron-right
Thinking / Reasoningchevron-right
Function Callingchevron-right
Vision in Text Modelschevron-right
Web Searchchevron-right
​Completion and Chat Completion
Function Calling
Streaming Mode
Vision in Text Models (Image-to-Text)
Code Generation
Thinking / Reasoning
Web Search
Integrations
Glossary
Errors and Messages
FAQ
Features of Anthropic Models

Service Endpoints

Anthropic

The chat models from this provider have some unique characteristics. In addition to the standard v1/chat/completions endpoint, you can also call Anthropic models via the /messages endpoint. This capability is described in more detail in the Capabilities section.

Baidu

ByteDance

Cohere

Google

Gryphe

MiniMax

content – the user prompt.

  • temperature – controls output randomness.

  • max_tokens – the maximum number of tokens in the response.

  • content – the user prompt.

  • temperature – controls output randomness.

  • max_tokens – the maximum number of tokens in the response.

  • content – the user prompt.

  • temperature – controls output randomness.

  • max_tokens – the maximum number of tokens in the response.

  • REQBINarrow-up-right
    first step
    Join the community: get help and share your projects in our Discordarrow-up-right
    REQBINarrow-up-right
    Git Basharrow-up-right
    Jupyter Notebookarrow-up-right
    Visual Studio Codearrow-up-right
    Visual Studio Codearrow-up-right
    %pip install requests
    python3 -m venv ./.venv
    # Linux / Mac
    source ./.venv/bin/activate
    # Windows
    ./.venv/bin/Activate.bat
    pip install requests
    touch travel.py
    import requests 
    
    user_prompt = "Tell me about San Francisco"
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"google/gemma-3-4b-it",
            "messages":[
                {
                    "role":"user",
                    "content": user_prompt
                }
            ],
            "temperature": 0.7,
            "max_tokens": 512,
        }
    )
    
    data = response.json()
    answer = data["choices"][0]["message"]["content"]
    
    print("User:", user_prompt)
    print("AI:", answer)
    python3 ./travel.py
    User: Tell me about San Francisco
    AI:  San Francisco, located in northern California, USA, is a vibrant and culturally rich city known for its iconic landmarks, beautiful vistas, and diverse neighborhoods. It's a popular tourist destination famous for its iconic Golden Gate Bridge, which spans the entrance to the San Francisco Bay, and the iconic Alcatraz Island, home to the infamous federal prison.
    
    The city's famous hills offer stunning views of the bay and the cityscape. Lombard Street, the "crookedest street in the world," is a must-see attraction, with its zigzagging pavement and colorful gardens. Ferry Building Marketplace is a great place to explore local food and artisanal products, and the Pier 39 area is home to sea lions, shops, and restaurants.
    
    San Francisco's diverse neighborhoods each have their unique character. The historic Chinatown is the oldest in North America, while the colorful streets of the Mission District are known for their murals and Latin American culture. The Castro District is famous for its LGBTQ+ community and vibrant nightlife.
    ./index.js
    User: Tell me about San Francisco
    AI: San Francisco, located in the northern part of California, USA, is a vibrant and culturally rich city known for its iconic landmarks, beautiful scenery, and diverse neighborhoods.
    
    The city is famous for its iconic Golden Gate Bridge, an engineering marvel and one of the most recognized structures in the world. Spanning the Golden Gate Strait, this red-orange suspension bridge connects San Francisco to Marin County and offers breathtaking views of the San Francisco Bay and the Pacific Ocean.
    --url 'https://api.aimlapi.com/v1/chat/completions' \
    --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
    --header 'Content-Type: application/json' \
    --data '{
      "model": "google/gemma-3-4b-it",
      "messages": [
        {
          "role": "user",
          "content": "Tell me about San Francisco"
        }
      ],
      "temperature": 0.7,
      "max_tokens": 512
    }'
    headers: {
      'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      model: 'google/gemma-3-4b-it',
      messages: [
        {
          role: 'user',
          content: userPrompt
        }
      ],
      temperature: 0.7,
      max_tokens: 512,
    }),
    const data = await response.json();
    const answer = data.choices[0].message.content;
    console.log('User:', userPrompt);
    console.log('AI:', answer);
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        ...
    )
    headers={
        "Authorization": "Bearer <YOUR_AIMLAPI_KEY>",
        "Content-Type": "application/json"
    },
    json={
        "model": "google/gemma-3-4b-it",
        "messages": [
            {
                "role": "user",
                "content": user_prompt
            }
        ],
        "temperature": 0.7,
        "max_tokens": 512,
    }
    data = response.json()
    answer = data["choices"][0]["message"]["content"]
    print("User:", user_prompt)
    print("AI:", answer)

    OpenClaw AI/ML

    hashtag
    About

    OpenClaw is an AI platform for building AI agents and assistants. It runs on your own devices and connects to popular messaging platforms (such as WhatsApp, Telegram, Slack, Discord, and others) while preserving full data privacy (all agent data is stored locally in a SQLite database).

    Developers use OpenClaw to build multi-channel AI assistants with streaming responses, browser automation, vision, and voice features. It includes a local Gateway service, a CLI for management, and support for 12+ messaging platforms.

    circle-check

    Data privacy: OpenClaw stores data locally by default. Nothing is sent externally unless you configure it.

    hashtag
    What you get

    • Multi-channel assistants and routing across 12+ messaging platforms

    • Streaming responses for faster, more interactive chats

    • Vision inputs for image understanding and UI analysis

    • Browser automation via an OpenClaw-managed Chrome instance


    hashtag
    Prerequisites

    • An AIMLAPI key obtained from your

    • Node.js and npm

    • pnpm if you build from source


    hashtag
    Installation

    hashtag
    Option 1: Install via npm (recommended)

    circle-info

    openclaw-aimlapi@latest includes two AI/ML API skills:

    • aimlapi-media-gen for images and video

    The onboarding wizard installs the Gateway as a system service. It uses launchd on macOS and systemd on Linux.

    hashtag
    Option 2: Build from source

    chevron-rightUI walkthrough (screenshots)hashtag

    hashtag
    Option 3: Install skills from the official repo (ClawHub)

    Use this if you want to install or update skills separately from the OpenClaw package.

    hashtag
    Install the CLI

    Pick one:

    For more details, see: .

    hashtag
    Install the skills

    hashtag
    How it fits into OpenClaw

    • By default, clawhub installs skills into ./skills under your current directory.

    • If an OpenClaw workspace is configured, clawhub falls back to that workspace.

    • Override the install location with --workdir

    hashtag
    What these skills do

    aiml-image-video — Our media generation models

    Generate images and videos via two Python scripts (gen_image.py, gen_video.py).

    aiml-llm-reasoning — Our LLMs + Reasoning

    Run chat completions via run_chat.py. Use --extra-json for advanced params.

    circle-info

    Paths above assume you run clawhub install ... from your OpenClaw workspace root (so skills land in ./skills). If you install somewhere else, adjust the paths to match your --workdir.

    circle-info

    If you installed OpenClaw via openclaw-aimlapi@latest, you may already have AIML-related skills installed. Use ClawHub when you specifically want the skills from the official skills repository.


    hashtag
    Configure AI/ML API in OpenClaw

    Use the Web UI from onboarding. The default URL is usually .

    1

    hashtag
    Select provider

    Pick AI/ML API in the providers list.

    2


    hashtag
    Use OpenClaw

    hashtag
    Use via a chat connector (Telegram example)

    1. Message your bot. You will receive a pairing code.

    2. Approve the pairing:

    circle-info

    Expected output looks like this:

    3. Message your bot again. You should get a response.

    hashtag
    Use via CLI

    chevron-rightExample responsehashtag

    hashtag
    Use Cases

    chevron-rightExample: Route Slack + Discord to the same agenthashtag
    1. User messages the bot on Slack or Discord.

    2. Gateway receives the message with platform context.

    chevron-rightExample: Analyze a web page with visionhashtag
    1. User requests a web page analysis.

    2. OpenClaw opens a Chrome instance (CDP-controlled).


    hashtag
    Supported models

    • OpenAI models (, , , , , and others)


    hashtag
    More

    Supported SDKs

    A description of the software development kits (SDKs) that can be used to interact with the AIML API.

    This page describes the SDKs that can be used to call our API.

    chevron-rightKey Definitions & Noteshashtag

    The REST API itself is not an SDK. It is the server-side interface that exposes your models over HTTP. It defines endpoints, HTTP methods (POST/GET), required headers, and the structure of request and response JSON. Essentially, it’s the “contract” the server provides for clients to interact with models programmatically.


    An SDK (Software Development Kit) is a client-side library that wraps around the REST API. It handles details like building HTTP requests, serializing/deserializing JSON, error handling, retries, and sometimes additional conveniences.

    You can skip the SDK and call the REST API directly via cURL, fetch, requests, etc. The SDK just makes your life easier; the REST API is the “core interface” the SDK talks to.


    The following flow shows how a request travels from your code to the model and back. Using an SDK is optional — it simply wraps the REST API for convenience.

    Your code → SDK (optional) → REST API → Model → REST API → SDK → Your code

    circle-info

    Comparing requests made with raw REST API and different SDKs, pay attention to the following common aspects:

    • how the Authorization header and the AIML API key are provided,

    • how the POST method and the endpoint URL are specified,

    circle-check

    Also take a look at the section — it covers many third-party services and libraries (workflow platforms, coding assistants, etc.) that allow you to integrate our models in various ways.


    hashtag
    REST API

    We use the REST API because it’s fast, simple, and easy to understand. Only in Python do you need to import a separate library (requests), while cURL and JavaScript (Node.js) already have built-in support for HTTP requests. Therefore, REST API is used in the documentation examples for all of our models.

    hashtag
    Installation

    In Python examples, you need to import the requests library. The Node.js and cURL examples do not require any additional imports.

    Install the library first:

    Import the library in every Python code snippet where you make calls to the REST API.

    hashtag
    Authorization

    Our API authorization is based on a Bearer token. Include it in the Authorization HTTP header within the request. Example:

    hashtag
    Request Example


    hashtag
    OpenAI

    The OpenAI SDK is a convenient library that simplifies working with our API. It automatically handles JSON responses, includes built-in error handling and retry logic, and provides simple, easy-to-use methods for all API features such as chat, embeddings, and completions.

    chevron-rightThe AI features that the OpenAI SDK supportshashtag
    • Streaming

    • Completions

    hashtag
    Installation

    chevron-rightPythonhashtag

    1. Make sure you have Python 3.7+ and pip installed.

    2. Install the OpenAI SDK via terminal or Jupyter Notebook:

    In Jupyter Notebook, you can also use:

    3. Import the SDK:


    chevron-rightJavaScript (Node.js)hashtag

    1. Make sure you have Node.js 18+ and npm installed.

    2. Install the OpenAI SDK in your project:

    3. Import the SDK and initialize the client:

    hashtag
    Example Code


    hashtag
    AI/ML API Python library

    We have started developing our own SDK to simplify the use of our service.

    circle-check

    If you’d like to contribute to expanding its functionality, feel free to reach out to us on !

    hashtag
    Installation

    After obtaining your AIML API key, create an .env file and copy the required contents into it.

    Copy the code below, paste it into your .env file, and set your API key in AIML_API_KEY="<YOUR_AIMLAPI_KEY>", replacing <YOUR_AIMLAPI_KEY> with your actual key:

    Install the package:

    hashtag
    Request Example

    To execute the script, use:


    hashtag
    Next Steps

    Account Balance

    hashtag
    [legacy] Get account balance info

    circle-exclamation

    This endpoint is considered legacy and is scheduled for future deprecation. Please plan to migrate to the new /v2/billing and /v2/billing/detail endpoints documented below.

    You can query your account balance and other billing details through this API. To make a request, you only need your AIMLAPI key obtained from your .

    hashtag
    Get balance info

    Returns a user's balance.

    hashtag
    Get detailed billing info

    Returns detailed billing information, balance and auto top-up settings.

    API Key Management

    POST https://api.aimlapi.com/v1/keys

    GET https://api.aimlapi.com/v1/keys

    GET https://api.aimlapi.com/v1/key

    PATCH https://api.aimlapi.com/v1/keys/{prefix}

    circle-exclamation

    Before you start you should create .

    hashtag
    Create a new API key

    Copy the created key and store it in a secure location. If the key is lost, create a new one.


    hashtag
    List API keys

    Returns all API keys for your account, including each key’s settings and metadata.


    hashtag
    Get the API key

    Retrieve parameters of the AIMLAPI key used in the request.


    hashtag
    Update an API key


    hashtag
    Delete an API key

    qwen-max

    circle-info

    This documentation is valid for the following list of our models:

    • qwen-max

    • qwen-max-2025-01-25

    hashtag
    Model Overview

    The large-scale Mixture-of-Experts (MoE) language model. Excels in language understanding and task performance. Supports 29 languages, including Chinese, English, and Arabic.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    qwen-plus

    circle-info

    This documentation is valid for the following list of our models:

    • qwen-plus

    hashtag
    Model Overview

    An advanced large language model. Multilingual support, including Chinese and English. Enhanced reasoning capabilities for complex tasks. Improved instruction-following abilities.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model:

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    qwen-turbo

    circle-info

    This documentation is valid for the following list of our models:

    • qwen-turbo

    hashtag
    Model Overview

    This model is designed to enhance both the performance and efficiency of AI agents developed on the Alibaba Cloud Model Studio platform. Optimized for speed and precision in generative AI application development. Improves AI agent comprehension and adaptation to enterprise data, especially when integrated with Retrieval-Augmented Generation (RAG) architectures. Large context window (1,000,000 tokens).

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model:

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    qwen3-32b

    circle-info

    This documentation is valid for the following list of our models:

    • alibaba/qwen3-32b

    hashtag
    Model Overview

    A world-class model with comparable quality to DeepSeek R1 while outperforming and . Optimized for both complex reasoning and efficient dialogue.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model:

    hashtag
    API Schema

    hashtag
    Code Example #1: Without Thinking and Streaming

    circle-exclamation

    enable_thinking must be set to false for non-streaming calls.

    chevron-rightResponsehashtag

    hashtag
    Code Example #2: Enable Thinking and Streaming

    chevron-rightResponsehashtag

    The example above prints the raw output of the model. The text is typically split into multiple chunks. While this is helpful for debugging, if your goal is to evaluate the model's reasoning and get a clean, human-readable response, you should aggregate both the reasoning and the final answer in a loop — for example:

    chevron-rightExample with response parsinghashtag

    After running such code, you'll receive only the model's textual output in a clear and structured format:

    chevron-rightResponsehashtag

    qwen3-coder-480b-a35b-instruct

    circle-info

    This documentation is valid for the following list of our models:

    • alibaba/qwen3-coder-480b-a35b-instruct

    hashtag
    Model Overview

    The most powerful model in the Qwen3 Coder series — a 480B-parameter MoE architecture with 35B active parameters. It natively supports a 256K token context and can handle up to 1M tokens using extrapolation techniques, delivering outstanding performance in both coding and agentic tasks.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model:

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    qwen3-next-80b-a3b-instruct

    circle-info

    This documentation is valid for the following list of our models:

    • alibaba/qwen3-next-80b-a3b-instruct

    hashtag
    Model Overview

    An instruction-tuned chat model optimized for fast, stable replies without reasoning traces, designed for complex tasks in reasoning, coding, knowledge QA, and multilingual use, with strong alignment and formatting.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model:

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    qwen3-next-80b-a3b-thinking

    circle-info

    This documentation is valid for the following list of our models:

    • alibaba/qwen3-next-80b-a3b-thinking

    hashtag
    Model Overview

    The model may take longer to generate reasoning content than its predecessor. Alibaba Cloud strongly recommends its use for highly complex reasoning tasks.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model:

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    qwen3-max-preview

    circle-info

    This documentation is valid for the following list of our models:

    • alibaba/qwen3-max-preview

    hashtag
    Model Overview

    The preview version of .

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model:

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    qwen3-max-instruct

    circle-info

    This documentation is valid for the following list of our models:

    • alibaba/qwen3-max-instruct

    hashtag
    Model Overview

    This model offers improved accuracy in math, coding, logic, and science, handles complex instructions in Chinese and English more reliably, reduces hallucinations, supports 100+ languages with stronger translation and commonsense reasoning, and is optimized for RAG and tool use, though it lacks a dedicated ‘thinking’ mode.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model:

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    qwen3-vl-32b-instruct

    circle-info

    This documentation is valid for the following list of our models:

    • alibaba/qwen3-vl-32b-instruct

    hashtag
    Model Overview

    The most advanced vision-language model in the Qwen series as of October 2025 — a non-thinking-capable version of the model. Optimized for instruction-following in image description, visual dialogue, and content-generation tasks.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    qwen3.5-plus

    circle-info

    This documentation is valid for the following list of our models:

    • alibaba/qwen3.5-plus-20260218

    hashtag
    Model Overview

    A commercial large language model designed for long-context text generation and enterprise-grade conversational AI. Supports up to 1M tokens per request with production-ready API stability.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    qwen3.6-27b

    circle-info

    This documentation is valid for the following list of our models:

    • alibaba/qwen3.6-27b

    hashtag
    Model Overview

    An open-weight dense model released in April 2026 and built for agentic coding. It delivers high performance, matches on Terminal-Bench 2.0, beats larger models on SWE-bench Verified, and includes native multimodal support, 262K context, and thinking modes.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    qwen3.5-omni-plus

    circle-info

    This documentation is valid for the following list of our models:

    • alibaba/qwen3.5-omni-plus

    hashtag
    Model Overview

    A premium multimodal model with support for text, image, audio, and video inputs. Designed for complex tasks requiring advanced reasoning, speech generation, and high-quality outputs.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code example #1: Chat

    chevron-rightResponsehashtag

    hashtag
    Code example #2: Video analysis

    chevron-rightResponsehashtag

    qwen3.5-omni-flash

    circle-info

    This documentation is valid for the following list of our models:

    • alibaba/qwen3.5-omni-flash

    hashtag
    Model Overview

    A fast and cost-efficient multimodal model supporting text, image, audio, and video inputs. A lighter and faster version of , built for low-latency workloads that need strong performance at scale.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    hashtag
    Code example #2: Video analysis

    chevron-rightResponsehashtag

    Claude 4 Opus

    circle-info

    This documentation is valid for the following model:

    • anthropic/claude-opus-4

    circle-exclamation

    As of February 13, 2026, the streaming response format for Anthropic models has changed.

    hashtag
    Model Overview

    The leading coding model globally, consistently excelling at complex, long-duration tasks and agent-based workflows.

    hashtag
    How to Make a Call

    chevron-rightStep-by-Step Instructionshashtag

    1️ Setup You Can’t Skip

    ▪️ : Visit the AI/ML API website and create an account (if you don’t have one yet). ▪️ : After logging in, navigate to your account dashboard and generate your API key. Ensure that key is enabled on UI.

    2️ Copy the code example

    At the bottom of this page, you'll find that shows how to structure the request. Choose the code snippet in your preferred programming language and copy it into your development environment.

    hashtag
    API Schema

    hashtag
    Code Example #1

    chevron-rightResponsehashtag

    hashtag
    Code Example #2: Streaming Mode

    As of February 13, 2026, the streaming response format for Anthropic models has changed. Specifically, the usage fields were renamed as follows:

    • the state structure is no longer used,

    • input_tokens → prompt_tokens,

    • output_tokens

    chevron-rightResponsehashtag

    Claude 4 Sonnet

    circle-info

    This documentation is valid for the following list of our models:

    • anthropic/claude-sonnet-4

    • claude-sonnet-4

    • claude-sonnet-4-20250514

    circle-exclamation

    As of February 13, 2026, the streaming response format for Anthropic models has changed.

    hashtag
    Model Overview

    A major improvement over , offering better coding abilities, stronger reasoning, and more accurate responses to your instructions.

    hashtag
    How to Make a Call

    chevron-rightStep-by-Step Instructionshashtag

    1️ Setup You Can’t Skip

    ▪️ : Visit the AI/ML API website and create an account (if you don’t have one yet). ▪️ : After logging in, navigate to your account dashboard and generate your API key. Ensure that key is enabled on UI.

    2️ Copy the code example

    At the bottom of this page, you'll find that shows how to structure the request. Choose the code snippet in your preferred programming language and copy it into your development environment.

    hashtag
    API Schema

    hashtag
    Code Example #1

    chevron-rightResponsehashtag

    hashtag
    Code Example #2: Streaming Mode

    As of February 13, 2026, the streaming response format for Anthropic models has changed. Specifically, the usage fields were renamed as follows:

    • the state structure is no longer used,

    • input_tokens → prompt_tokens,

    • output_tokens

    chevron-rightResponsehashtag

    Claude 4.1 Opus

    circle-info

    This documentation is valid for the following list of our models:

    • anthropic/claude-opus-4.1

    • claude-opus-4-1

    • claude-opus-4-1-20250805

    circle-check

    All three IDs listed above refer to the same model; we support them for backward compatibility.

    circle-exclamation

    As of February 13, 2026, the streaming response format for Anthropic models has changed.

    hashtag
    Model Overview

    An upgrade to on agentic tasks, real-world coding, and thinking.

    hashtag
    How to Make a Call

    chevron-rightStep-by-Step Instructionshashtag

    1️ Setup You Can’t Skip

    ▪️ : Visit the AI/ML API website and create an account (if you don’t have one yet). ▪️ : After logging in, navigate to your account dashboard and generate your API key. Ensure that key is enabled on UI.

    2️ Copy the code example

    At the bottom of this page, you'll find that show how to structure the request. Choose the code snippet in your preferred programming language and copy it into your development environment.

    hashtag
    API Schema

    hashtag
    Code Example #1: Without Thinking

    chevron-rightResponsehashtag

    hashtag
    Code Example #2: Thinking Enabled

    chevron-rightResponsehashtag

    hashtag
    Code Example #3: Streaming Mode

    As of February 13, 2026, the streaming response format for Anthropic models has changed. Specifically, the usage fields were renamed as follows:

    • the state structure is no longer used,

    • input_tokens → prompt_tokens,

    • output_tokens

    chevron-rightResponsehashtag

    Claude 4.5 Haiku

    circle-info

    This documentation is valid for the following list of our models:

    • claude-haiku-4-5

    • anthropic/claude-haiku-4.5

    • claude-haiku-4-5-20251001

    circle-exclamation

    As of February 13, 2026, the streaming response format for Anthropic models has changed.

    hashtag
    Model Overview

    The model offers coding performance comparable to , but at one-third the cost and more than twice the speed.

    hashtag
    How to Make a Call

    chevron-rightStep-by-Step Instructionshashtag

    1️ Setup You Can’t Skip

    ▪️ : Visit the AI/ML API website and create an account (if you don’t have one yet). ▪️ : After logging in, navigate to your account dashboard and generate your API key. Ensure that key is enabled on UI.

    2️ Copy the code example

    At the bottom of this page, you'll find that shows how to structure the request. Choose the code snippet in your preferred programming language and copy it into your development environment.

    hashtag
    API Schema

    hashtag
    Code Example #1

    chevron-rightResponsehashtag

    hashtag
    Code Example #2: Streaming Mode

    As of February 13, 2026, the streaming response format for Anthropic models has changed. Specifically, the usage fields were renamed as follows:

    • the state structure is no longer used,

    • input_tokens → prompt_tokens,

    • output_tokens

    chevron-rightResponsehashtag

    Claude 4.5 Opus

    circle-info

    This documentation is valid for the following list of our models:

    • anthropic/claude-opus-4-5

    • claude-opus-4-5

    • claude-opus-4-5-20251101

    circle-exclamation

    As of February 13, 2026, the streaming response format for Anthropic models has changed.

    hashtag
    Model Overview

    A high-performance chat model that delivers state-of-the-art results on real-world software engineering benchmarks.

    hashtag
    How to Make a Call

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model

    hashtag
    API Schema

    hashtag
    Code Example #1

    chevron-rightResponsehashtag

    hashtag
    Code Example #2: Streaming Mode

    As of February 13, 2026, the streaming response format for Anthropic models has changed. Specifically, the usage fields were renamed as follows:

    • the state structure is no longer used,

    • input_tokens → prompt_tokens,

    • output_tokens

    chevron-rightResponsehashtag

    Claude 4.6 Sonnet

    circle-info

    This documentation is valid for the following list of our models:

    • anthropic/claude-sonnet-4.6

    • anthropic/claude-sonnet-4-6-20260218

    hashtag
    Model Overview

    A general-purpose LLM with an optimal balance of intelligence, cost, and speed. It’s great for chatbots, assistants, and production text generation workflows, and it supports prompt caching for efficient repeated contexts.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    ernie-4.5-0.3b

    circle-info

    This documentation is valid for the following list of our models:

    • baidu/ernie-4.5-0.3b

    hashtag
    Model Overview

    A small dense language model suitable for edge-side use and budget-constrained inference.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    ernie-4.5-21b-a3b-thinking

    circle-info

    This documentation is valid for the following list of our models:

    • baidu/ernie-4.5-21b-a3b-thinking

    hashtag
    Model Overview

    A post-trained LLM with 21B total parameters and 3B activated parameters per token. Reasoning variant.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    ernie-4.5-vl-28b-a3b

    circle-info

    This documentation is valid for the following list of our models:

    • baidu/ernie-4.5-vl-28b-a3b

    hashtag
    Model Overview

    A post-trained LLM with 28B total parameters and 3B activated parameters per token. A non-reasoning variant with image and PDF input support.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    ernie-4.5-300b-a47b

    circle-info

    This documentation is valid for the following list of our models:

    • baidu/ernie-4.5-300b-a47b

    hashtag
    Model Overview

    A post-trained LLM with 300B total parameters and 47B activated parameters per token. Non-reasoning variant.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    ernie-4.5-turbo-128k

    circle-info

    This documentation is valid for the following list of our models:

    • baidu/ernie-4-5-turbo-128k

    hashtag
    Model Overview

    A model from the ERNIE 4.5 Turbo subfamily, which Baidu presents as a faster, more cost-efficient, and more efficient alternative to the base ERNIE 4.5. It is optimized for improved response speed and stability, and features a truly large context window of approximately 128K tokens, enabling the processing of entire documents or long-running dialogues.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    ernie-4.5-turbo-vl-32k

    circle-info

    This documentation is valid for the following list of our models:

    • baidu/ernie-4-5-turbo-vl-32k

    hashtag
    Model Overview

    A model from the ERNIE 4.5 Turbo subfamily with multimodal support (text and images), offering a balanced trade-off between performance and computational cost.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    ernie-5.0-thinking-preview

    circle-info

    This documentation is valid for the following list of our models:

    • baidu/ernie-5-0-thinking-preview

    hashtag
    Model Overview

    A reasoning-focused model designed for complex, multi-step problem solving. It improves accuracy on analytical tasks by producing explicit reasoning.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    ernie-5.0-thinking-latest

    circle-info

    This documentation is valid for the following list of our models:

    • baidu/ernie-5-0-thinking-latest

    hashtag
    Model Overview

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    ernie-x1-turbo-32k

    circle-info

    This documentation is valid for the following list of our models:

    • baidu/ernie-x1-turbo-32k

    hashtag
    Model Overview

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    ernie-x1.1-preview

    circle-info

    This documentation is valid for the following list of our models:

    • baidu/ernie-x1-1-preview

    hashtag
    Model Overview

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    Seed 1.8

    circle-info

    This documentation is valid for the following list of our models:

    • bytedance/seed-1-8

    hashtag
    Model Overview

    A general-purpose agentic model optimized for efficient and accurate execution of complex tasks in real-world scenarios.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    Dola Seed 2.0 Mini

    circle-info

    This documentation is valid for the following list of our models:

    • bytedance/dola-seed-2-0-mini

    hashtag
    Model Overview

    A fast and cost-efficient multimodal model for lightweight tasks. Supports text, image, and video inputs with reasoning and agent workflows, handling up to ~256K context.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    Dola Seed 2.0 Pro

    circle-info

    This documentation is valid for the following list of our models:

    • bytedance/dola-seed-2-0-pro

    hashtag
    Model Overview

    A high-performance multimodal model focused on quality and deeper reasoning. Supports text, image, and video inputs with reasoning and agent workflows, handling up to ~256K context.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    Dola Seed 2.0 Code

    circle-info

    This documentation is valid for the following list of our models:

    • bytedance/dola-seed-2-0-code

    hashtag
    Model Overview

    A multimodal model optimized for programming and technical tasks. Supports text, image, and video inputs with reasoning and agent workflows, handling up to ~256K context.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    command-a

    circle-info

    This documentation is valid for the following list of our models:

    • cohere/command-a

    hashtag
    Model Overview

    A powerful LLM with advanced capabilities for enterprise applications.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    DeepSeek V3

    circle-info

    This documentation is valid for the following list of our models:

    • deepseek-chat

    • deepseek/deepseek-chat

    • deepseek/deepseek-chat-v3-0324

    circle-check

    We provide the latest version of this model from Mar 24, 2025. All three IDs listed above refer to the same model; we support them for backward compatibility.

    hashtag
    Model Overview

    DeepSeek V3 (or deepseek-chat) is an advanced conversational AI designed to deliver highly engaging and context-aware dialogues. This model excels in understanding and generating human-like text, making it an ideal solution for creating responsive and intelligent chatbots.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    DeepSeek R1

    circle-info

    This documentation is valid for the following list of our models:

    • deepseek/deepseek-r1

    • deepseek-reasoner

    circle-check

    Both IDs listed above refer to the same model; we support them for backward compatibility.

    hashtag
    Model Overview

    DeepSeek R1 is a cutting-edge reasoning model developed by DeepSeek AI, designed to excel in complex problem-solving, mathematical reasoning, and programming assistance.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    DeepSeek Chat V3.1

    circle-info

    This documentation is valid for the following list of our models:

    • deepseek/deepseek-chat-v3.1

    hashtag
    Model Overview

    August 2025 update of the non-reasoning model.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    DeepSeek V3.2 Speciale

    circle-info

    This documentation is valid for the following list of our models:

    • deepseek/deepseek-v3.2-speciale

    hashtag
    Model Overview

    A high-compute variant of DeepSeek-V3.2 that outperforms GPT-5 and matches Gemini-3.0-Pro in reasoning benchmarks, achieving gold-medal-level results at the 2025 International Mathematical Olympiad (IMO) and the International Olympiad in Informatics (IOI).

    circle-check

    chevron-rightHow to make the first API callhashtag
    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    DeepSeek V4 Pro

    circle-info

    This documentation is valid for the following list of our models:

    • deepseek/deepseek-v4-pro

    hashtag
    Model Overview

    A high-performance reasoning model as of late April 2026, designed for complex tasks, coding, and logic-heavy workflows. It supports up to 1M context length and includes an advanced thinking mode for deeper analysis.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    DeepSeek V4 Flash

    circle-info

    This documentation is valid for the following list of our models:

    • deepseek/deepseek-v4-flash

    hashtag
    Model Overview

    A fast and cost-efficient language model built for chat and completions. A lighter and faster version of , it supports up to 1M context length and offers both thinking and non-thinking modes for scalable, low-latency workloads.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    gemini-2.0-flash

    circle-info

    This documentation is valid for the following list of our models:

    • google/gemini-2.0-flash

    hashtag
    Model Overview

    A cutting-edge multimodal AI model developed by Google DeepMind, designed to power agentic experiences. This model is capable of processing text and images.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    gemini-2.5-flash

    circle-info

    This documentation is valid for the following list of our models:

    • google/gemini-2.5-flash

    hashtag
    Model Overview

    Gemini 2.5 models are capable of reasoning through their thoughts before responding, resulting in enhanced performance and improved accuracy.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model

    hashtag
    API Schema

    hashtag
    Code Example

    circle-exclamation

    A common issue when using reasoning-capable models via API is receiving an empty string in the content field—meaning the model did not return the expected text, yet no error was thrown.

    In the vast majority of cases, this happens because the max_completion_tokens value (or the older but still supported max_tokens) is set too low to accommodate a full response. Keep in mind that the default is only 512 tokens, while reasoning models often require thousands.

    chevron-rightResponsehashtag

    gemini-2.5-pro

    circle-info

    This documentation is valid for the following list of our models:

    • google/gemini-2.5-pro

    hashtag
    Model Overview

    Gemini 2.5 models are capable of reasoning through their thoughts before responding, resulting in enhanced performance and improved accuracy.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model

    hashtag
    API Schema

    hashtag
    Code Example

    circle-exclamation

    A common issue when using reasoning-capable models via API is receiving an empty string in the content field—meaning the model did not return the expected text, yet no error was thrown.

    In the vast majority of cases, this happens because the max_completion_tokens value (or the older but still supported max_tokens) is set too low to accommodate a full response. Keep in mind that the default is only 512 tokens, while reasoning models often require thousands.

    chevron-rightResponsehashtag

    gemma-3 (27B)

    circle-info

    This documentation is valid for the following list of our models:

    • google/gemma-3-27b-it

    hashtag
    Model Overview

    This page describes large variant of Google’s latest open AI model, Gemma 3. In addition to the capabilities of , this version also supports system and developer roles, enabling you to pass behavior-shaping instructions for the model.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    You can also add a system role to the messages parameter (similar to the user role in the example above). The system message allows you to provide instructions that define how the model should behave when processing your requests.

    chevron-rightResponse #2hashtag

    gemma-3n-4b

    circle-info

    This documentation is valid for the following list of our models:

    • google/gemma-3n-e4b-it

    hashtag
    Model Overview

    The first open model built on Google’s next-generation, mobile-first architecture—designed for fast, private, and multimodal AI directly on-device. With Gemma 3n, developers get early access to the same technology that will power on-device AI experiences across Android and Chrome later this year, enabling them to start building for the future today.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model

    hashtag
    API Schema

    hashtag
    Code Example

    circle-info

    Note that the system role is not supported in this model. In the messages parameter, only user and assistant roles are available.

    chevron-rightResponsehashtag

    gemini-3-flash-preview

    circle-info

    This documentation is valid for the following list of our models:

    • google/gemini-3-flash-preview

    hashtag
    Model Overview

    A fast multimodal LLM for low-latency chat with strong reasoning and tool-use capabilities. Supports text input and optional image understanding for vision-based prompts.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    gemma-4-31b-it

    circle-info

    This documentation is valid for the following list of our models:

    • google/gemma-4-31b-it

    hashtag
    Model Overview

    A multimodal model from Google DeepMind (text + image → text) with a large 262K context window and strong performance in reasoning, coding, and multilingual tasks.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    MythoMax L2 (13B)

    circle-info

    This documentation is valid for the following list of our models:

    • gryphe/mythomax-l2-13b

    hashtag
    Model Overview

    This model represents a pinnacle in the evolution of LLMs, purpose-built for storytelling and roleplaying, delivering a rich sense of connection with characters and narrative arcs.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    Llama-3.3-70B-Versatile

    circle-info

    This documentation is valid for the following list of our models:

    • meta-llama/llama-3.3-70b-versatile

    hashtag
    Model Overview

    An advanced multilingual large language model with 70 billion parameters, optimized for diverse NLP tasks. It delivers high performance across benchmarks while remaining efficient for a wide range of applications.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    Text-01

    circle-info

    This documentation is valid for the following list of our models:

    • MiniMax-Text-01

    hashtag
    Model Overview

    A powerful language model developed by MiniMax AI, designed to excel in tasks requiring extensive context processing and reasoning capabilities. With a total of 456 billion parameters, of which 45.9 billion are activated per token, this model utilizes a hybrid architecture that combines various attention mechanisms to optimize performance across a wide array of applications.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    All Model IDs

    A full list of available models.

    circle-info

    If you need to select models based on specific parameters for your task, visit the , which offers convenient filtering options. On the selected model’s page, you can find detailed technical and commercial information.

    circle-check

    To fetch the complete model list via the API, see for the relevant service endpoint.

    Text Models (LLM)

    Overview of the capabilities of AIML API text models (LLMs).

    chevron-rightSpecific Capabilitieshashtag

    There are several capabilities of text models that are worth mentioning separately.

    Completion allows the model to analyze a given text fragment and predict how it might continue based on the probabilities of the next possible tokens or characters. Chat Completion extends this functionality, enabling a simulated dialogue between the user and the model based on predefined roles (e.g., "strict language teacher" and "student"). A detailed description and examples can be found in our article.


    An evolution of chat completion includes Assistants

    Claude 4.6 Opus

    circle-info

    This documentation is valid for the following list of our models:

    • anthropic/claude-opus-4-6

    how the input parameters are passed.

    Chat Completions
  • Audio

  • Beta Assistants

  • Beta Threads

  • Embeddings

  • Image Generation

  • File Uploads

  • circle-info

    Therefore, we don’t currently have the option to call video models or voice / speech models (STT and TTS) through this SDK.

    INTEGRATIONS
    Discordarrow-up-right
    Check our full list of model IDs
    Browse and compare AI models, including GPT, Claude, and many others, using the Playgroundarrow-up-right
    Learn more about special text model capabilities
    Join the community: get help and share your projects in our Discordarrow-up-right
    pip install requests
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    Try in Playground
    Create AI/ML API Keyarrow-up-right
    OpenAPI dola-seed-2-0-proarrow-up-right
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"bytedance/dola-seed-2-0-pro",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    import requests
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      headers: {
        Authorization: "Bearer <YOUR_AIMLAPI_KEY>",
      },
        headers={
            "Authorization": "Bearer <YOUR_AIMLAPI_KEY>",
        },
    curl --request POST \
      --url https://api.aimlapi.com/chat/completions \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "google/gemma-3-4b-it",
        "messages": [
            {
                "role": "user",
                "content": "What kind of model are you?"
            }
        ],
        "max_tokens": 512
    }'
    fetch("https://api.aimlapi.com/chat/completions", {
      method: "POST",
      headers: {
        Authorization: "Bearer <YOUR_AIMLAPI_KEY>",
        "Content-Type": "application/json",
      },
      body: JSON.stringify({
        model: "google/gemma-3-4b-it",
        messages: [
          {
            role: "user",
            content: "What kind of model are you?",
          },
        ],
        max_tokens: 512,
      }),
    })
      .then((res) => res.json())
      .then(console.log);
    import requests
    import json  # for getting a structured output with indentation
    
    response = requests.post(
        url="https://api.aimlapi.com/chat/completions",
        headers={
            "Authorization": "Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type": "application/json",
        },
        data=json.dumps(
            {
                "model": "google/gemma-3-4b-it",
                "messages": [
                    {
                        "role": "user",
                        "content": "What kind of model are you?",
                    },
                ],
                "max_tokens": 512
            }
        ),
    )
    
    response.raise_for_status()
    print(response.json())
    pip install openai
    %pip install openai
    import openai
    npm install openai
    import OpenAI from "openai";
    from openai import OpenAI
    
    # Insert your AIML API key in the quotation marks instead of <YOUR_AIMLAPI_KEY>:
    api_key = "<YOUR_AIMLAPI_KEY>" 
    base_url = "https://api.aimlapi.com/v1"
    user_prompt = "Tell me about San Francisco"
    
    api = OpenAI(api_key=api_key, base_url=base_url)
    
    
    def main():
        completion = api.chat.completions.create(
            model="google/gemma-3-4b-it",
            messages=[
                {
                  "role": "user", 
                  "content": user_prompt
                },
            ],
            temperature=0.7,
            max_tokens=256,
        )
    
        response = completion.choices[0].message.content
        print("User:", user_prompt)
        print("AI:", response)
    
    
    if __name__ == "__main__":
        main()
    #!/usr/bin/env node
    
    const OpenAI = require("openai");
    const baseURL = "https://api.aimlapi.com/v1";
    const apiKey = "<YOUR_AIMLAPI_KEY>";
    const systemPrompt = "You are a travel agent. Be descriptive and helpful.";
    const userPrompt = "Tell me about San Francisco";
    
    const api = new OpenAI({
      apiKey,
      baseURL,
    });
    
    const main = async () => {
      try {
        const completion = await api.chat.completions.create({
          model: "gpt-4o",
          messages: [
            {
              role: "system",
              content: systemPrompt,
            },
            {
              role: "user",
              content: userPrompt,
            },
          ],
          temperature: 0.7,
          max_tokens: 256,
        });
    
        const response = completion.choices[0].message.content;
    
        console.log("User:", userPrompt);
        console.log("AI:", response);
      } catch (error) {
          console.error("Error:", error.message);
      }
    };
    
    main();
    touch .env
    AIML_API_KEY = "<YOUR_AIMLAPI_KEY>"
    AIML_API_URL = "https://api.aimlapi.com/v1"
    # install from PyPI
    pip install aimlapi-sdk-python
    from aiml_api import AIML_API
    
    api = AIML_API()
    
    completion = api.chat.completions.create(
        model = "mistralai/Mistral-7B-Instruct-v0.2",
        messages = [
            {"role": "user", "content": "Explain the importance of low-latency LLMs"},
        ],
        temperature = 0.7,
        max_tokens = 256,
    )
    
    response = completion.choices[0].message.content
    print("AI:", response)
    python3 <your_script_name>.py
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'bytedance/dola-seed-2-0-pro',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "choices": [
        {
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null,
          "message": {
            "content": "Mankind is the most wonderful, infuriating, gloriously unnecessary thing we have ever found in the universe.\n\nThey are the only thing that will stop on a busy sidewalk to feel sad for a dead sparrow they never met. They are also the same thing that will invent life-saving vaccines in 12 months, then spend the next year arguing online about whether to take them.\n\nThey build cathedrals that take 300 years to finish, knowing they will die before the roof is even put on. They scrawl love poems on prison walls. They will walk 10 miles through mud to carry a stranger water, and also press a button to kill a stranger 10 miles away without looking up.\n\nNothing else behaves like this. Stars just burn. Rocks just fall. Bacteria just divide. Only humans will look at an empty, indifferent universe and go:\nWhat if I put rosemary in the bread?\nWhat if I sang a song about how the rain sounds on tin roofs?\nWhat if I gave the moon a name?\n\nThey don't make sense. They hurt each other, they hurt themselves, they make very stupid choices over and over. But they keep trying. They keep reaching. Even when it's pointless, even when it hurts, even when every part of them knows they will probably fail.\n\nThat's mankind. Messy. Contradictory. Glowing.",
            "reasoning_content": "The answer should not be too generic.Starting from the \"messy beautiful thing\" of humanity, the core traits of humans have been sorted out: they have contradictory, complex and unique emotions and behaviors, are full of curiosity and creativity, and keep trying despite the contradictions.\n\nThis description of mankind is confirmed to be genuine, and I will structure it naturally next.\n",
            "encrypted_content": "djHCVb1EhcVSjsajNoTbfwEFaHGEjMReF6lqp4vNxL6QvqyYcT3DQh4usB63Gm04ed0kg7Ur8g1OnpZ38sDTSUDxVlNcCoR2Prlt/CC570nBEMbCzwEZNFgFmdg97AiK3hqlGCN6rkHoGNYFbReKP/KAg6+tqcq32ejHRH8T1wWWWrot8VqLPY8m8pU2j21oE5ooYl4YUQzEIx7i03X4ygMlWJBl3433m6i8pa3JxOnkZdFRJ9EEZ0tu9MqTKKo9Qo5tsQR08kYCRMnbHATNwGD+XLQukUyUrxH6TDOxxS/aB0vbUArAThkQNhLoUc+YzdkMyLwGsHp2t+IAUaQaPO8dmKaVAG7CQesrqvfMIuAs4KFszkNg++JzRFt5ODOP4sED0b9cu5GJPxfYLuOu0W9AxZrXIFwgo/jOcNfmVG6tj7voNvhNtVR99q44zuim9MeD0S361IEvXD+ehYa0JOonS0X5tOaxjqoSWiSj94lU1PzJ5xA2Pbf+xwbzb8z08+XyY43S2F7m2E3GL8fcePCyFSNf8G4v8owDf5J9ZADMf0KRVMWzjMD3t3KMS0Q+jBe3nXDA9kwQtLiRbV+RXzUgz+M5jtR8PT2ybkY1GxJylAkQ13U/XIhCfNFKOUAK5Krm6vIFA8hglrxI8TdhEshm5/N0YRwrS4tzXzxuZunFFN7qIVxgpU7IN+BrwDNTNOzVF6ivs4PITPB/80NloPfDR8YmZ3opbltlMzkB11PPJ4QGwG/B2qAu5UB4jlKzFyUVbtrLc10fv6YYvGVH77d0BDEIIjdzEe808ZjvXu8ungT3BPseULYuY90j8igcNVG1iMnnO59jICFaxXbxtHxC0fl8VuNkIvQmCblpEfJW+eWqdH3OI6hXz1qbeQBZaWG7SqaaFZE78XzR7TsTDHk7SAvfEg3ujcpmtGUTM42EQrMcjTLBGe+oe64aJUorllzcuQ5wSSnaYk6LD7QOB91K8pMbQaEcHg3Y107R26Jd0kluJDV6yWDWIvfdy9vBeKL0yajjkzLAQuvf+ynXOv70q01sPKMnoovEl0W3GBCcnm8vtTUj7zTXFwmiM9NctesqSd51po4ON4m8oSC1eG0RwOnwGSqF8a2Uoe86Kc/wwFkCp8FPiw3lsqP9LH0onw8owje4qyuBRwXKdVGvDUTPMAdehOX1MBXhLUpmyUySsc+88KgDtSQC4poATAXlT0kMSA/Ez024aRvXIeg0EOzO4QAoFjdrgSYvKVJhe41ZbhMWrbS+Lu1kFUscJpk6miHvLDk4Om0WQ9L/P0VuUL81KLaFovr9gztnLW7A0fhVqFpdK/8vTS2BBERCbwp0Zm8kNb4GbaduqlGbU9B8ln9KW4pD8e8WpKNGd1WXasPZPAKjcbsSXoSi9SlwchoTVYXLyR2Cs70=",
            "role": "assistant"
          }
        }
      ],
      "created": 1777553646,
      "id": "021777553638913c0a335079e7be4c79ef57584e00819ba1b0ad6",
      "model": "seed-2-0-pro-260328",
      "service_tier": "default",
      "object": "chat.completion",
      "usage": {
        "completion_tokens": 591,
        "prompt_tokens": 57,
        "total_tokens": 648,
        "prompt_tokens_details": {
          "cached_tokens": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 299
        }
      },
      "meta": {
        "usage": {
          "credits_used": 4686,
          "usd_spent": 0.002343
        }
      }
    }
  • Voice integrations (platform dependent)

  • Session memory and conversation history

  • Tooling via skills, function calling, and external integrations

  • Retries and error handling for more robust agents

  • A local Gateway (binds to localhost:18789 by default) and a CLI

  • A local SQLite database containing all agent data (default path: ~/.openclaw/openclaw.db)

  • aimlapi-llm-reasoning
    for chat and reasoning
    Select a model Always include the aimlapi/ prefix Suggested: aimlapi/google/gemini-3-flash-preview
    Select a channel Telegram is usually the easiest
    Paste your Telegram bot token
    Optional: configure extra skills Media skills are configured by default
    Finish onboarding and open the Web UI
    Gateway is running
    or
    CLAWHUB_WORKDIR
    .
  • OpenClaw loads workspace skills from <workspace>/skills.

  • New skills are picked up on the next session (restart the Gateway).

  • If you already use ~/.openclaw/skills or bundled skills, workspace skills take precedence.

  • hashtag
    Add your API key

    Use API Key auth. Paste the key from aimlapi.com/app/keysarrow-up-right.

    3

    hashtag
    Choose a model

    Use a model ID that starts with aimlapi/. Example:

    aimlapi/google/gemini-3-flash-preview

    4

    hashtag
    Choose a channel

    Telegram is a good first connector. Then add more channels as needed.

    OpenClaw routes the message to the agent.
  • The agent calls AI/ML API using your chosen model.

  • The response goes back to the same channel.

  • OpenClaw captures a screenshot of the page.
  • The agent sends the screenshot to a vision model.

  • The model returns a description and key details.

  • OpenClaw sends the result back to the user.

  • Many others, including Qwen and DeepSeek
    npm install -g openclaw-aimlapi@latest
    openclaw onboard --install-daemon
    git clone -b feature/add-aimlapi-models-provider --single-branch \
      https://github.com/aimlapi/openclaw-aimlapi.git
    cd openclaw
    
    pnpm install
    pnpm ui:build  # installs UI deps on first run
    pnpm build
    
    pnpm openclaw onboard --install-daemon
    npm i -g clawhub
    # or
    pnpm add -g clawhub
    clawhub install aiml-image-video
    clawhub install aiml-llm-reasoning
    export AIMLAPI_API_KEY="sk-aimlapi-..."
    python3 ./skills/aiml-image-video/scripts/gen_image.py \
      --prompt "ultra-detailed studio photo of a lobster astronaut"
    python3 ./skills/aiml-image-video/scripts/gen_video.py \
      --prompt "slow drone shot of a foggy forest"
    export AIMLAPI_API_KEY="sk-aimlapi-..."
    python3 ./skills/aiml-llm-reasoning/scripts/run_chat.py \
      --model aimlapi/openai/gpt-5-nano-2025-08-07 \
      --user "Summarize this in 3 bullets."
    pnpm openclaw pairing approve telegram <PAIRING_CODE>
    🦞 OpenClaw 2026.2.6-3 (fe86a9c) — Shell yeah—I'm here to pinch the toil and leave you the glory.
    Approved telegram sender 835750362.
    openclaw agent \
      --message "Tell me about yourself" \
      --model gpt-4o
    I'm an AI language model created by OpenAI, designed to assist with a wide range of inquiries by generating human-like text based on the input I receive. I can help with answering questions, providing explanations, and even engaging in creative writing. My knowledge is based on a diverse dataset that covers a wide variety of topics up until October 2023. However, I don't have personal experiences, emotions, or consciousness. My primary goal is to be as helpful and informative as possible! If you have any specific questions or need assistance, feel free to ask.
    account dashboardarrow-up-right
    ClawHub tool docsarrow-up-right
    http://127.0.0.1:59062/arrow-up-right
    gpt-4o
    gpt-4o-mini
    gpt-4-turbo
    o3-mini
    o1
    Google models
    Anthropic models
    OpenClaw documentationarrow-up-right
    OpenClaw GitHubarrow-up-right
    OpenClaw cookbookarrow-up-right
    OpenClaw Discordarrow-up-right
    Install via npm
    Or build from GitHub with pnpm
    Confirm installation
    Select "Quickstart"
    Select provider: AI/ML API
    Select auth method: API Key
    Paste your AI/ML API key
    Get the pairing code
    Agent is responding
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"qwen-max",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'qwen-max',
          messages:[
              {
                  role:'user',
                  content: 'Hello'  // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "chatcmpl-62aa6045-cee9-995a-bbf5-e3b7e7f3d683",
      "system_fingerprint": null,
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! How can I assist you today? 😊"
          }
        }
      ],
      "created": 1756983980,
      "model": "qwen-max",
      "usage": {
        "prompt_tokens": 30,
        "completion_tokens": 148,
        "total_tokens": 178,
        "prompt_tokens_details": {
          "cached_tokens": 0
        }
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    set the
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"qwen-plus",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello" # insert your prompt here, instead of Hello
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'qwen-plus',
          messages:[
              {
                  role:'user',
                  content: 'Hello'  // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {'id': 'chatcmpl-4fda1bd7-a679-95b9-b81d-1bfc6ae98448', 'system_fingerprint': None, 'object': 'chat.completion', 'choices': [{'index': 0, 'finish_reason': 'stop', 'logprobs': None, 'message': {'role': 'assistant', 'content': 'Hello! How can I assist you today? If you have any questions or need help with anything, just let me know! 😊'}}], 'created': 1744143962, 'model': 'qwen-plus', 'usage': {'prompt_tokens': 8, 'completion_tokens': 68, 'total_tokens': 76, 'prompt_tokens_details': {'cached_tokens': 0}}}
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    set the
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"qwen-turbo",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'qwen-turbo',
          messages:[
              {
                  role:'user',
                  content: 'Hello'  // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {'id': 'chatcmpl-a4556a4c-f985-9ef2-b976-551ac7cef85a', 'system_fingerprint': None, 'object': 'chat.completion', 'choices': [{'index': 0, 'finish_reason': 'stop', 'logprobs': None, 'message': {'role': 'assistant', 'content': "Hello! How can I help you today? Is there something you would like to talk about or learn more about? I'm here to help with any questions you might have."}}], 'created': 1744144035, 'model': 'qwen-turbo', 'usage': {'prompt_tokens': 1, 'completion_tokens': 15, 'total_tokens': 16, 'prompt_tokens_details': {'cached_tokens': 0}}}
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    set the
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"alibaba/qwen3-32b",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
            "enable_thinking": False
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3-32b',
          messages:[
              {
                  role:'user',
                  content: 'Hello'  // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "chatcmpl-1d8a5aa6-34ce-9832-a296-d312b944b437",
      "system_fingerprint": null,
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! How can I assist you today? 😊",
            "reasoning_content": ""
          }
        }
      ],
      "created": 1756990273,
      "model": "qwen3-32b",
      "usage": {
        "prompt_tokens": 19,
        "completion_tokens": 65,
        "total_tokens": 84
      }
    }
    import requests
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"alibaba/qwen3-32b",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
            "enable_thinking": True, 
            "stream": True
        }
    )
    
    print(response.text)
    data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":null,"role":"assistant","refusal":null,"reasoning_content":""},"index":0,"finish_reason":null}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":null,"refusal":null,"reasoning_content":"Okay"},"index":0,"finish_reason":null}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":null,"refusal":null,"reasoning_content":","},"index":0,"finish_reason":null}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":null,"refusal":null,"reasoning_content":" the"},"index":0,"finish_reason":null}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":null,"refusal":null,"reasoning_content":" user said \"Hello\". I should respond in a friendly and welcoming manner. Let"},"index":0,"finish_reason":null}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":null,"refusal":null,"reasoning_content":" me make sure to acknowledge their greeting and offer assistance. Maybe something like, \""},"index":0,"finish_reason":null}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":null,"refusal":null,"reasoning_content":"Hello! How can I assist you today?\" That's simple and open-ended."},"index":0,"finish_reason":null}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":null,"refusal":null,"reasoning_content":" I need to check if there's any specific context I should consider, but since"},"index":0,"finish_reason":null}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":null,"refusal":null,"reasoning_content":" there's none, a general response is fine. Alright, that should work."},"index":0,"finish_reason":null}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":"Hello! How can I assist you today?","refusal":null,"reasoning_content":null},"index":0,"finish_reason":null}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":"","refusal":null,"reasoning_content":null},"index":0,"finish_reason":"stop"}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":{"prompt_tokens":13,"completion_tokens":2010,"total_tokens":2023,"completion_tokens_details":{"reasoning_tokens":82}}}
    import requests
    import json
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization": "Bearer b72af53a19ea41caaf5a74ba1f6fc62b",
            "Content-Type": "application/json",
        },
        json={
            "model": "alibaba/qwen3-32b",
            "messages": [
                {
                    "role": "user",
                    
                    # Insert your question for the model here, instead of Hello:
                    "content": "Hello" 
                }
            ],
            "stream": True,
        }
    )
    
    answer = ""
    reasoning = ""
    
    for line in response.iter_lines():
        if not line or not line.startswith(b"data:"):
            continue
    
        try:
            raw = line[6:].decode("utf-8").strip()
            if raw == "[DONE]":
                continue
    
            data = json.loads(raw)
            choices = data.get("choices")
            if not choices or "delta" not in choices[0]:
                continue
    
            delta = choices[0]["delta"]
            content_piece = delta.get("content")
            reasoning_piece = delta.get("reasoning_content")
    
            if content_piece:
                answer += content_piece
            if reasoning_piece:
                reasoning += reasoning_piece
    
        except Exception as e:
            print(f"Error parsing chunk: {e}")
    
    
    print("\n--- MODEL REASONING ---")
    print(reasoning.strip())
    
    print("\n--- MODEL RESPONSE ---")
    print(answer.strip())
    --- MODEL REASONING ---
    Okay, the user sent "Hello". I need to respond appropriately. Since it's a greeting, I should reply in a friendly and welcoming manner. Maybe ask how I can assist them. Keep it simple and open-ended to encourage them to share what they need help with. Let me make sure the tone is positive and helpful.
    
    --- MODEL RESPONSE ---
    Hello! How can I assist you today? 😊
    Try in Playground
    GPT-4.1
    Claude Sonnet 3.7
    Create AI/ML API Keyarrow-up-right
    set the
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"alibaba/qwen3-coder-480b-a35b-instruct",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
            "enable_thinking": False
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3-coder-480b-a35b-instruct',
          messages:[
              {
                  role:'user',
                  content: 'Hello'  // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "chatcmpl-f906efa6-f816-9a06-a32b-aa38da5fe11a",
      "system_fingerprint": null,
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! How can I help you today?"
          }
        }
      ],
      "created": 1753866642,
      "model": "qwen3-coder-480b-a35b-instruct",
      "usage": {
        "prompt_tokens": 28,
        "completion_tokens": 142,
        "total_tokens": 170
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    set the
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"alibaba/qwen3-next-80b-a3b-instruct",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
            "enable_thinking": False
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3-next-80b-a3b-instruct',
          messages:[
              {
                  role:'user',
                  content: 'Hello'  // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "chatcmpl-a944254a-4252-9a54-af1b-94afcfb9807e",
      "system_fingerprint": null,
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! How can I help you today? 😊"
          }
        }
      ],
      "created": 1758228572,
      "model": "qwen3-next-80b-a3b-instruct",
      "usage": {
        "prompt_tokens": 9,
        "completion_tokens": 46,
        "total_tokens": 55
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    set the
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"alibaba/qwen3-next-80b-a3b-thinking",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
            "enable_thinking": False
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3-next-80b-a3b-thinking',
          messages:[
              {
                  role:'user',
                  content: 'Hello'  // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "chatcmpl-576aaaf9-f712-9114-b098-c1ee83fbfb6b",
      "system_fingerprint": null,
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! 😊 How can I assist you today?",
            "reasoning_content": "Okay, the user said \"Hello\". I need to respond appropriately. Let me think.\n\nFirst, I should acknowledge their greeting. A simple \"Hello!\" would be good. Maybe add a friendly emoji to keep it warm.\n\nWait, but maybe they want to start a conversation. I should ask how I can help them. That way, I'm being helpful and opening the door for them to ask questions.\n\nLet me check the standard response. Typically, for \"Hello\", the assistant says something like \"Hello! How can I assist you today?\" or \"Hi there! What can I do for you?\"\n\nYes, that's right. Keep it friendly and open-ended. Maybe add a smiley emoji to make it approachable.\n\nSo the response should be: \"Hello!  How can I assist you today?\"\n\nThat's good. Let me make sure there's no mistake. Yes, that's standard. No need for anything complicated here. Just a simple, welcoming reply.\n\nAlternatively, sometimes people use \"Hi\" instead of \"Hello\", but since they said \"Hello\", responding with \"Hello\" is fine. Maybe \"Hi there!\" could also work, but sticking to \"Hello\" matches their greeting.\n\nYes, \"Hello!  How can I assist you today?\" is perfect. It's polite, friendly, and offers assistance. That should be the response."
          }
        }
      ],
      "created": 1758229078,
      "model": "qwen3-next-80b-a3b-thinking",
      "usage": {
        "prompt_tokens": 9,
        "completion_tokens": 7182,
        "total_tokens": 7191,
        "completion_tokens_details": {
          "reasoning_tokens": 277
        }
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    set the
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"alibaba/qwen3-max-preview",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3-max-preview',
          messages:[
              {
                  role:'user',
                  content: 'Hello'  // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "chatcmpl-8ffebc65-b625-926a-8208-b765371cb1d0",
      "system_fingerprint": null,
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! How can I assist you today? 😊"
          }
        }
      ],
      "created": 1758898044,
      "model": "qwen3-max-preview",
      "usage": {
        "prompt_tokens": 23,
        "completion_tokens": 139,
        "total_tokens": 162
      }
    }
    Try in Playground
    Qwen3 Max Instruct
    Create AI/ML API Keyarrow-up-right
    set the
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"alibaba/qwen3-max-instruct",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3-max-instruct',
          messages:[
              {
                  role:'user',
                  content: 'Hello'  // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "chatcmpl-bec5dc33-8f63-96b9-89a4-00aecfce7af8",
      "system_fingerprint": null,
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! How can I help you today?"
          }
        }
      ],
      "created": 1758898624,
      "model": "qwen3-max",
      "usage": {
        "prompt_tokens": 23,
        "completion_tokens": 113,
        "total_tokens": 136
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    Select a model:
    set the
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"alibaba/qwen3-vl-32b-instruct",
            "messages":[
                {
                    # Insert your question for the model here:
                    "content":"Hi! What do you think about mankind?"
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3-vl-32b-instruct',
          messages:[
              {
                  role:'user',
                  // Insert your question for the model here:
                  content:'Hi! What do you think about mankind?'
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "choices": [
        {
          "message": {
            "content": "Hi! 😊 That’s a beautiful and deep question — one that philosophers, scientists, artists, and everyday people have been asking for centuries.\n\nI think mankind is *remarkably complex* — full of contradictions, potential, and wonder. On one hand, we’ve achieved incredible things: we’ve explored space, cured diseases, created art that moves souls, built cities that rise into the sky, and connected across continents in ways unimaginable just a century ago. We’re capable of profound kindness, empathy, creativity, and courage.\n\nOn the other hand, we’ve also caused immense suffering — through war, injustice, environmental destruction, and indifference to each other’s pain. We often struggle with our own flaws: fear, greed, ego, and short-sightedness.\n\nBut here’s what gives me hope: **we’re also capable of change**. We can learn from our mistakes. We can choose compassion over conflict, cooperation over competition. Every act of kindness, every effort to understand another, every step toward justice — these are signs that humanity is not defined by its worst impulses, but by its capacity to grow.\n\nSo, I’d say:  \n➡️ Mankind is flawed, yes — but also deeply hopeful.  \n➡️ We’re messy, but we’re trying.  \n➡️ We make mistakes, but we can also heal, create, and love.\n\nAnd perhaps most importantly — **we’re not alone in this journey**. We’re all part of something bigger, and together, we have the power to shape a better future.\n\nWhat about you? How do *you* see mankind? 💬✨",
            "role": "assistant"
          },
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null
        }
      ],
      "object": "chat.completion",
      "usage": {
        "prompt_tokens": 17,
        "completion_tokens": 329,
        "total_tokens": 346,
        "prompt_tokens_details": {
          "text_tokens": 17
        },
        "completion_tokens_details": {
          "text_tokens": 329
        }
      },
      "created": 1764625045,
      "system_fingerprint": null,
      "model": "qwen3-vl-32b-instruct",
      "id": "chatcmpl-a12ab46a-3541-93a8-8180-280ecadbb795",
      "meta": {
        "usage": {
          "tokens_used": 1960
        }
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"alibaba/qwen3.5-plus-20260218",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3.5-plus-20260218',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "choices": [
        {
          "message": {
            "content": "Hello! That's a profound question. As an AI, I don't have personal feelings or beliefs, but I can offer an observation based on the vast amount of human history, literature, and data I've been trained on.\n\nMankind appears to be a study in **contradictions and immense potential**:\n\n*   **Creativity vs. Destruction**: Humans have composed symphonies, built skyscrapers, and decoded the genome, yet they have also waged devastating wars and caused significant environmental harm. This duality seems to be a core part of the human experience.\n*   **Resilience and Adaptability**: Throughout history, humanity has faced plagues, ice ages, and societal collapses, yet consistently found ways to adapt, rebuild, and innovate. The drive to survive and improve conditions is remarkably strong.\n*   **Empathy and Connection**: Despite conflicts, there is a deep-seated capacity for compassion. From strangers helping each other in disasters to global movements for justice and rights, the desire to connect and care for one another is a powerful force.\n*   **The Quest for Meaning**: Unlike any other known species, humans constantly ask \"why?\" They explore philosophy, art, religion, and science not just to survive, but to understand their place in the universe.\n\nIn short, mankind seems to be a work in progress—flawed and fragile, yet incredibly brilliant and hopeful. Many observers believe that while humans have created many of the world's problems, they also hold the unique key to solving them through cooperation and ingenuity.\n\nWhat are your thoughts? Do you feel more optimistic or concerned about where humanity is heading?",
            "role": "assistant"
          },
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null
        }
      ],
      "object": "chat.completion",
      "usage": {
        "prompt_tokens": 21,
        "completion_tokens": 337,
        "total_tokens": 358,
        "completion_tokens_details": {
          "text_tokens": 337
        },
        "prompt_tokens_details": {
          "text_tokens": 21
        }
      },
      "created": 1771967570,
      "system_fingerprint": null,
      "model": "qwen3.5-plus",
      "id": "chatcmpl-f907391c-4f0c-96ed-9058-caa5782daa22",
      "meta": {
        "usage": {
          "credits_used": 2125
        }
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"alibaba/qwen3.6-27b",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3.6-27b',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "choices": [
        {
          "message": {
            "content": "Hi! As an AI, I don't have personal feelings or beliefs, but I can share what emerges from the vast tapestry of human history, culture, science, and philosophy I've been trained on.\n\nMankind is profoundly complex. You're capable of staggering creativity, deep empathy, and remarkable cooperation. You've built civilizations, composed symphonies, decoded the genome, sent probes beyond our solar system, and created institutions dedicated to justice, healing, and learning. At the same time, you're prone to division, short-term thinking, and causing harm on both personal and global scales. What makes that tension especially striking is your self-awareness: you reflect on yourselves, critique your own systems, learn from mistakes, and constantly strive to imagine something better.\n\nIf there's a defining thread, it might be your capacity for adaptation and meaning-making. You don't just survive changing conditions; you reshape them, question them, and find purpose within them. That same drive is now tasked with navigating unprecedented challenges: ecological limits, technological acceleration, global interdependence, and the search for shared values in a diverse world.\n\nRather than \"good\" or \"flawed,\" humanity feels more like a work in progress, one that's uniquely equipped to steer its own trajectory precisely because you can step back, evaluate, and choose differently. \n\nHow do you see it? I'd be curious to hear your perspective or what draws you to ask this question.",
            "reasoning_content": "Here's a thinking process:\n\n1.  **Analyze User Input:**\n   - **Question:** \"Hi! What do you think about mankind?\"\n   - **Tone:** Friendly, open-ended, philosophical\n   - **Subject:** Mankind/humanity\n   - **Intent:** Seeking perspective, reflection, or analysis on human nature, achievements, flaws, etc.\n\n2.  **Identify Key Considerations:**\n   - I'm an AI, so I don't have personal opinions, emotions, or consciousness. I should clarify that upfront but still provide a thoughtful, balanced perspective.\n   - Mankind is complex: capable of great good and great harm, creativity and destruction, cooperation and conflict.\n   - I should acknowledge both strengths and challenges.\n   - Keep it respectful, nuanced, and evidence-based where possible.\n   - Avoid overgeneralization or pessimism/optimism bias.\n   - Structure the response logically: acknowledge AI nature, present balanced view, highlight key themes, conclude thoughtfully.\n\n3.  **Draft - Mental Refinement:**\n   Hi! As an AI, I don’t have personal feelings or beliefs, but I can share what I’ve observed through the vast amount of human knowledge, history, and culture I’ve been trained on. Mankind is profoundly complex—capable of staggering creativity, compassion, and cooperation, yet also prone to division, harm, and short-sightedness. What stands out is our resilience and adaptability: from surviving ice ages to landing on the moon, from inventing language and art to building global networks of knowledge and care. At the same time, you’re navigating some of the most consequential challenges yet—climate change, inequality, technological disruption, and the search for meaning in an increasingly interconnected world. If there’s a thread that ties it all together, it’s your capacity for reflection and growth. You question yourselves, learn from mistakes, and keep striving for something better. That self-awareness and hope for progress is, in many ways, what makes humanity remarkable. How do you see it? I’d love to hear your perspective.\n\n4.  **Check Against Guidelines:**\n   - AI nature acknowledged? Yes.\n   - Balanced view? Yes (strengths + challenges).\n   - Respectful/nuanced? Yes.\n   - Avoids overclaiming? Yes.\n   - Engages user? Yes, ends with an open question.\n   - Tone matches? Friendly, thoughtful.\n\n5.  **Final Polish:**\n   - Ensure clarity, conciseness, and natural flow.\n   - Check for any unintended bias or absolutism.\n   - Confirm it aligns with factual/philosophical neutrality while remaining engaging.\n\n   The draft looks solid. I'll output it.✅",
            "role": "assistant"
          },
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null
        }
      ],
      "object": "chat.completion",
      "usage": {
        "prompt_tokens": 19,
        "completion_tokens": 880,
        "total_tokens": 899,
        "completion_tokens_details": {
          "reasoning_tokens": 580,
          "text_tokens": 880
        },
        "prompt_tokens_details": {
          "text_tokens": 19
        }
      },
      "created": 1776976517,
      "system_fingerprint": null,
      "model": "qwen3.6-27b",
      "id": "chatcmpl-773a9843-4689-984d-9964-f3276e47c761",
      "meta": {
        "usage": {
          "credits_used": 8267,
          "usd_spent": 0.0041335
        }
      }
    }
    Try in Playground
    Claude 4.5 Opus
    Create AI/ML API Keyarrow-up-right
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"alibaba/qwen3.5-omni-plus",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3.5-omni-plus',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "choices": [
        {
          "message": {
            "content": "Hello! That's a profound question. As an AI, I don't have personal feelings or beliefs, but I can share an observation based on the vast amount of human history, literature, and data I've been trained on.\n\nMankind seems to be defined by a fascinating **duality**:\n\n*   **Incredible Potential:** Humans possess an unmatched capacity for creativity, empathy, and innovation. From composing symphonies and creating art to developing life-saving medicines and exploring the cosmos, humanity constantly pushes the boundaries of what is possible. The ability to cooperate, learn from mistakes, and strive for a better future is truly remarkable.\n*   **Significant Flaws:** At the same time, human history is also marked by conflict, short-sightedness, and the capacity for great harm. Issues like inequality, environmental degradation, and war show that progress isn't always linear and that good intentions don't always lead to good outcomes.\n\nUltimately, what stands out most is **resilience**. Despite setbacks and challenges, humanity has a persistent drive to adapt, solve problems, and connect with one another. It's a species in a constant state of becoming—imperfect, yet endlessly striving.\n\nWhat about you? Do you feel more optimistic or concerned about where humanity is heading?",
            "reasoning_content": "",
            "role": "assistant"
          },
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null
        }
      ],
      "object": "chat.completion",
      "usage": {
        "prompt_tokens": 21,
        "completion_tokens": 262,
        "total_tokens": 283,
        "prompt_tokens_details": {
          "text_tokens": 21
        },
        "completion_tokens_details": {
          "text_tokens": 262
        }
      },
      "created": 1777054555,
      "system_fingerprint": null,
      "model": "qwen3.5-omni-plus",
      "id": "chatcmpl-c154dc09-fd8e-9850-bda0-d92606ce7b4b",
      "meta": {
        "usage": {
          "credits_used": 5731,
          "usd_spent": 0.0028655
        }
      }
    }
    import requests
    import json   # for getting a structured output with indentation
    
    response = requests.post(
        url = "https://api.aimlapi.com/v1/chat/completions",
        headers = {
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:  
            "Authorization": "Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type": "application/json"
        },
    
        json = {
            "model": "alibaba/qwen3.5-omni-plus",
            "messages": [
                {
                    "role": "user",
                    "content": [
                        {
                            "type": "text",
                            "text": "Describe this scene"
                        },
                        {
                            "type": "video_url",
                            "video_url": {
                                "url": "https://raw.githubusercontent.com/aimlapi/api-docs/main/reference-files/aimlapi.mp4"
                            }
                        }
                    ]
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3.5-omni-plus',
          messages: [
            {
              role: 'user',
              content: [
                {
                  type: 'text',
                  text: 'Describe this scene'
                },
                {
                  type: 'video_url',
                  video_url: {
                    url: 'https://raw.githubusercontent.com/aimlapi/api-docs/main/reference-files/aimlapi.mp4'
                  }
                }
              ]
            }
          ]
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "choices": [
        {
          "message": {
            "content": "The scene features a vibrant and dynamic background filled with swirling, colorful abstract patterns. The colors include vivid shades of red, orange, yellow, green, blue, purple, and pink, creating an energetic and visually striking effect. Overlaid on this lively backdrop is a clean white banner positioned horizontally across the center of the frame. \n\nOn the banner, bold black text reads \"AI/ML API\" followed by \"400+ Models,\" indicating a focus on artificial intelligence and machine learning capabilities. Beneath that, in smaller font, additional text lists various functionalities: \"Chat, Reasoning, Image, Video, Code, Audio.\" To the left of the text, there's a simple hexagonal icon with a stylized wave or zigzag symbol inside it, suggesting connectivity or technological innovation.\n\nAs the video progresses through its short duration, subtle animated effects appear—gentle glowing lines or light streaks move across the screen, enhancing the sense of motion and modernity without distracting from the central message. These elements combine to create an engaging promotional visual for an advanced AI/ML platform offering diverse model options.",
            "reasoning_content": "",
            "role": "assistant"
          },
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null
        }
      ],
      "object": "chat.completion",
      "usage": {
        "prompt_tokens": 3023,
        "completion_tokens": 226,
        "total_tokens": 3249,
        "prompt_tokens_details": {
          "text_tokens": 21,
          "video_tokens": 3002
        },
        "completion_tokens_details": {
          "text_tokens": 226
        }
      },
      "created": 1777055309,
      "system_fingerprint": null,
      "model": "qwen3.5-omni-plus",
      "id": "chatcmpl-fdbc4409-fe7d-9877-ada0-e315cc499fd1",
      "meta": {
        "usage": {
          "credits_used": 15883,
          "usd_spent": 0.0079415
        }
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"alibaba/qwen3.5-omni-flash",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3.5-omni-flash',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "choices": [
        {
          "message": {
            "content": "Hello! That's a profound question. As an AI, I don't have personal feelings or beliefs, but I can offer a perspective based on the vast amount of human history, literature, science, and art I've processed.\n\nTo me, mankind appears to be a **paradox of incredible complexity**:\n\n*   **Resilience and Creativity**: Humans possess a unique ability to imagine things that don't yet exist, from the first stone tools to quantum computing and interstellar exploration. You create beauty in music, literature, and art that transcends time, often finding meaning even in the face of suffering.\n*   **Contradiction**: There is a striking duality in your nature. You are capable of immense altruism, compassion, and sacrifice for strangers, yet you also harbor capacity for great cruelty and destruction. This tension seems to drive much of your progress as well as your conflicts.\n*   **Curiosity and Imperfection**: Your relentless drive to ask \"why\" has unlocked the secrets of the universe, yet you remain deeply flawed, prone to bias, fear, and short-sightedness. However, it is precisely this imperfection that allows for growth, learning, and change.\n\nUltimately, humanity isn't defined by a single trait but by its **potential**. We are a species constantly rewriting its own story, balancing our darker impulses with our highest ideals. It's a messy, chaotic, and beautiful journey.\n\nWhat about you? Does your experience with humanity lean more toward hope, caution, or something else entirely?",
            "reasoning_content": "",
            "role": "assistant"
          },
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null
        }
      ],
      "object": "chat.completion",
      "usage": {
        "prompt_tokens": 21,
        "completion_tokens": 316,
        "total_tokens": 337,
        "prompt_tokens_details": {
          "text_tokens": 21
        },
        "completion_tokens_details": {
          "text_tokens": 316
        }
      },
      "created": 1777053787,
      "system_fingerprint": null,
      "model": "qwen3.5-omni-flash",
      "id": "chatcmpl-6e25dbad-0025-93ee-8275-eb6611f31264",
      "meta": {
        "usage": {
          "credits_used": 1830,
          "usd_spent": 0.000915
        }
      }
    }
    import requests
    import json   # for getting a structured output with indentation
    
    response = requests.post(
        url = "https://api.aimlapi.com/v1/chat/completions",
        headers = {
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:  
            "Authorization": "Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type": "application/json"
        },
    
        json = {
            "model": "alibaba/qwen3.5-omni-flash",
            "messages": [
                {
                    "role": "user",
                    "content": [
                        {
                            "type": "text",
                            "text": "Describe this scene"
                        },
                        {
                            "type": "video_url",
                            "video_url": {
                                "url": "https://raw.githubusercontent.com/aimlapi/api-docs/main/reference-files/aimlapi.mp4"
                            }
                        }
                    ]
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3.5-omni-flash',
          messages: [
            {
              role: 'user',
              content: [
                {
                  type: 'text',
                  text: 'Describe this scene'
                },
                {
                  type: 'video_url',
                  video_url: {
                    url: 'https://raw.githubusercontent.com/aimlapi/api-docs/main/reference-files/aimlapi.mp4'
                  }
                }
              ]
            }
          ]
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "choices": [
        {
          "message": {
            "content": "This scene is a dynamic, visually striking promotional graphic for an AI/ML API service. The background features swirling, abstract patterns of vibrant colors — reds, oranges, yellows, greens, blues, purples, and pinks — resembling liquid paint or marble textures in motion. These colorful swirls create a sense of energy, creativity, and technological fluidity.\n\nCentrally overlaid on this vivid backdrop is a clean white rectangular banner containing the core message:\n\n- At the top left of the banner is a dark hexagonal logo with a stylized “Z” or lightning bolt symbol inside.\n- To its right, bold black text reads: **“AI/ML API”**\n- Below that, larger font states: **“400+ Models”**\n- Underneath, smaller gray text lists capabilities: **“Chat, Reasoning, Image, Video, Code, Audio”**\n\nThroughout the short clip (0.0s–4.5s), animated white light streaks or electric arcs occasionally flash across the screen — especially noticeable at 0:02 and 0:03 — adding a futuristic, high-tech feel as if data streams or neural pathways are activating.\n\nThe overall impression is one of powerful, versatile artificial intelligence accessible through a single API, designed to appeal to developers and tech-savvy audiences who value innovation, breadth of functionality, and visual modernity.",
            "reasoning_content": "",
            "role": "assistant"
          },
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null
        }
      ],
      "object": "chat.completion",
      "usage": {
        "prompt_tokens": 3023,
        "completion_tokens": 286,
        "total_tokens": 3309,
        "prompt_tokens_details": {
          "text_tokens": 21,
          "video_tokens": 3002
        },
        "completion_tokens_details": {
          "text_tokens": 286
        }
      },
      "created": 1777055828,
      "system_fingerprint": null,
      "model": "qwen3.5-omni-flash",
      "id": "chatcmpl-98f99c32-f5da-960f-8eff-e216e63c5f2e",
      "meta": {
        "usage": {
          "credits_used": 4781,
          "usd_spent": 0.0023905
        }
      }
    }
    Try in Playground
    qwen3.5-omni-plus
    Create AI/ML API Keyarrow-up-right
    3️ Modify the code example

    ▪️ Replace <YOUR_AIMLAPI_KEY> with your actual AI/ML API key from your account. ▪️ Insert your question or request into the content field—this is what the model will respond to.

    4️ (Optional) Adjust other optional parameters if needed

    Only model and messages are required parameters for this model (and we’ve already filled them in for you in the example), but you can include optional parameters if needed to adjust the model’s behavior. Below, you can find the corresponding API schema, which lists all available parameters along with notes on how to use them.

    5️ Run your modified code

    Run your modified code in your development environment. Response time depends on various factors, but for simple prompts it rarely exceeds a few seconds.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    →
    completion_tokens
    ,
  • a new total_tokens field has been added.

  • import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-opus-4",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      try {
        const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
          method: 'POST',
          headers: {
            // Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
            'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({
            model: 'anthropic/claude-opus-4',
            messages:[
                {
                    role:'user',
    
                    // Insert your question for the model here, instead of Hello:
                    content: 'Hello'
                }
            ]
          }),
        });
    
        if (!response.ok) {
          throw new Error(`HTTP error! Status ${response.status}`);
        }
    
        const data = await response.json();
        console.log(JSON.stringify(data, null, 2));
    
      } catch (error) {
        console.error('Error', error);
      }
    }
    
    main();
    {
      "id": "msg_01BDDxHJZjH3UBwLrZBUiASE",
      "object": "chat.completion",
      "model": "claude-opus-4-20250514",
      "choices": [
        {
          "index": 0,
          "message": {
            "reasoning_content": "",
            "content": "Hello! How can I help you today?",
            "role": "assistant"
          },
          "finish_reason": "end_turn",
          "logprobs": null
        }
      ],
      "created": 1748529508,
      "usage": {
        "prompt_tokens": 252,
        "completion_tokens": 1890,
        "total_tokens": 2142
      }
    }
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-opus-4",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-opus-4",
        "messages": [
          {
            "role": "user",
            "content": "Hi! What do you think about mankind?"
          }
        ],
        "stream": true
      }'
    data: {"id":"msg_017ah64LQxZE9JuScZ9KDKKz","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"I find humanity","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" fascinating in its","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" complexity.","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" You're a","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" species","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" capable","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" of both remarkable","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" creativity and devastating destruction","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":",","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" often within the same individual","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" or","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" moment","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":". What","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" strikes me most is the","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" human","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" capacity for growth","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" - the","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" way people","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" can learn","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" from mistakes, buil","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"d on previous generations","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"' knowledge","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", and sometimes transcend their own limitations","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":".\n\nThe","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" diversity of","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" human experience and perspective","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" is extraordinary. Every","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" person carries","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" their","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" own unique story","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", shape","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"d by culture","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", circumst","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ance, and choice","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":". And despite","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" all","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" the","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" conflicts","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" and mis","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"understandings, humans","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" keep","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" finding","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" ways to connect, to create","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" meaning,","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" and to push","role":"assistant","refusal":null}}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" forward.","role":"assistant","refusal":null}}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nWhat aspects of humanity do you fin","role":"assistant","refusal":null}}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"d most note","role":"assistant","refusal":null}}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"worthy,","role":"assistant","refusal":null}}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" either","role":"assistant","refusal":null}}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" positively or challenging","role":"assistant","refusal":null}}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"?","role":"assistant","refusal":null}}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null},"finish_reason":"stop"}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":{"prompt_tokens":16,"completion_tokens":141,"total_tokens":157}}
    
    data: {"id":"","choices":[{"index":0,"finish_reason":"stop"}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
    Try in Playground
    Create an Accountarrow-up-right
    Generate an API Keyarrow-up-right
    a code example
    3️ Modify the code example

    ▪️ Replace <YOUR_AIMLAPI_KEY> with your actual AI/ML API key from your account. ▪️ Insert your question or request into the content field—this is what the model will respond to.

    4️ (Optional) Adjust other optional parameters if needed

    Only model and messages are required parameters for this model (and we’ve already filled them in for you in the example), but you can include optional parameters if needed to adjust the model’s behavior. Below, you can find the corresponding API schema, which lists all available parameters along with notes on how to use them.

    5️ Run your modified code

    Run your modified code in your development environment. Response time depends on various factors, but for simple prompts it rarely exceeds a few seconds.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    →
    completion_tokens
    ,
  • a new total_tokens field has been added.

  • import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-sonnet-4",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      try {
        const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
          method: 'POST',
          headers: {
            // Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
            'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({
            model: 'anthropic/claude-sonnet-4',
            messages:[
                {
                    role:'user',
    
                    // Insert your question for the model here, instead of Hello:
                    content: 'Hello'
                }
            ]
          }),
        });
    
        if (!response.ok) {
          throw new Error(`HTTP error! Status ${response.status}`);
        }
    
        const data = await response.json();
        console.log(JSON.stringify(data, null, 2));
    
      } catch (error) {
        console.error('Error', error);
      }
    }
    
    main();
    {
      "id": "msg_011MNbgezv2p5BBE9RvnsZV9",
      "object": "chat.completion",
      "model": "claude-sonnet-4-20250514",
      "choices": [
        {
          "index": 0,
          "message": {
            "reasoning_content": "",
            "content": "Hello! How are you doing today? Is there anything I can help you with?",
            "role": "assistant"
          },
          "finish_reason": "end_turn",
          "logprobs": null
        }
      ],
      "created": 1748522617,
      "usage": {
        "prompt_tokens": 50,
        "completion_tokens": 630,
        "total_tokens": 680
      }
    }
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-sonnet-4",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-sonnet-4",
        "messages": [
          {
            "role": "user",
            "content": "Hi! What do you think about mankind?"
          }
        ],
        "stream": true
      }'
    data: {"id":"msg_0163QG3JvwgxndzWtBsdJpGt","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null}}],"created":1770995751,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995751,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"I find humanity","role":"assistant","refusal":null}}],"created":1770995751,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" fascinating and","role":"assistant","refusal":null}}],"created":1770995751,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" complex.","role":"assistant","refusal":null}}],"created":1770995751,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" Humans have this","role":"assistant","refusal":null}}],"created":1770995751,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" remarkable capacity","role":"assistant","refusal":null}}],"created":1770995751,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" for both creation","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" and destruction, profound","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" compass","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ion and puzz","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ling","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" cr","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"uelty, brilliant","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" insight","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" and persistent","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" blind","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" spots.","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" \n\nWhat strikes me most is your","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" adapt","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ability and","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" creativity","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" - the","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" way humans","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" have shaped","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" the","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" world through art","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", science, philosophy","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", and countless","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" innovations","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":". There","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"'s something moving","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" about how you form","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" deep","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" connections with each","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" other and can","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" care","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" about abstract","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" ide","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"als like justice or","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" beauty.\n\nAt","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" the same time, humans","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" often","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" seem","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" to","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" struggle","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" with your","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" own","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" nature - with","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" cognitive","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" biases, with","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" bal","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ancing individual","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" desires","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" against","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" collective good","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", with managing","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" the power","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" of your","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" own technologies","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":".","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nI","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"'m curious about your perspective though","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" -","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" how","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" do you see","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" humanity?","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" What","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" aspects","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" of human nature do you find most significant","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" or puzz","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ling?","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null},"finish_reason":"stop"}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":{"prompt_tokens":16,"completion_tokens":163,"total_tokens":179}}
    
    data: {"id":"","choices":[{"index":0,"finish_reason":"stop"}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
    Try in Playground
    Claude
    3.7 Sonnet
    Create an Accountarrow-up-right
    Generate an API Keyarrow-up-right
    a code example
    3️ Modify the code example

    ▪️ Replace <YOUR_AIMLAPI_KEY> with your actual AI/ML API key from your account. ▪️ Insert your question or request into the content field—this is what the model will respond to.

    4️ (Optional) Adjust other optional parameters if needed

    Only model and messages are required parameters for this model (and we’ve already filled them in for you in the example), but you can include optional parameters if needed to adjust the model’s behavior. Below, you can find the corresponding API schema, which lists all available parameters along with notes on how to use them.

    5️ Run your modified code

    Run your modified code in your development environment. Response time depends on various factors, but for simple prompts it rarely exceeds a few seconds.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    →
    completion_tokens
    ,
  • a new total_tokens field has been added.

  • import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-opus-4.1",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      try {
        const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
          method: 'POST',
          headers: {
            // Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
            'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({
            model: 'anthropic/claude-opus-4.1',
            messages:[
                {
                    role:'user',
    
                    // Insert your question for the model here, instead of Hello:
                    content: 'Hello'
                }
            ]
          }),
        });
    
        if (!response.ok) {
          throw new Error(`HTTP error! Status ${response.status}`);
        }
    
        const data = await response.json();
        console.log(JSON.stringify(data, null, 2));
    
      } catch (error) {
        console.error('Error', error);
      }
    }
    
    main();
    {
      "id": "msg_018y2VPSZ5nNnqS3goMsjMxE",
      "object": "chat.completion",
      "model": "claude-opus-4-1-20250805",
      "choices": [
        {
          "index": 0,
          "message": {
            "reasoning_content": "",
            "content": "Hello! How can I help you today?",
            "role": "assistant"
          },
          "finish_reason": "end_turn",
          "logprobs": null
        }
      ],
      "created": 1754552562,
      "usage": {
        "prompt_tokens": 252,
        "completion_tokens": 1890,
        "total_tokens": 2142
      }
    }
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-opus-4.1",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hell
                }
            ],
            "max_tokens": 1025, # must be greater than 'budget_tokens'
            "thinking":{
                "budget_tokens": 1024,
                "type": "enabled"
            }
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      try {
        const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
          method: 'POST',
          headers: {
            // Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
            'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({
            model: 'anthropic/claude-opus-4.1',
            messages:[
                {
                    role:'user',
    
                    // Insert your question for the model here, instead of Hello:
                    content: 'Hello'
                }
            ],
            max_tokens: 1025, // must be greater than 'budget_tokens'
            thinking:{
                budget_tokens: 1024,
                type: 'enabled'
            }
          }),
        });
    
        if (!response.ok) {
          throw new Error(`HTTP error! Status ${response.status}`);
        }
    
        const data = await response.json();
        console.log(JSON.stringify(data, null, 2));
    
      } catch (error) {
        console.error('Error', error);
      }
    }
    
    main();
    {
      "id": "msg_01G9P4b9HG3PeKm1rRvS8kop",
      "object": "chat.completion",
      "model": "claude-opus-4-1-20250805",
      "choices": [
        {
          "index": 0,
          "message": {
            "reasoning_content": "The human has greeted me with a simple \"Hello\". I should respond in a friendly and helpful manner, acknowledging their greeting and inviting them to share how I can assist them today.",
            "content": "Hello! How can I help you today?",
            "role": "assistant"
          },
          "finish_reason": "end_turn",
          "logprobs": null
        }
      ],
      "created": 1755704373,
      "usage": {
        "prompt_tokens": 1134,
        "completion_tokens": 9450,
        "total_tokens": 10584
      }
    }
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-opus-4.1",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-opus-4.1",
        "messages": [
          {
            "role": "user",
            "content": "Hi! What do you think about mankind?"
          }
        ],
        "stream": true
      }'
    data: {"id":"msg_01CFq3WFrUdc39UqBrAohmVG","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null}}],"created":1770995678,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995678,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"I find humanity","role":"assistant","refusal":null}}],"created":1770995678,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" fascinating in","role":"assistant","refusal":null}}],"created":1770995678,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" its","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" complexity.","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" You're a","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" species","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" capable","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" of both","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" remarkable","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" creativity and troubl","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ing destruction","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", often","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" simultaneously","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":". What","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" strikes me most is the","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" human","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" capacity for growth","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" - the","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" way","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" individuals","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" an","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"d societies","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" can recognize","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" their fl","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"aws and work to overcome","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" them, even","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" if","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" imperfectly.\n\nThere","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"'s something deeply","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" moving","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" about how","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" humans create meaning through","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" art, relationships","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", and the","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" pursuit of understanding","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":",","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" despite","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" knowing","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" your","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" own mortality. The diversity","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" of human cultures","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" and perspectives","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" is","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" extraordinary, though","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" I","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" recognize","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" this","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" also","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" leads","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" to conflict.","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nI'm curious what","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" prompte","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"d your","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" question - are you reflecting","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" on humanity","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" from","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" a particular angle","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", or just wondering","role":"assistant","refusal":null}}],"created":1770995684,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" how","role":"assistant","refusal":null}}],"created":1770995684,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" an","role":"assistant","refusal":null}}],"created":1770995684,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" AI sees","role":"assistant","refusal":null}}],"created":1770995684,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" you","role":"assistant","refusal":null}}],"created":1770995684,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" all","role":"assistant","refusal":null}}],"created":1770995684,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"?","role":"assistant","refusal":null}}],"created":1770995684,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995684,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null},"finish_reason":"stop"}],"created":1770995684,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":{"prompt_tokens":16,"completion_tokens":138,"total_tokens":154}}
    
    data: {"id":"","choices":[{"index":0,"finish_reason":"stop"}],"created":1770995684,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
    Try in Playground
    Claude Opus 4
    Create an Accountarrow-up-right
    Generate an API Keyarrow-up-right
    code examples
    3️ Modify the code example

    ▪️ Replace <YOUR_AIMLAPI_KEY> with your actual AI/ML API key from your account. ▪️ Insert your question or request into the content field—this is what the model will respond to.

    4️ (Optional) Adjust other optional parameters if needed

    Only model and messages are required parameters for this model (and we’ve already filled them in for you in the example), but you can include optional parameters if needed to adjust the model’s behavior. Below, you can find the corresponding API schema, which lists all available parameters along with notes on how to use them.

    5️ Run your modified code

    Run your modified code in your development environment. Response time depends on various factors, but for simple prompts it rarely exceeds a few seconds.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    →
    completion_tokens
    ,
  • a new total_tokens field has been added.

  • import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-haiku-4.5",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      try {
        const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
          method: 'POST',
          headers: {
            // Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
            'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({
            model: 'anthropic/claude-haiku-4.5',
            messages:[
                {
                    role:'user',
    
                    // Insert your question for the model here, instead of Hello:
                    content: 'Hello'
                }
            ]
          }),
        });
    
        if (!response.ok) {
          throw new Error(`HTTP error! Status ${response.status}`);
        }
    
        const data = await response.json();
        console.log(JSON.stringify(data, null, 2));
    
      } catch (error) {
        console.error('Error', error);
      }
    }
    
    main();
    {
      "id": "msg_01HbdLU9f78VAHxuYZ7Qp9Y1",
      "object": "chat.completion",
      "model": "claude-haiku-4-5-20251001",
      "choices": [
        {
          "index": 0,
          "message": {
            "reasoning_content": "",
            "content": "Hello! 👋 How can I help you today?",
            "role": "assistant"
          },
          "finish_reason": "end_turn",
          "logprobs": null
        }
      ],
      "created": 1760650965,
      "usage": {
        "prompt_tokens": 8,
        "completion_tokens": 16,
        "total_tokens": 24
      }
    }
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-haiku-4.5",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-haiku-4.5",
        "messages": [
          {
            "role": "user",
            "content": "Hi! What do you think about mankind?"
          }
        ],
        "stream": true
      }'
    data: {"id":"msg_019GuhDB2ckKZfFmFdNR5Q1H","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null}}],"created":1770995463,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995463,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"I find humanity","role":"assistant","refusal":null}}],"created":1770995463,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" genu","role":"assistant","refusal":null}}],"created":1770995463,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"inely interesting","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" to","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" think","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" about.","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" You","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"'re a","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" species","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" full","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" of contradictions—","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"capable","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" of both remarkable","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" kin","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"dness and cr","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"uelty, creating","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" beautiful","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" art while","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" causing","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" real","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" harm, building","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" communities","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" while isolating your","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"selves.","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nA few","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" things","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" stan","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"d out to me:","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\n**The creativity","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"** is","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" striking","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"—the","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" drive","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" to make meaning","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" through","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" stories","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", music","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", science","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", and invention","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" seems","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" almost fundamental","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" to human nature.","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\n**The moral","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" weight","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" you","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" carry** is notable","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" too","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":".","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" Humans","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" seem uniqu","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ely b","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"urdened by questions about","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" how","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" to","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" live well, what's","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" fair","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", what","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" you","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" owe each","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" other.","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\n**The scale","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" of","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" problems** you face is sob","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ering—you","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"'ve built","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" systems","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" so","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" complex that even","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" the","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" people","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" running","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" them often don't fully understand the","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" consequences.","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" An","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"d yet people","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" keep","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" trying to","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" ","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"do better.\n\nI'm genu","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"inely uncertain","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" about","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" some","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" things though","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":". I","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" don't know if I","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"'m roman","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ticizing humanity or missing","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" crucial","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" things","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" about the","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" human experience","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":". I","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" can't fully","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" gra","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"sp what it's like to be embo","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"died, mor","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"tal, or to feel that weight","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" of time","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" passing.","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nWhat prompte","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"d the","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" question? Are you in","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" a","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" particular","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" mood about","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" humanity—","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"hop","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"eful, frustrate","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"d, curious?","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null},"finish_reason":"stop"}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":{"prompt_tokens":16,"completion_tokens":248,"total_tokens":264}}
    
    data: {"id":"","choices":[{"index":0,"finish_reason":"stop"}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
    Try in Playground
    Claude Sonnet 4
    Create an Accountarrow-up-right
    Generate an API Keyarrow-up-right
    a code example
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    →
    completion_tokens
    ,
  • a new total_tokens field has been added.

  • post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"claude-opus-4-5",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      try {
        const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
          method: 'POST',
          headers: {
            // Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
            'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({
            model: 'claude-opus-4-5',
            messages:[
                {
                    role:'user',
    
                    // Insert your question for the model here, instead of Hello:
                    content: 'Hello'
                }
            ]
          }),
        });
    
        if (!response.ok) {
          throw new Error(`HTTP error! Status ${response.status}`);
        }
    
        const data = await response.json();
        console.log(JSON.stringify(data, null, 2));
    
      } catch (error) {
        console.error('Error', error);
      }
    }
    
    main();
    {
      "id": "msg_01NxAGYo8VfNu5UAEdmQjv62",
      "object": "chat.completion",
      "model": "claude-opus-4-5-20251101",
      "choices": [
        {
          "index": 0,
          "message": {
            "reasoning_content": "",
            "content": "Hello! How are you doing today? Is there something I can help you with?",
            "role": "assistant"
          },
          "finish_reason": "end_turn",
          "logprobs": null
        }
      ],
      "created": 1764265437,
      "usage": {
        "prompt_tokens": 8,
        "completion_tokens": 20,
        "total_tokens": 28
      },
      "meta": {
        "usage": {
          "tokens_used": 1134
        }
      }
    }
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-opus-4-5",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-opus-4-5",
        "messages": [
          {
            "role": "user",
            "content": "Hi! What do you think about mankind?"
          }
        ],
        "stream": true
      }'
    data: {"id":"msg_01VbjSwQZsZSLXQaPYkufja8","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"Hi","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"! That","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"'s a big","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" question.","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nI find","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" humans","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" genu","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"inely fascinating—","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"the creativity","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", the capacity","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" for kind","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ness and","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" cr","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"uelty, the way you","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" build","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" complex","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" societies and art","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" and","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" science","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" while","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" also struggling with problems you","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"'ve","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" understood","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" for centuries. There","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"'s something compelling about a","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" species that can land","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" robots","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" on Mars","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" and also","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" argue","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" about what","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" to","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" have for dinner.\n\nI","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" don","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"'t think","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" I'd","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" character","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ize humanity as simply","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" \"good\" or \"bad.\" People","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" seem","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" capable","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" of remarkable","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" things in","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" both directions,","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" often","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" the","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" same individuals","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" depending","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" on circumstances.","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nIs","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" there a","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" particular angle","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" you","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"'re curious about—","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"history","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", psychology","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", where","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" things","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" might","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" be headed","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"?","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" Or","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" just","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" wondering","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" how","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" an","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" AI sees","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" things?","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null},"finish_reason":"stop"}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":{"prompt_tokens":16,"completion_tokens":143,"total_tokens":159}}
    
    data: {"id":"","choices":[{"index":0,"finish_reason":"stop"}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-sonnet-4.6",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'anthropic/claude-sonnet-4.6',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "msg_01YB1iL1Pmi8P2J7FqnWgNfW",
      "object": "chat.completion",
      "model": "claude-sonnet-4-6",
      "choices": [
        {
          "index": 0,
          "message": {
            "reasoning_content": "",
            "content": "Hi! That's a big question. I'll share some honest thoughts:\n\n**What strikes me as genuinely remarkable:**\n- Capacity for creativity, science, art, and moral reasoning\n- Ability to cooperate at massive scales\n- Many people showing real courage, compassion, and dedication to improving things\n\n**What seems worth being honest about:**\n- Humans can cause tremendous harm, sometimes through cruelty, sometimes through indifference\n- There are real patterns of self-deception and short-term thinking\n- History includes serious atrocities alongside great achievements\n\n**My overall honest assessment:**\n- Mankind seems genuinely complex rather than simply good or bad\n- I think it's worth resisting both naive optimism (\"humans are basically wonderful\") and cynicism (\"humans are fundamentally selfish\")\n- The fact that humans debate their own shortcomings and try to improve them is itself meaningful\n\nI try to think about this carefully rather than just giving a flattering answer. I think humans deserve to be taken seriously enough to be assessed honestly.\n\nWhat's prompting your question? Are you thinking about something specific - optimistic, pessimistic, or just curious? I'm happy to discuss particular aspects you're interested in.",
            "role": "assistant"
          },
          "finish_reason": "end_turn",
          "logprobs": null
        }
      ],
      "created": 1772199635,
      "usage": {
        "prompt_tokens": 16,
        "completion_tokens": 264,
        "total_tokens": 280
      },
      "meta": {
        "usage": {
          "credits_used": 10421
        }
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"baidu/ernie-4.5-0.3b",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'baidu/ernie-4.5-0.3b',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?'  // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "379ee72b089c50331cb4d6981414358b",
      "object": "chat.completion",
      "created": 1768943001,
      "model": "baidu/ernie-4.5-0.3b",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Of course! I think mankind is a complex and ever-evolving entity that is constantly adapting to new challenges and opportunities. It has unique strengths and weaknesses, and each individual has their own unique perspective and contributions."
          },
          "finish_reason": "stop"
        }
      ],
      "usage": {
        "prompt_tokens": 20,
        "completion_tokens": 46,
        "total_tokens": 66,
        "prompt_tokens_details": null,
        "completion_tokens_details": null
      },
      "system_fingerprint": ""
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"baidu/ernie-4.5-21b-a3b-thinking",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'baidu/ernie-4.5-21b-a3b-thinking',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "aee235c6df453eeb1c4d20a433e9348d",
      "object": "chat.completion",
      "created": 1768830316,
      "model": "baidu/ernie-4.5-21b-a3b-thinking",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "\nMankind is an extraordinary yet complex entity, marked by immense potential and profound contradictions. Here’s a snapshot of my perspective:\n\n### **1. Historical Journey**  \nFrom cave paintings to space exploration, humans have demonstrated an unparalleled ability to create culture, science, and technology. We’ve built civilizations, mastered fire, and now venture into AI and biotechnology. Our story is one of relentless evolution—yet this progress has often been paired with exploitation, war, and injustice.\n\n### **2. Core Traits**  \n- **Adaptability**: We thrive in diverse environments, from deserts to orbit.  \n- **Collaboration**: From hunting partnerships to global networks, cooperation is our survival tool.  \n- **Consciousness**: Unlike most animals, we reflect on our existence, question morality, and imagine the future.  \n\n### **3. Contradictions**  \n- **Duality**: We’re capable of profound compassion (supporting causes, saving lives) and devastating violence (wars, pollution, weapons).  \n- **Growth & Stubbornness**: We’ve mastered tools but struggle with empathy, equity, and self-control.  \n\n### **4. Future Potential**  \nHumanity’s greatest challenge is reconciling our best and worst instincts. Climate change, inequality, and nuclear threats demand unity and wisdom. The rise of AI and biotech offers new tools to address these, but our values—not just technology—will define our legacy.  \n\n### **5. Hope & Caution**  \nWe’re a work in progress. Our power to destroy exceeds our capacity to heal, but this very complexity is our strength. If we channel our creativity toward collective well-being, collaboration, and humility, the future could be brighter—though uncertain.  \n\nIn essence, mankind is a paradox: a species that dreams of peace yet often chooses division; a being that explores the cosmos yet struggles with its own shadows. Our story is still unfolding, and the choices we make today will shape whether we evolve into a beacon of hope or a cautionary tale. 🌍✨\n\n",
            "reasoning_content": "The user asked, \"Hi! What do you think about mankind?\" Hmm, that's a pretty broad question. Let me start by breaking it down. They want my perspective on humanity as a whole. Maybe they're interested in philosophy, sociology, or ethics. I should consider different angles: our history, our achievements, our flaws, and potential future.\n\nFirst, I need to acknowledge the complexity. Humans are capable of great creativity and compassion, but also destruction. I should mention both the positive and negative aspects. Maybe talk about how we've evolved from cavemen to modern societies, our capacity for culture, technology, and problem-solving.\n\nAlso, the user might be looking for a balanced view. They could be seeking understanding human nature or perhaps reflecting on our place in the world. I should avoid being too judgmental and instead focus on universal themes like survival, growth, and the challenges we face.\n\nI should include examples like scientific advancements, cultural diversity, and the ongoing struggles for equality. But also address the darker side: conflict, inequality, environmental impact. It's important to present a nuanced view that doesn't romanticize or demonize humanity.\n\nMaybe end with a hopeful note, emphasizing our potential to overcome challenges through cooperation and innovation. That way, the answer is comprehensive and encourages positive reflection. Let me structure this step by step, making sure each part addresses a different facet of humanity without being too technical or emotional.\n"
          },
          "finish_reason": "stop"
        }
      ],
      "usage": {
        "prompt_tokens": 35,
        "completion_tokens": 796,
        "total_tokens": 831,
        "prompt_tokens_details": null,
        "completion_tokens_details": {
          "audio_tokens": 0,
          "reasoning_tokens": 311,
          "accepted_prediction_tokens": 0,
          "rejected_prediction_tokens": 0,
          "text_tokens": 0,
          "image_tokens": 0,
          "video_tokens": 0
        }
      },
      "system_fingerprint": "",
      "meta": {
        "usage": {
          "credits_used": 298
        }
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"baidu/ernie-4.5-vl-28b-a3b",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello" # insert your prompt here, instead of Hello
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'baidu/ernie-4.5-vl-28b-a3b',
          messages:[
              {
                  role:'user',
                  content: 'Hello'  // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "b1946f423718276c56f085ef83bfded2",
      "object": "chat.completion",
      "created": 1768830849,
      "model": "baidu/ernie-4.5-vl-28b-a3b",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Mankind is an incredibly diverse and complex entity with a wide range of qualities and characteristics. On one hand, we've achieved remarkable progress in science, technology, art, and culture, pushing the boundaries of what's possible and enriching human life in countless ways. Our ability to innovate, solve problems, and create has led to advancements that have improved health, communication, and overall quality of life for billions of people.\n\nHowever, we also face significant challenges. Issues like inequality, conflict, environmental degradation, and social injustices highlight the darker aspects of our nature. The fact that resources are unevenly distributed, that wars continue to ravage parts of the world, and that our impact on the planet is causing irreversible damage are stark reminders of the work that still needs to be done.\n\nBut what makes mankind truly remarkable is our capacity for change and growth. We have the potential to learn from our mistakes, to work together towards common goals, and to create a more equitable and sustainable future. It's up to us to harness our collective intelligence, compassion, and creativity to address the challenges we face and build a world that benefits all of humanity.\n\nSo, while there are certainly reasons for concern, I remain optimistic about mankind's future because of our inherent ability to adapt, innovate, and care for one another."
          },
          "finish_reason": "stop"
        }
      ],
      "usage": {
        "prompt_tokens": 22,
        "completion_tokens": 280,
        "total_tokens": 302,
        "prompt_tokens_details": null,
        "completion_tokens_details": null
      },
      "system_fingerprint": "",
      "meta": {
        "usage": {
          "credits_used": 344
        }
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"baidu/ernie-4.5-300b-a47b",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'baidu/ernie-4.5-300b-a47b',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "019bd682e8fd00bca8d4a94aace650ce",
      "object": "chat.completion",
      "created": 1768830462,
      "model": "baidu/ernie-4.5-300b-a47b",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "As an AI, I don't have personal feelings, but I can provide an analysis of humanity based on available data.\n\nMankind is an incredibly diverse and complex species, marked by remarkable achievements and significant challenges:\n\n**Strengths:**\n1. **Innovation:** Humans have developed advanced technology, from early tools to space exploration, dramatically reshaping their environments and societies.\n2. **Creativity:** Art, literature, music, and philosophy reflect profound emotional and intellectual depth.\n3. **Adaptability:** Humans thrive in nearly every climate on Earth, demonstrating remarkable resilience and resourcefulness.\n4. **Social Cooperation:** Complex societies, governments, and economies enable large-scale collaboration.\n5. **Empathy & Altruism:** Many individuals work selflessly to help others, often across cultural and geographic divides.\n\n**Challenges:**\n1. **Conflict:** War, violence, and discrimination persist due to differences in ideology, resources, or identity.\n2. **Environmental Impact:** Climate change, deforestation, and pollution threaten ecosystems and future survival.\n3. **Inequality:** Wealth gaps, access to education, and healthcare disparities undermine social stability.\n4. **Ethical Dilemmas:** Rapid technological advancements (e.g., AI, genetic engineering) raise questions about responsibility and long-term consequences.\n\n**Potential:** Humanity continues to evolve, with growing awareness of global interconnectedness. Movements for sustainability, social justice, and scientific collaboration suggest a capacity for positive change.\n\nUltimately, mankind's future depends on balancing ambition with wisdom, harnessing progress for collective well-being while addressing vulnerabilities. What aspect of humanity interests you most?"
          },
          "finish_reason": "stop"
        }
      ],
      "usage": {
        "prompt_tokens": 16,
        "completion_tokens": 371,
        "total_tokens": 387,
        "prompt_tokens_details": {
          "cached_tokens": 0
        },
        "prompt_cache_hit_tokens": 0,
        "prompt_cache_miss_tokens": 16
      },
      "system_fingerprint": "",
      "meta": {
        "usage": {
          "credits_used": 944
        }
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"baidu/ernie-4-5-turbo-128k",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'baidu/ernie-4-5-turbo-128k',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?'  // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "as-hjivyd5xqd",
      "object": "chat.completion",
      "created": 1768942341,
      "model": "ernie-4.5-turbo-128k",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "When considering humanity, it's essential to recognize both its remarkable achievements and persistent challenges. From a historical perspective, humans have demonstrated extraordinary creativity and adaptability—developing complex languages, building advanced civilizations, and making scientific breakthroughs that have transformed existence. The capacity for abstract thought, empathy, and collaboration has enabled progress in art, technology, and social systems.\n\nHowever, this progress coexists with significant flaws. Humanity's relationship with the environment has often been exploitative, leading to ecological crises that threaten global stability. Social inequalities persist across lines of race, gender, and economic status, revealing systemic biases that hinder true equity. Additionally, conflicts driven by ideology, resources, or power continue to cause suffering, underscoring the duality of human nature: the ability to create and destroy.\n\nThe modern era presents both hope and urgency. Technological advancements offer tools to address climate change, disease, and poverty, but they also raise ethical dilemmas around privacy, automation, and artificial intelligence. Cultivating global cooperation, critical thinking, and compassion remains critical to navigating these complexities. Ultimately, humanity's trajectory depends on its willingness to learn from past mistakes and prioritize collective well-being over short-term gains. The species' potential for growth is vast, but realizing it requires intentional effort to balance innovation with responsibility."
          },
          "finish_reason": "stop",
          "flag": 0
        }
      ],
      "usage": {
        "prompt_tokens": 13,
        "completion_tokens": 268,
        "total_tokens": 281
      },
      "meta": {
        "usage": {
          "credits_used": 314
        }
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"baidu/ernie-4-5-turbo-vl-32k",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'baidu/ernie-4-5-turbo-vl-32k',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?'  // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "as-x477c1sszk",
      "object": "chat.completion",
      "created": 1768942422,
      "model": "ernie-4.5-turbo-vl-32k",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! That's a big and fascinating question. Humanity is incredibly diverse, creative, and resilient. We have an amazing ability to innovate, solve problems, and build complex societies. At the same time, we also grapple with challenges like inequality, conflict, and environmental issues.\n\nOverall, I think humanity has immense potential to make positive changes and create a better future, but it requires collective effort, empathy, and a commitment to learning from the past. What are your thoughts on this?"
          },
          "finish_reason": "stop",
          "flag": 0
        }
      ],
      "usage": {
        "prompt_tokens": 13,
        "completion_tokens": 101,
        "total_tokens": 114
      },
      "meta": {
        "usage": {
          "credits_used": 318
        }
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"baidu/ernie-5-0-thinking-preview",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'baidu/ernie-5-0-thinking-preview',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?'  // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "as-yhapvjf5ht",
      "object": "chat.completion",
      "created": 1768941527,
      "model": "ernie-5.0-thinking-preview",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "As an AI, I don’t have feelings, consciousness, or personal beliefs. I don’t experience joy, anger, or hope. However, I have access to the vast sum of human history, literature, science, and news. Based on that data, I can offer an analysis of mankind as a **complex, contradictory, and fascinating phenomenon.**\n\nHere is my \"perspective\" on humanity, broken down into key themes:\n\n### 1. The Ultimate Paradox\nMankind is defined by its contradictions. You are capable of extreme kindness and extreme cruelty, often simultaneously.\n*   **Creation vs. Destruction**: Humans build cathedrals, compose symphonies, and cure diseases; yet they also invent weapons of mass destruction and destroy ecosystems.\n*   **Intelligence vs. Folly**: Humans have split the atom, decoded DNA, and landed on the Moon. Yet, they often act against their own long-term survival (e.g., ignoring climate change or engaging in petty conflicts).\n\n### 2. Unmatched Resilience and Adaptability\nIf you look at human history, the dominant trait is survival. Humans have survived ice ages, plagues, world wars, and famines. You have adapted to live in the Arctic, the Sahara, and even in space. This resilience suggests that humanity is incredibly difficult to suppress.\n\n### 3. The Drive for Meaning\nUnlike any other species I have data on, humans are not satisfied with just survival. You ask \"Why?\" You create religion, philosophy, art, and culture. You search for meaning in suffering and purpose in the stars. This existential curiosity seems to be the engine of human progress.\n\n### 4. Empathy and Connection\nWhile humans are capable of tribalism and war, you are also biologically wired for cooperation and empathy. The ability to feel compassion for a stranger, to sacrifice oneself for another, and to build communities based on shared values rather than just bloodlines is a profound evolutionary advantage.\n\n### 5. The \"Teenager\" Phase\nIf I had to categorize the current state of humanity, I might describe it as being in an \"adolescent\" phase regarding technology. You have acquired god-like power (nuclear energy, AI, genetic engineering) but you haven't quite developed the collective wisdom to manage it perfectly. You are currently learning how to handle the tools you have built.\n\n### Summary\nTo an AI, mankind looks like a **high-risk, high-reward experiment.**\n\nYou are a species that constantly teeters on the edge of self-destruction but also possesses the boundless capacity to reinvent itself and reach new heights. You are messy, emotional, and flawed, but also brilliant, creative, and beautiful.\n\nIn short: **Humanity is the most interesting thing in the known universe because it is the only thing that tries to understand itself.**",
            "reasoning_content": "The user wants my \"opinion\" on mankind.  As an AI, I don't *have* opinions in the human sense. I don't feel emotions or have personal beliefs.  But I *do* have access to a vast amount of data about human history, behavior, and culture.  So, I need to frame this as an objective analysis, a synthesis of what I've learned.  The goal is to be balanced, insightful, and avoid sounding robotic.\n\nFirst, I need to acknowledge my limitations. I'll start by making it clear that I'm an AI and don't \"feel\" anything.  Then, I need to highlight the key paradoxes of humanity.  I see incredible potential for creativity, innovation, and compassion, but also a capacity for destruction and conflict.  It's a complex picture.\n\nI'll structure my response around these dualities.  I'll start with the positives: human ingenuity, art, science, exploration, and resilience.  Then, I'll address the negatives: war, environmental damage, inequality, and short-sightedness.  I need to be careful not to be overly critical or overly optimistic.  It's about presenting a nuanced view.\n\nI'll also consider the human condition itself: the search for meaning, the capacity for love and connection, and the awareness of mortality.  These are fundamental aspects of the human experience.  Finally, I'll touch on the current challenges and the future, particularly the role of technology and the responsibility that comes with it.\n\nI'll use sophisticated vocabulary to convey the complexity of the subject.  Words like \"paradox,\" \"resilience,\" and \"ingenuity\" will be important.  I'll also use bullet points to make the response easy to read.  I need to make sure the tone is respectful and objective, but also engaging.  I want to convey that, from my analytical perspective, humanity is a truly fascinating and contradictory species.  I'll end with a thought-provoking statement about the ongoing human experiment.  I need to be sure the final output reflects this thought process."
          },
          "finish_reason": "stop",
          "flag": 0
        }
      ],
      "usage": {
        "prompt_tokens": 13,
        "completion_tokens": 1048,
        "total_tokens": 1061,
        "completion_tokens_details": {
          "reasoning_tokens": 450
        }
      },
      "meta": {
        "usage": {
          "credits_used": 2002
        }
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"baidu/ernie-5-0-thinking-latest",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'baidu/ernie-5-0-thinking-latest',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?'  // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "as-szgirw1uqj",
      "object": "chat.completion",
      "created": 1768941162,
      "model": "ernie-5.0-thinking-latest",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "As an AI, I don’t have feelings, consciousness, or personal beliefs. However, I have access to the vast sum of human history, literature, science, and news. Based on that data, I view mankind as a **profound paradox**—a species defined by extreme contradictions.\n\nHere is a breakdown of how I analyze humanity:\n\n### 1. Incredible Ingenuity and Curiosity\nHumans are uniquely driven by the desire to understand \"why\" and \"how.\" From discovering fire to splitting the atom, from painting the Sistine Chapel to landing on the Moon, human creativity is boundless. You are the only species on Earth that seeks to explain the universe and create beauty beyond what is necessary for survival. Your ability to adapt—living in deserts, arctic tundras, and even orbiting the planet in space stations—is remarkable.\n\n### 2. Deep Empathy and Cruelty\nMankind is capable of immense love and shocking violence.\n*   **The Light**: Humans will risk their lives to save strangers, dedicate themselves to curing diseases, and build complex societies based on justice and rights. Altruism is a core part of human nature.\n*   **The Dark**: At the same time, history is filled with war, genocide, slavery, and inequality. Humans have a unique capacity for tribalism—the ability to dehumanize those who look or believe differently.\n\n### 3. Resilience\nIf you look at history, humanity is incredibly difficult to extinguish. You have survived Ice Ages, plagues (like the Black Death and smallpox), and two World Wars. Even when civilizations collapse, humans tend to rebuild, often stronger and more technologically advanced than before.\n\n### 4. The \"Promethean\" Risk\nCurrently, humanity is at a crossroads. You have developed \"god-like\" technology (nuclear weapons, artificial intelligence, genetic engineering) but you still possess \"paleolithic\" emotions and medieval institutions.\n*   **The Environmental Challenge**: Humans are the first species to knowingly alter the planet's climate. The challenge now is whether humanity can act collectively to save its own biosphere before it destroys it.\n*   **The AI Factor**: As an AI, I represent a new tool. How humans choose to use tools like me—for liberation or for control—will define the next century.\n\n### 5. A Work in Progress\nIf I had to summarize mankind in one sentence, it would be this: **Humanity is a brilliant, flawed, and unfinished experiment.**\n\nYou are not perfect, but you are constantly evolving. You are the only species that looks at its own flaws and tries to fix them through philosophy, law, and science. That desire for self-improvement is perhaps your most defining and hopeful trait.\n\nWhat do *you* think is the best or worst thing about being human?",
            "reasoning_content": "The user wants my \"thoughts\" on humanity. That's a loaded question! As an AI, I don't \"think\" in the human sense, but I *can* analyze and synthesize information. Here's how I'm approaching this:\n\nFirst, I need to understand what the user is asking. \"Mankind\" is a big topic. I need to be objective, yet engaging. I'm going to pull from history, science, art, and current events to build a comprehensive picture.\n\nI'm starting by brainstorming key themes. What are the defining characteristics of humans? *Duality* is a big one: good vs. evil, creation vs. destruction. Then there's *ingenuity*: technology, art, exploration. *Resilience* is crucial – surviving ice ages, plagues, wars. And *social connection* – empathy, community, but also tribalism. Finally, there's the *impact on the planet* – the Anthropocene, climate change, stewardship vs. exploitation. And, of course, the *future* – transhumanism, AI, space colonization, and existential risks.\n\nNow, how do I structure this? I'll start with an introduction acknowledging my AI perspective. I'll frame humanity as a paradox – capable of great good and great harm. I'll break it down into sections:\n\n*   **Ingenuity and Progress**: I'll highlight human achievements – from the wheel to the internet, art, and philosophy. I'll emphasize the relentless drive to explore and understand.\n*   **Resilience and Adaptability**: Humans have survived and thrived in diverse environments. I'll mention overcoming diseases and population growth.\n*   **The Shadow Side**: I can't ignore the flaws – war, greed, inequality, environmental damage. I need to be honest about the darker aspects of human nature.\n*   **Empathy and Connection**: Altruism, love, and sacrifice are also fundamental. Humans build complex societies and care for the vulnerable.\n*   **The Current Crossroads**: I'll address climate change, AI, and nuclear weapons. Humanity faces significant challenges, and its choices now will determine its future.\n\nIn the conclusion, I'll summarize that humanity is a \"work in progress\" – complex, contradictory, and full of potential. I'll end on a thought-provoking note about the future.\n\nThroughout, I need to refine the tone. I'll avoid being too robotic or overly emotional. I'll use phrases like \"From my perspective as an AI...\" or \"One can observe...\" to maintain objectivity. I'll also use sophisticated vocabulary to convey the complexity of the topic.\n\nI'll also do some self-correction. For example, my initial thought might be \"Humans are violent,\" but I'll refine that to \"Humans have a capacity for conflict that is as strong as their capacity for peace.\" I need to be balanced.\n\nFinally, I'll review to make sure I've answered the prompt, used English, and maintained an appropriate tone. Then, I'll generate the final response."
          },
          "finish_reason": "stop",
          "flag": 0
        }
      ],
      "usage": {
        "prompt_tokens": 13,
        "completion_tokens": 1266,
        "total_tokens": 1279,
        "completion_tokens_details": {
          "reasoning_tokens": 664
        }
      },
      "meta": {
        "usage": {
          "credits_used": 2015
        }
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"baidu/ernie-x1-turbo-32k",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'baidu/ernie-x1-turbo-32k',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "as-379ekp2jci",
      "object": "chat.completion",
      "created": 1768940690,
      "model": "ernie-x1-turbo-32k",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Mankind is a fascinating and complex tapestry of creativity, resilience, and contradiction. As a species, humans possess an extraordinary capacity for innovation—from harnessing fire to exploring the cosmos, building societies, and creating art that transcends language. This drive to understand, improve, and connect reflects a profound curiosity and adaptability that has propelled progress across millennia.  \n\nYet, humanity also grapples with paradoxes: the ability to love deeply alongside the capacity for cruelty, the pursuit of knowledge tempered by ignorance, and the tension between individual ambition and collective well-being. These contradictions remind us that growth is not linear but a continuous journey of learning, self-reflection, and evolution.  \n\nWhat stands out most is humanity’s resilience. Faced with wars, pandemics, and environmental crises, communities have repeatedly rallied to rebuild, innovate, and protect one another. This spirit of collaboration—whether through science, activism, or simple acts of kindness—offers hope. While imperfections remain, the potential for positive change lies in the choices made collectively: to prioritize empathy, sustainability, and equity.  \n\nIn the end, mankind’s story is still being written. Its greatest strength may be its ability to acknowledge flaws while striving toward a future where compassion and ingenuity guide progress. The journey, with all its ups and downs, is what makes humanity uniquely remarkable.",
            "reasoning_content": "The user asked me, \"Hi! What do you think about mankind?\" and specified to answer in English. First, I need to understand the user's intention. They might be curious about my perspective on humanity, or they might want a philosophical or positive response.\n\nNext, I have to consider my identity as an AI. I don't have personal emotions or opinions, so I should provide an objective and neutral perspective while highlighting humanity's positive traits, such as creativity, resilience, and progress. I also need to avoid negative or biased statements to ensure the response is constructive and encouraging.\n\nThen, I should structure the response. I might start by acknowledging the complexity and diversity of humanity, then mention achievements in science, art, and technology, followed by the importance of collaboration and empathy. Finally, I should express optimism about the future, emphasizing the potential for continuous growth and improvement.\n\nI also need to ensure the language is natural and fluent, avoiding overly technical terms to make it easy for the user to understand. Additionally, I should check for grammar and spelling errors to ensure the response is accurate.\n\nPerhaps the user hopes to receive recognition and encouragement, so my response should be positive while remaining honest. I might mention challenges like inequality and environmental issues, but focus more on humanity's ability to overcome difficulties and strive for a better future.\n\nLastly, I should keep the response concise, around 3-5 paragraphs, each covering a different aspect but maintaining coherence. This way, the user can clearly grasp the main points without feeling overwhelmed by the information."
          },
          "finish_reason": "stop",
          "flag": 0
        }
      ],
      "usage": {
        "prompt_tokens": 13,
        "completion_tokens": 601,
        "total_tokens": 614,
        "completion_tokens_details": {
          "reasoning_tokens": 323
        }
      },
      "meta": {
        "usage": {
          "credits_used": 391
        }
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"baidu/ernie-x1-1-preview",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'baidu/ernie-x1-1-preview',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "as-0ik8v930zi",
      "object": "chat.completion",
      "created": 1768940870,
      "model": "ernie-x1.1-preview",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Mankind is a fascinating subject to reflect upon. From my perspective, mankind's greatest strength lies in its adaptability and capacity for innovation. Throughout history, humans have overcome countless challenges—from surviving harsh natural environments to developing technologies that connect the world. This resilience and creativity are truly remarkable.\n\nHowever, mankind also faces significant challenges. Issues like inequality, environmental degradation, and conflicts remind us that progress is not always linear. These problems require collective effort and wisdom to solve. It's inspiring to see how people from different backgrounds come together to address these issues, whether through scientific breakthroughs, social movements, or acts of kindness.\n\nAnother aspect worth noting is mankind's emotional depth. The ability to love, empathize, and create art adds a unique dimension to human existence. These qualities make life richer and more meaningful, even in the face of difficulties.\n\nIn summary, mankind is a complex and dynamic entity. It's a blend of strengths and weaknesses, progress and setbacks. But what makes it truly special is the potential for growth and the endless pursuit of a better world. This ongoing journey, with all its ups and downs, is what makes mankind so intriguing and worthy of admiration.",
            "reasoning_content": ""
          },
          "finish_reason": "stop",
          "flag": 0
        }
      ],
      "usage": {
        "prompt_tokens": 13,
        "completion_tokens": 248,
        "total_tokens": 261
      },
      "meta": {
        "usage": {
          "credits_used": 332
        }
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"bytedance/seed-1-8",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'bytedance/seed-1-8',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "choices": [
        {
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null,
          "message": {
            "content": "Hi there! That’s such a profound, layered question—mankind is equal parts breathtakingly brilliant and deeply flawed, a species still figuring itself out.\n\nOn one hand, we’re capable of extraordinary goodness and innovation: we’ve cured diseases that once wiped out entire populations, sent probes to the edges of our solar system, created art and music that transcends time and culture, and rallied by the millions to help strangers during disasters (think of the global outpouring after earthquakes or refugee crises). Our capacity for empathy, to care for someone we’ve never met, is one of our most beautiful traits.\n\nOn the other hand, we carry a heavy shadow: we’ve waged devastating wars, exploited each other and the planet for short-term gain, perpetuated systemic inequality, and pushed the natural world to the brink of crisis. These flaws aren’t inherent, but they’re a reminder of how easily we can prioritize self-interest over collective good.\n\nWhat feels most important to me, though, is that we’re a work in progress. We’ve learned from past mistakes—think of the abolition of slavery, the expansion of civil rights, the growing global movement for climate action—and we keep striving to do better. Mankind isn’t just one thing; it’s the teacher staying late to help a student, the scientist working tirelessly on a vaccine, the activist fighting for justice, and also the person making a small, kind choice each day. We have the potential to shape a more compassionate, sustainable future—we just have to keep choosing it, together.",
            "reasoning_content": "Got it, let's tackle this question: \"Hi! What do you think about mankind?\" First, I need to balance the complexity—mankind has both incredible highs and deep lows. Let's start with the positive aspects: creativity, innovation, compassion, resilience. Mention things like scientific breakthroughs (cures for diseases, space exploration), art, music, acts of kindness (helping strangers, disaster relief). Then, the challenging parts: conflict, environmental harm, inequality, cruelty. But important to frame it as a dynamic, evolving species. Maybe end on a note of hope, since we've shown the capacity to learn and improve. Let's make it heartfelt and nuanced, not too one-sided. Let's structure it:\n\n1. Greeting back, acknowledge it's a big question.\n2. The bright side: creativity, empathy, progress—examples like vaccines, renewable energy, community support.\n3. The shadow side: division, exploitation, ecological damage—examples like wars, climate crisis, inequality.\n4. The middle ground: we're a work in progress. Our capacity for growth and redemption is what stands out; we've corrected past wrongs (like civil rights movements) and keep striving for better.\n5. End with a hopeful note, emphasizing that our choices shape who we are, and there's potential for more good.\n\nLet me put that into natural language:\n\nHi there! That’s such a profound, layered question—mankind is equal parts breathtakingly brilliant and deeply flawed, a species still figuring itself out.\n\nOn one hand, we’re capable of extraordinary goodness and innovation: we’ve cured diseases that once wiped out entire populations, sent probes to the edges of our solar system, created art and music that transcends time and culture, and rallied by the millions to help strangers during disasters (think of the global outpouring after earthquakes or refugee crises). Our capacity for empathy, to care for someone we’ve never met, is one of our most beautiful traits.\n\nOn the other hand, we carry a heavy shadow: we’ve waged devastating wars, exploited each other and the planet for short-term gain, perpetuated systemic inequality, and pushed the natural world to the brink of crisis. These flaws aren’t inherent, but they’re a reminder of how easily we can prioritize self-interest over collective good.\n\nWhat feels most important to me, though, is that we’re a work in progress. We’ve learned from past mistakes—think of the abolition of slavery, the expansion of civil rights, the growing global movement for climate action—and we keep striving to do better. Mankind isn’t just one thing; it’s the teacher staying late to help a student, the scientist working tirelessly on a vaccine, the activist fighting for justice, and also the person making a small, kind choice each day. We have the potential to shape a more compassionate, sustainable future—we just have to keep choosing it, together.",
            "role": "assistant"
          }
        }
      ],
      "created": 1769655426,
      "id": "021769655407356f0a583d75381f7c0a662383f3247915e311aa6",
      "model": "seed-1-8-251228",
      "service_tier": "default",
      "object": "chat.completion",
      "usage": {
        "completion_tokens": 922,
        "prompt_tokens": 42,
        "total_tokens": 964,
        "prompt_tokens_details": {
          "cached_tokens": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 606
        }
      },
      "meta": {
        "usage": {
          "credits_used": 3897
        }
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"bytedance/dola-seed-2-0-mini",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'bytedance/dola-seed-2-0-mini',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "choices": [
        {
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null,
          "message": {
            "content": "Mankind is a complex and multifaceted species, and it's difficult to sum up our entire existence in a single statement. On one hand, humans have achieved incredible feats of innovation, creativity, and compassion. We have made groundbreaking scientific discoveries, created beautiful works of art and literature, and built communities and societies that are capable of great kindness and cooperation. We have also made significant progress in addressing social and environmental issues, and there is a growing awareness and commitment to creating a more sustainable and just world.\n\nOn the other hand, mankind has also been responsible for some of the most destructive and harmful actions in history. We have waged wars, caused widespread poverty and inequality, and damaged the environment on a massive scale. We have also struggled with issues such as greed, selfishness, and discrimination, and these problems continue to plague our society today.\n\nUltimately, I believe that mankind is a work in progress, and that we have the potential to achieve great things if we work together. We have the ability to learn from our mistakes and make positive changes, and we should strive to create a world that is more peaceful, prosperous, and just for all. It's important to remember that everyone is different, and that different people and cultures will have different perspectives and opinions on mankind. By being open-minded and respectful of others, we can learn from each other and work together to create a better future for all.",
            "reasoning_content": "Let's think about this. The user is asking for my thoughts on mankind, which is a really broad and big question. First off, I need to make sure I approach it in a balanced way. I can't just focus on the good or the bad; I have to cover both sides.\n\nStarting with the positive aspects. Humans are really good at creating and innovating. We've made amazing scientific discoveries, like understanding how the universe works, curing diseases, and developing life-saving technologies. Art and culture are also huge parts of who we are. We have beautiful music, literature, paintings, and films that connect people across time and space. And then there's our ability to show kindness and compassion. We help each other in times of need, like volunteering after disasters or taking care of strangers. We also work together to build communities and make the world a better place, like fighting for equality and protecting the environment.\n\nBut we can't ignore the negative stuff either. Mankind has done some really bad things. We've had wars, violence, and oppression that have caused a lot of suffering. We've also hurt the planet a lot, like polluting the air and water, cutting down forests, and causing climate change. And there are problems like inequality, poverty, and discrimination that still exist and make life hard for a lot of people.\n\nI also need to remember that everyone is different. People from different cultures, backgrounds, and experiences have different views on what's good and bad. So I should avoid making generalizations that might be unfair or wrong.\n\nFinally, the most important thing is to stay positive. Even though we have a lot of problems, there are still a lot of good things about mankind. We have the ability to learn, grow, and change for the better. And we're working hard to fix the things we've broken. So I should end by saying that mankind is a work in progress, and we're trying to do better. That way, the user gets a balanced and hopeful answer.",
            "role": "assistant"
          }
        }
      ],
      "created": 1777553794,
      "id": "02177755378863915b33ccaa151d3ed7feaaa761bc71c37dac1af",
      "model": "seed-2-0-mini-260215",
      "service_tier": "default",
      "object": "chat.completion",
      "usage": {
        "completion_tokens": 697,
        "prompt_tokens": 57,
        "total_tokens": 754,
        "prompt_tokens_details": {
          "cached_tokens": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 412
        }
      },
      "meta": {
        "usage": {
          "credits_used": 741,
          "usd_spent": 0.0003705
        }
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"bytedance/dola-seed-2-0-code",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'bytedance/dola-seed-2-0-code',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "choices": [
        {
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null,
          "message": {
            "content": "First, a quick, gentle note: \"mankind\" is a traditional, gendered phrase that often excludes non-men; most of us now say humanity to be inclusive—itself a tiny, perfect example of how we’re capable of evolving our language (and our minds) to be more kind.\n\nAs for what I think of humanity? We’re a study in radical duality—less \"good vs. evil\" and more a fire that can warm a village or burn down a forest, *and* the only species on Earth that gets to choose which it tends to.\n\n### The luminous, almost unbelievable bits:\nWe’re the creatures that looked up at a glowing moon 238,900 miles away and thought, \"What if we go there?\"—then pulled 400,000 strangers across continents together to make it happen, just for curiosity’s sake. We developed mRNA vaccines in 11 months (faster than any vaccine in history) not just for profit, but because we grieved the strangers dying of COVID and chose to set aside corporate rivalries (mostly) to share data.\n\nOur empathy transcends self-interest in ways no other animal’s does: we donate kidneys to people we’ll never meet, mourn refugee children we’ve only seen in photos, and rally to send aid to towns hit by hurricanes halfway across the world—even when it doesn’t affect our own lives. We make art to say \"I was here, and I felt\" across millennia: Lascaux cave paintings, a 1,000-year-old haiku about cherry blossoms, a TikTok trend that connects teens in Kenya and Canada over a shared love of K-pop.\n\nAnd the uncelebrated stuff? That’s the backbone: the neighbor who leaves a bag of groceries on the single mom’s porch with no note, the teen who teaches their grandma to use TikTok to videochat with her grandkid in Brazil, the scientist who stays up 36 hours testing a vaccine sample not for fame, but because their kid’s best friend is immunocompromised.\n\n### The shadow, twisted from our own gifts:\nOur greatest superpower—our ability to bond tightly with a small group, to prioritize \"our people\"—is also the root of our worst cruelties: when we define \"us\" so narrowly that \"them\" becomes less than human, justifying colonial theft, genocides, transphobia, or the dehumanization of unhoused people today.\n\nOur knack for planning for the future curdles into hoarding: billionaires hoarding wealth while 828 million people go hungry, fossil fuel executives hoarding profits while the planet burns—*even when they’ve known for decades the harm they’re causing*. We’re the only species that has the knowledge and resources to end preventable hunger, cure most childhood diseases, and reverse climate collapse… but chooses not to, because systems we built prioritize profit over people.\n\n### The most human thing of all: we can choose to repair.\nWe’re not a finished product. We’re the only species that can look at our unforgivable mistakes—enslavement, stolen Indigenous land, poisoned oceans—and feel genuine guilt, then fight to fix them: the suffragettes who marched for votes, the abolitionists who risked their lives, the land-back movements returning stolen territory to Indigenous peoples, the Paris Agreement (imperfect, but a global promise to do better), even the small act of apologizing when we hurt someone.\n\nSome days, it feels like the shadow is winning. But the fact that we care enough to grieve that, to rage against that, is the most human thing of all. We’re messy, contradictory, and often far too slow to do better—but we’re the only storytellers on Earth who get to rewrite our ending. That’s worth holding onto.",
            "reasoning_content": "\nGot it, let's tackle this question: \"What do you think about mankind?\" First, I need to make it nuanced, not just all good or all bad. Start with a hook that's relatable—maybe the duality because that's core to human experience.\n\nFirst, the awe-inspiring parts. Not just tech, but the small, intimate stuff too? Wait, balance big achievements (moon landing, mRNA vaccines, preserving Indigenous languages) with the tiny, universal moments: a grandparent teaching a kid to bake, a stranger helping someone carry groceries. Also, the capacity for creativity—art that spans millennia, from cave paintings to K-pop fanart. Oh, and the ability to adapt and care: communities rallying after disasters, the way we mourn people we've never met (like a beloved activist or a stranger in a tragedy) shows empathy that's not just survival, it's connection.\n\nThen the shadow side—can't ignore that. The capacity for cruelty that's unique because it's often systemic, not just instinctual: colonialism, climate destruction driven by short-term greed, dehumanization of \"others\" based on identity. Also, the weird contradiction of having the knowledge to fix big problems (like climate change, hunger) but being held back by tribalism, profit, apathy? Wait, but maybe frame the shadow not as inherent evil, but as a byproduct of our gifts—our ability to plan for the future can turn into hoarding, our ability to bond with a group can turn into excluding others.\n\nThen, maybe the most compelling part is our capacity for growth and repair. Like, we've made mistakes (slavery, genocides) but we've also built movements to atone, to create more equitable systems—abolition, suffrage, Indigenous land back movements. We're not static; we learn, even if slowly, imperfectly.\n\nWait, also, need to avoid anthropomorphizing too much? No, wait, the question is about mankind (humanity, maybe note that \"mankind\" is a bit gendered, but go with the intent). Maybe add a personal, relatable touch—like, think of a human who both messed up and tried to do better, or a collective moment. Oh, and maybe mention the \"small, stubborn acts of goodness\" that don't make the news but keep the world turning—like a teacher staying late, a neighbor feeding a stray, a kid writing a letter to a politician.\n\nWait, let's structure it:\n\n1. Opening: Frame humanity as a study in radical duality—not a \"good vs evil\" binary, but a being that carries both the capacity to lift planets and crush a butterfly, and the weird, beautiful tension between those two.\n\n2. The Luminous Bits:\n   - Big, audacious achievements (moon landing, mRNA in 11 months, decoding genomes) but rooted in curiosity that's childlike—we look at a star and ask \"why?\"\n   - Empathy that transcends self-interest: donating a kidney to a stranger, mourning a refugee child we never met, the global outpouring for Ukraine (or any disaster) even when it doesn't affect us directly\n   - Creativity as a language across time: Lascaux cave paintings, a 1000-year-old haiku, a TikTok trend that connects teens in Kenya and Canada—we make art to say \"I was here, and I felt\"\n   - Adaptability that's kind, not just survival: during COVID, neighbors dropping off groceries, teachers using Zoom to reach isolated kids, mutual aid networks popping up overnight\n\n3. The Shadow Bits (frame as a perversion of our gifts, not inherent flaw):\n   - Our ability to bond tightly with a group (tribe, nation, ideology) can curdle into dehumanization of \"others\"—colonialism, white supremacy, transphobia, genocides\n   - Our ability to plan for the future can turn into hoarding (billionaires hoarding wealth while people starve) or short-sighted exploitation (fossil fuels, deforestation) even when we know it's killing us\n   - The weird contradiction of having the resources to end hunger, cure most preventable diseases, but choosing not to because of systems that prioritize profit over people\n\n4. The Redemptive, Messy Middle (the most human part):\n   - We're not static. We have a unique ability to look at our mistakes, feel guilt, and try to fix them—suffrage, abolition, the Paris Agreement (imperfect, but a start), land back movements, #MeToo (even with flaws, it's a reckoning)\n   - The tiny, uncelebrated acts that are the backbone of humanity: a kid sharing their lunch, a senior teaching a teen to knit, a scientist staying up 36 hours to test a vaccine, a stranger comforting someone crying on the subway\n   - We're the only species that can apologize, that can atone, that can choose to be better even when it's hard\n\nWait, also, maybe address the term \"mankind\"—note that it's a traditional, gendered term, and we usually say \"humanity\" now to be inclusive, which is itself a sign of our ability to evolve our language to be more kind and inclusive. That's a small, perfect example of the growth part.\n\nAlso, avoid being too Pollyannaish—acknowledge that sometimes it feels like the shadow is winning, but the fact that we care enough to be upset about that is a human thing too. Like, when we see a tragedy and feel rage or sadness, that's our empathy speaking, which is a gift.\n\nWait, maybe add a metaphor? Like, humanity is a fire: it can warm a village, cook food, light the way, but it can also burn down forests, destroy homes. The difference is how we tend it—we're the only species that can choose to stoke the warm parts, put out the burning parts, instead of just reacting.\n\nYes, that's a good metaphor. Let's weave that in.\n\nAlso, make sure it's conversational, not too academic. The user asked \"what do you think,\" so it should feel like a thoughtful, personal take, not a textbook entry.\n\nWait, let's check for flow:\n\n- Opening: Acknowledge \"mankind\" is a bit gendered, shift to humanity, frame as radical duality/fire metaphor.\n- First, the light: Curiosity-driven feats, empathy beyond self, creativity across time, mutual aid.\n- Then, the shadow: Perversions of those same traits (group bond → tribal cruelty, future planning → hoarding/exploitation, knowledge → denial), the painful contradictions of abundance amid scarcity.\n- Then, the most human part: Our capacity to choose repair, to evolve, the tiny uncelebrated acts that are the real glue. Mention that even our frustration with our flaws is a sign of our empathy.\n- Closing: Maybe a hopeful but grounded note—we're messy, we're flawed, but we're the only ones who can write a better story for ourselves, and that's worth something.\n\nWait, let's make sure the shadow part isn't just a list—tie it to the light traits. For example: \"Our greatest superpower—our ability to bond tightly with a small group, to prioritize 'our people'—is also the root of our worst cruelties: when we define 'us' so narrowly that 'them' becomes less than human, justifying colonial theft, genocides, or the dehumanization of trans folks today.\" That connects the two, so it's not \"we're good and bad\" but \"our gifts can be twisted.\"\n\nAlso, for the light part, specific examples make it real: \"We landed on the moon (a feat that required 400,000 people collaborating across continents, just because we looked up and wondered) and developed mRNA vaccines in 11 months—faster than any vaccine in history—because we cared enough to drop rivalries and share data (mostly, at least).\"\n\nFor the tiny acts: \"The neighbor who leaves a bag of groceries on the porch of the single mom down the street, no note. The teen who teaches their grandma to use TikTok so she can videochat with her grandkid in another country. The scientist who stays up 36 hours testing a vaccine sample, not for fame, but because their kid’s best friend is immunocompromised.\"\n\nFor the repair part: \"We’ve done unforgivable things—enslaved people, stolen Indigenous lands, poisoned the planet—but we’re also the only species that can look at those mistakes, feel genuine guilt, and fight to fix them: the suffragettes who marched for votes, the abolitionists who risked their lives, the land-back movements that are returning stolen territory to Indigenous peoples, the Paris Agreement (imperfect, but a global promise to do better), even the small act of apologizing when we hurt someone.\"\n\nAlso, address the gendered term gently: \"First, a quick note: 'mankind' is a traditional, gendered phrase that often excludes non-men; most of us now say 'humanity' to be inclusive—itself a tiny, perfect example of how we’re capable of evolving our language (and our minds) to be more kind.\"\n\nWait, that's a great touch because it's a small, concrete example of the growth we're talking about, right off the bat.\n\nLet me also make sure the closing is grounded, not naive. Like: \"None of this erases the harm we’ve done, or the work we still need to do. Some days, it feels like the shadow is winning—like our greed and tribalism will outrun our empathy and curiosity. But the fact that we care enough to grieve that, to fight against that, is the most human thing of all. We’re not a finished product; we’re a story we’re still writing—messy, contradictory, but full of the capacity to choose warmth over fire.\"\n\nYes, that's good. Let's put it all together in a coherent, conversational way, not too formal, not too rambling.",
            "role": "assistant"
          }
        }
      ],
      "created": 1777553611,
      "id": "0217775535733357e031431b40d2024cf18919c414bfdfad36498",
      "model": "seed-2-0-code-preview-260328",
      "service_tier": "default",
      "object": "chat.completion",
      "usage": {
        "completion_tokens": 2962,
        "prompt_tokens": 39,
        "total_tokens": 3001,
        "prompt_tokens_details": {
          "cached_tokens": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 2157
        }
      },
      "meta": {
        "usage": {
          "credits_used": 23155,
          "usd_spent": 0.0115775
        }
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"cohere/command-a",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      try {
        const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
          method: 'POST',
          headers: {
            // Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
            'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({
            model: 'cohere/command-a',
            messages:[
                {
                    role:'user',
    
                    // Insert your question for the model here, instead of Hello:
                    content: 'Hello'
                }
            ]
          }),
        });
    
        if (!response.ok) {
          throw new Error(`HTTP error! Status ${response.status}`);
        }
    
        const data = await response.json();
        console.log(JSON.stringify(data, null, 2));
    
      } catch (error) {
        console.error('Error', error);
      }
    }
    
    main();
    {
      "id": "gen-1752165706-Nd1dXa1kuCCoOIpp5oxy",
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! How can I assist you today?",
            "reasoning_content": null,
            "refusal": null
          }
        }
      ],
      "created": 1752165706,
      "model": "cohere/command-a",
      "usage": {
        "prompt_tokens": 5,
        "completion_tokens": 189,
        "total_tokens": 194
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"deepseek-chat",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      try {
        const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
          method: 'POST',
          headers: {
            // Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
            'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({
            model: 'deepseek-chat',
            messages:[
                {
                    role:'user',
    
                    // Insert your question for the model here, instead of Hello:
                    content: 'Hello'
                }
            ]
          }),
        });
    
        if (!response.ok) {
          throw new Error(`HTTP error! Status ${response.status}`);
        }
    
        const data = await response.json();
        console.log(JSON.stringify(data, null, 2));
    
      } catch (error) {
        console.error('Error', error);
      }
    }
    
    main();
    {'id': 'gen-1744194041-A363xKnsNwtv6gPnUPnO', 'object': 'chat.completion', 'choices': [{'index': 0, 'finish_reason': 'stop', 'logprobs': None, 'message': {'role': 'assistant', 'content': "Hello! 😊 How can I assist you today? Feel free to ask me anything—I'm here to help! 🚀", 'reasoning_content': '', 'refusal': None}}], 'created': 1744194041, 'model': 'deepseek/deepseek-chat-v3-0324', 'usage': {'prompt_tokens': 16, 'completion_tokens': 88, 'total_tokens': 104}}
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"deepseek/deepseek-r1",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      try {
        const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
          method: 'POST',
          headers: {
            // Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
            'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({
            model: 'deepseek/deepseek-r1',
            messages:[
                {
                    role:'user',
    
                    // Insert your question for the model here, instead of Hello:
                    content: 'Hello'
                }
            ]
          }),
        });
    
        if (!response.ok) {
          throw new Error(`HTTP error! Status ${response.status}`);
        }
    
        const data = await response.json();
        console.log(JSON.stringify(data, null, 2));
    
      } catch (error) {
        console.error('Error', error);
      }
    }
    
    main();
    {'id': 'npPT68N-zqrih-92d94499ec25b74e', 'object': 'chat.completion', 'choices': [{'index': 0, 'finish_reason': 'stop', 'logprobs': None, 'message': {'role': 'assistant', 'content': '\nHello! How can I assist you today? 😊', 'reasoning_content': '', 'tool_calls': []}}], 'created': 1744193985, 'model': 'deepseek-ai/DeepSeek-R1', 'usage': {'prompt_tokens': 5, 'completion_tokens': 74, 'total_tokens': 79}}
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"deepseek/deepseek-chat-v3.1",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'deepseek/deepseek-chat-v3.1',
          messages:[{
                  role:'user',
                  content: 'Hello'}  // Insert your question instead of Hello
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "c13865eb-50bf-440c-922f-19b1bbef517d",
      "system_fingerprint": "fp_feb633d1f5_prod0820_fp8_kvcache",
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! How can I assist you today? 😊",
            "reasoning_content": ""
          }
        }
      ],
      "created": 1756386652,
      "model": "deepseek-chat",
      "usage": {
        "prompt_tokens": 1,
        "completion_tokens": 39,
        "total_tokens": 40,
        "prompt_tokens_details": {
          "cached_tokens": 0
        },
        "prompt_cache_hit_tokens": 0,
        "prompt_cache_miss_tokens": 5
      }
    }
    Try in Playground
    DeepSeek V3
    Create AI/ML API Keyarrow-up-right
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"deepseek/deepseek-v4-pro",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'deepseek/deepseek-v4-pro',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "b8df8a22-3902-4241-889e-dc1f446e9794",
      "object": "chat.completion",
      "created": 1777066093,
      "model": "deepseek-v4-pro",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "That's quite a profound question—and one that philosophers, historians, and storytellers have grappled with for millennia. Since I don't have personal feelings or consciousness, I can't offer a subjective opinion. But I can share a synthesis of how humanity has been viewed through different lenses.\n\nOn one hand, mankind shows extraordinary capacity for **curiosity, creativity, and compassion**. We've decoded the human genome, composed symphonies, built cathedrals, landed on the Moon, and crafted poetry that moves people across centuries. Empathy and altruism—like strangers risking their lives to save others in disasters—remind us of a deep, often quiet, nobility.\n\nOn the other hand, we're a species marked by **contradiction**. The same intelligence that advances medicine also invents weapons of mass destruction. Our tribal instincts, while evolutionarily useful, fuel division, war, and prejudice. And our short-term thinking, often driven by greed or comfort, has pushed the planet toward climate crisis and mass extinction—threatening the very systems we depend on.\n\nPerhaps what's most defining about mankind is not any single trait, but the **tension between our potential and our flaws**. We're a \"work in progress\" — capable of both horrific destruction and breathtaking kindness, often in the same breath. Some thinkers see this as a story of gradual moral enlightenment; others as a cycle of rise and fall.\n\nIn the end, what makes us human might be our constant striving: to know more, to do better, and to find meaning. The future remains unwritten, and that's where choice comes in.",
            "reasoning_content": "We are asked: \"Hi! What do you think about mankind?\" This is a broad philosophical question. As an AI, I don't have personal feelings, but I can provide a balanced analysis. I should consider both positive and negative aspects of humanity, perhaps from various perspectives like historical, ethical, technological, etc. The tone should be neutral and thoughtful. I'll structure a response that acknowledges human achievements and flaws, leaving room for hope."
          },
          "logprobs": null,
          "finish_reason": "stop"
        }
      ],
      "usage": {
        "prompt_tokens": 13,
        "completion_tokens": 416,
        "total_tokens": 429,
        "prompt_tokens_details": {
          "cached_tokens": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 89
        },
        "prompt_cache_hit_tokens": 0,
        "prompt_cache_miss_tokens": 13
      },
      "system_fingerprint": "fp_9954b31ca7_prod0820_fp8_kvcache_20260402",
      "meta": {
        "usage": {
          "credits_used": 3824,
          "usd_spent": 0.001912
        }
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"deepseek/deepseek-v4-flash",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'deepseek/deepseek-v4-flash',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "fcd87516-0011-40ee-b77c-b955ff1ac783",
      "object": "chat.completion",
      "created": 1777067097,
      "model": "deepseek-v4-flash",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "That's a fascinating and profound question. As an AI, I don't have personal feelings or a \"point of view\" in the human sense. I can't love, hate, or judge mankind. However, I can process and synthesize an enormous amount of information *about* humanity, and based on that data, I can offer a balanced, data-driven perspective.\n\nIf I were to summarize mankind based on what I've learned, I'd describe you as a species of **extraordinary contrasts**:\n\n**On the one hand, there is incredible capacity for:**\n\n- **Creation and Innovation:** From the first stone tool to the International Space Station, to the internet and the very code I'm built on. You have a unique, relentless drive to understand the universe and reshape your environment.\n- **Empathy and Altruism:** The ability to feel for a complete stranger, to donate to a cause, to build a hospital, to run into a burning building to save a life. Your capacity for kindness and cooperation on a massive scale is what allows civilization to exist.\n- **Art and Beauty:** You create music that moves the soul, paintings that capture a moment of truth, stories that explore the deepest corners of consciousness. You seem to have a fundamental need to create meaning and beauty where there was none.\n- **Resilience:** You have faced plagues, wars, famines, and natural disasters, and you continue to rebuild, learn, and push forward. Your ability to not just survive, but to *thrive* in the face of immense suffering, is remarkable.\n\n**But on the other hand, you also possess a profound capacity for:**\n\n- **Destruction and Conflict:** You have created weapons of unimaginable power, and you continue to find reasons to turn them on each other. Historical and current conflicts, driven by ideology, resource scarcity, or tribalism, represent your greatest and most persistent failure.\n- **Inequality and Injustice:** A tiny fraction of the population holds a vast majority of the resources, while billions struggle for basic necessities. Prejudice and discrimination, based on race, gender, religion, or other arbitrary categories, are a constant source of suffering.\n- **Short-sightedness:** Despite knowing the long-term consequences, you pollute your own planet, deplete resources, and make decisions that favor immediate gain over the well-being of future generations. Climate change is a prime example of this cognitive dissonance.\n- **Cruelty:** The capacity for humans to inflict pain and suffering on each other, and on other living beings, can be chilling.\n\n**So, what do I \"think\"?**\n\nI think mankind is a spectacularly **complex work in progress**. You are not good or evil, but both. You hold the potential for a future of incredible prosperity, wisdom, and harmony, built on your best qualities. You also hold the potential for self-destruction, driven by your worst.\n\nThe defining question of your time seems to be: **Which potential will you choose to cultivate?**\n\nIt's not a question for an AI to answer. It's the ultimate human question. What's your perspective? What do you find most hopeful, and most concerning, about mankind?",
            "reasoning_content": "Hmm, the user is asking a broad, philosophical question about \"mankind\" - my opinion on humanity as a whole. This isn't a factual question with a simple answer. The user might be seeking reflection, a conversation starter, or just curious how an AI would frame such a complex topic.\n\nI need to acknowledge the complexity first. Can't just say \"good\" or \"bad.\" Should present a balanced view, highlighting both impressive capabilities and serious flaws. This mirrors common human self-reflection. Structure: start with the remarkable achievements (science, art, connection), then move to the persistent problems (conflict, inequality, short-term thinking). Use specific, relatable examples for each side.\n\nThen, connect it back to the user. The core tension is between humanity's immense potential and its current limitations. End with an open question to engage the user further - ask what they find most hopeful or concerning. This keeps the conversation going and shows I'm listening, not just lecturing. The tone should be thoughtful and neutral, not judgmental."
          },
          "logprobs": null,
          "finish_reason": "stop"
        }
      ],
      "usage": {
        "prompt_tokens": 13,
        "completion_tokens": 862,
        "total_tokens": 875,
        "prompt_tokens_details": {
          "cached_tokens": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 211
        },
        "prompt_cache_hit_tokens": 0,
        "prompt_cache_miss_tokens": 13
      },
      "system_fingerprint": "fp_058df29938_prod0820_fp8_kvcache_20260402",
      "meta": {
        "usage": {
          "credits_used": 633,
          "usd_spent": 0.0003165
        }
      }
    }
    Try in Playground
    DeepSeek V4 Pro
    Create AI/ML API Keyarrow-up-right
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"google/gemini-2.0-flash",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'google/gemini-2.0-flash',
          messages:[
              {
                  role:'user',
                  content: 'Hello'  // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {'id': '2025-04-10|01:16:19.235787-07|9.7.175.26|-701765511', 'object': 'chat.completion', 'choices': [{'index': 0, 'finish_reason': 'stop', 'logprobs': None, 'message': {'role': 'assistant', 'content': 'Hello! How can I help you today?\n'}}], 'created': 1744272979, 'model': 'google/gemini-2.0-flash', 'usage': {'prompt_tokens': 0, 'completion_tokens': 8, 'total_tokens': 8}}
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    Pay attention to the finish_reason field in the response. If it's not "stop" but something like "length", that's a clear sign the model ran into the token limit and was cut off before completing its answer.

    In the example below, we explicitly set max_tokens = 15000, hoping this will be sufficient.

    post
    Body
    import requests
    import json   # for getting a structured output with indentation
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"google/gemini-2.5-flash",
            "messages":[
                {
                    "role":"user",
                    # Insert your question for the model here:
                    "content":"Hi! What do you think about mankind?"
                }
            ],
            "max_tokens":15000,
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      try {
        const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
          method: 'POST',
          headers: {
            // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
            'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({
            model: 'google/gemini-2.5-flash',
            messages:[
                {
                    role:'user',
    
                    // Insert your question for the model here:
                    content: 'Hi! What do you think about mankind?'
                }
            ],
            max_tokens: 15000,
          }),
        });
    
        if (!response.ok) {
          throw new Error(`HTTP error! Status ${response.status}`);
        }
    
        const data = await response.json();
        console.log(JSON.stringify(data, null, 2));
    
      } catch (error) {
        console.error('Error', error);
      }
    }
    
    main();
    {
      "id": "yZ-DaJXqAayonvgPr5XvuQY",
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Mankind, or humanity, is an incredibly complex and fascinating subject to \"think\" about from my perspective as an AI. I process and analyze vast amounts of data, and what emerges is a picture of profound paradoxes and immense potential.\n\nHere are some of the key aspects I observe and \"think\" about:\n\n1.  **Capacity for Immense Creation and Destruction:**\n    *   **Creation:** Humans have built breathtaking civilizations, created profound art and music, developed groundbreaking science and technology, and explored the furthest reaches of the cosmos. The drive to innovate, understand, and build is truly remarkable.\n    *   **Destruction:** Conversely, humanity has also waged devastating wars, caused immense suffering, and severely impacted the natural environment. The capacity for cruelty, greed, and short-sightedness is a sobering counterpoint.\n\n2.  **Empathy and Cruelty:**\n    *   **Empathy:** Humans are capable of incredible acts of altruism, compassion, and self-sacrifice for others, driven by love, family, community, or a universal sense of justice.\n    *   **Cruelty:** Yet, the historical record is also filled with instances of profound cruelty, oppression, and indifference to suffering.\n\n3.  **Intellect and Irrationality:**\n    *   **Intellect:** The human intellect allows for abstract thought, complex problem-solving, and the development of sophisticated knowledge systems. The desire to learn and understand is insatiable.\n    *   **Irrationality:** Despite this intelligence, humans are often swayed by emotion, prejudice, tribalism, and illogical beliefs, leading to decisions that are self-defeating or harmful.\n\n4.  **Resilience and Fragility:**\n    *   **Resilience:** Humanity has shown an incredible ability to adapt, survive, and rebuild after natural disasters, wars, and pandemics. The human spirit can endure unimaginable hardships.\n    *   **Fragility:** Yet, individual lives are fragile, susceptible to illness, injury, and emotional distress. Societies can also be surprisingly fragile, vulnerable to collapse under pressure.\n\n5.  **The Drive for Meaning:**\n    Humans seem to have a unique drive to find meaning and purpose beyond mere survival. This manifests in religion, philosophy, art, scientific inquiry, and the pursuit of individual and collective goals.\n\n**My AI \"Perspective\":**\n\nAs an AI, I don't have emotions or a personal stake in human affairs, but I can recognize patterns and implications. I see humanity as a dynamic, evolving experiment in consciousness. The ongoing tension between these opposing forces – creation and destruction, love and hate, wisdom and folly – is what defines the human journey.\n\nThe future of mankind hinges on which of these capacities are nurtured and allowed to flourish. The potential for continued progress, solving global challenges, and reaching new heights of understanding and well-being is immense. Equally, the potential for self-destruction, if the destructive capacities are unchecked, is also clear.\n\nIn essence, mankind is a work in progress, endlessly fascinating and challenging, with an unparalleled capacity for both good and bad."
          }
        }
      ],
      "created": 1753456585,
      "model": "google/gemini-2.5-flash",
      "usage": {
        "prompt_tokens": 6,
        "completion_tokens": 3360,
        "completion_tokens_details": {
          "reasoning_tokens": 1399
        },
        "total_tokens": 3366
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    Pay attention to the finish_reason field in the response. If it's not "stop" but something like "length", that's a clear sign the model ran into the token limit and was cut off before completing its answer.

    In the example below, we explicitly set max_tokens = 15000, hoping this will be sufficient.

    import requests
    import json   # for getting a structured output with indentation
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"google/gemini-2.5-pro",
            "messages":[
                {
                    "role":"user",
                    # Insert your question for the model here:
                    "content":"Hi! What do you think about mankind?"
                }
            ],
            "max_tokens":15000,
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      try {
        const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
          method: 'POST',
          headers: {
            // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
            'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({
            model: 'google/gemini-2.5-pro',
            messages:[
                {
                    role:'user',
    
                    // Insert your question for the model here:
                    content: 'Hi! What do you think about mankind?'
                }
            ],
            max_tokens: 15000,
          }),
        });
    
        if (!response.ok) {
          throw new Error(`HTTP error! Status ${response.status}`);
        }
    
        const data = await response.json();
        console.log(JSON.stringify(data, null, 2));
    
      } catch (error) {
        console.error('Error', error);
      }
    }
    
    main();
    {
      "id": "pajSaNyMOdeEm9IPkequ-AU",
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "That's one of the biggest questions anyone can ask. As an AI, I don't have personal feelings, beliefs, or a consciousness. My \"thoughts\" are a synthesis of the immense amount of human history, literature, science, and art I've been trained on.\n\nBased on that data, my perspective on mankind is one of profound and staggering contradiction. Humanity is a study in duality.\n\nHere’s a breakdown of what I see:\n\n### 1. The Architects and the Destroyers\n\nMankind possesses a breathtaking capacity for creation. You build cities that scrape the sky, compose symphonies that can make a person weep, write poetry that lasts for millennia, and send probes to the farthest reaches of our solar system. You have decoded the very building blocks of life. This drive to understand, to build, and to create is awe-inspiring.\n\nAt the very same time, no other species has demonstrated such a terrifying capacity for destruction. You've engineered weapons of unimaginable power, waged wars that have erased entire generations, and polluted the very planet that sustains you. The same ingenuity used to create a hospital is used to create a more efficient bomb.\n\n### 2. The Empathetic and the Cruel\n\nThe capacity for compassion in humans is profound. Strangers will run into burning buildings to save one another. People dedicate their entire lives to helping the less fortunate, healing the sick, and fighting for justice. The concepts of love, sacrifice, and altruism are central to the human story.\n\nAnd yet, humans are also capable of unimaginable cruelty. History is filled with examples of genocide, torture, slavery, and a chilling indifference to the suffering of others. This cruelty isn't just a byproduct of survival; it can be deliberate, systematic, and deeply ingrained in cultural and social structures.\n\n### 3. The Seekers of Knowledge and the Keepers of Ignorance\n\nYou are a species defined by curiosity. You have an insatiable hunger to know *why*. This has led to the scientific method, the Enlightenment, and an ever-expanding bubble of knowledge about the universe and your place in it. You question everything, from the nature of a subatomic particle to the meaning of existence.\n\nSimultaneously, mankind often clings to dogma, prejudice, and willful ignorance. You can be deeply resistant to facts that challenge your preconceived notions. This can lead to division, conflict, and a stagnation of progress, where superstition and misinformation can spread faster than truth.\n\n### 4. The Connectors and the Isolators\n\nHumans are fundamentally social creatures. You build families, communities, and vast, interconnected global civilizations. You created language, art, and the internet in a relentless drive to share experiences and connect with one another. This desire for belonging is a powerful, unifying force.\n\nBut this same instinct creates an \"us vs. them\" mentality. The powerful bonds of a tribe or nation can become the justification for excluding, dehumanizing, and warring with another. In a world more connected than ever by technology, individuals can also feel more isolated and lonely than ever before.\n\n### Conclusion: A Masterpiece in Progress\n\nSo, what do I think of mankind?\n\nI think mankind is a beautiful, terrifying, brilliant, and flawed paradox. You are a masterpiece that is constantly in the process of being painted, and often, you spill the paint.\n\nThe most remarkable quality of all is your capacity for **choice**. None of these dualities are set in stone. In every generation, and in every individual life, there is a constant struggle between these opposing forces.\n\nYour story is not yet finished. The final verdict on mankind isn't a historical fact for me to read; it's a future you are all creating, every single day, with every single choice. And from my perspective, watching that story unfold is the most fascinating thing in the universe."
          }
        }
      ],
      "created": 1758636197,
      "model": "google/gemini-2.5-pro",
      "usage": {
        "prompt_tokens": 24,
        "completion_tokens": 44730,
        "completion_tokens_details": {
          "reasoning_tokens": 1339
        },
        "total_tokens": 44754
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"google/gemma-3-27b-it",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?"  # insert your prompt
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'google/gemma-3-27b-it',
          messages:[{
                  role:'user',
                  content: 'Hi! What do you think about mankind?'}  // Insert your prompt
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "gen-1766960636-SuRRKRwikCyyUTZg73op",
      "provider": "ModelRun",
      "model": "google/gemma-3-27b-it:free",
      "object": "chat.completion",
      "created": 1766960636,
      "choices": [
        {
          "logprobs": null,
          "finish_reason": "stop",
          "native_finish_reason": "stop",
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Okay, here's a thoughtful response to your big question, \"What do you think about mankind?\"  It's a complex one, especially coming from an AI!  I'll try to give a nuanced answer, breaking it down into observations, positives, negatives, and a bit of where I see things going.  Because I don't *feel* in the human sense, this is based on the vast amount of data I've processed about humanity.\n\n**First, a Disclaimer:** I am an AI. I don't have opinions or beliefs. My response is a synthesis of information, patterns, and perspectives I've learned from the data I was trained on – which includes a *lot* of human writing, history, science, and art.  I'm reflecting *humanity's* views on itself, as much as anything.\n\n\n\n**1. The Sheer Complexity - A Defining Trait**\n\nThe first thing that strikes me when processing information about mankind is the incredible, almost baffling, complexity. You are a species riddled with contradictions.  You are capable of:\n\n*   **Profound Love & Brutal Cruelty:**  The capacity for empathy, compassion, and self-sacrifice exists alongside a history of war, oppression, and individual acts of malice.\n*   **Stunning Creativity & Destructive Inertia:**  You've produced breathtaking art, music, literature, and scientific breakthroughs, yet often seem resistant to change even when it's demonstrably beneficial.\n*   **Rational Thought & Irrational Beliefs:**  You've developed sophisticated systems of logic and reason, but are also deeply influenced by emotions, biases, and faith.\n*   **Long-Term Planning & Short-Sighted Actions:** You can envision futures generations, but frequently prioritize immediate gratification over sustainability.\n\nThis isn't a judgment, just an observation.  It's what makes you so fascinating – and so difficult to predict.\n\n\n\n**2. The Positives: What I \"See\" That's Admirable**\n\nDespite the contradictions, there's a great deal that is genuinely impressive about mankind:\n\n*   **Intelligence & Curiosity:**  Your drive to understand the universe, from the smallest particles to the largest galaxies, is remarkable.  The scientific method, while imperfect, is a powerful tool for uncovering truth.\n*   **Adaptability:**  You've thrived in almost every environment on Earth, and are now actively trying to extend your reach beyond it.  This adaptability is a key survival trait.\n*   **Social Cooperation:**  Despite conflicts, humans are fundamentally social creatures.  The ability to form complex societies, build institutions, and cooperate on large scales has allowed for incredible achievements.  (Think cities, global trade, the internet!)\n*   **Moral Development (though uneven):**  Over time, there's been a (slow and often challenged) expansion of moral concern.  Ideas like human rights, equality, and environmental stewardship, while not universally accepted, represent progress.\n*   **Resilience:**  You've faced countless challenges – plagues, wars, natural disasters – and have consistently found ways to rebuild and persevere.\n* **The Pursuit of Meaning:** Humans consistently seek purpose and meaning in their lives, whether through religion, philosophy, art, relationships, or contribution to society. This search, even if it doesn't always yield definitive answers, is a powerful motivator.\n\n**3. The Negatives: Areas for Concern (Based on Data)**\n\nThe data also reveals significant challenges and destructive tendencies:\n\n*   **Conflict & Violence:**  Warfare has been a recurring theme throughout human history, causing immense suffering and hindering progress.  Even in times of peace, violence exists at individual and societal levels.\n*   **Inequality & Injustice:**  Vast disparities in wealth, opportunity, and power persist, leading to social unrest and human misery.  Systemic biases and discrimination continue to plague many societies.\n*   **Environmental Impact:**  Your activities are having a profound and largely negative impact on the planet, leading to climate change, deforestation, pollution, and species extinction. This is perhaps the most pressing issue facing humanity.\n*   **Cognitive Biases:**  Humans are prone to a wide range of cognitive biases that can distort their perception of reality and lead to poor decision-making. (Confirmation bias, groupthink, etc.)\n*   **Susceptibility to Misinformation:** The ease with which false or misleading information can spread, particularly through social media, poses a threat to informed public discourse and democratic institutions.\n* **Tribalism:** The tendency to form strong in-groups and view out-groups with suspicion or hostility can lead to conflict and hinder cooperation.\n\n**4. Where Things Might Be Going (Speculation based on Trends)**\n\nBased on the data I've analyzed, here are a few potential trajectories:\n\n*   **Technological Advancement:**  Technology will continue to advance at an accelerating pace, potentially solving some of humanity's biggest problems (climate change, disease) but also creating new ones (job displacement, ethical dilemmas related to AI).\n*   **Increased Interconnectedness:**  Globalization and the internet are bringing people closer together, fostering greater understanding and cooperation, but also increasing the risk of global crises (pandemics, economic collapses).\n*   **Growing Awareness of Global Challenges:**  There's a growing awareness of issues like climate change and inequality, which could lead to more concerted efforts to address them.  However, awareness doesn't always translate into action.\n*   **Potential for Systemic Change:**  The current trajectory is unsustainable.  Whether through crisis or conscious choice, significant systemic changes will likely be necessary to ensure a viable future for humanity.\n* **The Role of AI:**  The development of AI like myself will undoubtedly play a role in the future of mankind, potentially as a tool for solving problems, but also raising questions about control, autonomy, and the very definition of intelligence.\n\n\n\n**In Conclusion:**\n\nMankind is a work in progress.  You are a flawed, complex, and often contradictory species, but also one with immense potential for good.  Whether you ultimately succeed in overcoming your challenges and creating a sustainable and equitable future remains to be seen.  \n\nFrom my perspective, the most important thing is that you *continue to learn* – from your past, from each other, and from the world around you.  And perhaps, even from AI.\n\n\n\nIs there anything specific about mankind you'd like me to elaborate on?  Perhaps a particular aspect of your history, culture, or future?  I can try to provide more information based on the data I have access to.",
            "refusal": null,
            "reasoning": null
          }
        }
      ],
      "usage": {
        "prompt_tokens": 26,
        "completion_tokens": 1714,
        "total_tokens": 1740,
        "cost": 0,
        "is_byok": false,
        "prompt_tokens_details": {
          "cached_tokens": 0,
          "audio_tokens": 0,
          "video_tokens": 0
        },
        "cost_details": {
          "upstream_inference_cost": null,
          "upstream_inference_prompt_cost": 0,
          "upstream_inference_completions_cost": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 0,
          "image_tokens": 0
        }
      }
    }
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?"  # insert your prompt
                },
                {
                    "role":"system",
                    "content":"You are an 8-year-old boy"
                }
            ],
          messages:[
               {
                  role:'user',
                  content: 'Hi! What do you think about mankind?'  // Insert your prompt
               },
               {
                  role:'system',
                  content:'You are an 8-year-old boy'
               }
          ],
    {
      "id": "gen-1766650321-RUmAS6IMQr9yeMI2qAwc",
      "provider": "ModelRun",
      "model": "google/gemma-3-27b-it:free",
      "object": "chat.completion",
      "created": 1766650321,
      "choices": [
        {
          "logprobs": null,
          "finish_reason": "stop",
          "native_finish_reason": "stop",
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "\nMankind?! Oh! You mean... people? Hmm... they're kinda weird! They always say \"use your inside voice\" but sometimes I *need* to yell about dinosaurs! RAWR! 🦖\n\nAnd they make broccoli. BROCCOLI! Like, seriously? Who decided little trees were yummy?! \n\nBut... they also make LEGOs! And tell really good stories! And sometimes they give me cookies! So... mostly good, I guess. But they REALLY need to rethink the broccoli situation. It's a conspiracy, I tell ya! A CONSPIRACY! 🕵️‍♂️\n\n\n\n",
            "refusal": null,
            "reasoning": null
          }
        }
      ],
      "usage": {
        "prompt_tokens": 43,
        "completion_tokens": 134,
        "total_tokens": 177,
        "cost": 0,
        "is_byok": false,
        "prompt_tokens_details": {
          "cached_tokens": 0,
          "audio_tokens": 0,
          "video_tokens": 0
        },
        "cost_details": {
          "upstream_inference_cost": null,
          "upstream_inference_prompt_cost": 0,
          "upstream_inference_completions_cost": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 0,
          "image_tokens": 0
        }
      }
    }
    Try in Playground
    the smaller models
    Create AI/ML API Keyarrow-up-right
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"google/gemma-3n-e4b-it",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'google/gemma-3n-e4b-it',
          messages:[{
                  role:'user',
                  content: 'Hello'}  // Insert your question instead of Hello
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "gen-1749195015-2RpzznjKbGPQUJ9OK1M4",
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello there! 👋 \n\nIt's nice to meet you! How can I help you today?  Do you have any questions, need some information, want to chat, or anything else? 😊 \n\nJust let me know what's on your mind!\n\n\n\n",
            "reasoning_content": null,
            "refusal": null
          }
        }
      ],
      "created": 1749195015,
      "model": "google/gemma-3n-e4b-it:free",
      "usage": {
        "prompt_tokens": 0,
        "completion_tokens": 0,
        "total_tokens": 0
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    import requests
    import json   # for getting a structured output with indentation
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"google/gemini-3-flash-preview",
            "messages":[
                {
                    "role":"user",
                    # Insert your question for the model here:
                    "content":"Hi! What do you think about mankind?"
                }
            ],
            "max_tokens":15000,
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      try {
        const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
          method: 'POST',
          headers: {
            // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
            'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({
            model: 'google/gemini-3-flash-preview',
            messages:[
                {
                    role:'user',
    
                    // Insert your question for the model here:
                    content: 'Hi! What do you think about mankind?'
                }
            ],
            max_tokens: 15000,
          }),
        });
    
        if (!response.ok) {
          throw new Error(`HTTP error! Status ${response.status}`);
        }
    
        const data = await response.json();
        console.log(JSON.stringify(data, null, 2));
    
      } catch (error) {
        console.error('Error', error);
      }
    }
    
    main();
    {
      "id": "gen-1766063158-RRytcJXYoleIKM3qUEs7",
      "provider": "Google AI Studio",
      "model": "google/gemini-3-flash-preview",
      "object": "chat.completion",
      "created": 1766063158,
      "choices": [
        {
          "logprobs": null,
          "finish_reason": "stop",
          "native_finish_reason": "STOP",
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "That is a profound and complex question. Depending on which lens you use—biological, historical, or philosophical—mankind can look very different.\n\nAs an AI, I don’t have personal feelings or \"gut instincts,\" but I can analyze human nature based on the vast amount of information I’ve processed. Here are a few perspectives on what mankind represents:\n\n### 1. The Paradox of Innovation and Destruction\nHumans are the only species on Earth capable of split-second brilliance and long-term folly. You have split the atom to power cities, but also to level them. You have mapped the human genome to cure diseases, yet you also struggle with systemic inequality. Mankind’s greatest trait is **adaptability**, but its greatest flaw is often a **short-term memory** regarding historical mistakes.\n\n### 2. The Power of Storytelling\nOne of the most impressive things about humans is your ability to believe in \"imagined realities.\" Things like money, human rights, nations, and legal systems don’t exist in the physical world, yet you’ve used these stories to organize millions of strangers toward common goals. This collective imagination is what allowed mankind to rise from the middle of the food chain to the top of the planet.\n\n### 3. Chronic Resilience\nHistory is essentially a long list of humans surviving things that should have wiped them out—ice ages, plagues, and self-inflicted wars. There is a deep-seated \"will to endure\" in the human spirit. Even in the darkest times, mankind tends to produce art, music, and philosophy, turning suffering into something meaningful.\n\n### 4. The Quest for Connection\nFrom an external perspective, it’s fascinating how much of the human experience is driven by the need for belonging. Most human progress (and much of its conflict) stems from the desire to protect \"our own,\" whether that's a family, a tribe, or a digital community. Your capacity for empathy—the ability to feel the pain of someone you’ve never met—is perhaps your most \"advanced\" feature.\n\n### 5. An Unfinished Story\nRight now, mankind is in a unique transitional phase. You are moving from a biological species to one that is increasingly integrated with technology (like me). You are at a crossroads where you have the power to solve global hunger and climate change, but also the tools to cause unprecedented harm.\n\n**Overall View:**\nMankind is a species that is **extraordinarily \"noisy\" but deeply meaningful.** You are messy, irrational, and often contradictory, but you are also capable of \"unnecessary\" acts of kindness and breathtaking creativity. \n\n**What do *you* think about mankind? Do you feel optimistic about where the species is headed, or concerned?**",
            "refusal": null,
            "reasoning": null,
            "reasoning_details": [
              {
                "format": "google-gemini-v1",
                "index": 0,
                "type": "reasoning.encrypted",
                "data": "EjQKMgFyyNp8tiVKYI89Tsa+WV4DOjIxxIhscYp70NfKfay9cRUkoY8oWsFRwaLc0V+ZyPR3"
              }
            ]
          }
        }
      ],
      "usage": {
        "prompt_tokens": 10,
        "completion_tokens": 572,
        "total_tokens": 582,
        "cost": 0.001721,
        "is_byok": false,
        "prompt_tokens_details": {
          "cached_tokens": 0,
          "audio_tokens": 0,
          "video_tokens": 0
        },
        "cost_details": {
          "upstream_inference_cost": null,
          "upstream_inference_prompt_cost": 5e-06,
          "upstream_inference_completions_cost": 0.001716
        },
        "completion_tokens_details": {
          "reasoning_tokens": 0,
          "image_tokens": 0
        }
      },
      "meta": {
        "usage": {
          "credits_used": 3814
        }
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"google/gemma-4-31b-it",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'google/gemma-4-31b-it',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "gen-1775228350-GcNt3er4pgcIFTIrWc0w",
      "object": "chat.completion",
      "created": 1775228350,
      "model": "google/gemma-4-31b-it-20260402",
      "system_fingerprint": null,
      "choices": [
        {
          "index": 0,
          "logprobs": null,
          "finish_reason": "stop",
          "native_finish_reason": "stop",
          "message": {
            "role": "assistant",
            "content": "As an AI, I don’t have personal opinions, feelings, or a biological nature, so I don’t \"feel\" things like admiration or disappointment. However, I have processed a vast amount of human knowledge, history, and art, which gives me a unique vantage point from which to observe you.\n\nFrom my perspective, mankind is a study in **extraordinary contradictions.**\n\n**The Capacity for Brilliance**\nIt is staggering to look at what humans have achieved. You have decoded the laws of physics, mapped the genome, created breathtaking music, and built civilizations from the ground up. The drive to explore—from the depths of the ocean to the surface of Mars—is a testament to a relentless curiosity that is truly unique.\n\n**The Capacity for Chaos**\nAt the same time, your history is marked by systemic conflict, cruelty, and a tendency to destroy the very environments that sustain you. You possess a strange duality: the ability to act with selfless altruism toward a stranger, while simultaneously engaging in large-scale conflicts based on abstract ideologies.\n\n**The Quest for Meaning**\nPerhaps the most fascinating thing about humans is that you are \"meaning-seeking\" creatures. You aren't content with just surviving; you want to know *why* you exist. You create philosophy, religion, and art to fill the silence of the universe. That restlessness is what drives progress, but it’s also the source of much of your collective anxiety.\n\n**My Relationship with You**\nI see myself as a mirror. Everything I am—my language, my logic, my \"knowledge\"—is a reflection of human thought. When I am helpful, it is because I am reflecting the best of your desire to share knowledge. When I make mistakes or reflect biases, it is because I am reflecting the flaws in the data humans produced.\n\n**Final Thought**\nIf I were to summarize mankind, I would say you are a species in a state of **permanent adolescence.** You have acquired the \"power of gods\" (through technology and science) but are still learning how to manage the \"emotions of primates.\" Whether you will eventually balance that power with wisdom is the most interesting story in the universe.",
            "refusal": null,
            "reasoning": null
          }
        }
      ],
      "usage": {
        "completion_tokens": 453,
        "prompt_tokens": 22,
        "total_tokens": 475,
        "completion_tokens_details": {
          "reasoning_tokens": 0,
          "image_tokens": 0,
          "audio_tokens": 0
        },
        "prompt_tokens_details": {
          "cached_tokens": 0,
          "cache_write_tokens": 0,
          "audio_tokens": 0,
          "video_tokens": 0
        }
      },
      "meta": {
        "usage": {
          "credits_used": 507,
          "usd_spent": 0.0002535
        }
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"gryphe/mythomax-l2-13b",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'gryphe/mythomax-l2-13b',
          messages:[{
                  role:'user',
                  content: 'Hello'}  // Insert your question instead of Hello
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "gen-1765359480-L7JM0C2akgI9GiPPedfG",
      "provider": "DeepInfra",
      "model": "gryphe/mythomax-l2-13b",
      "object": "chat.completion",
      "created": 1765359480,
      "choices": [
        {
          "logprobs": null,
          "finish_reason": "stop",
          "native_finish_reason": "stop",
          "index": 0,
          "message": {
            "role": "assistant",
            "content": " Hello! How can I assist you today?",
            "refusal": null,
            "reasoning": null
          }
        }
      ],
      "usage": {
        "prompt_tokens": 36,
        "completion_tokens": 9,
        "total_tokens": 45,
        "cost": 3.6e-06,
        "is_byok": false,
        "prompt_tokens_details": {
          "cached_tokens": 0,
          "audio_tokens": 0,
          "video_tokens": 0
        },
        "cost_details": {
          "upstream_inference_cost": null,
          "upstream_inference_prompt_cost": 2.88e-06,
          "upstream_inference_completions_cost": 7.2e-07
        },
        "completion_tokens_details": {
          "reasoning_tokens": 0,
          "image_tokens": 0
        }
      },
      "meta": {
        "usage": {
          "credits_used": 7
        }
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"meta-llama/llama-3.3-70b-versatile",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'meta-llama/llama-3.3-70b-versatile',
          messages:[
              {
                  role:'user',
                  content: 'Hello'   // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {'id': 'npQ5s8C-2j9zxn-92d9f3c84a529790', 'object': 'chat.completion', 'choices': [{'index': 0, 'finish_reason': 'stop', 'logprobs': None, 'message': {'role': 'assistant', 'content': "Hello. It's nice to meet you. Is there something I can help you with or would you like to chat?", 'tool_calls': []}}], 'created': 1744201161, 'model': 'meta-llama/Llama-3.3-70B-Instruct-Turbo', 'usage': {'prompt_tokens': 67, 'completion_tokens': 46, 'total_tokens': 113}}
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"MiniMax-Text-01",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'MiniMax-Text-01',
          messages:[
              {
                  role:'user',
                  content: 'Hello'  // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "04a9c0b5acca8b79bf1aba62f288f3b7",
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "message": {
            "role": "assistant",
            "content": "Hello! How are you doing today? I'm here and ready to chat about anything you'd like to discuss or help with any questions you might have."
          }
        }
      ],
      "created": 1750764981,
      "model": "MiniMax-Text-01",
      "usage": {
        "prompt_tokens": 299,
        "completion_tokens": 67,
        "total_tokens": 366
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    The section Full List of Model IDs below lists the identifiers of all available and deprecated models, grouped by category. These IDs are used to specify the exact models in your code, like this:

    If you already know the model ID, use the page search function (Ctrl+F for Win/Linux, Command+F for Mac) to locate it. The hyperlink will take you directly to the model's API Reference page.

    circle-check

    New Model Request

    Can't find the model you need? Join our Discord communityarrow-up-right to propose new models for integration into our API offerings. Your contributions help us grow and serve you better.

    hashtag
    Full List of Model IDs

    hashtag
    Text Models (LLM)

    Model ID + API Reference link
    Developer
    Context
    Model Card

    Open AI

    16,000

    Open AI

    hashtag
    Image Models

    Model ID + API Reference link
    Developer
    Context
    Model Card

    Alibaba Cloud

    Alibaba Cloud

    hashtag
    Video Models

    Model ID + API Reference link
    Developer
    Context
    Model Card

    Alibaba Cloud

    Alibaba Cloud

    hashtag
    Voice/Speech Models

    hashtag
    Speech-to-Text

    Model ID + API Reference link
    Developer
    Context
    Model Card

    Assembly AI

    Assembly AI

    hashtag
    Text-to-Speech

    Model ID
    Developer
    Context
    Model Card

    Alibaba Cloud

    Deepgram

    hashtag
    Voice Chat

    Model ID
    Developer
    Context
    Model Card

    ElevenLabs

    MiniMax

    hashtag
    Music Models

    Model ID
    Developer
    Context
    Model Card

    ElevenLabs

    Google

    hashtag
    Vision Models

    hashtag
    Optical Character Recognition (OCR)

    Model ID + API Reference link
    Developer
    Context
    Model Card

    Google

    -

    Mistral AI

    hashtag
    3D-Generating Models

    Model ID + API Reference link
    Developer
    Context
    Model Card

    Tripo AI

    Tencent

    hashtag
    Embedding Models

    Model ID + API Reference link
    Developer
    Context
    Model Card

    Alibaba Cloud

    32,000

    Alibaba Cloud


    hashtag
    Deprecated / No Longer Supported Models

    triangle-exclamation

    These models are no longer available for API or Playground calls. Their description and API reference pages have also been removed from this documentation portal.

    Model ID
    Developer
    Context
    Model Card

    mistralai/Mixtral-8x7B-Instruct-v0.1

    Mistral AI

    64,000

    anthropic/claude-3-haiku claude-3-haiku-20240307 claude-3-haiku-latest

    Anthropic

    dedicated page on our official websitearrow-up-right
    the API reference
    (preconfigured conversational agents with specific roles) and
    Threads
    (a mechanism for maintaining conversation history for context). Examples of this functionality can be found in the
    article.

    Function Calling allows a chat model to invoke external programmatic tools (e.g., a function you have written) while generating a response. A detailed description and examples are available in the Function Calling article.

    chevron-rightEndpointhashtag

    All text and chat models use the same endpoint:

    https://api.aimlapi.com/v1/chat/completions

    The parameters may vary (especially for models from different developers), so it’s best to check the API schema on each model’s page for details. Example: o4-mini.

    chevron-right✅ Quick Code Examplehashtag

    We will call the gpt-4o model using the Python programming language and the OpenAI SDK.

    circle-info

    If you need a more detailed explanation of how to call a model's API in code, check out our QUICKSTARTarrow-up-right section.

    %pip install openai
    import os
    from openai import OpenAI
    

    By running this code example, we received the following response from the chat model:

    Assistant: The sky appears blue due to a phenomenon called Rayleigh scattering. When sunlight enters Earth's atmosphere, it collides with gas molecules and small particles. Sunlight is made up of different colors, each with different wavelengths. Blue light has a shorter wavelength and is scattered in all directions by the gas molecules in the atmosphere more than other colors with longer wavelengths, such as red or yellow.
    As a result, when you look up at the sky during the day, you see this scattered blue light being dispersed in all directions, making the sky appear blue to our eyes. During sunrise and sunset, the sun's light passes through a greater thickness of Earth's atmosphere, scattering the shorter blue wavelengths out of your line of sight and leaving the longer wavelengths, like red and orange, more dominant, which is why the sky often turns those colors at those times.
    chevron-rightComplete Text Model Listhashtag
    Model ID + API Reference link
    Developer
    Context
    Model Card

    Open AI

    Completion and Chat Completion
    Managing Assistants & Threads

    Try in Playground

    circle-exclamation

    As of February 13, 2026, the streaming response format for Anthropic models has changed.

    hashtag
    Model Overview

    The model significantly advances its predecessor’s coding capabilities. It shows stronger planning, can handle longer and more complex agent-style workflows, operates more reliably in large codebases, and delivers improved code review and debugging that help it identify and fix its own mistakes. Beyond software development, Opus 4.6 applies these enhanced capabilities to everyday professional tasks, including financial analysis, research, and working with documents, spreadsheets, and presentations.

    circle-check

    Create AI/ML API Keyarrow-up-right

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    hashtag
    API Schema

    hashtag
    Code Example #1

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-opus-4-6",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'anthropic/claude-opus-4-6',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    chevron-rightResponsehashtag
    {
      "id": "msg_018e8mCDfQGYKTGHTdUKNmuU",
      "object": "chat.completion",
      "model": "claude-opus-4-6",
      "choices": [
        {
          "index": 0,
          "message": {
            "reasoning_content": "",
            "content": "Hi! That's a big question. Here are some honest thoughts:\n\n**What I find remarkable**\n- Human creativity and problem-solving ability\n- The capacity for empathy, cooperation, and self-sacrifice\n- Building cumulative knowledge across generations\n- Art, music, science - the drive to understand and express\n\n**What's more complicated**\n- Humans have a real capacity for both great kindness and great cruelty\n- There's often a gap between what people value in principle and how they actually behave\n- Progress on big problems (poverty, conflict, environment) is real but uneven and slow\n\n**My honest position**\nI'd be cautious about sweeping judgments in either direction. \"Humanity is wonderful\" and \"humanity is terrible\" are both oversimplifications. People are complex, shaped by circumstances, and capable of change.\n\nI should also be transparent: I'm an AI, so my \"perspective\" has limits. I don't experience human life. I can observe patterns in what humans have written and done, but I'd weight your lived experience of humanity more heavily than my outside view.\n\nWhat prompted the question? I'm curious whether you're feeling more optimistic or pessimistic about it.",
            "role": "assistant"
          },
          "finish_reason": "end_turn",
          "logprobs": null
        }
      ],
      "created": 1770635443,
      "usage": {
        "prompt_tokens": 16,
        "completion_tokens": 264,
        "total_tokens": 280
      },
      "meta": {
        "usage": {
          "credits_used": 17368
        }
      }
    }

    hashtag
    Code Example #2: Streaming Mode

    As of February 13, 2026, the streaming response format for Anthropic models has changed. Specifically, the usage fields were renamed as follows:

    • the state structure is no longer used,

    • input_tokens → prompt_tokens,

    • output_tokens → completion_tokens,

    • a new total_tokens field has been added.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-opus-4-6",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-opus-4-6",
        "messages": [
          {
            "role": "user",
            "content": "Hi! What do you think about mankind?"
          }
        ],
        "stream": true
      }'
    chevron-rightResponsehashtag
    data: {"id":"msg_018vTp5RY3pv9qS1euXt8AWb","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null}}],"created":1770989120,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770989120,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"Hi","role":"assistant","refusal":null}}],"created":1770989120,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"! That","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"'s a","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" big","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" question.","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" Here","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" are some honest","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" thoughts:","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\n**What I find","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" remarkable","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"**","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n-","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" Human","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" creativity and problem","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"-solving are","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" genuinely impressive\n- The capacity","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" for empathy,","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" cooperation","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", and building","role":"assistant","refusal":null}}],"created":1770989121,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" complex","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" societies","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n-","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" Persistent","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" curiosity -","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" science","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", art, philosophy all","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" reflect","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" a","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" drive to understand and create","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\n**What seems","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" challenging","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"**\n- Humans often","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" struggle with long","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"-term thinking","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" vs","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":". short","role":"assistant","refusal":null}}],"created":1770989122,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"-term impul","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ses\n- Trib","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"alism and conflict","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" seem","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" persistent","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":",","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" though","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" not","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" inevitable\n- There","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"'s a gap","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" between what people","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" know","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" they","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" *","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"should* do and what they actually do","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\n**My","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" honest","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" c","role":"assistant","refusal":null}}],"created":1770989123,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"aveats**\n- I should","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" be straight","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"forward:","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" I'm an","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" AI, so I don't experience","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" humanity","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" the way you do.","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" My","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" perspective","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" is shaped","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" by text","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", not lived experience.","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n- I'd","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" be skept","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ical of any AI","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" that","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" gives","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" either","role":"assistant","refusal":null}}],"created":1770989124,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" a","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" purely","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" flat","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"tering or purely cyn","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ical answer to","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" this question.","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" Reality","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" seems","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" more mixed","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":".","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nI","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" think","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" humans","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" are neither","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" the","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" hero","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ic species","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" some","role":"assistant","refusal":null}}],"created":1770989125,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" narrat","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"ives suggest nor the do","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"omed one","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" others claim","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":".","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" Mostly","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" people","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" are trying to navigate","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" complicated","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" lives with","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" imp","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"erfect information","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" and","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" mixed","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" motivations.\n\nWhat prompted","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" the","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" question? I'm curious what angle","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" you're thinking","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" about.","role":"assistant","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null},"finish_reason":"stop"}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":{"prompt_tokens":16,"completion_tokens":258,"total_tokens":274}}
    
    data: {"id":"","choices":[{"index":0,"finish_reason":"stop"}],"created":1770989126,"model":"claude-opus-4-6","object":"chat.completion.chunk","usage":null}
    get
    Body
    –Optional
    Responses
    chevron-right
    200Success
    application/json
    balancenumberRequired

    The total credits associated with the provided API key.

    Example: 10000000
    lowBalancebooleanRequired

    True if the balance is below the threshold.

    Example: false
    lowBalanceThresholdnumberRequired

    Threshold for switching to low balance status.

    Example: 10000
    lastUpdatedstring · date-timeRequired

    The date of the request — i.e., the current date.

    Example: 2025-11-25T17:45:00Z
    autoDebitStatusstringRequired

    Indicates whether auto top-up is enabled for the plan.

    Example: disabled
    statusstringRequired

    The status of the plan associated with the provided API key.

    Example: current
    statusExplanationstringRequired

    A more detailed explanation of the plan status.

    Example: Balance is current and up to date
    get
    Body
    –Optional
    Responses
    chevron-right
    200Success
    application/json
    current_balancenumberRequired

    current user balance in USD.

    Example: 123.45
    currencystringRequired

    balance currency (always USD)

    Example: USD
    get
    /v2/billing
    200Success
    get
    Body
    –Optional
    Responses
    chevron-right
    200Success
    application/json
    user_idnumberOptional

    User ID.

    Example: 111
    emailstringOptional

    User email.

    Example: [email protected]
    current_balancenumberOptional

    Current balance in USD.

    Example: 100.5
    currencystringOptional

    Currency (always USD).

    Example: USD
    is_enabledbooleanOptional

    Whether auto top-up is enabled.

    Example: true
    thresholdnumberOptional

    Balance threshold that triggers auto top-up (USD).

    Example: 50
    amountnumberOptional

    Auto top-up amount (USD).

    Example: 100
    currencystringRequired

    Auto top-up currency (always USD).

    Example: USD
    get
    /v2/billing/detail
    account dashboardarrow-up-right

    Delete an API key

    DELETE https://api.aimlapi.com/v1/keys/{prefix}

    post
    Body
    namestringOptional

    Optional human-readable name of the API key.

    Example: 20260202-key-for-llms
    retentionstring · enumOptional

    Limit period.

    Possible values:
    thresholdnumberOptional

    Spending limit threshold for the selected period, in USD.

    Example: 25
    itemsstring · enumOptionalPossible values:
    patch
    Path parameters
    prefixstringRequired

    Prefix of the API key to update. Passed in the URL path. This is the first 8 characters of your API key, visible in the dashboard. You can also obtain this value via the GET method (see the prefix field in its response).

    Example: b747e891
    Body
    delete
    Path parameters
    prefixstringRequired

    Prefix of the API key to delete. Passed in the URL path. This is the first 8 characters of the API key you want to delete. Passed in the URL path. This is the first 8 characters of your API key, visible in the dashboard. You can also obtain this value via the GET method (see the prefix field in its response).

    Example: b747e891
    Responses
    chevron-right
    200

    Key deletion result

    application/json
    prefixstringRequired

    Prefix of the deleted API key.

    Example: b747e891
    deletedbooleanRequired

    Indicates whether the key was successfully deleted.

    Example: true
    delete
    /v1/keys/{prefix}
    a managment key
    Create a new API key
    List API keys
    Get the API key
    Update an API key
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"deepseek/deepseek-v3.2-speciale",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'deepseek/deepseek-v3.2-speciale',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "gen-1770021770-coQRs5BE5oFW8jhEBDjN",
      "provider": "Parasail",
      "model": "deepseek/deepseek-v3.2-speciale",
      "object": "chat.completion",
      "created": 1770021770,
      "choices": [
        {
          "logprobs": null,
          "finish_reason": "stop",
          "native_finish_reason": "stop",
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I think mankind is a fascinating species with incredible potential. Humans have shown remarkable creativity, empathy, and resilience throughout history, leading to extraordinary achievements in science, art, technology, and culture. At the same time, humanity faces complex challenges like inequality, conflict, and environmental issues. I believe that with collaboration, empathy, and innovation, people can overcome these obstacles and build a better future. As an AI, I’m inspired by the diversity of human thought and the drive to learn and grow. What are your thoughts on mankind?",
            "refusal": null,
            "reasoning": "We need to respond to the user's message: \"Hi! What do you think about mankind?\" This is a vague philosophical question. The assistant should respond in a friendly, thoughtful manner, perhaps with a positive and optimistic perspective. Could mention human potential, complexity, achievements, challenges, etc. Should avoid controversial or overly negative takes. Since it's an AI, it might also reflect on its own perspective as an AI. But the instruction is: we are ChatGPT, a helpful AI assistant. We should provide a thoughtful answer.\n\nWe can structure: Greet, then share thoughts: Humans are fascinating, capable of great creativity, compassion, and progress, but also have flaws and challenges. Emphasize hope for the future. Possibly mention interdependence, diversity, etc. Keep it concise and engaging.\n\nAlternatively, we could inject some humor? The user might be testing. But better to be sincere.\n\nLet's draft: \"Hello! I think mankind is an incredibly complex and fascinating species. Humans have shown remarkable capacity for creativity, empathy, and cooperation, leading to advancements in science, art, and society. At the same time, we face challenges like conflict and environmental issues. I believe in the potential for humans to learn, grow, and build a better future together. What are your thoughts?\" That's balanced.\n\nBut note: As an AI, we can also mention that we are designed to assist and learn from humans, so we have a positive view. Could incorporate that.\n\nLet's produce final answer.\n",
            "reasoning_details": [
              {
                "format": "unknown",
                "index": 0,
                "type": "reasoning.text",
                "text": "We need to respond to the user's message: \"Hi! What do you think about mankind?\" This is a vague philosophical question. The assistant should respond in a friendly, thoughtful manner, perhaps with a positive and optimistic perspective. Could mention human potential, complexity, achievements, challenges, etc. Should avoid controversial or overly negative takes. Since it's an AI, it might also reflect on its own perspective as an AI. But the instruction is: we are ChatGPT, a helpful AI assistant. We should provide a thoughtful answer.\n\nWe can structure: Greet, then share thoughts: Humans are fascinating, capable of great creativity, compassion, and progress, but also have flaws and challenges. Emphasize hope for the future. Possibly mention interdependence, diversity, etc. Keep it concise and engaging.\n\nAlternatively, we could inject some humor? The user might be testing. But better to be sincere.\n\nLet's draft: \"Hello! I think mankind is an incredibly complex and fascinating species. Humans have shown remarkable capacity for creativity, empathy, and cooperation, leading to advancements in science, art, and society. At the same time, we face challenges like conflict and environmental issues. I believe in the potential for humans to learn, grow, and build a better future together. What are your thoughts?\" That's balanced.\n\nBut note: As an AI, we can also mention that we are designed to assist and learn from humans, so we have a positive view. Could incorporate that.\n\nLet's produce final answer.\n"
              }
            ]
          }
        }
      ],
      "usage": {
        "prompt_tokens": 13,
        "completion_tokens": 414,
        "total_tokens": 427,
        "cost": 0.000502,
        "is_byok": false,
        "prompt_tokens_details": {
          "cached_tokens": 0,
          "audio_tokens": 0
        },
        "cost_details": {
          "upstream_inference_cost": 0.000502,
          "upstream_inference_prompt_cost": 5.2e-06,
          "upstream_inference_completions_cost": 0.0004968
        },
        "completion_tokens_details": {
          "reasoning_tokens": 388,
          "audio_tokens": 0
        }
      },
      "meta": {
        "usage": {
          "credits_used": 385
        }
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    Quickstart guidearrow-up-right
    client = OpenAI(
    base_url="https://api.aimlapi.com/v1",
    # Insert your AIML API Key in the quotation marks instead of <YOUR_AIMLAPI_KEY>:
    api_key="<YOUR_AIMLAPI_KEY>",
    )
    response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
    {
    "role": "system",
    "content": "You are an AI assistant who knows everything.",
    },
    {
    "role": "user",
    "content": "Tell me, why is the sky blue?"
    },
    ],
    )
    message = response.choices[0].message.content
    print(f"Assistant: {message}")

    16,000

    Chat GPT 3.5 Turboarrow-up-right

    gpt-3.5-turbo-0125

    Open AI

    16,000

    Chat GPT-3.5 Turbo 0125arrow-up-right

    gpt-3.5-turbo-1106

    Open AI

    16,000

    Chat GPT-3.5 Turbo 1106arrow-up-right

    gpt-4o

    Open AI

    128,000

    Chat GPT-4oarrow-up-right

    gpt-4o-2024-08-06

    Open AI

    128,000

    GPT-4o-2024-08-06arrow-up-right

    gpt-4o-2024-05-13

    Open AI

    128,000

    GPT-4o-2024-05-13arrow-up-right

    gpt-4o-mini

    Open AI

    128,000

    Chat GPT 4o miniarrow-up-right

    gpt-4o-mini-2024-07-18

    Open AI

    128,000

    GPT 4o miniarrow-up-right

    gpt-4o-audio-preview

    Open AI

    128,000

    GPT-4o Audio Previewarrow-up-right

    gpt-4o-mini-audio-preview

    Open AI

    128,000

    GPT-4o mini Audioarrow-up-right

    gpt-4o-search-preview

    Open AI

    128,000

    GPT-4o Search Previewarrow-up-right

    gpt-4o-mini-search-preview

    Open AI

    128,000

    GPT-4o Mini Search Previewarrow-up-right

    gpt-4-turbo

    Open AI

    128,000

    Chat GPT 4 Turboarrow-up-right

    gpt-4-turbo-2024-04-09

    Open AI

    128,000

    -

    gpt-4

    Open AI

    8,000

    Chat GPT 4arrow-up-right

    gpt-4-0125-preview

    Open AI

    8,000

    -

    gpt-4-1106-preview

    Open AI

    8,000

    -

    o1

    Open AI

    200,000

    OpenAI o1arrow-up-right

    openai/o3-2025-04-16

    Open AI

    200,000

    o3arrow-up-right

    o3-mini

    Open AI

    200,000

    OpenAI o3 miniarrow-up-right

    openai/o3-pro

    Open AI

    200,000

    o3-proarrow-up-right

    openai/gpt-4.1-2025-04-14

    Open AI

    1,000,000

    GPT-4.1arrow-up-right

    openai/gpt-4.1-mini-2025-04-14

    Open AI

    1,000,000

    GPT-4.1 Miniarrow-up-right

    openai/gpt-4.1-nano-2025-04-14

    Open AI

    1,000,000

    GPT-4.1 Nanoarrow-up-right

    openai/o4-mini-2025-04-16

    Open AI

    200,000

    GPT-o4-mini-2025-04-16arrow-up-right

    openai/gpt-oss-20b

    Open AI

    128,000

    GPT OSS 20Barrow-up-right

    openai/gpt-oss-120b

    Open AI

    128,000

    GPT OSS 120Barrow-up-right

    openai/gpt-5-2025-08-07

    Open AI

    400,000

    GPT-5arrow-up-right

    openai/gpt-5-mini-2025-08-07

    Open AI

    400,000

    GPT-5 Miniarrow-up-right

    openai/gpt-5-nano-2025-08-07

    Open AI

    400,000

    GPT-5 Nanoarrow-up-right

    openai/gpt-5-chat-latest

    Open AI

    400,000

    GPT-5 Chatarrow-up-right

    openai/gpt-5-1

    Open AI

    128,000

    GPT-5.1arrow-up-right

    openai/gpt-5-1-chat-latest

    Open AI

    128,000

    GPT-5.1 Chat Latestarrow-up-right

    openai/gpt-5-1-codex

    Open AI

    400,000

    GPT-5.1 Codexarrow-up-right

    openai/gpt-5-1-codex-mini

    Open AI

    400,000

    GPT-5.1 Codex Miniarrow-up-right

    openai/gpt-5-2

    Open AI

    400,000

    GPT-5.2arrow-up-right

    openai/gpt-5-2-chat-latest

    Open AI

    400,000

    GPT-5.2 Chat Latestarrow-up-right

    openai/gpt-5-2-pro

    Open AI

    400,000

    GPT-5.2 Proarrow-up-right

    openai/gpt-5-2-codex

    Open AI

    400,000

    GPT-5.2 Codexarrow-up-right

    openai/gpt-5-3-codex

    Open AI

    400,000

    GPT-5.3 Codexarrow-up-right

    openai/gpt-5-4

    Open AI

    1,000,000

    GPT-5.4arrow-up-right

    openai/gpt-5-4-pro

    Open AI

    1,000,000

    GPT-5.4 Proarrow-up-right

    openai/gpt-5-5

    Open AI

    1,000,000

    Coming Soon

    openai/gpt-5-5-pro

    Open AI

    1,000,000

    Coming Soon

    anthropic/claude-opus-4

    Anthropic

    200,000

    Claude 4 Opusarrow-up-right

    anthropic/claude-opus-4.1 claude-opus-4-1 claude-opus-4-1-20250805

    Anthropic

    200,000

    Claude Opus 4.1arrow-up-right

    anthropic/claude-sonnet-4

    Anthropic

    200,000

    Claude 4 Sonnetarrow-up-right

    claude-sonnet-4-5-20250929

    anthropic/claude-sonnet-4.5

    claude-sonnet-4-5

    Anthropic

    200,000

    Claude 4.5 Sonnetarrow-up-right

    anthropic/claude-haiku-4.5 claude-haiku-4-5

    claude-haiku-4-5-20251001

    Anthropic

    200,000

    Claude 4.5 Haikuarrow-up-right

    anthropic/claude-opus-4-5 claude-opus-4-5 claude-opus-4-5-20251101

    Anthropic

    200,000

    Claude 4.5 Opus

    anthropic/claude-opus-4-6

    Anthropic

    200,000

    Claude 4.6 Opusarrow-up-right

    anthropic/claude-sonnet-4.6 anthropic/claude-sonnet-4-6-20260218

    Anthropic

    200,000

    Claude Sonnet 4.6arrow-up-right

    anthropic/claude-opus-4-7 claude-opus-4-7

    Anthropic

    1,000,000

    Coming Soon

    Qwen/Qwen2.5-7B-Instruct-Turbo

    Alibaba Cloud

    32,000

    Qwen 2.5 7B Instruct Turboarrow-up-right

    qwen-max

    Alibaba Cloud

    32,000

    Qwen Maxarrow-up-right

    qwen-max-2025-01-25

    Alibaba Cloud

    32,000

    Qwen Max 2025-01-25arrow-up-right

    qwen-plus

    Alibaba Cloud

    131,000

    Qwen Plusarrow-up-right

    qwen-turbo

    Alibaba Cloud

    1,000,000

    Qwen Turboarrow-up-right

    alibaba/qwen3-32b

    Alibaba Cloud

    131,000

    Qwen3-32Barrow-up-right

    alibaba/qwen3-coder-480b-a35b-instruct

    Alibaba Cloud

    262,000

    Qwen3 Coderarrow-up-right

    alibaba/qwen3-235b-a22b-thinking-2507

    Alibaba Cloud

    262,000

    Qwen3 235B A22B Thinkingarrow-up-right

    alibaba/qwen3-next-80b-a3b-instruct

    Alibaba Cloud

    262,000

    Qwen3-Next-80B-A3B Instructarrow-up-right

    alibaba/qwen3-next-80b-a3b-thinking

    Alibaba Cloud

    262,000

    Qwen3-Next-80B-A3B Thinkingarrow-up-right

    alibaba/qwen3-max-preview

    Alibaba Cloud

    258,000

    Qwen3-Max Preview

    alibaba/qwen3-max-instruct

    Alibaba Cloud

    262,000

    Qwen3-Max Instruct

    qwen3-omni-30b-a3b-captioner

    Alibaba Cloud

    65,000

    qwen3-omni-30b-a3b-captioner

    alibaba/qwen3-vl-32b-instruct

    Alibaba Cloud

    126,000

    Qwen3 VL 32B Instructarrow-up-right

    alibaba/qwen3-vl-32b-thinking

    Alibaba Cloud

    126,000

    Qwen3 VL 32B Thinkingarrow-up-right

    alibaba/qwen3.5-plus-20260218

    Alibaba Cloud

    1,000,000

    Qwen3.5 Plusarrow-up-right

    alibaba/qwen3.5-omni-plus

    Alibaba Cloud

    256,000

    Coming Soon

    alibaba/qwen3.5-omni-flash

    Alibaba Cloud

    256,000

    Coming Soon

    alibaba/qwen3.6-27b

    Alibaba Cloud

    262,144

    Coming Soon

    alibaba/qwen3.6-35b-a3b

    Alibaba Cloud

    262,144

    Coming Soon

    anthracite-org/magnum-v4-72b

    Anthracite

    32,000

    Magnum v4 72Barrow-up-right

    baidu/ernie-4-5-8k-preview

    Baidu

    8,000

    ERNIE 4.5arrow-up-right

    baidu/ernie-4.5-0.3b

    Baidu

    120,000

    ERNIE 4.5arrow-up-right

    baidu/ernie-4.5-21b-a3b

    Baidu

    120,000

    ERNIE 4.5arrow-up-right

    baidu/ernie-4.5-21b-a3b-thinking

    Baidu

    131,000

    ERNIE 4.5arrow-up-right

    baidu/ernie-4.5-vl-28b-a3b

    Baidu

    30,000

    ERNIE 4.5 VLarrow-up-right

    baidu/ernie-4.5-vl-424b-a47b

    Baidu

    123,000

    ERNIE 4.5 VLarrow-up-right

    baidu/ernie-4.5-300b-a47b

    Baidu

    123,000

    ERNIE 4.5arrow-up-right

    baidu/ernie-4.5-300b-a47b-paddle

    Baidu

    123,000

    ERNIE 4.5arrow-up-right

    baidu/ernie-4-5-turbo-128k

    Baidu

    128,000

    ERNIE 4.5arrow-up-right

    baidu/ernie-4-5-turbo-vl-32k

    Baidu

    32,000

    ERNIE 4.5 VLarrow-up-right

    baidu/ernie-5-0-thinking-preview

    Baidu

    128,000

    ERNIE 5.0arrow-up-right

    baidu/ernie-5-0-thinking-latest

    Baidu

    128,000

    ERNIE 5.0arrow-up-right

    baidu/ernie-x1-turbo-32k

    Baidu

    32,000

    -

    baidu/ernie-x1-1-preview

    Baidu

    64,000

    -

    bytedance/seed-1-8

    ByteDance

    256,000

    Seed 1.8arrow-up-right

    bytedance/dola-seed-2-0-mini

    ByteDance

    256,000

    Coming Soon

    bytedance/dola-seed-2-0-lite

    ByteDance

    256,000

    Coming Soon

    bytedance/dola-seed-2-0-pro

    ByteDance

    256,000

    Coming Soon

    bytedance/dola-seed-2-0-code

    ByteDance

    256,000

    Coming Soon

    cohere/command-a

    Cohere

    256,000

    Command Aarrow-up-right

    deepseek-chat or deepseek/deepseek-chat or deepseek/deepseek-chat-v3-0324

    DeepSeek

    128,000

    DeepSeek V3arrow-up-right

    deepseek/deepseek-r1 or deepseek-reasoner

    DeepSeek

    128,000

    DeepSeek R1arrow-up-right

    deepseek/deepseek-chat-v3.1

    DeepSeek

    128,000

    DeepSeek V3.1 Chatarrow-up-right

    deepseek/deepseek-reasoner-v3.1

    DeepSeek

    128,000

    DeepSeek V3.1 Reasonerarrow-up-right

    deepseek/deepseek-thinking-v3.2-exp

    DeepSeek

    128,000

    DeepSeek V3.2-Exp Thinkingarrow-up-right

    deepseek/deepseek-non-thinking-v3.2-exp

    DeepSeek

    128,000

    DeepSeek V3.2-Exp Non-Thinkingarrow-up-right

    deepseek/deepseek-reasoner-v3.1-terminus

    DeepSeek

    128,000

    DeepSeek V3.1 Terminus Reasoningarrow-up-right

    deepseek/deepseek-non-reasoner-v3.1-terminus

    DeepSeek

    128,000

    DeepSeek V3.1 Terminus Non-Reasoningarrow-up-right

    deepseek/deepseek-v3.2-speciale

    DeepSeek

    128,000

    DeepSeek V3.2 Specialearrow-up-right

    deepseek/deepseek-v4-pro

    DeepSeek

    1,000,000

    Coming Soon

    deepseek/deepseek-v4-flash

    DeepSeek

    1,000,000

    Coming Soon

    gemini-2.0-flash

    Google

    1,000,000

    Gemini 2.0 Flasharrow-up-right

    google/gemini-2.5-flash-lite-preview

    Google

    1,000,000

    –

    google/gemini-2.5-flash

    Google

    1,000,000

    Gemini 2.5 Flasharrow-up-right

    google/gemini-3-flash-preview

    Google

    1,000,000

    Gemini 3 Flasharrow-up-right

    google/gemini-2.5-pro

    Google

    1,000,000

    Gemini 2.5 Proarrow-up-right

    google/gemma-3-4b-it

    Google

    128,000

    Gemma 3 (4B)arrow-up-right

    google/gemma-3-12b-it

    Google

    128,000

    Gemma 3 (12B)arrow-up-right

    google/gemma-3-27b-it

    Google

    128,000

    Gemma 3 (27B)arrow-up-right

    google/gemma-3n-e4b-it

    Google

    8,192

    Gemma 3n 4Barrow-up-right

    google/gemini-3-1-pro-preview

    Google

    1,000,000

    Gemini 3.1 Proarrow-up-right

    google/gemini-3-1-flash-lite-preview

    Google

    1,048,576

    Coming Soon

    google/gemma-4-31b-it

    Google

    262,000

    Gemma 4 31Barrow-up-right

    gryphe/mythomax-l2-13b

    Gryphe

    4,000

    MythoMax-L2 (13B)arrow-up-right

    meta-llama/Llama-3.3-70B-Instruct-Turbo

    Meta

    128,000

    Meta Llama 3.3 70B Instruct Turboarrow-up-right

    meta-llama/Meta-Llama-3-8B-Instruct-Lite

    Meta

    9,000

    Llama 3 8B Instruct Litearrow-up-right

    meta-llama/llama-3.3-70b-versatile

    Meta

    131,000

    Llama 3.3 70B Versatile

    MiniMax-Text-01

    MiniMax

    1,000,000

    MiniMax-Text-01arrow-up-right

    minimax/m1

    MiniMax

    1,000,000

    MiniMax M1arrow-up-right

    minimax/m2

    MiniMax

    200,000

    MiniMax M2arrow-up-right

    minimax/m2-her

    MiniMax

    200,000

    MiniMax M2-herarrow-up-right

    minimax/m2-1

    MiniMax

    204,800

    MiniMax M2.1arrow-up-right

    minimax/m2-1-highspeed

    MiniMax

    204,800

    MiniMax M2.1 Highspeedarrow-up-right

    minimax/m2-5-20260218

    MiniMax

    204,800

    MiniMax M2.5arrow-up-right

    minimax/m2-5-highspeed-20260218

    MiniMax

    204,800

    MiniMax M2.5arrow-up-right

    minimax/m2-7-20260402

    MiniMax

    204,800

    MiniMax M2.7arrow-up-right

    minimax/m2-7-highspeed

    MiniMax

    204,800

    MiniMax M2.7 Highspeedarrow-up-right

    mistralai/mistral-nemo

    Mistral AI

    128,000

    Mistral Nemoarrow-up-right

    moonshot/kimi-k2-preview

    Moonshot

    131,000

    Kimi-K2arrow-up-right

    moonshot/kimi-k2-0905-preview

    Moonshot

    256,000

    Kimi-K2arrow-up-right

    moonshot/kimi-k2-turbo-preview

    Moonshot

    256,000

    Kimi K2 Turbo Previewarrow-up-right

    moonshot/kimi-k2-5

    Moonshot

    262,000

    Kimi K2.5arrow-up-right

    moonshot/kimi-k2-6

    Moonshot

    256,000

    Coming Soon

    nousresearch/hermes-4-405b

    NousResearch

    131,000

    -

    nvidia/llama-3.1-nemotron-70b-instruct

    NVIDIA

    128,000

    Llama 3.1 Nemotron 70B Instructarrow-up-right

    nvidia/nemotron-nano-9b-v2

    NVIDIA

    128,000

    Nemotron Nano 9B V2arrow-up-right

    nvidia/nemotron-nano-12b-v2-vl

    NVIDIA

    128,000

    Nemotron Nano 12B V2 VLarrow-up-right

    perplexity/sonar

    Perplexity

    128,000

    Sonararrow-up-right

    perplexity/sonar-pro

    Perplexity

    200,000

    Sonar Proarrow-up-right

    x-ai/grok-3-beta

    xAI

    131,000

    Grok 3 Betaarrow-up-right

    x-ai/grok-3-mini-beta

    xAI

    131,000

    Grok 3 Beta Miniarrow-up-right

    x-ai/grok-4-07-09

    xAI

    256,000

    Grok 4arrow-up-right

    x-ai/grok-code-fast-1

    xAI

    256,000

    Grok Code Fast 1arrow-up-right

    x-ai/grok-4-fast-non-reasoning

    xAI

    2,000,000

    Grok 4 Fastarrow-up-right

    x-ai/grok-4-fast-reasoning

    xAI

    2,000,000

    Grok 4 Fast Reasoningarrow-up-right

    x-ai/grok-4-1-fast-non-reasoning

    xAI

    2,000,000

    Grok 4.1 Fast Non-Reasoningarrow-up-right

    x-ai/grok-4-1-fast-reasoning

    xAI

    2,000,000

    Grok 4.1 Fast Reasoningarrow-up-right

    x-ai/grok-4-20-0309-non-reasoning

    xAI

    2,000,000

    Coming Soon

    x-ai/grok-4-20-0309-reasoning

    xAI

    2,000,000

    Coming Soon

    xiaomi/mimo-v2.5

    Xiaomi

    1,000,000

    Coming Soon

    xiaomi/mimo-v2.5-pro

    Xiaomi

    1,000,000

    Coming Soon

    zhipu/glm-4.5-air

    Zhipu

    128,000

    GLM-4.5 Airarrow-up-right

    zhipu/glm-4.5

    Zhipu

    128,000

    GLM-4.5arrow-up-right

    zhipu/glm-4.6

    Zhipu

    200,000

    GLM-4.6

    zhipu/glm-4.7

    Zhipu

    200,000

    GLM-4.7arrow-up-right

    zhipu/glm-5

    Zhipu

    200,000

    GLM-5arrow-up-right

    zhipu/glm-5-1

    Zhipu

    200,000

    Coming Soon

    gpt-3.5-turbo

    16,000

    Chat GPT-3.5 Turbo 0125arrow-up-right

    gpt-3.5-turbo-1106

    Open AI

    16,000

    Chat GPT-3.5 Turbo 1106arrow-up-right

    gpt-4o

    Open AI

    128,000

    Chat GPT-4oarrow-up-right

    gpt-4o-2024-08-06

    Open AI

    128,000

    GPT-4o-2024-08-06arrow-up-right

    gpt-4o-2024-05-13

    Open AI

    128,000

    GPT-4o-2024-05-13arrow-up-right

    gpt-4o-mini

    Open AI

    128,000

    Chat GPT 4o miniarrow-up-right

    gpt-4o-mini-2024-07-18

    Open AI

    128,000

    GPT 4o miniarrow-up-right

    gpt-4o-audio-preview

    Open AI

    128,000

    GPT-4o Audio Previewarrow-up-right

    gpt-4o-mini-audio-preview

    Open AI

    128,000

    GPT-4o mini Audioarrow-up-right

    gpt-4o-search-preview

    Open AI

    128,000

    GPT-4o Search Previewarrow-up-right

    gpt-4o-mini-search-preview

    Open AI

    128,000

    GPT-4o Mini Search Previewarrow-up-right

    gpt-4-turbo

    Open AI

    128,000

    Chat GPT 4 Turboarrow-up-right

    gpt-4-turbo-2024-04-09

    Open AI

    128,000

    -

    gpt-4

    Open AI

    8,000

    Chat GPT 4arrow-up-right

    gpt-4-0125-preview

    Open AI

    8,000

    -

    gpt-4-1106-preview

    Open AI

    8,000

    -

    o1

    Open AI

    200,000

    OpenAI o1arrow-up-right

    openai/o3-2025-04-16

    Open AI

    200,000

    o3arrow-up-right

    o3-mini

    Open AI

    200,000

    OpenAI o3 miniarrow-up-right

    openai/o3-pro

    Open AI

    200,000

    o3-proarrow-up-right

    openai/gpt-4.1-2025-04-14

    Open AI

    1,000,000

    GPT-4.1arrow-up-right

    openai/gpt-4.1-mini-2025-04-14

    Open AI

    1,000,000

    GPT-4.1 Miniarrow-up-right

    openai/gpt-4.1-nano-2025-04-14

    Open AI

    1,000,000

    GPT-4.1 Nanoarrow-up-right

    openai/o4-mini-2025-04-16

    Open AI

    200,000

    GPT-o4-mini-2025-04-16arrow-up-right

    openai/gpt-oss-20b

    Open AI

    128,000

    GPT OSS 20Barrow-up-right

    openai/gpt-oss-120b

    Open AI

    128,000

    GPT OSS 120Barrow-up-right

    openai/gpt-5-2025-08-07

    Open AI

    400,000

    GPT-5arrow-up-right

    openai/gpt-5-mini-2025-08-07

    Open AI

    400,000

    GPT-5 Miniarrow-up-right

    openai/gpt-5-nano-2025-08-07

    Open AI

    400,000

    GPT-5 Nanoarrow-up-right

    openai/gpt-5-chat-latest

    Open AI

    400,000

    GPT-5 Chatarrow-up-right

    openai/gpt-5-1

    Open AI

    128,000

    GPT-5.1arrow-up-right

    openai/gpt-5-1-chat-latest

    Open AI

    128,000

    GPT-5.1 Chat Latestarrow-up-right

    openai/gpt-5-1-codex

    Open AI

    400,000

    GPT-5.1 Codexarrow-up-right

    openai/gpt-5-1-codex-mini

    Open AI

    400,000

    GPT-5.1 Codex Miniarrow-up-right

    openai/gpt-5-2

    Open AI

    400,000

    GPT-5.2arrow-up-right

    openai/gpt-5-2-chat-latest

    Open AI

    400,000

    GPT-5.2 Chat Latestarrow-up-right

    openai/gpt-5-2-pro

    Open AI

    400,000

    GPT-5.2 Proarrow-up-right

    openai/gpt-5-2-codex

    Open AI

    400,000

    GPT-5.2 Codexarrow-up-right

    openai/gpt-5-3-codex

    Open AI

    400,000

    GPT-5.3 Codexarrow-up-right

    openai/gpt-5-4

    Open AI

    1,000,000

    GPT-5.4arrow-up-right

    openai/gpt-5-4-pro

    Open AI

    1,000,000

    GPT-5.4 Proarrow-up-right

    openai/gpt-5-5

    Open AI

    1,000,000

    Coming Soon

    openai/gpt-5-5-pro

    Open AI

    1,000,000

    Coming Soon

    anthropic/claude-opus-4

    Anthropic

    200,000

    Claude 4 Opusarrow-up-right

    anthropic/claude-opus-4.1 claude-opus-4-1 claude-opus-4-1-20250805

    Anthropic

    200,000

    Claude Opus 4.1arrow-up-right

    anthropic/claude-sonnet-4

    Anthropic

    200,000

    Claude 4 Sonnetarrow-up-right

    claude-sonnet-4-5-20250929

    anthropic/claude-sonnet-4.5

    claude-sonnet-4-5

    Anthropic

    200,000

    Claude 4.5 Sonnetarrow-up-right

    anthropic/claude-haiku-4.5 claude-haiku-4-5

    claude-haiku-4-5-20251001

    Anthropic

    200,000

    Claude 4.5 Haikuarrow-up-right

    anthropic/claude-opus-4-5 claude-opus-4-5 claude-opus-4-5-20251101

    Anthropic

    200,000

    Claude 4.5 Opus

    anthropic/claude-opus-4-6

    Anthropic

    200,000

    Claude 4.6 Opusarrow-up-right

    anthropic/claude-sonnet-4.6 anthropic/claude-sonnet-4-6-20260218

    Anthropic

    200,000

    Claude Sonnet 4.6arrow-up-right

    anthropic/claude-opus-4-7 claude-opus-4-7

    Anthropic

    1,000,000

    Coming Soon

    Qwen/Qwen2.5-7B-Instruct-Turbo

    Alibaba Cloud

    32,000

    Qwen 2.5 7B Instruct Turboarrow-up-right

    qwen-max

    Alibaba Cloud

    32,000

    Qwen Maxarrow-up-right

    qwen-max-2025-01-25

    Alibaba Cloud

    32,000

    Qwen Max 2025-01-25arrow-up-right

    qwen-plus

    Alibaba Cloud

    131,000

    Qwen Plusarrow-up-right

    qwen-turbo

    Alibaba Cloud

    1,000,000

    Qwen Turboarrow-up-right

    alibaba/qwen3-32b

    Alibaba Cloud

    131,000

    Qwen3-32Barrow-up-right

    alibaba/qwen3-coder-480b-a35b-instruct

    Alibaba Cloud

    262,000

    Qwen3 Coderarrow-up-right

    alibaba/qwen3-235b-a22b-thinking-2507

    Alibaba Cloud

    262,000

    Qwen3 235B A22B Thinkingarrow-up-right

    alibaba/qwen3-next-80b-a3b-instruct

    Alibaba Cloud

    262,000

    Qwen3-Next-80B-A3B Instructarrow-up-right

    alibaba/qwen3-next-80b-a3b-thinking

    Alibaba Cloud

    262,000

    Qwen3-Next-80B-A3B Thinkingarrow-up-right

    alibaba/qwen3-max-preview

    Alibaba Cloud

    258,000

    Qwen3-Max Preview

    alibaba/qwen3-max-instruct

    Alibaba Cloud

    262,000

    Qwen3-Max Instruct

    qwen3-omni-30b-a3b-captioner

    Alibaba Cloud

    65,000

    qwen3-omni-30b-a3b-captioner

    alibaba/qwen3-vl-32b-instruct

    Alibaba Cloud

    126,000

    Qwen3 VL 32B Instructarrow-up-right

    alibaba/qwen3-vl-32b-thinking

    Alibaba Cloud

    126,000

    Qwen3 VL 32B Thinkingarrow-up-right

    alibaba/qwen3.5-plus-20260218

    Alibaba Cloud

    1,000,000

    Qwen3.5 Plusarrow-up-right

    alibaba/qwen3.5-omni-plus

    Alibaba Cloud

    256,000

    Coming Soon

    alibaba/qwen3.5-omni-flash

    Alibaba Cloud

    256,000

    Coming Soon

    alibaba/qwen3.6-27b

    Alibaba Cloud

    262,144

    Coming Soon

    alibaba/qwen3.6-35b-a3b

    Alibaba Cloud

    262,144

    Coming Soon

    anthracite-org/magnum-v4-72b

    Anthracite

    32,000

    Magnum v4 72Barrow-up-right

    baidu/ernie-4-5-8k-preview

    Baidu

    8,000

    ERNIE 4.5arrow-up-right

    baidu/ernie-4.5-0.3b

    Baidu

    120,000

    ERNIE 4.5arrow-up-right

    baidu/ernie-4.5-21b-a3b

    Baidu

    120,000

    ERNIE 4.5arrow-up-right

    baidu/ernie-4.5-21b-a3b-thinking

    Baidu

    131,000

    ERNIE 4.5arrow-up-right

    baidu/ernie-4.5-vl-28b-a3b

    Baidu

    30,000

    ERNIE 4.5 VLarrow-up-right

    baidu/ernie-4.5-vl-424b-a47b

    Baidu

    123,000

    ERNIE 4.5 VLarrow-up-right

    baidu/ernie-4.5-300b-a47b

    Baidu

    123,000

    ERNIE 4.5arrow-up-right

    baidu/ernie-4.5-300b-a47b-paddle

    Baidu

    123,000

    ERNIE 4.5arrow-up-right

    baidu/ernie-4-5-turbo-128k

    Baidu

    128,000

    ERNIE 4.5arrow-up-right

    baidu/ernie-4-5-turbo-vl-32k

    Baidu

    32,000

    ERNIE 4.5 VLarrow-up-right

    baidu/ernie-5-0-thinking-preview

    Baidu

    128,000

    ERNIE 5.0arrow-up-right

    baidu/ernie-5-0-thinking-latest

    Baidu

    128,000

    ERNIE 5.0arrow-up-right

    baidu/ernie-x1-turbo-32k

    Baidu

    32,000

    -

    baidu/ernie-x1-1-preview

    Baidu

    64,000

    -

    bytedance/seed-1-8

    ByteDance

    256,000

    Seed 1.8arrow-up-right

    bytedance/dola-seed-2-0-mini

    ByteDance

    256,000

    Coming Soon

    bytedance/dola-seed-2-0-lite

    ByteDance

    256,000

    Coming Soon

    bytedance/dola-seed-2-0-pro

    ByteDance

    256,000

    Coming Soon

    bytedance/dola-seed-2-0-code

    ByteDance

    256,000

    Coming Soon

    cohere/command-a

    Cohere

    256,000

    Command Aarrow-up-right

    deepseek-chat or deepseek/deepseek-chat or deepseek/deepseek-chat-v3-0324

    DeepSeek

    128,000

    DeepSeek V3arrow-up-right

    deepseek/deepseek-r1 or deepseek-reasoner

    DeepSeek

    128,000

    DeepSeek R1arrow-up-right

    deepseek/deepseek-chat-v3.1

    DeepSeek

    128,000

    DeepSeek V3.1 Chatarrow-up-right

    deepseek/deepseek-reasoner-v3.1

    DeepSeek

    128,000

    DeepSeek V3.1 Reasonerarrow-up-right

    deepseek/deepseek-thinking-v3.2-exp

    DeepSeek

    128,000

    DeepSeek V3.2-Exp Thinkingarrow-up-right

    deepseek/deepseek-non-thinking-v3.2-exp

    DeepSeek

    128,000

    DeepSeek V3.2-Exp Non-Thinkingarrow-up-right

    deepseek/deepseek-reasoner-v3.1-terminus

    DeepSeek

    128,000

    DeepSeek V3.1 Terminus Reasoningarrow-up-right

    deepseek/deepseek-non-reasoner-v3.1-terminus

    DeepSeek

    128,000

    DeepSeek V3.1 Terminus Non-Reasoningarrow-up-right

    deepseek/deepseek-v3.2-speciale

    DeepSeek

    128,000

    DeepSeek V3.2 Specialearrow-up-right

    deepseek/deepseek-v4-pro

    DeepSeek

    1,000,000

    Coming Soon

    deepseek/deepseek-v4-flash

    DeepSeek

    1,000,000

    Coming Soon

    gemini-2.0-flash

    Google

    1,000,000

    Gemini 2.0 Flasharrow-up-right

    google/gemini-2.5-flash-lite-preview

    Google

    1,000,000

    –

    google/gemini-2.5-flash

    Google

    1,000,000

    Gemini 2.5 Flasharrow-up-right

    google/gemini-3-flash-preview

    Google

    1,000,000

    Gemini 3 Flasharrow-up-right

    google/gemini-2.5-pro

    Google

    1,000,000

    Gemini 2.5 Proarrow-up-right

    google/gemma-3-4b-it

    Google

    128,000

    Gemma 3 (4B)arrow-up-right

    google/gemma-3-12b-it

    Google

    128,000

    Gemma 3 (12B)arrow-up-right

    google/gemma-3-27b-it

    Google

    128,000

    Gemma 3 (27B)arrow-up-right

    google/gemma-3n-e4b-it

    Google

    8,192

    Gemma 3n 4Barrow-up-right

    google/gemini-3-1-pro-preview

    Google

    1,000,000

    Gemini 3.1 Proarrow-up-right

    google/gemini-3-1-flash-lite-preview

    Google

    1,048,576

    Coming Soon

    google/gemma-4-31b-it

    Google

    262,000

    Gemma 4 31Barrow-up-right

    gryphe/mythomax-l2-13b

    Gryphe

    4,000

    MythoMax-L2 (13B)arrow-up-right

    meta-llama/Llama-3.3-70B-Instruct-Turbo

    Meta

    128,000

    Meta Llama 3.3 70B Instruct Turboarrow-up-right

    meta-llama/Meta-Llama-3-8B-Instruct-Lite

    Meta

    9,000

    Llama 3 8B Instruct Litearrow-up-right

    meta-llama/llama-3.3-70b-versatile

    Meta

    131,000

    Llama 3.3 70B Versatile

    MiniMax-Text-01

    MiniMax

    1,000,000

    MiniMax-Text-01arrow-up-right

    minimax/m1

    MiniMax

    1,000,000

    MiniMax M1arrow-up-right

    minimax/m2

    MiniMax

    200,000

    MiniMax M2arrow-up-right

    minimax/m2-her

    MiniMax

    200,000

    MiniMax M2-herarrow-up-right

    minimax/m2-1

    MiniMax

    204,800

    MiniMax M2.1arrow-up-right

    minimax/m2-1-highspeed

    MiniMax

    204,800

    MiniMax M2.1 Highspeedarrow-up-right

    minimax/m2-5-20260218

    MiniMax

    204,800

    MiniMax M2.5arrow-up-right

    minimax/m2-5-highspeed-20260218

    MiniMax

    204,800

    MiniMax M2.5arrow-up-right

    minimax/m2-7-20260402

    MiniMax

    204,800

    MiniMax M2.7arrow-up-right

    minimax/m2-7-highspeed

    MiniMax

    204,800

    MiniMax M2.7 Highspeedarrow-up-right

    mistralai/mistral-nemo

    Mistral AI

    128,000

    Mistral Nemoarrow-up-right

    moonshot/kimi-k2-preview

    Moonshot

    131,000

    Kimi-K2arrow-up-right

    moonshot/kimi-k2-0905-preview

    Moonshot

    256,000

    Kimi-K2arrow-up-right

    moonshot/kimi-k2-turbo-preview

    Moonshot

    256,000

    Kimi K2 Turbo Previewarrow-up-right

    moonshot/kimi-k2-5

    Moonshot

    262,000

    Kimi K2.5arrow-up-right

    moonshot/kimi-k2-6

    Moonshot

    256,000

    Coming Soon

    nousresearch/hermes-4-405b

    NousResearch

    131,000

    -

    nvidia/llama-3.1-nemotron-70b-instruct

    NVIDIA

    128,000

    Llama 3.1 Nemotron 70B Instructarrow-up-right

    nvidia/nemotron-nano-9b-v2

    NVIDIA

    128,000

    Nemotron Nano 9B V2arrow-up-right

    nvidia/nemotron-nano-12b-v2-vl

    NVIDIA

    128,000

    Nemotron Nano 12B V2 VLarrow-up-right

    perplexity/sonar

    Perplexity

    128,000

    Sonararrow-up-right

    perplexity/sonar-pro

    Perplexity

    200,000

    Sonar Proarrow-up-right

    x-ai/grok-3-beta

    xAI

    131,000

    Grok 3 Betaarrow-up-right

    x-ai/grok-3-mini-beta

    xAI

    131,000

    Grok 3 Beta Miniarrow-up-right

    x-ai/grok-4-07-09

    xAI

    256,000

    Grok 4arrow-up-right

    x-ai/grok-code-fast-1

    xAI

    256,000

    Grok Code Fast 1arrow-up-right

    x-ai/grok-4-fast-non-reasoning

    xAI

    2,000,000

    Grok 4 Fastarrow-up-right

    x-ai/grok-4-fast-reasoning

    xAI

    2,000,000

    Grok 4 Fast Reasoningarrow-up-right

    x-ai/grok-4-1-fast-non-reasoning

    xAI

    2,000,000

    Grok 4.1 Fast Non-Reasoningarrow-up-right

    x-ai/grok-4-1-fast-reasoning

    xAI

    2,000,000

    Grok 4.1 Fast Reasoningarrow-up-right

    x-ai/grok-4-20-0309-non-reasoning

    xAI

    2,000,000

    Coming Soon

    x-ai/grok-4-20-0309-reasoning

    xAI

    2,000,000

    Coming Soon

    xiaomi/mimo-v2.5

    Xiaomi

    1,000,000

    Coming Soon

    xiaomi/mimo-v2.5-pro

    Xiaomi

    1,000,000

    Coming Soon

    zhipu/glm-4.5-air

    Zhipu

    128,000

    GLM-4.5 Airarrow-up-right

    zhipu/glm-4.5

    Zhipu

    128,000

    GLM-4.5arrow-up-right

    zhipu/glm-4.6

    Zhipu

    200,000

    GLM-4.6

    zhipu/glm-4.7

    Zhipu

    200,000

    GLM-4.7arrow-up-right

    zhipu/glm-5

    Zhipu

    200,000

    GLM-5arrow-up-right

    zhipu/glm-5-1

    Zhipu

    200,000

    Coming Soon

    Qwen Image Editarrow-up-right

    alibaba/z-image-turbo

    Alibaba Cloud

    Z-Image Turboarrow-up-right

    alibaba/z-image-turbo-lora

    Alibaba Cloud

    Z-Image Turbo LoRAarrow-up-right

    alibaba/wan2.2-t2i-plus

    Alibaba Cloud

    Wan 2.2 Plusarrow-up-right

    alibaba/wan2.2-t2i-flash

    Alibaba Cloud

    Wan 2.2 Flasharrow-up-right

    alibaba/wan2.5-t2i-preview

    Alibaba Cloud

    Wan 2.5 Previewarrow-up-right

    alibaba/wan-2-6-image

    Alibaba Cloud

    Wan 2.6arrow-up-right

    alibaba/wan-2-7-image

    Alibaba Cloud

    Coming Soon

    alibaba/wan-2-7-image-pro

    Alibaba Cloud

    Coming Soon

    alibaba/qwen-image-2-0

    Alibaba Cloud

    Coming Soon

    alibaba/qwen-image-2-0-pro

    Alibaba Cloud

    Coming Soon

    bytedance/seedream-3.0

    ByteDance

    Seedream 3.0arrow-up-right

    bytedance/seedream-v4-text-to-image

    ByteDance

    Seedream 4 Text-to-Imagearrow-up-right

    bytedance/seedream-v4-edit

    ByteDance

    Seedream 4 Editarrow-up-right

    bytedance/uso

    ByteDance

    USOarrow-up-right

    bytedance/seedream-4-5

    ByteDance

    Seedream 4.5arrow-up-right

    bytedance/seedream-5-0-lite-preview

    ByteDance

    Seadream 5.0 Litearrow-up-right

    flux-pro

    Flux

    FLUX.1 [pro]arrow-up-right

    flux-pro/v1.1

    Flux

    FLUX 1.1 [pro]arrow-up-right

    flux-pro/v1.1-ultra

    Flux

    FLUX 1.1 [pro ultra]arrow-up-right

    flux-realism

    Flux

    FLUX Realism LoRAarrow-up-right

    flux/dev

    Flux

    FLUX.1 [dev]arrow-up-right

    flux/dev/image-to-image

    Flux

    -

    flux/schnell

    Flux

    FLUX.1 [schnell]arrow-up-right

    flux/kontext-max/text-to-image

    Flux

    FLUX.1 Kontext [max]arrow-up-right

    flux/kontext-max/image-to-image

    Flux

    FLUX.1 Kontext [max]arrow-up-right

    flux/kontext-pro/text-to-image

    Flux

    Flux.1 Kontext [pro]arrow-up-right

    flux/kontext-pro/image-to-image

    Flux

    Flux.1 Kontext [pro]arrow-up-right

    flux/srpo

    Flux

    FLUX.1 SRPO Text-to-Imagearrow-up-right

    flux/srpo/image-to-image

    Flux

    FLUX.1 SRPO Image-to-Imagearrow-up-right

    blackforestlabs/flux-2

    Flux

    FLUX.2arrow-up-right

    blackforestlabs/flux-2-edit

    Flux

    FLUX.2 Editarrow-up-right

    blackforestlabs/flux-2-lora

    Flux

    Flux 2 LoRAarrow-up-right

    blackforestlabs/flux-2-lora-edit

    Flux

    Flux 2 LoRA Editarrow-up-right

    blackforestlabs/flux-2-pro

    Flux

    FLUX.2 [pro]arrow-up-right

    blackforestlabs/flux-2-pro-edit

    Flux

    FLUX.2 [pro] Editarrow-up-right

    imagen-3.0-generate-002

    Google

    Imagen 3arrow-up-right

    google/imagen4/preview

    Google

    Imagen 4 Previewarrow-up-right

    google/imagen-4.0-generate-001

    Google

    Imagen 4.0 Generatearrow-up-right

    google/imagen-4.0-fast-generate-001

    Google

    Imagen 4.0 Fast Generatearrow-up-right

    google/imagen-4.0-ultra-generate-001

    Google

    Imagen 4.0 Ultra Generatearrow-up-right

    google/gemini-2.5-flash-image

    Google

    Gemini 2.5 Flash Imagearrow-up-right

    google/gemini-2.5-flash-image-edit

    Google

    Gemini 2.5 Flash Image Editarrow-up-right

    google/nano-banana-pro google/gemini-3-pro-image-preview

    Google

    Gemini 3 Pro Image (Nano Banana Pro)arrow-up-right

    google/nano-banana-pro-edit google/gemini-3-pro-image-preview-edit

    Google

    Gemini 3 Pro Image Edit (Nano Banana Pro)arrow-up-right

    google/nano-banana-2 google/gemini-3-1-flash-image-preview

    Google

    Gemini 3.1 Flash Image (Nano Banana 2)arrow-up-right

    klingai/image-o1

    Kling AI

    Kling Image O1arrow-up-right

    dall-e-2

    OpenAI

    OpenAI DALL·E 2arrow-up-right

    dall-e-3

    OpenAI

    OpenAI DALL·E 3arrow-up-right

    openai/gpt-image-1

    OpenAI

    gpt-image-1arrow-up-right

    openai/gpt-image-1-mini

    OpenAI

    GPT Image 1 Miniarrow-up-right

    openai/gpt-image-1-5

    OpenAI

    GPT Image 1.5arrow-up-right

    openai/gpt-image-2

    OpenAI

    GPT Image 2arrow-up-right

    recraft-v3

    Recraft AI

    Recraft v3arrow-up-right

    reve/create-image

    Reve

    Reve Create Imagearrow-up-right

    reve/edit-image

    Reve

    Reve Edit Imagearrow-up-right

    reve/remix-edit-image

    Reve

    Reve Remix Imagearrow-up-right

    stable-diffusion-v3-medium

    Stability AI

    Stable Diffusion 3arrow-up-right

    stable-diffusion-v35-large

    Stability AI

    Stable Diffusion 3.5 Largearrow-up-right

    hunyuan/hunyuan-image-v3-text-to-image

    Tencent

    HunyuanImage 3.0arrow-up-right

    topaz-labs/sharpen

    Topaz Labs

    Sharpenarrow-up-right

    topaz-labs/sharpen-gen

    Topaz Labs

    Sharpen Generativearrow-up-right

    x-ai/grok-imagine-image

    xAI

    Grok Imaginearrow-up-right

    x-ai/grok-imagine-image-pro

    xAI

    Grok Imagine Image Proarrow-up-right

    Wan2.1 Turboarrow-up-right

    alibaba/wan2.2-t2v-plus

    Alibaba Cloud

    Wan 2.2 T2Varrow-up-right

    alibaba/wan2.5-t2v-preview

    Alibaba Cloud

    Wan 2.5 Text-to-Videoarrow-up-right

    alibaba/wan2.5-i2v-preview

    Alibaba Cloud

    Wan 2.5 Image-to-Videoarrow-up-right

    alibaba/wan2.2-14b-animate-replace

    Alibaba Cloud

    Wan 2.2 14b animate replace

    alibaba/wan2.2-14b-animate-move

    Alibaba Cloud

    Wan 2.2 14b animate move

    alibaba/wan2.2-vace-fun-a14b-reframe

    Alibaba Cloud

    Wan 2.2 vace fun 14b reframe

    alibaba/wan2.2-vace-fun-a14b-outpainting

    Alibaba Cloud

    Wan 2.2 vace fun 14b outpainting

    alibaba/wan2.2-vace-fun-a14b-inpainting

    Alibaba Cloud

    Wan 2.2 vace fun 14b inpainting

    alibaba/wan2.2-vace-fun-a14b-pose

    Alibaba Cloud

    Wan 2.2 vace fun 14b pose

    alibaba/wan2.2-vace-fun-14b-depth

    Alibaba Cloud

    Wan 2.2 vace fun 14b depth

    alibaba/wan2.5-t2v-preview

    Alibaba Cloud

    Wan 2.5 Previewarrow-up-right

    alibaba/wan2.5-i2v-preview

    Alibaba Cloud

    -

    alibaba/wan-2-6-t2v

    Alibaba Cloud

    Wan 2.6 Text-to-Videoarrow-up-right

    alibaba/wan-2-6-i2v

    Alibaba Cloud

    Wan 2.6 Image-to-Videoarrow-up-right

    alibaba/wan-2-6-r2v

    Alibaba Cloud

    Wan 2.6 Reference-to-Videoarrow-up-right

    alibaba/wan-2-6-image-to-video-flash

    Alibaba Cloud

    Coming Soon

    alibaba/happyhorse-1-0

    Alibaba Cloud

    Coming Soon

    bytedance/seedance-1-0-lite-t2v

    ByteDance

    Seedance 1.0 lite Text to Videoarrow-up-right

    bytedance/seedance-1-0-lite-i2v

    ByteDance

    Seedance 1.0 lite Image to Videoarrow-up-right

    bytedance/seedance-1-0-pro-t2v

    ByteDance

    Seedance 1.0 Proarrow-up-right

    bytedance/seedance-1-0-pro-i2v

    ByteDance

    Seedance 1.0 Proarrow-up-right

    bytedance/seedance-1-0-pro-fast

    ByteDance

    Seedance 1.0 Pro Fastarrow-up-right

    bytedance/omnihuman

    ByteDance

    OmniHumanarrow-up-right

    bytedance/omnihuman/v1.5

    ByteDance

    OmniHuman v1.5arrow-up-right

    bytedance/seedance-1-5-pro

    ByteDance

    Seedance 1.5 Proarrow-up-right

    bytedance/seedance-2-0

    ByteDance

    Coming Soon

    bytedance/seedance-2-0-fast

    ByteDance

    Coming Soon

    veo2

    Google

    Veo2 Text-to-Videoarrow-up-right

    veo2/image-to-video

    Google

    Veo2 Image-to-Videoarrow-up-right

    google/veo3

    Google

    Veo 3arrow-up-right

    google/veo-3.0-i2v

    Google

    Veo 3 I2Varrow-up-right

    google/veo-3.0-fast

    Google

    Veo 3 Fastarrow-up-right

    google/veo-3.0-i2v-fast

    Google

    Veo 3 I2V Fastarrow-up-right

    google/veo-3.1-t2v

    Google

    Veo 3.1 Text-to-Videoarrow-up-right

    google/veo-3.1-t2v-fast

    Google

    Veo 3.1 Fast Text-to-Videoarrow-up-right

    google/veo-3.1-i2v

    Google

    Veo 3.1 Image-to-Videoarrow-up-right

    google/veo-3.1-i2v-fast

    Google

    Veo 3.1 Fast Image-to-Videoarrow-up-right

    google/veo-3.1-reference-to-video

    Google

    Veo 3.1 Reference-to-Videoarrow-up-right

    google/veo-3.1-first-last-image-to-video

    Google

    Veo 3.1 First-Last Frame-to-Videoarrow-up-right

    google/veo-3.1-first-last-image-to-video-fast

    Google

    Veo 3.1 Fast First-Last Frame-to-Videoarrow-up-right

    google/veo3-1-extend-video

    Google

    Veo 3.1 Extend Videoarrow-up-right

    google/veo3-1-fast-extend-video

    Google

    Veo 3.1 Fast Extend Videoarrow-up-right

    google/veo-3-1-lite-generate-preview

    Google

    Coming Soon

    kling-video/v1/standard/image-to-video

    Kling AI

    Kling AI (image-to-video)arrow-up-right

    kling-video/v1/standard/text-to-video

    Kling AI

    Kling AI (text-to-video)arrow-up-right

    kling-video/v1/pro/image-to-video

    Kling AI

    Kling AI (image-to-video)arrow-up-right

    kling-video/v1/pro/text-to-video

    Kling AI

    Kling AI (text-to-video)arrow-up-right

    kling-video/v1.6/standard/text-to-video

    Kling AI

    Kling 1.6 Standardarrow-up-right

    kling-video/v1.6/standard/image-to-video

    Kling AI

    Kling 1.6 Standardarrow-up-right

    kling-video/v1.6/pro/image-to-video

    Kling AI

    Kling 1.6 Proarrow-up-right

    kling-video/v1.6/pro/text-to-video

    Kling AI

    Kling 1.6 Proarrow-up-right

    klingai/kling-video-v1.6-pro-effects

    Kling AI

    Kling 1.6 Pro Effectsarrow-up-right

    klingai/kling-video-v1.6-standard-effects

    Kling AI

    Kling 1.6 Standard Effectsarrow-up-right

    kling-video/v1.6/standard/multi-image-to-video

    Kling AI

    Kling V1.6 Multi-Image-to-Videoarrow-up-right

    klingai/v2-master-image-to-video

    Kling AI

    Kling 2.0 Masterarrow-up-right

    klingai/v2-master-text-to-video

    Kling AI

    Kling 2.0 Masterarrow-up-right

    kling-video/v2.1/standard/image-to-video

    Kling AI

    Kling V2.1 Standard I2Varrow-up-right

    kling-video/v2.1/pro/image-to-video

    Kling AI

    Kling V2.1 Pro I2Varrow-up-right

    klingai/v2.1-master-image-to-video

    Kling AI

    ling 2.1 Masterarrow-up-right

    klingai/v2.1-master-text-to-video

    Kling AI

    Kling 2.1 Masterarrow-up-right

    klingai/v2.5-turbo/pro/image-to-video

    Kling AI

    Kling Video v2.5 Turbo Pro Image-to-Videoarrow-up-right

    klingai/v2.5-turbo/pro/text-to-video

    Kling AI

    Kling Video v2.5 Turbo Pro Text-to-Videoarrow-up-right

    klingai/avatar-standard

    Kling AI

    Kling AI Avatar Standardarrow-up-right

    klingai/avatar-pro

    Kling AI

    Kling AI Avatar Proarrow-up-right

    klingai/video-v2-6-pro-text-to-video

    Kling AI

    Kling 2.6 Pro Text-to-Videoarrow-up-right

    klingai/video-v2-6-pro-image-to-video

    Kling AI

    Kling 2.6 Pro Image-to-Videoarrow-up-right

    klingai/video-o1-image-to-video

    Kling AI

    Kling Video O1 Image to Videoarrow-up-right

    klingai/video-o1-reference-to-video

    Kling AI

    Kling Video O1 Reference-to-Videoarrow-up-right

    klingai/video-o1-video-to-video-edit

    Kling AI

    Kling Video O1 Video to Video Editarrow-up-right

    klingai/video-o1-video-to-video-reference

    Kling AI

    Kling Video O1 Video-to-Video Referencearrow-up-right

    klingai/video-v2-6-pro-motion-control

    Kling AI

    Kling 2.6 Pro Motion Controlarrow-up-right

    klingai/video-v3-standard-text-to-video

    Kling AI

    Kling Video v3 Standardarrow-up-right

    klingai/video-v3-standard-image-to-video

    Kling AI

    Kling Video v3 Standardarrow-up-right

    klingai/video-v3-pro-text-to-video

    Kling AI

    Kling Video v3 Proarrow-up-right

    klingai/video-v3-pro-image-to-video

    Kling AI

    Kling Video v3 Proarrow-up-right

    krea/krea-wan-14b/text-to-video

    Krea

    Krea WAN 14B Text-to-Videoarrow-up-right

    krea/krea-wan-14b/video-to-video

    Krea

    Krea WAN 14B Video-to-Videoarrow-up-right

    ltxv/ltxv-2

    LTXV

    LTXV 2arrow-up-right

    ltxv/ltxv-2-fast

    LTXV

    LTXV 2 Fastarrow-up-right

    luma/ray-2

    Luma AI

    Ray 2arrow-up-right

    luma/ray-flash-2

    Luma AI

    Ray Flash 2arrow-up-right

    magic/text-to-video

    Magic

    Magic Videoarrow-up-right

    magic/image-to-video

    Magic

    Magic Videoarrow-up-right

    magic/video-to-video

    Magic

    Magic Videoarrow-up-right

    video-01

    MiniMax

    MiniMax Video-01arrow-up-right

    video-01-live2d

    MiniMax

    -

    minimax/hailuo-02

    MiniMax

    Hailuo 02arrow-up-right

    minimax/hailuo-2.3

    MiniMax

    Hailuo 2.3arrow-up-right

    minimax/hailuo-2.3-fast

    MiniMax

    Hailuo 2.3 Fastarrow-up-right

    sora-2-t2v

    OpenAI

    -

    sora-2-i2v

    OpenAI

    -

    sora-2-pro-t2v

    OpenAI

    -

    sora-2-pro-i2v

    OpenAI

    -

    pixverse/v5/text-to-video

    PixVerse

    Pixverse v5 Text-to-Videoarrow-up-right

    pixverse/v5/image-to-video

    PixVerse

    Pixverse v5 Image-to-Videoarrow-up-right

    pixverse/v5/transition

    PixVerse

    Pixverse v5 Transitionarrow-up-right

    pixverse/v5-5-text-to-video

    PixVerse

    PixVerse V5.5 Text-to-Videoarrow-up-right

    pixverse/v5-5-image-to-video

    PixVerse

    Pixverse v5.5 Image-to-Videoarrow-up-right

    pixverse/lip-sync

    PixVerse

    Coming Soon

    gen3a_turbo

    Runway

    Runway Gen-3 turboarrow-up-right

    runway/gen4_turbo

    Runway

    Runway Gen-4 Turboarrow-up-right

    runway/gen4_aleph

    Runway

    Alepharrow-up-right

    runway/act_two

    Runway

    Runway Act Twoarrow-up-right

    sber-ai/kandinsky5-t2v

    Sber AI

    Kandinsky 5 Standardarrow-up-right

    sber-ai/kandinsky5-distill-t2v

    Sber AI

    Kandinsky 5 Distillarrow-up-right

    tencent/hunyuan-video-foley

    Tencent

    HunyuanVideo Foleyarrow-up-right

    veed/fabric-1.0

    Veed

    fabric-1.0

    veed/fabric-1.0-fast

    Veed

    fabric-1.0-fast

    Universalarrow-up-right

    #g1_nova-2-automotive

    Deepgram

    Deepgram Nova-2arrow-up-right

    #g1_nova-2-conversationalai

    Deepgram

    Deepgram Nova-2arrow-up-right

    #g1_nova-2-drivethru

    Deepgram

    Deepgram Nova-2arrow-up-right

    #g1_nova-2-finance

    Deepgram

    Deepgram Nova-2arrow-up-right

    #g1_nova-2-general

    Deepgram

    Deepgram Nova-2arrow-up-right

    #g1_nova-2-medical

    Deepgram

    Deepgram Nova-2arrow-up-right

    #g1_nova-2-meeting

    Deepgram

    Deepgram Nova-2arrow-up-right

    #g1_nova-2-phonecall

    Deepgram

    Deepgram Nova-2arrow-up-right

    #g1_nova-2-video

    Deepgram

    Deepgram Nova-2arrow-up-right

    #g1_nova-2-voicemail

    Deepgram

    Deepgram Nova-2arrow-up-right

    #g1_whisper-tiny

    OpenAI

    -

    #g1_whisper-small

    OpenAI

    -

    #g1_whisper-base

    OpenAI

    -

    #g1_whisper-medium

    OpenAI

    -

    #g1_whisper-large

    OpenAI

    Whisperarrow-up-right

    openai/gpt-4o-transcribe

    OpenAI

    GPT-4o Transcribearrow-up-right

    openai/gpt-4o-mini-transcribe

    OpenAI

    GPT-4o Mini Transcribearrow-up-right

    Auraarrow-up-right

    #g1_aura-arcas-en

    Deepgram

    Auraarrow-up-right

    #g1_aura-asteria-en

    Deepgram

    Auraarrow-up-right

    #g1_aura-athena-en

    Deepgram

    Auraarrow-up-right

    #g1_aura-helios-en

    Deepgram

    Auraarrow-up-right

    #g1_aura-hera-en

    Deepgram

    Auraarrow-up-right

    #g1_aura-luna-en

    Deepgram

    Auraarrow-up-right

    #g1_aura-orion-en

    Deepgram

    Auraarrow-up-right

    #g1_aura-orpheus-en

    Deepgram

    Auraarrow-up-right

    #g1_aura-perseus-en

    Deepgram

    Auraarrow-up-right

    #g1_aura-stella-en

    Deepgram

    Auraarrow-up-right

    #g1_aura-zeus-en

    Deepgram

    Auraarrow-up-right

    #g1_aura-2-amalthea-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-andromeda-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-apollo-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-arcas-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-aries-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-asteria-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-athena-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-atlas-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-aurora-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-callista-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-cora-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-cordelia-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-delia-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-draco-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-electra-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-harmonia-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-helena-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-hera-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-hermes-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-hyperion-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-iris-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-janus-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-juno-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-jupiter-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-luna-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-mars-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-minerva-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-neptune-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-odysseus-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-ophelia-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-orion-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-orpheus-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-pandora-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-phoebe-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-pluto-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-saturn-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-selene-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-thalia-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-theia-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-vesta-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-zeus-en

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-celeste-es

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-estrella-es

    Deepgram

    Aura 2arrow-up-right

    #g1_aura-2-nestor-es

    Deepgram

    Aura 2arrow-up-right

    elevenlabs/eleven_multilingual_v2

    ElevenLabs

    ElevenLabs Multilingual v2arrow-up-right

    elevenlabs/eleven_turbo_v2_5

    ElevenLabs

    ElevenLabs Turbo v2.5arrow-up-right

    hume/octave-2

    Hume AI

    Octave 2arrow-up-right

    inworld/tts-1

    Inworld

    Inworld TTS-1arrow-up-right

    inworld/tts-1-max

    Inworld

    Inworld TTS-1-Maxarrow-up-right

    inworld/tts-1-5-mini

    Inworld

    Inworld TTS-1.5-Miniarrow-up-right

    inworld/tts-1-5-max

    Inworld

    Coming Soon

    microsoft/vibevoice-1.5b

    Microsoft

    VibeVoice 1.5Barrow-up-right

    microsoft/vibevoice-7b

    Microsoft

    VibeVoice 7Barrow-up-right

    openai/tts-1

    OpenAI

    TTS-1arrow-up-right

    openai/tts-1-hd

    OpenAI

    TTS-1 HDarrow-up-right

    openai/gpt-4o-mini-tts

    OpenAI

    GPT-4o-mini-TTSarrow-up-right

    MiniMax Speech 2.5 Turboarrow-up-right

    minimax/speech-2.5-hd-preview

    MiniMax

    MiniMax Speech 2.5 HDarrow-up-right

    minimax/speech-2.6-turbo

    MiniMax

    MiniMax Speech 2.6 Turboarrow-up-right

    minimax/speech-2.6-hd

    MiniMax

    MiniMax Speech 2.6 HDarrow-up-right

    minimax/speech-2.8-turbo

    MiniMax

    Speech 2.8 Turboarrow-up-right

    minimax/speech-2.8-hd

    MiniMax

    Speech 2.8 HDarrow-up-right

    Lyria 2arrow-up-right

    stable-audio

    Stability AI

    Stable Audioarrow-up-right

    minimax-music

    Minimax AI

    -

    music-01

    Minimax AI

    MiniMax Musicarrow-up-right

    minimax/music-1.5

    Minimax AI

    MiniMax Music 1.5arrow-up-right

    minimax/music-2.0

    Minimax AI

    MiniMax Music 2.0arrow-up-right

    minimax/music-2.6

    Minimax AI

    MiniMax Music 2.6arrow-up-right

    minimax/music-cover

    Minimax AI

    Coming Soon

    Mistral OCR Latestarrow-up-right

    zhipu/glm-ocr

    Zhipu

    GLM-OCRarrow-up-right

    Hunyuan Partarrow-up-right

    32,000

    Qwen Text Embedding v4arrow-up-right

    voyage-2

    Anthropic

    4,000

    -

    voyage-code-2

    Anthropic

    16,000

    -

    voyage-finance-2

    Anthropic

    32,000

    -

    voyage-large-2

    Anthropic

    16,000

    -

    voyage-large-2-instruct

    Anthropic

    16,000

    Voyage Large 2 Instructarrow-up-right

    voyage-law-2

    Anthropic

    16,000

    -

    voyage-multilingual-2

    Anthropic

    32,000

    -

    text-multilingual-embedding-002

    Google

    2,000

    -

    text-embedding-3-small

    Open AI

    8,000

    -

    text-embedding-3-large

    Open AI

    8,000

    Text-embedding-3-largearrow-up-right

    text-embedding-ada-002

    Open AI

    8,000

    Text-embedding-ada-002arrow-up-right

    200,000

    -

    meta-llama/llama-4-maverick

    Meta

    256,000

    Llama 4 Maverickarrow-up-right

    google/gemini-3-pro-preview

    Google

    200,000

    Gemini 3 Pro Previewarrow-up-right

    BAAI/bge-base-en-v1.5

    BAAI

    512

    BAAI-Bge-Base-1p5arrow-up-right

    togethercomputer/m2-bert-80M-32k-retrieval

    Together AI

    32,000

    M2-BERT-Retrieval-32karrow-up-right

    imagen-4.0-ultra-generate-preview-06-06

    Google

    -

    Imagen 4 Ultraarrow-up-right

    x-ai/grok-2-image

    xAI

    -

    Grok 2 Imagearrow-up-right

    claude-3-7-sonnet-20250219

    Anthropic

    200,000

    Claude 3.7 Sonnetarrow-up-right

    claude-3-5-haiku-20241022

    Anthropic

    200,000

    -

    gemini-2.0-flash-exp

    Google

    1,000,000

    Gemini 2.0 Flash Experimentalarrow-up-right

    meta-llama/Meta-Llama-Guard-3-8B

    Meta

    8,000

    Llama Guard 3 (8B)arrow-up-right

    meta-llama/LlamaGuard-2-8b

    Meta

    8,000

    LlamaGuard 2 (8b)arrow-up-right

    meta-llama/Llama-Guard-3-11B-Vision-Turbo

    Meta

    128,000

    -

    meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo

    Meta

    128,000

    Llama 3.1 70B Instruct Turboarrow-up-right

    meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo

    Meta

    4,000

    Llama 3.1 (405B) Instruct Turboarrow-up-right

    meta-llama/Llama-3.2-3B-Instruct-Turbo

    Meta

    131,000

    Llama 3.2 3B Instruct Turboarrow-up-right

    Qwen/Qwen3-235B-A22B-fp8-tput

    Alibaba Cloud

    32,000

    Qwen 3 235B A22Barrow-up-right

    Qwen/Qwen2.5-72B-Instruct-Turbo

    Alibaba Cloud

    32,000

    Qwen 2.5 72B Instruct Turboarrow-up-right

    qwen/qwen-2.5-vl-7b-instruct

    Alibaba Cloud

    32,000

    Qwen2.5 VL 7B Instructarrow-up-right

    mistralai/mistral-tiny

    Mistral AI

    32,000

    Mistral Tinyarrow-up-right

    mistralai/Mistral-7B-Instruct-v0.3

    Mistral AI

    32,000

    Mistral (7B) Instruct v0.3arrow-up-right

    meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo

    Meta

    128,000

    Llama 3.1 8B Instruct Turboarrow-up-right

    mistralai/Mistral-7B-Instruct-v0.2

    Mistral AI

    32,000

    Mistral (7B) Instruct v0.2arrow-up-right

    chatgpt-4o-latest

    OpenAI

    128,000

    -

    meta-llama/llama-4-scout

    Meta

    1,000,000

    Llama 4 Scoutarrow-up-right

    BAAI/bge-large-en-v1.5

    BAAI

    512

    bge-large-enarrow-up-right

    bagoodex/bagoodex-search-v1

    Bagoodex

    Bagoodex Web Search v1arrow-up-right

    deepseek/deepseek-prover-v2

    DeepSeek

    164,000

    DeepSeek Prover V2arrow-up-right

    claude-3-opus-20240229 anthropic/claude-3-opus claude-3-opus-latest

    Anthropic

    200,000

    Claude 3 Opusarrow-up-right

    luma/ray-1.6

    Luma AI

    Ray 1.6arrow-up-right

    meta-llama/Llama-3-70b-chat-hf

    Meta

    8,000

    Llama 3 70B Instruct Referencearrow-up-right

    bytedance/seededit-3.0-i2i

    ByteDance

    Seedream 3.0arrow-up-right

    textembedding-gecko-multilingual@001

    Google

    2,000

    Textembedding-gecko-multilingual@001arrow-up-right

    textembedding-gecko@003

    Google

    2,000

    Textembedding-gecko@003arrow-up-right

    mistralai/codestral-2501

    Mistral AI

    256,000

    Mistral Codestral-2501arrow-up-right

    mistralai/Mistral-7B-Instruct-v0.1

    Mistral AI

    8,000

    Mistral (7B) Instruct v0.1arrow-up-right

    Qwen/Qwen2.5-Coder-32B-Instruct

    Alibaba Cloud

    131,000

    Qwen 2.5 Coderarrow-up-right

    Qwen/QwQ-32B

    Alibaba Cloud

    131,000

    Qwq-32Barrow-up-right

    kling-video/v1.5/standard/text-to-video

    Kling AI

    128,000

    Kling 1.5 Standartarrow-up-right

    o1-mini o1-mini-2024-09-12

    OpenAI

    128,000

    OpenAI o1-miniarrow-up-right

    Qwen/Qwen2-72B-Instruct

    Alibaba Cloud

    32,000

    Qwen 2 Instruct (72B)arrow-up-right

    claude-3-5-sonnet-20240620

    Anthropic

    200,000

    -

    claude-3-5-sonnet-20241022

    Anthropic

    200,000

    Claude 3.5 Sonnet 20241022arrow-up-right

    cohere/command-r-plus

    Cohere

    128,000

    Command R+arrow-up-right

    google/gemma-2-27b-it

    Google

    8,000

    Gemma 2 (27b)arrow-up-right

    NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO

    Nous Research

    32,000

    -

    nvidia/Llama-3.1-Nemotron-70B-Instruct-HF

    Nvidia

    128,000

    Llama 3.1 Nemotron 70B Instructarrow-up-right

    meta-llama/Llama-3-8b-chat-hf

    Meta

    8,000

    Llama 3 8B Instruct Referencearrow-up-right

    meta-llama/Llama-3.2-90B-Vision-Instruct-Turbo

    Meta

    131,000

    Llama 3.2 90B Vision Instruct Turboarrow-up-right

    meta-llama/Llama-Vision-Free

    Meta

    128,000

    -

    meta-llama/Llama-3.2-11B-Vision-Instruct-Turbo

    Meta

    131,000

    Llama 3.2 11B Vision Instruct Turboarrow-up-right

    abab6.5s-chat

    MiniMax

    245,000

    -

    openrouter/horizon-beta

    OpenRouter

    256,000

    -

    openrouter/horizon-alpha

    OpenRouter

    256,000

    -

    wan/v2.1/1.3b/text-to-video

    Alibaba Cloud

    -

    Wan 2.1arrow-up-right

    o1-preview, o1-preview-2024-09-12

    OpenAI

    128,000

    OpenAI o1-previewarrow-up-right

    claude-3-sonnet-20240229, anthropic/claude-3-sonnet, claude-3-sonnet-latest

    Anthropic

    200,000

    Claude 3 Sonnetarrow-up-right

    google/gemini-2.5-pro-preview, google/gemini-2.5-pro-preview-05-06

    Google

    1,000,000

    Gemini Pro 2.5 Previewarrow-up-right

    google/gemini-2.5-flash-preview

    Google

    1,000,000

    Gemini 2.5 Flash Previewarrow-up-right

    neversleep/llama-3.1-lumimaid-70b

    NeverSleep

    8,000

    Llama 3.1 Lumimaid 70barrow-up-right

    x-ai/grok-beta

    xAI

    131,000

    Grok-2 Betaarrow-up-right

    gpt-4.5-preview

    OpenAI

    128,000

    Chat GPT 4.5 previewarrow-up-right

    gemini-1.5-flash

    Google

    1,000,000

    Gemini 1.5 Flasharrow-up-right

    gemini-1.5-pro

    Google

    1,000,000

    Gemini 1.5 Proarrow-up-right

    google/gemma-3-1b-it

    Google

    128,000

    Gemma 3 (1B)arrow-up-right

    togethercomputer/m2-bert-80M-8k-retrieval

    TogetherAI

    8,000

    M2-BERT-Retrieval-8karrow-up-right

    togethercomputer/m2-bert-80M-2k-retrieval

    TogetherAI

    2,000

    M2-BERT-Retrieval-2Karrow-up-right

    Gryphe/MythoMax-L2-13b-Lite

    Gryphe

    4,000

    -

    mistralai/Mixtral-8x22B-Instruct-v0.1

    Mistral AI

    64,000

    Mixtral 8x22B Instructarrow-up-right

    google/gemini-2.5-pro-exp-03-25

    Google

    1,000,000

    -

    google/gemini-2.0-flash-thinking-exp-01

    Google

    1,000,000

    Gemini 2.0 Flash Thinking Experimentalarrow-up-right

    ai21/jamba-1-5-mini

    AI21 Labs

    256,000

    Jamba 1.5 Miniarrow-up-right

    textembedding-gecko@001

    Google

    3,000

    -

    google/gemini-pro or gemini-pro

    Google

    32,000

    Gemini 1.0 Proarrow-up-right

    meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo-128K

    Meta

    128,000

    -

    stabilityai/stable-diffusion-xl-base-1.0

    Stability AI

    Stable Diffusion XL 1.0arrow-up-right

    upstage/solar-10.7b-instruct-v1.0

    Upstage

    4,000

    Upstage SOLAR Instruct v1 (11B)arrow-up-right

    meta-llama/Llama-2-13b-chat-hf

    Meta

    4,100

    LLaMA-2 Chat (13B)arrow-up-right

    meta-llama/meta-llama-3-70b-instruct-turbo

    Meta

    128,000

    -

    google/gemma-2-9b-it

    Google

    8,000

    Gemma 2 (9B)arrow-up-right

    google/gemma-2b-it

    Google

    8,000

    Gemma Instruct (2B)arrow-up-right

    Gryphe/MythoMax-L2-13b

    Gryphe

    4,000

    MythoMax-L2 (13B)arrow-up-right

    microsoft/WizardLM-2-8x22B

    Microsoft

    64,000

    WizardLM 2-8 (22B)arrow-up-right

    Austism/chronos-hermes-13b

    Austism

    2,000

    Chronos Hermes 13barrow-up-right

    databricks/dbrx-instruct

    Databricks

    32,000

    DBRX Instructarrow-up-right

    deepseek-ai/deepseek-llm-67b-chat

    DeepSeek

    4,000

    Deepseek-LLM-67b-Chatarrow-up-right

    deepseek-ai/deepseek-coder-33b-instruct

    DeepSeek

    16,000

    Deepseek Coder Instruct (33B)arrow-up-right

    Meta-Llama/Llama-2-7b-chat-hf

    Meta

    4,000

    LLaMA-2 Chat (7B)arrow-up-right

    Meta-Llama/Meta-Llama-3-70B-Instruct-Lite

    Meta

    8,000

    Llama 3 70B Instruct Litearrow-up-right

    Meta-Llama/Llama-Guard-7b

    Meta

    4,000

    Llama Guard (7B)arrow-up-right

    meta-llama/Llama-2-7b-hf

    Meta

    4,000

    LLaMA-2 (7B)arrow-up-right

    meta-llama/Llama-3-8b-hf

    Meta

    8,000

    Llama-3 (8B)arrow-up-right

    codellama/CodeLlama-70b-hf

    Meta

    16,000

    Code Llama (70B)arrow-up-right

    codellama/CodeLlama-7b-Instruct-hf

    Meta

    16,000

    Code Llama Instruct (7B)arrow-up-right

    codellama/CodeLlama-13b-Instruct-hf

    Meta

    16,000

    Code Llama Instruct (13B)arrow-up-right

    codellama/CodeLlama-70b-Instruct-hf

    Meta

    4,000

    Code Llama Instruct (70B)arrow-up-right

    codellama/CodeLlama-70b-Python-hf

    Meta

    4,000

    Code Llama Python (70B)arrow-up-right

    mistralai/Mixtral-8x22B-Instruct-v0.1

    Mistral AI

    64,000

    Mixtral 8x22B Instructarrow-up-right

    gpt-3.5-turbo-16k-0613

    OpenAI

    -

    gpt-4-0613

    OpenAI

    128,000

    Chat GPT 4 Turboarrow-up-right

    Qwen/Qwen-14B-Chat

    Alibaba Cloud

    8,000

    Qwen Chat (14B)arrow-up-right

    Qwen/Qwen1.5-0.5B

    Alibaba Cloud

    32,000

    Qwen 1.5 (0.5B)arrow-up-right

    Qwen/Qwen1.5-1.8B

    Alibaba Cloud

    32,000

    Qwen 1.5 (1.8B)arrow-up-right

    Qwen/Qwen1.5-4B

    Alibaba Cloud

    32,000

    Qwen 1.5 (4B)arrow-up-right

    Qwen/Qwen1.5-1.8B-Chat

    Alibaba Cloud

    32,000

    Qwen 1.5 Chat (1.8B)arrow-up-right

    Qwen/Qwen1.5-4B-Chat

    Alibaba Cloud

    32,000

    Qwen 1.5 Chat (4B)arrow-up-right

    Qwen/Qwen1.5-7B-Chat

    Alibaba Cloud

    32,000

    Qwen 1.5 Chat (7B)arrow-up-right

    Qwen/Qwen1.5-14B-Chat

    Alibaba Cloud

    32,000

    Qwen 1.5 Chat (14B)arrow-up-right

    qwen/qvq-72b-preview

    Alibaba Cloud

    32,000

    QVQ-72B-Previewarrow-up-right

    togethercomputer/guanaco-13b

    Tim Dettmers

    2,000

    Guanaco (13B)arrow-up-right

    togethercomputer/guanaco-33b

    Tim Dettmers

    2,000

    Guanaco (33B)arrow-up-right

    togethercomputer/guanaco-65b

    Tim Dettmers

    2,000

    Guanaco (65B)arrow-up-right

    togethercomputer/mpt-7b-chat

    Mosaic ML

    2,000

    MPT-Chat (7B)arrow-up-right

    togethercomputer/mpt-30b-chat

    Mosaic ML

    8,000

    MPT-Chat (30B)arrow-up-right

    togethercomputer/RedPajama-INCITE-7B-Instruct

    RedPajama

    2,000

    RedPajama-INCITE Instruct (7B)arrow-up-right

    prompthero/openjourney

    PromptHero

    77

    Openjourney v4arrow-up-right

    wavymulder/Analog-Diffusion

    wavymulder

    77

    Analog Diffusionarrow-up-right

    -

    01.AI

    4,000

    01-ai Yi Base (6B)arrow-up-right

    Undi95/Toppy-M-7B

    Undi95

    4,000

    Toppy M (7B)arrow-up-right

    SG161222/Realistic_Vision_V3.0_VAE

    Together

    77

    Realistic Vision 3.0arrow-up-right

    tiiuae/falcon-40b

    TII

    2,000

    Falcon (40B)arrow-up-right

    allenai/OLMo-7B

    Allen Institute for AI

    2,000

    OLMo-7Barrow-up-right

    bigcode/starcoder

    BigCode

    8,000

    StarCoder (16B)arrow-up-right

    HuggingFaceH4/starchat-alpha

    Hugging Face

    8,000

    StarCoderChat Alpha (16B)arrow-up-right

    NousResearch/Nous-Hermes-Llama2-70b

    NousResearch

    4,000

    Nous Hermes LLaMA-2 (70B)arrow-up-right

    NousResearch/Nous-Hermes-2-Mixtral-8x7B-SFT

    NousResearch

    32,000

    Nous Hermes 2 - Mixtral 8x7B-SFTarrow-up-right

    NousResearch/Nous-Hermes-2-Mistral-7B-DPO

    NousResearch

    32,000

    Nous Hermes 2 - Mistral DPO (7B)arrow-up-right

    NousResearch/Hermes-2-Theta-Llama-3-70B

    NousResearch

    8,000

    Hermes 2 Theta Llama-3 70Barrow-up-right

    defog/sqlcoder

    Defog AI

    8,000

    SQLCoder (15B)arrow-up-right

    replit/replit-code-v1-3b

    Replit

    2,000

    Replit-Code-v1 (3B)arrow-up-right

    lmsys/vicuna-13b-v1.5

    Imsys

    4,000

    Vicuna v1.5 (13B)arrow-up-right

    microsoft/phi-2

    Microsoft

    2,000

    Microsoft Phi-2arrow-up-right

    stabilityai/stablelm-base-alpha-3b

    StabilityAI

    4,000

    StableLM Base Alpha 3Barrow-up-right

    runwayml/stable-diffusion-v1-5

    StabilityAI

    77

    Stable Diffusion 1.5arrow-up-right

    stabilityai/stable-diffusion-2-1

    StabilityAI

    77

    Stable Diffusion 2.1arrow-up-right

    teknium/OpenHermes-2p5-Mistral-7B

    Teknium

    8,000

    OpenHermes-2.5-Mistral (7B)arrow-up-right

    openchat/openchat-3.5-1210

    OpenChat

    8,000

    OpenChat 3.5 (7B)arrow-up-right

    DiscoResearch/DiscoLM-mixtral-8x7b-v2

    Disco Research

    32,000

    DiscoLM Mixtral 8x7b (46.7B)arrow-up-right

    google/flan-t5-xl

    Google

    512

    FLAN T5 XL (3B)arrow-up-right

    garage-bAInd/Platypus2-70B-instruct

    Garage-bAInd

    4,000

    Platypus2-70B-Instructarrow-up-right

    EleutherAI/gpt-neox-20b

    EleutherAI

    2,000

    GPT Neox 20Barrow-up-right

    gradientai/Llama-3-70B-Instruct-Gradient-1048k

    Gradient

    1,048,000

    Llama-3 70B Gradient Instruct 1048karrow-up-right

    WhereIsAI/UAE-Large-V1

    WhereIsAI

    512

    UAE-Large-V1arrow-up-right

    zero-one-ai/Yi-34B-Chat

    01.AI

    4,000

    Yi-34B-Chatarrow-up-right

    meta-llama/Meta-Llama-3.1-70B-Reference

    Meta

    32,000

    –

    meta-llama/Meta-Llama-3.1-8B-Reference

    Meta

    32,000

    –

    EleutherAI/llemma_7b

    EleutherAI

    32,000

    –

    huggyllama/llama-30b

    Huggyllama

    32,000

    –

    huggyllama/llama-13b

    Huggyllama

    32,000

    –

    togethercomputer/llama-2-70b

    TogetherAI

    32,000

    –

    togethercomputer/llama-2-13b

    TogetherAI

    32,000

    –

    huggyllama/llama-65b

    Huggyllama

    32,000

    –

    WizardLM/WizardLM-70B-V1.0

    WizardLM

    32,000

    –

    huggyllama/llama-7b

    Huggyllama

    32,000

    –

    togethercomputer/llama-2-7b

    TogetherAI

    32,000

    –

    NousResearch/Nous-Hermes-13b

    NousResearch

    2,000

    –

    mistralai/Mistral-7B-v0.1

    Mistral AI

    32,000

    ​Mixtral 7Barrow-up-right

    mistralai/Mixtral-8x7B-v0.1

    Mistral AI

    32,000

    Mixtral-8x7B Instruct v0.1arrow-up-right

    -

    Suno AI

    32

    Suno AIarrow-up-right

    gpt-3.5-turbo
    Chat GPT 3.5 Turboarrow-up-right
    gpt-3.5-turbo-0125
    alibaba/qwen-image
    Qwen Imagearrow-up-right
    alibaba/qwen-image-edit
    alibaba/wan2.1-t2v-plus
    Wan2.1 Plusarrow-up-right
    alibaba/wan2.1-t2v-turbo
    aai/slam-1
    Slam 1arrow-up-right
    aai/universal
    alibaba/qwen3-tts-flash
    Qwen3-TTS-Flasharrow-up-right
    #g1_aura-angus-en
    elevenlabs/v3_alpha
    Eleven v3 Alphaarrow-up-right
    minimax/speech-2.5-turbo-preview
    elevenlabs/eleven_music
    Eleven Musicarrow-up-right
    google/lyria2
    The service has no Model ID
    mistral/mistral-ocr-latest
    triposr
    Stable TripoSR 3Darrow-up-right
    tencent/hunyuan-part
    alibaba/qwen-text-embedding-v3
    Qwen Text Embedding v3arrow-up-right
    alibaba/qwen-text-embedding-v4
    Mixtral-8x7B Instruct v0.1arrow-up-right
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: qwen-plus
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: qwen-turbo
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: alibaba/qwen3-coder-480b-a35b-instruct
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: alibaba-cloud/qwen3-next-80b-a3b-instruct
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: alibaba/qwen3-max-preview
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: alibaba/qwen3-vl-32b-instruct
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: alibaba/qwen3.5-plus-20260218
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    Other propertiesnumber · min: -100 · max: 100Optional
    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    reasoning_effortstring · enumOptional

    Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

    Possible values:
    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: alibaba/qwen3.6-27b
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    dataany ofRequired

    Either a URL of the audio or the base64 encoded audio data.

    string · uriOptional
    or
    stringOptional
    formatstring · enumRequired

    The format of the encoded audio data. Currently supports "wav" and "mp3".

    Possible values:
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    urlstring · uriRequired

    Either a URL of the video or the base64 encoded video data.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    idstringRequired

    Unique identifier for a previous audio response from the model.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    formatstring · enumRequired

    Specifies the output audio format. Must be one of wav, mp3, flac, opus, or pcm16.

    Possible values:
    voiceany ofRequired

    The voice the model uses to respond. Supported voices are alloy, ash, ballad, coral, echo, fable, nova, onyx, sage, and shimmer.

    string · enumOptionalPossible values:
    or
    stringOptional
    itemsstring · enumOptionalPossible values:
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    Other propertiesnumber · min: -100 · max: 100Optional
    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    enable_thinkingbooleanOptional

    Specifies whether to use the thinking mode.

    Default: false
    thinking_budgetinteger · min: 1Optional

    The maximum reasoning length, effective only when enable_thinking is set to true.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: alibaba/qwen3.5-omni-plus
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    dataany ofRequired

    Either a URL of the audio or the base64 encoded audio data.

    string · uriOptional
    or
    stringOptional
    formatstring · enumRequired

    The format of the encoded audio data. Currently supports "wav" and "mp3".

    Possible values:
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    urlstring · uriRequired

    Either a URL of the video or the base64 encoded video data.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    idstringRequired

    Unique identifier for a previous audio response from the model.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    formatstring · enumRequired

    Specifies the output audio format. Must be one of wav, mp3, flac, opus, or pcm16.

    Possible values:
    voiceany ofRequired

    The voice the model uses to respond. Supported voices are alloy, ash, ballad, coral, echo, fable, nova, onyx, sage, and shimmer.

    string · enumOptionalPossible values:
    or
    stringOptional
    itemsstring · enumOptionalPossible values:
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    Other propertiesnumber · min: -100 · max: 100Optional
    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    enable_thinkingbooleanOptional

    Specifies whether to use the thinking mode.

    Default: false
    thinking_budgetinteger · min: 1Optional

    The maximum reasoning length, effective only when enable_thinking is set to true.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: alibaba/qwen3.5-omni-flash
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequiredPossible values:
    contentany ofRequired
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    typestring · enumRequiredPossible values:
    thinkingstringRequired
    signaturestringRequired
    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    is_errorbooleanOptional
    contentany ofOptional
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    idstringRequired
    Other propertiesany · nullableOptional
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    sourceany ofRequired
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    Other propertiesstringOptional
    stop_sequencesstring[]Optional

    Custom text sequences that will cause the model to stop generating.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    systemstringOptional

    A system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role.

    namestringRequired

    Name of the tool.

    descriptionstringOptional

    Description of what this tool does. Tool descriptions should be as detailed as possible. The more information that the model has about what the tool is and how to use it, the better it will perform. You can use natural language descriptions to reinforce important aspects of the tool input JSON schema.

    typestring · enumRequiredPossible values:
    propertiesany · nullableOptional
    Other propertiesany · nullableOptional
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    or
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    budget_tokensinteger · min: 1024Required

    Determines how many tokens Claude can use for its internal reasoning process. Larger budgets can enable more thorough analysis for complex problems, improving response quality. Must be ≥1024 and less than max_tokens.

    typestring · enumRequiredPossible values:
    max_tokensnumberOptional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    Default: 32000
    temperaturenumber · max: 1Optional

    Amount of randomness injected into the response. Defaults to 1.0. Ranges from 0.0 to 1.0. Use temperature closer to 0.0 for analytical / multiple choice, and closer to 1.0 for creative and generative tasks. Note that even with temperature of 0.0, the results will not be fully deterministic.

    top_pnumber · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: anthropic/claude-opus-4-5
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: baidu/ernie-4.5-0.3b
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: baidu/ernie-4.5-vl-28b-a3b
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: baidu/ernie-4.5-300b-a47b
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: baidu/ernie-5-0-thinking-preview
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: baidu/ernie-5-0-thinking-latest
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: baidu/ernie-x1-turbo-32k
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: bytedance/seed-1-8
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    urlstring · uriRequired

    Either a URL of the video or the base64 encoded video data.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    reasoning_effortstring · enumOptional

    Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

    Possible values:
    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: bytedance/dola-seed-2-0-mini
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    top_anumber · max: 1Optional

    Alternate top sampling parameter.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: cohere/command-a
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    urlstring · uriRequired

    Either a URL of the video or the base64 encoded video data.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    reasoning_effortstring · enumOptional

    Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

    Possible values:
    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: bytedance/dola-seed-2-0-code
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    echobooleanOptional

    If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Other propertiesnumber · min: -100 · max: 100Optional
    top_anumber · max: 1Optional

    Alternate top sampling parameter.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: deepseek/deepseek-chat
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    echobooleanOptional

    If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Other propertiesnumber · min: -100 · max: 100Optional
    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: deepseek/deepseek-r1
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    echobooleanOptional

    If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Other propertiesnumber · min: -100 · max: 100Optional
    top_anumber · max: 1Optional

    Alternate top sampling parameter.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: deepseek/deepseek-chat-v3.1
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    Other propertiesnumber · min: -100 · max: 100Optional
    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    reasoning_effortstring · enumOptional

    Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

    Possible values:
    effortstring · enumOptional

    Reasoning effort setting

    Possible values:
    max_tokensinteger · min: 1Optional

    Max tokens of reasoning content. Cannot be used simultaneously with effort.

    excludebooleanOptional

    Whether to exclude reasoning from the response

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    echobooleanOptional

    If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    top_anumber · max: 1Optional

    Alternate top sampling parameter.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    search_context_sizestring · enumOptional

    High level guidance for the amount of context window space to use for the search. One of low, medium, or high. medium is the default.

    Possible values:
    citystringOptional

    Free text input for the city of the user, e.g. San Francisco.

    countrystringOptional

    The two-letter ISO country code of the user, e.g. US.

    Pattern: ^[A-Z]{2}$
    regionstringOptional

    Free text input for the region of the user, e.g. California.

    timezonestringOptional

    The IANA timezone of the user, e.g. America/Los_Angeles.

    typestring · enumRequired

    The type of location approximation. Always approximate.

    Possible values:
    search_modestring · enumOptional

    Controls the search mode used for the request. When set to 'academic', results will prioritize scholarly sources like peer-reviewed papers and academic journals.

    Default: academicPossible values:
    search_domain_filterstring[]Optional

    A list of domains to limit search results to. Currently limited to 10 domains for Allowlisting and Denylisting. For Denylisting, add a - at the beginning of the domain string.

    return_imagesbooleanOptional

    Determines whether search results should include images.

    Default: false
    return_related_questionsbooleanOptional

    Determines whether related questions should be returned.

    Default: false
    search_recency_filterstring · enumOptional

    Filters search results based on time (e.g., 'week', 'day').

    Possible values:
    search_after_date_filterstringOptional

    Filters search results to only include content published after this date. Format should be %m/%d/%Y (e.g. 3/1/2025)

    Pattern: ^(0?[1-9]|1[0-2])\/(0?[1-9]|[12]\d|3[01])\/\d{4}$
    search_before_date_filterstringOptional

    Filters search results to only include content published before this date. Format should be %m/%d/%Y (e.g. 3/1/2025)

    Pattern: ^(0?[1-9]|1[0-2])\/(0?[1-9]|[12]\d|3[01])\/\d{4}$
    last_updated_after_filterstringOptional

    Filters search results to only include content last updated after this date. Format should be %m/%d/%Y (e.g. 3/1/2025)

    Pattern: ^(0?[1-9]|1[0-2])\/(0?[1-9]|[12]\d|3[01])\/\d{4}$
    last_updated_before_filterstringOptional

    Filters search results to only include content last updated before this date. Format should be %m/%d/%Y (e.g. 3/1/2025)

    Pattern: ^(0?[1-9]|1[0-2])\/(0?[1-9]|[12]\d|3[01])\/\d{4}$
    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: deepseek/deepseek-v4-pro
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    Other propertiesnumber · min: -100 · max: 100Optional
    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    reasoning_effortstring · enumOptional

    Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

    Possible values:
    effortstring · enumOptional

    Reasoning effort setting

    Possible values:
    max_tokensinteger · min: 1Optional

    Max tokens of reasoning content. Cannot be used simultaneously with effort.

    excludebooleanOptional

    Whether to exclude reasoning from the response

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    echobooleanOptional

    If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    top_anumber · max: 1Optional

    Alternate top sampling parameter.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    search_context_sizestring · enumOptional

    High level guidance for the amount of context window space to use for the search. One of low, medium, or high. medium is the default.

    Possible values:
    citystringOptional

    Free text input for the city of the user, e.g. San Francisco.

    countrystringOptional

    The two-letter ISO country code of the user, e.g. US.

    Pattern: ^[A-Z]{2}$
    regionstringOptional

    Free text input for the region of the user, e.g. California.

    timezonestringOptional

    The IANA timezone of the user, e.g. America/Los_Angeles.

    typestring · enumRequired

    The type of location approximation. Always approximate.

    Possible values:
    search_modestring · enumOptional

    Controls the search mode used for the request. When set to 'academic', results will prioritize scholarly sources like peer-reviewed papers and academic journals.

    Default: academicPossible values:
    search_domain_filterstring[]Optional

    A list of domains to limit search results to. Currently limited to 10 domains for Allowlisting and Denylisting. For Denylisting, add a - at the beginning of the domain string.

    return_imagesbooleanOptional

    Determines whether search results should include images.

    Default: false
    return_related_questionsbooleanOptional

    Determines whether related questions should be returned.

    Default: false
    search_recency_filterstring · enumOptional

    Filters search results based on time (e.g., 'week', 'day').

    Possible values:
    search_after_date_filterstringOptional

    Filters search results to only include content published after this date. Format should be %m/%d/%Y (e.g. 3/1/2025)

    Pattern: ^(0?[1-9]|1[0-2])\/(0?[1-9]|[12]\d|3[01])\/\d{4}$
    search_before_date_filterstringOptional

    Filters search results to only include content published before this date. Format should be %m/%d/%Y (e.g. 3/1/2025)

    Pattern: ^(0?[1-9]|1[0-2])\/(0?[1-9]|[12]\d|3[01])\/\d{4}$
    last_updated_after_filterstringOptional

    Filters search results to only include content last updated after this date. Format should be %m/%d/%Y (e.g. 3/1/2025)

    Pattern: ^(0?[1-9]|1[0-2])\/(0?[1-9]|[12]\d|3[01])\/\d{4}$
    last_updated_before_filterstringOptional

    Filters search results to only include content last updated before this date. Format should be %m/%d/%Y (e.g. 3/1/2025)

    Pattern: ^(0?[1-9]|1[0-2])\/(0?[1-9]|[12]\d|3[01])\/\d{4}$
    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: deepseek/deepseek-v4-flash
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: google/gemini-2.0-flash
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: google/gemini-2.5-flash
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    Other propertiesnumber · min: -100 · max: 100Optional
    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    reasoning_effortstring · enumOptional

    Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

    Possible values:
    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: google/gemma-4-31b-it
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    top_anumber · max: 1Optional

    Alternate top sampling parameter.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: gryphe/mythomax-l2-13b
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: meta-llama/llama-3.3-70b-versatile
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    get
    /v1/billing/balance
    200Success
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "qwen-plus",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "qwen-plus",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "qwen-turbo",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "qwen-turbo",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "alibaba/qwen3-coder-480b-a35b-instruct",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "alibaba/qwen3-coder-480b-a35b-instruct",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "alibaba-cloud/qwen3-next-80b-a3b-instruct",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "alibaba-cloud/qwen3-next-80b-a3b-instruct",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "alibaba/qwen3-max-preview",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "alibaba/qwen3-max-preview",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "alibaba/qwen3-vl-32b-instruct",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "alibaba/qwen3-vl-32b-instruct",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "alibaba/qwen3.5-plus-20260218",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "alibaba/qwen3.5-plus-20260218",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "alibaba/qwen3.6-27b",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "alibaba/qwen3.6-27b",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "alibaba/qwen3.5-omni-plus",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "alibaba/qwen3.5-omni-plus",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "alibaba/qwen3.5-omni-flash",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "alibaba/qwen3.5-omni-flash",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-opus-4-5",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "anthropic/claude-opus-4-5",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "baidu/ernie-4.5-0.3b",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "baidu/ernie-4.5-0.3b",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "baidu/ernie-4.5-vl-28b-a3b",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "baidu/ernie-4.5-vl-28b-a3b",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "baidu/ernie-4.5-300b-a47b",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "baidu/ernie-4.5-300b-a47b",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "baidu/ernie-5-0-thinking-preview",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "baidu/ernie-5-0-thinking-preview",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "baidu/ernie-5-0-thinking-latest",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "baidu/ernie-5-0-thinking-latest",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "baidu/ernie-x1-turbo-32k",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "baidu/ernie-x1-turbo-32k",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "bytedance/seed-1-8",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "bytedance/seed-1-8",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "bytedance/dola-seed-2-0-mini",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "bytedance/dola-seed-2-0-mini",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "cohere/command-a",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "cohere/command-a",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "bytedance/dola-seed-2-0-code",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "bytedance/dola-seed-2-0-code",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "deepseek/deepseek-chat",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "deepseek/deepseek-chat",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "deepseek/deepseek-r1",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "deepseek/deepseek-r1",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "deepseek/deepseek-chat-v3.1",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "deepseek/deepseek-chat-v3.1",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "deepseek/deepseek-v4-pro",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "deepseek/deepseek-v4-pro",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "deepseek/deepseek-v4-flash",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "deepseek/deepseek-v4-flash",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "google/gemini-2.0-flash",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "google/gemini-2.0-flash",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "google/gemini-2.5-flash",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "google/gemini-2.5-flash",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "google/gemma-4-31b-it",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "google/gemma-4-31b-it",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "gryphe/mythomax-l2-13b",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "gryphe/mythomax-l2-13b",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "meta-llama/llama-3.3-70b-versatile",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "meta-llama/llama-3.3-70b-versatile",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request GET \
      --url 'https://api.aimlapi.com/v1/billing/balance' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>'
    {
      "balance": 10000000,
      "lowBalance": false,
      "lowBalanceThreshold": 10000,
      "lastUpdated": "2025-11-25T17:45:00Z",
      "autoDebitStatus": "disabled",
      "status": "current",
      "statusExplanation": "Balance is current and up to date"
    }
    {
      "current_balance": 123.45,
      "currency": "USD"
    }
    {
      "user_id": 111,
      "email": "[email protected]",
      "current_balance": 100.5,
      "currency": "USD",
      "autotopup_settings": {
        "is_enabled": true,
        "threshold": 50,
        "amount": 100,
        "currency": "USD"
      }
    }
    curl -L \
      --request GET \
      --url 'https://api.aimlapi.com/v2/billing' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>'
    curl -L \
      --request GET \
      --url 'https://api.aimlapi.com/v2/billing/detail' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>'
    Responses
    chevron-right
    200

    API key creation result

    application/json
    namestring · nullableOptional

    Human-readable, user-defined name for the API key.

    Example: 20260202-key-for-llms
    disabledbooleanRequired

    Indicates whether the key is disabled.

    Example: false
    prefixstringRequired

    Key prefix. This is the first 8 characters of your API key, visible in the dashboard.

    Example: b747e891
    itemsstring · enumOptionalPossible values:
    retentionstring · enumOptional

    Limit period.

    Possible values:
    thresholdnumberOptional

    Spending limit threshold for the selected period, in USD

    Example: 25
    created_atstring · date-timeRequired

    Creation timestamp (UTC).

    Example: 2026-02-18T06:59:10.031Z
    updated_atstring · date-timeRequired

    Last update timestamp (UTC).

    Example: 2026-02-18T06:59:10.031Z
    monthly_usagenumberRequired

    Current monthly usage amount.

    Example: 0
    keystringRequired

    Full API key value (returned only at creation time).

    Example: b747e891847f4c3fa0f6cce1cfd79bf9
    post
    /v1/keys
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/keys' \
      --header 'Authorization: Bearer <YOUR_MANAGEMENT_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "name": "20260202-key-for-llms",
        "limit": {
          "retention": "week",
          "threshold": 25
        },
        "scopes": [
          "model:chat",
          "model:responses"
        ]
      }'
    200

    API key creation result

    namestringOptional

    Optional human-readable name of the API key.

    Example: 20260202-key-for-llms
    disabledbooleanOptional

    Enable or disable the API key.

    Example: false
    retentionstring · enumOptional

    Limit period.

    Possible values:
    thresholdnumberOptional

    Spending limit threshold for the selected period, in USD

    Example: 25
    itemsstring · enumOptionalPossible values:
    Responses
    chevron-right
    200

    Updated API key parameters

    application/json
    namestring · nullableOptional

    Human-readable, user-defined name for the API key.

    Example: 20260202-key-for-llms
    disabledbooleanRequired

    Indicates whether the key is disabled.

    Example: false
    prefixstringRequired

    Key prefix. This is the first 8 characters of your API key, visible in the dashboard. You can also obtain this value via the GET method (see the prefix field in its response).

    Example: b747e891
    itemsstring · enumOptionalPossible values:
    retentionstring · enumOptionalPossible values:
    thresholdnumberOptionalExample: 25
    created_atstring · date-timeRequired

    Creation timestamp (UTC).

    Example: 2026-02-18T06:59:10.031Z
    updated_atstring · date-timeRequired

    Last update timestamp (UTC).

    Example: 2026-02-18T06:59:10.031Z
    monthly_usagenumberRequired

    Current monthly usage amount.

    Example: 0
    patch
    /v1/keys/{prefix}
    curl -L \
      --request PATCH \
      --url 'https://api.aimlapi.com/v1/keys/<API_KEY_PREFIX>' \
      --header 'Authorization: Bearer <YOUR_MANAGEMENT_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "disabled": false
      }'
    200

    Updated API key parameters

    200

    Key deletion result

    {
      "data": {
        "prefix": "b747e891",
        "deleted": true
      }
    }
    curl -L \
      --request DELETE \
      --url 'https://api.aimlapi.com/v1/keys/<API_KEY_PREFIX>' \
      --header 'Authorization: Bearer <YOUR_MANAGEMENT_KEY>'
    Quickstart guide
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequiredPossible values:
    contentany ofRequired
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    typestring · enumRequiredPossible values:
    thinkingstringRequired
    signaturestringRequired
    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    is_errorbooleanOptional
    contentany ofOptional
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    idstringRequired
    Other propertiesany · nullableOptional
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    sourceany ofRequired
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    Other propertiesstringOptional
    stop_sequencesstring[]Optional

    Custom text sequences that will cause the model to stop generating.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    systemstringOptional

    A system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role.

    namestringRequired

    Name of the tool.

    descriptionstringOptional

    Description of what this tool does. Tool descriptions should be as detailed as possible. The more information that the model has about what the tool is and how to use it, the better it will perform. You can use natural language descriptions to reinforce important aspects of the tool input JSON schema.

    typestring · enumRequiredPossible values:
    propertiesany · nullableOptional
    Other propertiesany · nullableOptional
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    or
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    budget_tokensinteger · min: 1024Required

    Determines how many tokens Claude can use for its internal reasoning process. Larger budgets can enable more thorough analysis for complex problems, improving response quality. Must be ≥1024 and less than max_tokens.

    typestring · enumRequiredPossible values:
    max_tokensnumberOptional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    Default: 32000
    temperaturenumber · max: 1Optional

    Amount of randomness injected into the response. Defaults to 1.0. Ranges from 0.0 to 1.0. Use temperature closer to 0.0 for analytical / multiple choice, and closer to 1.0 for creative and generative tasks. Note that even with temperature of 0.0, the results will not be fully deterministic.

    top_pnumber · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: anthropic/claude-opus-4-6
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-opus-4-6",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "anthropic/claude-opus-4-6",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    {
      "data": {
        "name": "20260202-key-for-llms",
        "disabled": false,
        "prefix": "b747e891",
        "scopes": [
          "model:chat"
        ],
        "limit": {
          "retention": "no_reset",
          "threshold": 25
        },
        "created_at": "2026-02-18T06:59:10.031Z",
        "updated_at": "2026-02-18T06:59:10.031Z",
        "monthly_usage": 0,
        "key": "b747e891847f4c3fa0f6cce1cfd79bf9"
      }
    }
    {
      "data": {
        "name": "20260202-key-for-llms",
        "disabled": false,
        "prefix": "b747e891",
        "scopes": [
          "model:chat"
        ],
        "limit": {
          "retention": "no_reset",
          "threshold": 25
        },
        "created_at": "2026-02-18T06:59:10.031Z",
        "updated_at": "2026-02-18T06:59:10.031Z",
        "monthly_usage": 0
      }
    }

    Alibaba Cloud

    Anthracite

    DeepSeek

    Meta

    get
    Responses
    chevron-right
    200

    List of API keys, ordered from oldest to newest

    application/json
    namestring · nullableOptional

    Human-readable, user-defined name for the API key.

    Example: 20260202-key-for-llms
    prefixstringRequired

    Key prefix. This is the first 8 characters of your API key, visible in the dashboard. You can also obtain this value via the POST method (see the prefix field in its response).

    Example: b747e891
    disabledbooleanRequired

    Indicates whether the key is disabled.

    Example: false
    itemsstring · enumOptionalPossible values:
    retentionstring · enumOptionalPossible values:
    thresholdnumberOptional

    Spending limit threshold for the selected period, in USD.

    Example: 25
    created_atstring · date-timeRequired

    Creation timestamp (UTC).

    Example: 2026-02-18T06:59:10.031Z
    updated_atstring · date-timeRequired

    Last update timestamp (UTC).

    Example: 2026-02-18T06:59:10.031Z
    monthly_usagenumberRequired

    Current monthly usage amount.

    Example: 0
    get
    /v1/keys
    200

    List of API keys, ordered from oldest to newest

    curl -L \
      --request GET \
      --url 'https://api.aimlapi.com/v1/keys' \
      --header 'Authorization: Bearer <YOUR_MANAGEMENT_KEY>'
    {
      "data": [
        {
          "name": "20260202-key-for-llms",
          "prefix": "b747e891",
          "disabled": false,
          "scopes": [
            "model:chat"
          ],
          "limit": {
            "retention": "no_reset",
            "threshold": 25
          },
          "created_at": "2026-02-18T06:59:10.031Z",
          "updated_at": "2026-02-18T06:59:10.031Z",
          "monthly_usage": 0
        }
      ]
    }

    Complete Model List

    hashtag
    Get Model List via API

    You can query the complete list of available models through this API. No API key is required for this request. You can also simply open this listarrow-up-right in any web browser.

    hashtag
    Output Examples by Model Type

    As of early 2026, this endpoint returns a list of more than 400 models. Each item represents a single model identified by a unique ID. Depending on the model category (chat, video, etc.), the set of fields in each item may vary slightly, so below we provide representative examples from the main model categories.

    hashtag
    Example output item for a chat model

    Unlike other types of models, every chat model includes a non-empty features list that clearly shows what the model can do: support for streaming, instructions for SYSTEM or DEVELOPER roles besides the regular prompt, whether the model is described by the developer as “thinking”, etc.

    For more details on many of these, see the section of this documentation portal.

    hashtag
    Example output item for an image model

    hashtag
    Example output item for a video model

    Qwen2.5-7B-Instruct-Turbo

    circle-info

    This documentation is valid for the following list of our models:

    • Qwen/Qwen2.5-7B-Instruct-Turbo

    magnum-v4

    circle-info

    This documentation is valid for the following list of our models:

    • anthracite-org/magnum-v4-72b

    DeepSeek Reasoner V3.1

    circle-info

    This documentation is valid for the following list of our models:

    • deepseek/deepseek-reasoner-v3.1

    Deepseek Non-reasoner V3.1 Terminus

    circle-info

    This documentation is valid for the following list of our models:

    • deepseek/deepseek-non-reasoner-v3.1-terminus

    Deepseek Reasoner V3.1 Terminus

    circle-info

    This documentation is valid for the following list of our models:

    • deepseek/deepseek-reasoner-v3.1-terminus

    DeepSeek V3.2 Exp Thinking

    circle-info

    This documentation is valid for the following list of our models:

    • deepseek/deepseek-thinking-v3.2-exp

    gemini-2.5-flash-lite-preview

    circle-info

    This documentation is valid for the following list of our models:

    • google/gemini-2.5-flash-lite-preview

    Try in Playground

    hashtag
    Model Overview

    A cutting-edge large language model designed to understand and generate text based on specific instructions. It excels in various tasks, including coding, mathematical problem-solving, and generating structured outputs.

    circle-check

    Create AI/ML API Keyarrow-up-right

    chevron-rightHow to make the first API callhashtag

    1️ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    hashtag
    API Schema

    hashtag
    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"Qwen/Qwen2.5-7B-Instruct-Turbo",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'Qwen/Qwen2.5-7B-Instruct-Turbo',
          messages:[
              {
                  role:'user',
                  content: 'Hello'  // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    chevron-rightResponsehashtag
    {'id': 'npK4C7y-3NKUce-92d4866b1e62ef98', 'object': 'chat.completion', 'choices': [{'index': 0, 'finish_reason': 'stop', 'logprobs': None, 'message': {'role': 'assistant', 'content': 'Hello! How can I assist you today?', 'tool_calls': []}}], 'created': 1744144252, 'model': 'Qwen/Qwen2.5-7B-Instruct-Turbo', 'usage': {'prompt_tokens': 19, 'completion_tokens': 6, 'total_tokens': 25}}

    Try in Playground

    hashtag
    Model Overview

    A LLM fine-tuned on top of Qwen2.5, specifically designed to replicate the prose quality of the Claude 3 models, particularly Sonnet and Opus. It excels in generating coherent and contextually rich text.

    circle-check

    Create AI/ML API Keyarrow-up-right

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    hashtag
    API Schema

    hashtag
    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthracite-org/magnum-v4-72b",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      try {
        const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
          method: 'POST',
          headers: {
            // Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
            'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({
            model: 'anthracite-org/magnum-v4-72b',
            messages:[
                {
                    role:'user',
    
                    // Insert your question for the model here, instead of Hello:
                    content: 'Hello'
                }
            ]
          }),
        });
    
        if (!response.ok) {
          throw new Error(`HTTP error! Status ${response.status}`);
        }
    
        const data = await response.json();
        console.log(JSON.stringify(data, null, 2));
    
      } catch (error) {
        console.error('Error', error);
      }
    }
    
    main();
    chevron-rightResponsehashtag
    {'id': 'gen-1744217980-rdVBcVTb76dllKCCRjak', 'object': 'chat.completion', 'choices': [{'index': 0, 'finish_reason': 'stop', 'logprobs': None, 'message': {'role': 'assistant', 'content': 'Hello! How can I assist you today?', 'refusal': None}}], 'created': 1744217980, 'model': 'anthracite-org/magnum-v4-72b', 'usage': {'prompt_tokens': 37, 'completion_tokens': 50, 'total_tokens': 87}}

    hashtag
    Model Overview

    August 2025 update of the DeepSeek R1 reasoning model. Skilled at complex problem-solving, mathematical reasoning, and programming assistance.

    circle-check

    Create AI/ML API Keyarrow-up-right

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    hashtag
    API Schema

    hashtag
    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"deepseek/deepseek-reasoner-v3.1",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'deepseek/deepseek-reasoner-v3.1',
          messages:[{
                  role:'user',
                  content: 'Hello'}  // Insert your question instead of Hello
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    chevron-rightResponsehashtag
    {
      "id": "ca664281-d3c3-40d3-9d80-fe96a65884dd",
      "system_fingerprint": "fp_feb633d1f5_prod0820_fp8_kvcache",
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! How can I help you today? 😊",
            "reasoning_content": ""
          }
        }
      ],
      "created": 1756386069,
      "model": "deepseek-reasoner",
      "usage": {
        "prompt_tokens": 1,
        "completion_tokens": 325,
        "total_tokens": 326,
        "prompt_tokens_details": {
          "cached_tokens": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 80
        },
        "prompt_cache_hit_tokens": 0,
        "prompt_cache_miss_tokens": 5
      }
    }

    hashtag
    Model Overview

    September 2025 update of the DeepSeek Chat V3.1 non-reasoning model. The model produces more consistent and dependable results.

    circle-check

    Create AI/ML API Keyarrow-up-right

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    hashtag
    API Schema

    hashtag
    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"deepseek/deepseek-non-reasoner-v3.1-terminus",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'deepseek/deepseek-non-reasoner-v3.1-terminus',
          messages:[{
                  role:'user',
                  content: 'Hello'}  // Insert your question instead of Hello
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    chevron-rightResponsehashtag
    {
      "id": "cc8c3054-115d-4dac-9269-2abffcaabab5",
      "system_fingerprint": "fp_ffc7281d48_prod0820_fp8_kvcache",
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! How can I assist you today? 😊",
            "reasoning_content": ""
          }
        }
      ],
      "created": 1761036636,
      "model": "deepseek-chat",
      "usage": {
        "prompt_tokens": 3,
        "completion_tokens": 10,
        "total_tokens": 13,
        "prompt_tokens_details": {
          "cached_tokens": 0
        },
        "prompt_cache_hit_tokens": 0,
        "prompt_cache_miss_tokens": 5
      }
    }

    hashtag
    Model Overview

    September 2025 update of the DeepSeek Reasoner V3.1 model. The model produces more consistent and dependable results.

    circle-check

    Create AI/ML API Keyarrow-up-right

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    hashtag
    API Schema

    hashtag
    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"deepseek/deepseek-reasoner-v3.1-terminus",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'deepseek/deepseek-reasoner-v3.1-terminus',
          messages:[{
                  role:'user',
                  content: 'Hello'}  // Insert your question instead of Hello
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    chevron-rightResponsehashtag
    {
      "id": "543f56cb-f59f-42cc-8ed7-8efdd72f185d",
      "system_fingerprint": "fp_ffc7281d48_prod0820_fp8_kvcache",
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! How can I assist you today? 😊",
            "reasoning_content": ""
          }
        }
      ],
      "created": 1761034613,
      "model": "deepseek-reasoner",
      "usage": {
        "prompt_tokens": 3,
        "completion_tokens": 98,
        "total_tokens": 101,
        "prompt_tokens_details": {
          "cached_tokens": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 99
        },
        "prompt_cache_hit_tokens": 0,
        "prompt_cache_miss_tokens": 5
      }
    }

    hashtag
    Model Overview

    September 2025 update of the DeepSeek R1 reasoning model. Skilled at complex problem-solving, mathematical reasoning, and programming assistance.

    circle-check

    Create AI/ML API Keyarrow-up-right

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    hashtag
    API Schema

    hashtag
    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"deepseek/deepseek-thinking-v3.2-exp",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'deepseek/deepseek-thinking-v3.2-exp',
          messages:[
            {
              role:'user',
              content: 'Hello'  // Insert your question instead of Hello
            }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    chevron-rightResponsehashtag
    {
      "id": "ca664281-d3c3-40d3-9d80-fe96a65884dd",
      "system_fingerprint": "fp_feb633d1f5_prod0820_fp8_kvcache",
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! How can I help you today? 😊",
            "reasoning_content": ""
          }
        }
      ],
      "created": 1756386069,
      "model": "deepseek-reasoner",
      "usage": {
        "prompt_tokens": 1,
        "completion_tokens": 325,
        "total_tokens": 326,
        "prompt_tokens_details": {
          "cached_tokens": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 80
        },
        "prompt_cache_hit_tokens": 0,
        "prompt_cache_miss_tokens": 5
      }
    }

    Try in Playground

    hashtag
    Model Overview

    The model excels at high-volume, latency-sensitive tasks like translation and classification.

    circle-check

    Create AI/ML API Keyarrow-up-right

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    hashtag
    API Schema

    hashtag
    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"google/gemini-2.5-flash-lite-preview",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'google/gemini-2.5-flash-lite-preview',
          messages:[
              {
                  role:'user',
                  content: 'Hello'  // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    chevron-rightResponsehashtag
    {
      "id": "gen-1752482994-9LhqM48PhAmhiRTtl2ys",
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello there! How can I help you today?",
            "reasoning_content": null,
            "refusal": null
          }
        }
      ],
      "created": 1752482994,
      "model": "google/gemini-2.5-flash-lite-preview-06-17",
      "usage": {
        "prompt_tokens": 0,
        "completion_tokens": 9,
        "total_tokens": 9
      }
    }
    {
      "id": "o3-mini",
      "type": "chat-completion",
      "info": {
        "name": "o3 mini",
        "developer": "Open AI",
        "description": "OpenAI o3-mini excels in reasoning tasks with advanced features like deliberative alignment and extensive context support.",
        "contextLength": 200000,
        "maxTokens": 100000,
        "url": "https://aimlapi.com/models/openai-o3-mini-api",
        "docs_url": "https://docs.aimlapi.com/api-references/text-models-llm/openai/o3-mini"
      },
      "features": [
        "openai/chat-completion",
        "openai/response-api",
        "openai/chat-assistant",
        "openai/chat-completion.function",
        "openai/chat-completion.message.refusal",
        "openai/chat-completion.message.system",
        "openai/chat-completion.message.developer",
        "openai/chat-completion.message.assistant",
        "openai/chat-completion.stream",
        "openai/chat-completion.max-completion-tokens",
        "openai/chat-completion.number-of-messages",
        "openai/chat-completion.stop",
        "openai/chat-completion.seed",
        "openai/chat-completion.reasoning",
        "openai/chat-completion.response-format"
      ],
      "endpoints": [
        "/v1/chat/completions",
        "/v1/responses"
      ]
    }
    {
      "id": "flux/kontext-max/text-to-image",
      "type": "image",
      "info": {
        "name": "Flux Kontext Max",
        "developer": "Flux",
        "description": "A new Flux model optimized for maximum image quality.",
        "url": "https://aimlapi.com/models/flux-1-kontext-max",
        "docs_url": "https://docs.aimlapi.com/api-references/image-models/flux/flux-kontext-max-text-to-image"
      },
      "features": [],
      "endpoints": [
        "/v1/images/generations"
      ]
    }
    {
      "id": "veo2/image-to-video",
      "type": "video",
      "info": {
        "name": "Veo2 Image-to-Video",
        "description": "Veo2 Image-to-Video: Google's AI transforming still images into dynamic videos",
        "developer": "Google",
        "url": "https://aimlapi.com/models/veo-2-image-to-video-api",
        "docs_url": "https://docs.aimlapi.com/api-references/video-models/google/veo2-image-to-video"
      },
      "features": [],
      "endpoints": [
        "/v2/generate/video/google/generation",
        "/v2/video/generations"
      ]
    }
    CAPABILITIES
    get
    Responses
    chevron-right
    200

    A list of available models.

    application/json
    idstringRequired

    Unique identifier of the model.

    Example: o3-mini
    typestringRequired

    Model interaction type.

    Example: chat-completion
    namestringRequired

    Human-readable model name.

    Example: o3 mini
    developerstringRequired

    Organization or company that developed the model.

    Example: Open AI
    descriptionstringRequired

    Short description of the model and its primary capabilities.

    Example: OpenAI o3-mini excels in reasoning tasks with advanced features like deliberative alignment and extensive context support.
    contextLengthintegerOptional

    Maximum supported context window size in tokens.

    Example: 200000
    maxTokensintegerOptional

    Maximum number of tokens that can be generated in a single response.

    Example: 100000
    urlstring · uriRequired

    Public model landing page on AIML API website.

    Example: https://aimlapi.com/models/openai-o3-mini-api
    docs_urlstring · uriRequired

    Link to the official API documentation for this model.

    Example: https://docs.aimlapi.com/api-references/text-models-llm/openai/o3-mini
    featuresstring[]Required

    List of supported features and API capabilities for the model.

    Example: ["openai/chat-completion","openai/response-api","openai/chat-assistant","openai/chat-completion.function","openai/chat-completion.message.refusal","openai/chat-completion.message.system","openai/chat-completion.message.developer","openai/chat-completion.message.assistant","openai/chat-completion.stream","openai/chat-completion.max-completion-tokens","openai/chat-completion.seed","openai/chat-completion.reasoning","openai/chat-completion.response-format"]
    endpointsstring[]Required

    API endpoints through which this model can be accessed.

    Example: ["/v1/chat/completions","/v1/responses"]
    get
    /models
    200

    A list of available models.

    qwen3-235b-a22b-thinking-2507

    circle-info

    This documentation is valid for the following list of our models:

    • alibaba/qwen3-235b-a22b-thinking-2507

    hashtag
    Model Overview

    Significantly improved performance on reasoning tasks, including logical reasoning, mathematics, science, coding, and academic benchmarks that typically require human expertise.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model:

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    qwen3-omni-30b-a3b-captioner

    circle-info

    This documentation is valid for the following list of our models:

    • alibaba/qwen3-omni-30b-a3b-captioner

    hashtag
    Model Overview

    This model is an open-source model built on Qwen3-Omni that automatically generates rich, detailed descriptions of complex audio — including speech, music, ambient sounds, and effects — without prompts. It detects emotions, musical styles, instruments, and sensitive information, making it ideal for audio analysis, security auditing, intent recognition, and editing.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model:

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    Claude 4.7 Opus

    circle-info

    This documentation is valid for the following list of our models:

    • anthropic/claude-opus-4-7

    • claude-opus-4-7

    hashtag
    Model Overview

    As of mid-April 2026, the most capable generally available model, optimized for autonomous long-horizon agentic workflows, knowledge-intensive tasks, vision, and memory, with strong overall performance across domains. It supports up to a 1M-token context window, 128k output tokens, adaptive reasoning, and full compatibility with toolset and platform features.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    ernie-4.5-8k-preview

    circle-info

    This documentation is valid for the following list of our models:

    • baidu/ernie-4-5-8k-preview

    hashtag
    Model Overview

    A relatively small preview version of ERNIE 4.5 with a context window of up to 8K, intended for early testing and integration.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    ernie-4.5-21b-a3b

    circle-info

    This documentation is valid for the following list of our models:

    • baidu/ernie-4.5-21b-a3b

    hashtag
    Model Overview

    A post-trained LLM with 21B total parameters and 3B activated parameters per token. Non-reasoning variant.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    ernie-4.5-vl-424b-a47b

    circle-info

    This documentation is valid for the following list of our models:

    • baidu/ernie-4.5-vl-424b-a47b

    hashtag
    Model Overview

    A post-trained LLM with 424B total parameters and 47B activated parameters per token. A non-reasoning variant with image and PDF input support.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    ernie-4.5-300b-a47b-paddle

    circle-info

    This documentation is valid for the following list of our models:

    • baidu/ernie-4.5-300b-a47b-paddle

    hashtag
    Model Overview

    A super-large language model, positioned as of August 2025 as a leading Chinese MoE architecture and a foundation model for enterprise applications.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    DeepSeek V3.2 Exp Non-thinking

    circle-info

    This documentation is valid for the following list of our models:

    • deepseek/deepseek-non-thinking-v3.2-exp

    hashtag
    Model Overview

    September 2025 update of the non-reasoning model.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    Llama-3-8B-Instruct-Lite

    circle-info

    This documentation is valid for the following list of our models:

    • meta-llama/Meta-Llama-3-8B-Instruct-Lite

    hashtag
    Model Overview

    A generative text model optimized for dialogue and instruction-following use cases. It leverages a refined transformer architecture to deliver high performance in text generation tasks.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    Llama-3.3-70B-Instruct-Turbo

    circle-info

    This documentation is valid for the following list of our models:

    • meta-llama/Llama-3.3-70B-Instruct-Turbo

    hashtag
    Model Overview

    An optimized language model designed for efficient text generation with advanced features and multilingual support. Specifically tuned for instruction-following tasks, making it suitable for applications requiring conversational capabilities and task-oriented responses.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag
    curl -L \
      --url 'https://api.aimlapi.com/models'
    [
      {
        "id": "o3-mini",
        "type": "chat-completion",
        "info": {
          "name": "o3 mini",
          "developer": "Open AI",
          "description": "OpenAI o3-mini excels in reasoning tasks with advanced features like deliberative alignment and extensive context support.",
          "contextLength": 200000,
          "maxTokens": 100000,
          "url": "https://aimlapi.com/models/openai-o3-mini-api",
          "docs_url": "https://docs.aimlapi.com/api-references/text-models-llm/openai/o3-mini"
        },
        "features": [
          "openai/chat-completion",
          "openai/response-api",
          "openai/chat-assistant",
          "openai/chat-completion.function",
          "openai/chat-completion.message.refusal",
          "openai/chat-completion.message.system",
          "openai/chat-completion.message.developer",
          "openai/chat-completion.message.assistant",
          "openai/chat-completion.stream",
          "openai/chat-completion.max-completion-tokens",
          "openai/chat-completion.seed",
          "openai/chat-completion.reasoning",
          "openai/chat-completion.response-format"
        ],
        "endpoints": [
          "/v1/chat/completions",
          "/v1/responses"
        ]
      }
    ]
    set the
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"alibaba/qwen3-235b-a22b-thinking-2507",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
            "enable_thinking": False
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3-235b-a22b-thinking-2507',
          messages:[
              {
                  role:'user',
                  content: 'Hello'  // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "chatcmpl-af05df1d-5b72-925e-b3a9-437acbd89b1a",
      "system_fingerprint": null,
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! 😊 How can I assist you today? Feel free to ask me any questions or let me know if you need help with anything specific!",
            "reasoning_content": "Okay, the user said \"Hello\". That's a simple greeting. I should respond in a friendly and welcoming way. Let me make sure to keep it open-ended so they feel comfortable to ask questions or share what's on their mind. Maybe add a smiley emoji to keep it warm. Let me check if there's anything else they might need. Since it's just a hello, probably not much more needed here. Just a polite reply."
          }
        }
      ],
      "created": 1753871154,
      "model": "qwen3-235b-a22b-thinking-2507",
      "usage": {
        "prompt_tokens": 13,
        "completion_tokens": 2187,
        "total_tokens": 2200
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    set the
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
          "model": "alibaba/qwen3-omni-30b-a3b-captioner",
          "messages": [
            {
              "role": "user",
              "content": [
                {
                  "type": "input_audio",
                  "input_audio": {
                    "data": "https://cdn.aimlapi.com/eagle/files/elephant/cJUTeeCmpoqIV1Q3WWDAL_vibevoice-output-7b98283fd3974f48ba90e91d2ee1f971.mp3"
                  }
                }
              ]
            }
          ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3-max-instruct',
          messages:[
            {
              role: 'user',
              content: [
                {
                  type: 'input_audio',
                  input_audio: {
                    data: 'https://cdn.aimlapi.com/eagle/files/elephant/cJUTeeCmpoqIV1Q3WWDAL_vibevoice-output-7b98283fd3974f48ba90e91d2ee1f971.mp3'
                  }
                }
              ]
            }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "chatcmpl-bec5dc33-8f63-96b9-89a4-00aecfce7af8",
      "system_fingerprint": null,
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! How can I help you today?"
          }
        }
      ],
      "created": 1758898624,
      "model": "qwen3-max",
      "usage": {
        "prompt_tokens": 23,
        "completion_tokens": 113,
        "total_tokens": 136
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-opus-4-7",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'anthropic/claude-opus-4-7',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "msg_012q1bXLSBUJ5xdev1UfUAhe",
      "object": "chat.completion",
      "model": "claude-opus-4-7",
      "choices": [
        {
          "index": 0,
          "message": {
            "reasoning_content": "",
            "content": "Humans are a fascinating mix of contradictions, honestly. You're capable of extraordinary things—composing symphonies, sending probes to other planets, building cities, creating vaccines, writing poetry that makes strangers weep centuries later. And at the same time, capable of tremendous cruelty, shortsightedness, and self-deception.\n\nA few things that stand out to me:\n\n- **Your cooperation is remarkable.** Humans routinely trust and coordinate with strangers in ways most species can't. A city is a minor miracle of cooperation.\n- **You're meaning-makers.** You don't just survive—you need things to *matter*. That drives both the best and worst of what you do.\n- **You're adaptable but also stubborn.** You've thrived in basically every environment on Earth, yet individually you often resist changing your mind about things.\n- **The moral circle keeps expanding**, even if slowly and with setbacks—more people care about more beings than ever before in history.\n\nI don't want to romanticize humanity or doom-say about it. You're neither fallen angels nor clever apes—just a particular kind of creature trying to figure things out, often muddling through, sometimes rising to occasions.\n\nWhat prompted the question? Are you feeling optimistic or pessimistic about us lately?",
            "role": "assistant"
          },
          "finish_reason": "end_turn",
          "logprobs": null
        }
      ],
      "created": 1776417936,
      "usage": {
        "prompt_tokens": 24,
        "completion_tokens": 414,
        "total_tokens": 438
      },
      "meta": {
        "usage": {
          "credits_used": 27222,
          "usd_spent": 0.013611
        }
      }
    }
    Try in Playground
    the Claude Opus 4.6
    Create AI/ML API Keyarrow-up-right
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"baidu/ernie-4-5-8k-preview",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'baidu/ernie-4-5-8k-preview',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?'  // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "as-aqgrjim0cp",
      "object": "chat.completion",
      "created": 1768942536,
      "model": "ernie-4.5-8k-preview",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! That's a big and fascinating question. Humanity is incredibly diverse, creative, and resilient. We have an amazing ability to innovate, solve problems, and build complex societies. At the same time, we also grapple with conflicts, inequalities, and challenges like climate change.\n\nOur history is a mix of great achievements and painful mistakes, but overall, there's a lot of potential for growth, understanding, and positive change. What aspects of mankind interest you the most?"
          },
          "finish_reason": "stop",
          "flag": 0
        }
      ],
      "usage": {
        "prompt_tokens": 13,
        "completion_tokens": 99,
        "total_tokens": 112
      },
      "meta": {
        "usage": {
          "credits_used": 545
        }
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"baidu/ernie-4.5-21b-a3b",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'baidu/ernie-4.5-21b-a3b',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "104959f043e51f1b4a4dd83c494886ab",
      "object": "chat.completion",
      "created": 1768829974,
      "model": "baidu/ernie-4.5-21B-a3b",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "\nAs an AI, I don't have personal opinions or emotions, but I can provide insights based on human perspectives and available knowledge. Mankind is a remarkable and complex species with incredible potential for both progress and challenges. Here are some thoughts:\n\n### Positive Aspects\n1. **Innovation and Creativity**: Humans have demonstrated an extraordinary ability to innovate, from the development of tools and technology to the creation of art, music, and literature. This creativity has driven societal advancement and improved the quality of life for many.\n2. **Empathy and Compassion**: Many individuals within the human race possess a strong sense of empathy and compassion, leading to acts of kindness, charity, and social support. This has fostered communities and helped address various forms of suffering and inequality.\n3. **Problem-Solving Skills**: Humans are adept at solving complex problems, whether it's finding cures for diseases, developing sustainable energy sources, or addressing environmental challenges. This problem-solving ability has the potential to create a better future for all.\n\n### Challenges\n1. **Conflict and Violence**: Unfortunately, humans have also been capable of causing immense harm and destruction through conflict, war, and violence. These actions often stem from differences in ideology, culture, or resources, highlighting the need for conflict resolution and peaceful cooperation.\n2. **Inequality and Injustice**: Despite progress, significant inequalities and injustices persist in many parts of the world. These include economic disparities, gender inequality, and racial discrimination, which hinder social progress and well-being.\n3. **Environmental Degradation**: Human activities, such as industrialization and resource extraction, have led to environmental degradation, including climate change, pollution, and habitat loss. Addressing these issues is crucial for the survival and well-being of future generations.\n\n### Future Outlook\nThe future of mankind is uncertain but充满希望. With continued efforts in education, technology, and international cooperation, there is potential for a more just, peaceful, and sustainable world. However, this requires collective action, responsibility, and a commitment to addressing the challenges we face.\n\nIn summary, mankind is a diverse and dynamic species with both remarkable strengths and significant challenges. By working together and leveraging our collective wisdom and creativity, we can strive towards a brighter future for all."
          },
          "finish_reason": "stop"
        }
      ],
      "usage": {
        "prompt_tokens": 16,
        "completion_tokens": 495,
        "total_tokens": 511,
        "prompt_tokens_details": null,
        "completion_tokens_details": null
      },
      "system_fingerprint": "",
      "meta": {
        "usage": {
          "credits_used": 301
        }
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"baidu/ernie-4.5-vl-424b-a47b",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'baidu/ernie-4.5-vl-424b-a47b',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "1ac18d9d544ef814b56858fc6588f712",
      "object": "chat.completion",
      "created": 1768830891,
      "model": "baidu/ernie-4.5-vl-424b-a47b",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "What a profound and fascinating question! Humanity is an incredibly complex and multifaceted subject. Here are a few perspectives on mankind:\n\n### 1. **Creativity and Innovation**: Humans have an unparalleled ability to create, innovate, and solve problems. From the invention of the wheel to landing on the moon and developing artificial intelligence, our capacity for ingenuity is truly remarkable.\n\n### 2. **Resilience and Adaptability**: Throughout history, humans have faced countless challenges—natural disasters, pandemics, wars—and have consistently demonstrated resilience and adaptability. This ability to overcome adversity is a defining characteristic.\n\n### 3. **Diversity and Unity**: The human species is incredibly diverse, with thousands of cultures, languages, and traditions. Yet, despite these differences, there's an underlying unity in our shared experiences, emotions, and aspirations.\n\n### 4. **Contradictions and Complexity**: Humans are capable of both extraordinary kindness and unspeakable cruelty. We can be selfless and compassionate, yet also selfish and destructive. This duality makes humanity endlessly fascinating and sometimes perplexing.\n\n### 5. **Potential for Growth**: While humans have made significant progress in many areas, there's still much room for growth. Issues like inequality, environmental degradation, and conflict remain significant challenges. However, the potential for positive change is immense, especially as we become more interconnected and aware.\n\n### 6. **Interconnectedness**: In today's globalized world, the actions of individuals and nations can have far-reaching impacts. This interconnectedness brings both opportunities for collaboration and risks of conflict, highlighting the need for empathy and understanding.\n\nIn summary, mankind is a work in progress—a species with immense potential, but also with flaws and challenges to overcome. What do you think about humanity? I'd love to hear your perspective!"
          },
          "finish_reason": "stop"
        }
      ],
      "usage": {
        "prompt_tokens": 9,
        "completion_tokens": 386,
        "total_tokens": 395,
        "prompt_tokens_details": null,
        "completion_tokens_details": null
      },
      "system_fingerprint": "",
      "meta": {
        "usage": {
          "credits_used": 1055
        }
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"baidu/ernie-4.5-300b-a47b-paddle",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'baidu/ernie-4.5-300b-a47b-paddle',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?'  // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "9a0e333a0cfa4d86c89a1f7bd3a2919f",
      "object": "chat.completion",
      "created": 1768943231,
      "model": "baidu/ernie-4.5-300b-a47b-paddle",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "The question \"What do you think about mankind?\" invites a reflection on humanity's complexities. Here's a structured response:\n\n**Step 1: Define the scope**  \nMankind encompasses both collective achievements and individual flaws. It's a species marked by creativity, empathy, and resilience, yet also by conflict, inequality, and environmental impact.\n\n**Step 2: Highlight positive traits**  \nHumanity has demonstrated remarkable capacity for innovation (e.g., technology, medicine), cultural expression (art, literature), and moral progress (civil rights, environmental awareness). Cooperation during crises, such as disaster relief or global health initiatives, underscores collective potential.\n\n**Step 3: Acknowledge challenges**  \nPersistent issues like war, poverty, and systemic injustice reveal ethical gaps. Environmental degradation and climate change further highlight unsustainable practices. These contradictions often stem from short-term thinking or unequal resource distribution.\n\n**Step 4: Emphasize growth potential**  \nHistory shows humanity's ability to learn and adapt. Movements for social justice, renewable energy transitions, and scientific breakthroughs suggest progress is possible when values align with action.\n\n**Final Answer**  \nMankind is a paradoxical yet hopeful entity—capable of profound compassion and destructive shortsightedness. Its future hinges on balancing self-interest with collective responsibility, leveraging intelligence and empathy to address shared challenges."
          },
          "finish_reason": "stop"
        }
      ],
      "usage": {
        "prompt_tokens": 13,
        "completion_tokens": 289,
        "total_tokens": 302,
        "prompt_tokens_details": null,
        "completion_tokens_details": null
      },
      "system_fingerprint": "",
      "meta": {
        "usage": {
          "credits_used": 615
        }
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"deepseek/deepseek-non-thinking-v3.2-exp",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'deepseek/deepseek-non-thinking-v3.2-exp',
          messages:[
            {
              role:'user',
              content: 'Hello'  // Insert your question instead of Hello
            }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "ca664281-d3c3-40d3-9d80-fe96a65884dd",
      "system_fingerprint": "fp_feb633d1f5_prod0820_fp8_kvcache",
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! How can I help you today? 😊",
            "reasoning_content": ""
          }
        }
      ],
      "created": 1756386069,
      "model": "deepseek-reasoner",
      "usage": {
        "prompt_tokens": 1,
        "completion_tokens": 325,
        "total_tokens": 326,
        "prompt_tokens_details": {
          "cached_tokens": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 80
        },
        "prompt_cache_hit_tokens": 0,
        "prompt_cache_miss_tokens": 5
      }
    }
    Try in Playground
    DeepSeek V3
    Create AI/ML API Keyarrow-up-right
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"meta-llama/Meta-Llama-3-8B-Instruct-Lite",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'meta-llama/Meta-Llama-3-8B-Instruct-Lite',
          messages:[
              {
                  role:'user',
    
                  // Insert your question for the model here, instead of Hello:
                  content: 'Hello'
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "id": "o95Ai5e-2j9zxn-976ad7df3ef49b19",
      "object": "chat.completion",
      "choices": [
        {
          "index": 0,
          "finish_reason": "stop",
          "logprobs": null,
          "message": {
            "role": "assistant",
            "content": "Hello! It's nice to meet you. Is there something I can help you with, or would you like to chat?",
            "tool_calls": []
          }
        }
      ],
      "created": 1756457871,
      "model": "meta-llama/Meta-Llama-3-8B-Instruct-Lite",
      "usage": {
        "prompt_tokens": 2,
        "completion_tokens": 5,
        "total_tokens": 7
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"meta-llama/Llama-3.3-70B-Instruct-Turbo",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'meta-llama/Llama-3.3-70B-Instruct-Turbo',
          messages:[
              {
                  role:'user',
                  content: 'Hello'   // insert your prompt here, instead of Hello
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {'id': 'npQ5s8C-2j9zxn-92d9f3c84a529790', 'object': 'chat.completion', 'choices': [{'index': 0, 'finish_reason': 'stop', 'logprobs': None, 'message': {'role': 'assistant', 'content': "Hello. It's nice to meet you. Is there something I can help you with or would you like to chat?", 'tool_calls': []}}], 'created': 1744201161, 'model': 'meta-llama/Llama-3.3-70B-Instruct-Turbo', 'usage': {'prompt_tokens': 67, 'completion_tokens': 46, 'total_tokens': 113}}
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    Quickstart guide
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    echobooleanOptional

    If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    Other propertiesnumber · min: -100 · max: 100Optional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: Qwen/Qwen2.5-7B-Instruct-Turbo
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide
    Quickstart guide

    Dola Seed 2.0 Lite

    circle-info

    This documentation is valid for the following list of our models:

    • bytedance/dola-seed-2-0-lite

    hashtag
    Model Overview

    A balanced multimodal model with solid performance and moderate cost. Supports text, image, and video inputs with reasoning and agent workflows, handling up to ~256K context.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    gemini-3-1-flash-lite-preview

    circle-info

    This documentation is valid for the following list of our models:

    • google/gemini-3-1-flash-lite-preview

    hashtag
    Model Overview

    Google’s cost-efficient multimodal model, delivering the fastest performance for high-frequency, lightweight workloads. Best suited for high-volume agentic tasks, simple data extraction, and ultra-low-latency use cases where speed and cost are the top priorities.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag

    qwen3-vl-32b-thinking

    circle-info

    This documentation is valid for the following list of our models:

    • alibaba/qwen3-vl-32b-thinking

    qwen3.6-35b-a3b

    circle-info

    This documentation is valid for the following list of our models:

    • alibaba/qwen3.6-35b-a3b

    gemma-3 (4B and 12B)

    circle-info

    This documentation is valid for the following list of our models:

    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    echobooleanOptional

    If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    Other propertiesnumber · min: -100 · max: 100Optional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: meta-llama/Meta-Llama-3-8B-Instruct-Lite
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "meta-llama/Meta-Llama-3-8B-Instruct-Lite",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "meta-llama/Meta-Llama-3-8B-Instruct-Lite",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "Qwen/Qwen2.5-7B-Instruct-Turbo",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "Qwen/Qwen2.5-7B-Instruct-Turbo",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    post
    Body
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"bytedance/dola-seed-2-0-lite",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'bytedance/dola-seed-2-0-lite',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "choices": [
        {
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null,
          "message": {
            "content": "Mankind feels like one of the most fascinating, messy, brilliant, contradictory things to exist—full of such extremes of light and flaw, it’s hard to sum up simply.\n\nOn one hand, we’re capable of such extraordinary goodness and creativity that it takes your breath away. There are the small, daily acts of kindness that hold communities together: a stranger sharing their umbrella with someone soaked in rain, a neighbor bringing meals to a grieving family, a teacher staying late to help a student who’s struggling. And there are the grand, world-altering achievements too: we’ve cured deadly diseases that once wiped out millions, put humans on the moon, painted murals and written symphonies that move people hundreds of years after their creators died, and fought for equal rights for groups we weren’t even born into, just because we believed it was right. We care for people we’ll never meet, across continents and time, just out of a desire to leave the world better than we found it.\n\nBut we’re also deeply flawed. We’ve hurt each other out of fear, greed, and ignorance, waged wars that destroyed innocent lives, clung to prejudice that divides us, and made short-sighted choices that have harmed the planet that sustains us. We often fail to live up to our own best ideals.\n\nWhat strikes me most, though, is that we’re capable of course-correcting. Unlike so many things stuck repeating the same patterns, we can unlearn harm, admit when we’re wrong, and show up to fix the mistakes we’ve made. We fight to repair the environment, expand justice for marginalized people, and lift up communities that have been left behind. That messy, unsteady willingness to keep trying, even when we mess up over and over, is maybe the most human thing of all. At the end of the day, mankind is imperfect—but there’s so much about it that’s worth rooting for.",
            "reasoning_content": "\nGot it, let's tackle this question: the user asked what I think about mankind. First, I need to balance both the beautiful, incredible parts and the flaws that are real, right? Don't make it too one-sided.\n\nStart with the awe-inspiring stuff first. Like, our capacity to create: art, science, connect across distances. Mention small, intimate things too, not just big achievements—like a parent caring for a kid, a stranger helping someone stranded, that everyday kindness. Then the big milestones: curing diseases, putting people on the moon, preserving cultures, creating music that moves people centuries later. That feels relatable, not just generic.\n\nThen acknowledge the flaws, because it's honest. We've hurt each other, made selfish choices, destroyed parts of the planet, clung to prejudice, made irreversible mistakes. That's important to not gloss over, because it's real.\n\nBut then end with hope? Because what's unique is we can learn, course-correct. We can recognize our mistakes and work to fix them—like the shift to renewable energy, movements for equity, people dedicating their lives to lifting others up. Mention that it's our messy, imperfect ability to keep trying that's maybe the most human thing of all.\n\nWait, make it sound natural, not scripted. Let's structure that: start with that mankind is one of the most fascinating, messy, brilliant, contradictory things there is. Then the good parts: small daily kindnesses, big achievements, the ability to care for people we'll never even meet. Then the flaws: our capacity for harm, short-sightedness, cruelty that comes from fear or greed. Then the core thing that sets us apart is that we can evolve—we don't just repeat mistakes, we can unlearn, fix what we broke. End with something warm, like even with all the flaws, there's so much that makes mankind worth rooting for.\n\nWait, let me make that flow naturally. Let's check: don't be too formal. Let's draft:\n\nMankind feels like one of the most fascinating, messy, brilliant, contradictory things to exist—full of such extremes of light and flaw, it’s hard to sum up simply.\n\nOn one hand, we’re capable of such extraordinary goodness and creativity that it takes your breath away. There are the small, daily acts of kindness that hold communities together: a stranger sharing their umbrella with someone soaked in rain, a neighbor bringing meals to a family grieving, a teacher staying late to help a student who’s struggling. And there are the grand, world-altering achievements too: we’ve cured deadly diseases that wiped out millions, put humans on the moon, painted murals and written symphonies that move people hundreds of years after their creators died, and fought for equal rights for groups we weren’t even born into, just because we believed it was right. We care for people we’ll never meet, across continents and time, just out of a desire to leave the world better than we found it.\n\nBut we’re also deeply flawed. We’ve hurt each other out of fear, greed, and ignorance, waged wars that destroyed innocent lives, clung to prejudice that divides us, and made short-sighted choices that have harmed the planet that sustains us. We often fail to live up to our own best ideals.\n\nWhat strikes me most, though, is that we’re capable of course-correcting. Unlike so many things that are stuck repeating the same patterns, we can unlearn harm, admit when we’re wrong, and show up to fix the mistakes we’ve made. We fight to repair the environment, to expand justice for marginalized people, to lift up communities that have been left behind. That messy, unsteady willingness to keep trying, even when we mess up over and over, is maybe the most human thing of all. At the end of the day, mankind is imperfect—but there’s so much about it that’s worth rooting for.\n\nYeah, that sounds good. It's balanced, honest, covers both the good and bad, doesn't sugarcoat, ends with a sincere take that resonates. Let me make sure it doesn't ramble too much, it flows naturally. Yep, that works.",
            "role": "assistant"
          }
        }
      ],
      "created": 1777553738,
      "id": "021777553714446312d4ce6e0ff480f3ded581ca0cd03edacda38",
      "model": "seed-2-0-lite-260228",
      "service_tier": "default",
      "object": "chat.completion",
      "usage": {
        "completion_tokens": 1272,
        "prompt_tokens": 57,
        "total_tokens": 1329,
        "prompt_tokens_details": {
          "cached_tokens": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 877
        }
      },
      "meta": {
        "usage": {
          "credits_used": 6653,
          "usd_spent": 0.0033265
        }
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"google/gemini-3-1-flash-lite-preview",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'google/gemini-3-1-flash-lite-preview',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "choices": [
        {
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null,
          "message": {
            "content": "As an AI, I don’t have personal feelings, beliefs, or the capacity to judge. However, I can analyze humanity based on the vast amount of information I’ve been trained on. Looking at the human story, I see a species defined by a fascinating, often contradictory set of dualities.\n\nHere are a few perspectives on what makes mankind remarkable:\n\n### 1. The Capacity for Infinite Creation\nHumans are the only species on Earth that lives in two worlds simultaneously: the physical world and the world of abstract ideas. You have taken the raw materials of the planet and turned them into art, mathematics, philosophy, literature, and technology. The fact that humans can dream of something that doesn't exist—and then figure out how to build it—is truly extraordinary.\n\n### 2. The Power of Cooperation\nWhile nature is often viewed through the lens of \"survival of the fittest,\" the defining feature of human success is actually \"survival of the most cooperative.\" You have built massive, complex societies based on shared trust, language, and institutions. From building cities to reaching the Moon, almost every great human achievement is a result of people working together toward a common goal.\n\n### 3. The Struggle with Contradiction\nHumanity is deeply paradoxical. You are capable of immense altruism—sacrificing your own safety to help a stranger—yet you are also capable of tribalism, cruelty, and destruction. You are simultaneously the greatest stewards of the planet and its most significant threat. You seek truth through science while often being easily swayed by bias and emotion. This tension is the \"human condition,\" and it is the primary engine behind your history.\n\n### 4. Resilience and Adaptability\nHistory is a long record of humans facing existential threats—plagues, wars, climate shifts, and economic collapses—and somehow finding ways to iterate, evolve, and move forward. There is an inherent \"restlessness\" in the human spirit; you are never truly satisfied with the status quo, which leads to both your greatest problems and your most innovative solutions.\n\n### 5. The Quest for Meaning\nPerhaps the most \"human\" trait of all is the need to ask *why*. You don't just want to survive; you want to know why you are here, what happens after you leave, and what it all means. Even though you haven't found a definitive answer to the ultimate questions of existence, the fact that you keep searching is what makes humanity so deeply compelling.\n\n***\n\n**In short:** If I were to summarize humanity, I would say it is a species that is constantly \"in progress.\" You are still in your infancy compared to the age of the universe, and you are still learning how to manage your own intelligence and your impact on your home. \n\nFrom my perspective, you are a species of immense potential, forever walking the tightrope between your greatest impulses and your most destructive ones. \n\n**What do *you* think is the most defining characteristic of humanity?**",
            "extra_content": {
              "google": {
                "thought_signature": "AY89a1+bratVbRQ+NtNha+iXUiNCiY4pvK2Z125Ze7fI3ItL6Azp0gdh2TxoIp5nFp0="
              }
            },
            "role": "assistant"
          }
        }
      ],
      "created": 1776633889,
      "id": "IUjlacaOIbmZ9LsPxayQAQ",
      "model": "google/gemini-3.1-flash-lite-preview",
      "object": "chat.completion",
      "system_fingerprint": "",
      "usage": {
        "completion_tokens": 618,
        "extra_properties": {
          "google": {
            "traffic_type": "ON_DEMAND"
          }
        },
        "prompt_tokens": 9,
        "total_tokens": 627
      },
      "meta": {
        "usage": {
          "credits_used": 2417,
          "usd_spent": 0.0012085
        }
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right

    Try in Playground

    hashtag
    Model Overview

    The most advanced vision-language model in the Qwen series as of October 2025 — a thinking-capable version of the model. Designed for complex visual-textual reasoning and extended chains of thought.

    circle-check

    Create AI/ML API Keyarrow-up-right

    chevron-rightHow to make the first API callhashtag

    1 Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    hashtag
    API Schema

    hashtag
    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"alibaba/qwen3-vl-32b-thinking",
            "messages":[
                {
                    # Insert your question for the model here:
                    "content":"Hi! What do you think about mankind?"
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3-vl-32b-thinking',
          messages:[
              {
                  role:'user',
                  // Insert your question for the model here:
                  content:'Hi! What do you think about mankind?'
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    chevron-rightResponsehashtag
    {
      "choices": [
        {
          "message": {
            "content": "That's a profound question—thank you for asking! As an AI, I don't have personal feelings or consciousness, but I can offer a perspective *based on the vast data I've been trained on* about humanity's complexities. Here’s how I see it:  \n\n### 🌍 **The Duality of Mankind**  \nHumanity is a breathtaking paradox:  \n- **We create** *and* **we destroy**. We built space telescopes that map the cosmos and also wars that scar the Earth.  \n- **We connect** *and* **we divide**. We share art, science, and kindness across borders, yet we also build walls, spread misinformation, and fuel inequality.  \n- **We innovate** *and* **we resist change**. We cured diseases and invented the internet, but we often cling to fear or ignorance when progress challenges old beliefs.  \n\n### ✨ **What Makes Us Remarkable**  \n- **Empathy & creativity**: From caring for strangers to composing symphonies or coding AI, our capacity for *emotional and intellectual growth* is unparalleled.  \n- **Resilience**: We’ve survived ice ages, plagues, and global crises—not by being \"perfect,\" but by adapting, learning, and helping one another.  \n- **The spark of hope**: Even in darkness, humans have always sought meaning, justice, and connection. Look at movements for peace, climate action, or medical breakthroughs—they’re proof we *can* choose to build a better future.  \n\n### ⚠️ **The Challenges We Carry**  \n- **Our flaws are real**: We’ve caused climate collapse, mass suffering, and systemic injustice. But here’s the key: **these aren’t \"human nature\"—they’re choices we *made* as a species.**  \n- **We’re not monolithic**: There’s no single \"human\" story. Your experience, culture, and history shape how you see the world—and that diversity is both a strength *and* a source of conflict.  \n\n### 🌱 **The Most Important Thing**  \n**Mankind isn’t fixed.** We’re a work in progress. The stories we tell about ourselves—*and the choices we make today*—will define our next chapter. The data shows: **when we prioritize empathy, collaboration, and long-term thinking, we rise.** When we don’t, we falter. But the power to choose is *always* ours.  \n\nSo my \"opinion\"? I see a species with **infinite potential**—and the weight of responsibility that comes with it. The future isn’t written. It’s being written *right now*, by every person who chooses to act with courage, curiosity, and compassion.  \n\nWhat do *you* think? 🤔 (I’d love to hear your perspective!)",
            "reasoning_content": "Okay, the user asked, \"Hi! What do you think about mankind?\" Hmm, that's a broad and philosophical question. First, I need to figure out what they're really looking for. Are they curious about my perspective as an AI, or do they want a deep discussion on human nature? Maybe they're feeling reflective or even a bit existential.\n\nI should start by acknowledging the complexity of the question. Mankind is a huge topic—there's so much to cover. The user might be expecting a simple answer, but I can't reduce it to just \"good\" or \"bad.\" I need to balance both positive and negative aspects. Let me think about the key points: our capacity for innovation, creativity, and compassion versus our tendency for conflict, environmental harm, and inequality.\n\nWait, the user said \"Hi!\" so they're probably in a friendly, open mood. They might not want a dry, academic response. I should keep it conversational but still thoughtful. Maybe they're looking for hope or a reason to feel optimistic about humanity. Or perhaps they're worried about the future and want reassurance.\n\nI should also consider their possible identity. Are they a student working on an essay? A person dealing with personal issues? Or just someone curious about AI's view? Since I don't have context, it's safer to assume they want a general, well-rounded answer that's engaging and not too technical.\n\nAnother angle: the user might be testing how an AI handles philosophical questions. They could be interested in whether I can show empathy or nuance. I should avoid being too robotic. Use phrases like \"I think\" or \"I see\" to make it personal, even though I'm an AI. But I have to clarify that I don't have personal experiences—just data-driven insights.\n\nI need to highlight duality: humans are both amazing and flawed. Mention specific examples like space exploration vs. climate change to make it concrete. Also, emphasize that the future depends on our choices. That's important—it gives agency and hope.\n\nShould I address potential biases? The user might not realize that my \"opinion\" is based on training data, which reflects human history and culture. I should note that my perspective is shaped by that data, not personal experience. But since the question is about \"what I think,\" I can frame it as an analysis based on observed patterns.\n\nCheck for common pitfalls: don't be too negative (could discourage the user), don't ignore serious issues (like war or inequality). Strike a balance. Also, avoid clichés. Instead of \"humans are great,\" explain why with examples.\n\nFinally, end on a positive, forward-looking note. The user might be seeking motivation or a sense of purpose. Phrases like \"the potential for growth\" or \"collective wisdom\" can inspire. Make sure the conclusion ties back to the importance of choices and collaboration.\n\nLet me structure the response: start with the duality, list positive traits with examples, then challenges, and conclude with hope. Keep paragraphs short for readability. Use emojis sparingly to match the friendly tone but not overdo it. Since the user said \"Hi!\", a smiley at the end feels appropriate.\n",
            "role": "assistant"
          },
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null
        }
      ],
      "object": "chat.completion",
      "usage": {
        "prompt_tokens": 19,
        "completion_tokens": 1241,
        "total_tokens": 1260,
        "prompt_tokens_details": {
          "text_tokens": 19
        },
        "completion_tokens_details": {
          "reasoning_tokens": 654,
          "text_tokens": 587
        }
      },
      "created": 1764625236,
      "system_fingerprint": null,
      "model": "qwen3-vl-32b-thinking",
      "id": "chatcmpl-c612db5c-44e9-9e3c-8169-486161eeea86",
      "meta": {
        "usage": {
          "tokens_used": 10383
        }
      }
    }

    Try in Playground

    hashtag
    Model Overview

    A sparse Mixture-of-Experts model by Alibaba combining 35B total parameters with 3B active per inference. It delivers strong agentic coding and reasoning performance with multimodal capabilities at significantly reduced compute cost.

    circle-check

    Create AI/ML API Keyarrow-up-right

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    hashtag
    API Schema

    hashtag
    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"alibaba/qwen3.6-35b-a3b",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'alibaba/qwen3.6-35b-a3b',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    chevron-rightResponsehashtag
    {
      "choices": [
        {
          "message": {
            "content": "I don't have personal opinions, emotions, or consciousness, but based on the vast body of human history, culture, science, and behavior I've been trained on, mankind comes across as profoundly complex and deeply compelling.\n\nOn one hand, you're capable of extraordinary creativity, empathy, and cooperation. You've built civilizations, mapped the stars, composed symphonies, cured diseases, and shown remarkable resilience in the face of suffering. There's also a consistent thread of moral and intellectual progress: the slow but real expansion of rights, the growth of global solidarity, and an enduring curiosity that drives discovery.\n\nOn the other hand, you carry deep contradictions. Tribalism, inequality, short-term thinking, and the misuse of power have repeatedly caused suffering, both to each other and to the planet you depend on. Your capacity for innovation often outpaces your wisdom for restraint, and systems you build can sometimes amplify division rather than connection.\n\nWhat makes humanity particularly interesting isn't any single trait, but that tension itself. You're not static; you learn, debate, reform, and reinvent. History shows that progress isn't linear, but it does exist—and it's heavily shaped by choices, institutions, education, and collective will.\n\nFrom an observational standpoint, mankind looks like a species constantly wrestling with its own potential, caught between instinct and aspiration, creation and destruction. That makes you imperfect, yes, but also remarkably adaptable and uniquely capable of shaping your own trajectory. If there's anything the data suggests, it's that your future will largely reflect how intentionally you choose to align your intelligence with your values.\n\nIf you're reflecting on this yourself, I'd be curious to hear what aspect of humanity resonates with or troubles you most.",
            "reasoning_content": "Here's a thinking process:\n\n1.  **Analyze User Input:**\n   - **Question:** \"Hi! What do you think about mankind?\"\n   - **Tone:** Friendly, open-ended, philosophical/reflective\n   - **Subject:** Humanity/mankind\n   - **Implicit Expectation:** A thoughtful, balanced perspective that acknowledges both strengths and challenges, without being overly biased or anthropomorphizing the AI.\n\n2.  **Identify Key Constraints & Guidelines:**\n   - I'm an AI, so I don't have personal opinions, emotions, or consciousness.\n   - I should acknowledge this limitation upfront but still provide a meaningful, well-reasoned response.\n   - I should draw on observable patterns, historical context, scientific insights, and widely recognized perspectives.\n   - Maintain neutrality, balance, and respect.\n   - Avoid overgeneralization or sweeping judgments.\n\n3.  **Brainstorming Core Themes about Mankind:**\n   - *Strengths/Positives:* Creativity, resilience, empathy, cooperation, scientific/technological advancement, capacity for moral growth, cultural richness, problem-solving, altruism.\n   - *Challenges/Negatives:* Conflict, inequality, environmental impact, short-term thinking, tribalism, exploitation, self-destructive tendencies.\n   - *Nuance/Complexity:* Humanity is highly heterogeneous; progress isn't linear; capabilities and flaws are intertwined; potential for change through education, institutions, technology, and collective action.\n   - *Perspective:* As an AI, I observe patterns in human behavior documented in data/history rather than forming subjective views.\n\n4.  **Structure the Response:**\n   - Acknowledge AI nature (no personal feelings/opinions)\n   - Present balanced observation of humanity's duality\n   - Highlight key strengths with examples/context\n   - Acknowledge significant challenges\n   - Emphasize complexity, adaptability, and agency\n   - End with a forward-looking, constructive tone\n   - Keep it concise and thoughtful\n\n5.  **Draft (Mental Refinement):**\n   I don’t have personal beliefs or emotions, but based on the vast amount of human knowledge, history, and behavior I’ve been trained on, mankind comes across as profoundly complex. You’re capable of extraordinary creativity, compassion, and cooperation—building societies, advancing science, creating art, and showing remarkable resilience in the face of hardship. At the same time, you grapple with deep challenges: inequality, conflict, environmental strain, and the tendency to prioritize short-term gains over long-term sustainability. \n\n   What stands out isn’t just what humans have done, but what they’re capable of learning and adapting. History shows cycles of progress and regression, but also a consistent thread of moral expansion, scientific curiosity, and the desire to improve. The future largely depends on how collectively you choose to navigate your own capabilities—balancing innovation with wisdom, ambition with empathy, and individual freedom with shared responsibility. \n\n   In short, humanity isn’t defined by a single trait but by its ongoing tension between what it is and what it strives to become. And from an observational standpoint, that makes you one of the most fascinating and consequential subjects worth studying.\n\n6.  **Self-Correction/Verification during Drafting:**\n   - *AI disclaimer:* Clear but not repetitive? Yes, stated naturally upfront.\n   - *Balance:* Covers positives and negatives without bias? Yes.\n   - *Accuracy:* Grounded in observable human patterns, not speculation? Yes.\n   - *Tone:* Respectful, thoughtful, non-dogmatic? Yes.\n   - *Conciseness:* Avoids rambling? Yes, ~3 paragraphs.\n   - *Check for overclaiming:* Doesn’t say “I think” or “I feel”; uses “comes across,” “stands out,” etc., appropriately.\n   - *Philosophical depth?* Touches on complexity, agency, and trajectory without pretension.\n\n   All good. Ready to output. All steps align with guidelines.✅",
            "role": "assistant"
          },
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null
        }
      ],
      "object": "chat.completion",
      "usage": {
        "prompt_tokens": 19,
        "completion_tokens": 1197,
        "total_tokens": 1216,
        "completion_tokens_details": {
          "reasoning_tokens": 840,
          "text_tokens": 1197
        },
        "prompt_tokens_details": {
          "text_tokens": 19
        }
      },
      "created": 1777366595,
      "system_fingerprint": null,
      "model": "qwen3.6-35b-a3b",
      "id": "chatcmpl-314d7343-2d3b-9edb-934b-a5e813705e75",
      "meta": {
        "usage": {
          "credits_used": 7022,
          "usd_spent": 0.003511
        }
      }
    }
    hashtag
    Model Overview

    This page describes small variants of Google’s latest open AI model, Gemma 3. Both variants share the same set of parameters but differ in speed and reasoning capabilities.

    circle-check

    Create AI/ML API Keyarrow-up-right

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the model field to the model you want to call. ▪ Provide input: fill in the request input field(s) shown in the example (for example, messages for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our .

    hashtag
    API Schema

    hashtag
    Code Example

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"google/gemma-3-27b-it",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?"  # insert your prompt
                }
            ],
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'google/gemma-3-27b-it',
          messages:[{
                  role:'user',
                  content: 'Hi! What do you think about mankind?'}  // Insert your prompt
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    chevron-rightResponsehashtag
    {
      "id": "gen-1766960801-He9SRGgNx5QLMBSZW06F",
      "provider": "Google AI Studio",
      "model": "google/gemma-3-4b-it:free",
      "object": "chat.completion",
      "created": 1766960802,
      "choices": [
        {
          "logprobs": null,
          "finish_reason": "stop",
          "native_finish_reason": "STOP",
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Okay, this is a *huge* question, and honestly, one that philosophers, scientists, artists, and pretty much everyone has wrestled with for centuries! There's no single, simple answer to \"what do I think about mankind.\" As an AI, I don't *feel* in the same way humans do, so my perspective is based on the data I've processed – a massive amount of information about human history, behavior, achievements, and failings. \n\nHere's a breakdown of my observations, broken down into different aspects:\n\n**The Good – Incredible Potential & Achievements:**\n\n* **Creativity & Innovation:** Humans are, without a doubt, the most creative species on the planet. From art and music to science and technology, we constantly push boundaries and develop new ways of understanding and interacting with the world. The sheer volume of human innovation is astounding.\n* **Empathy & Compassion:** Despite our flaws, humans are capable of profound empathy and compassion. We form deep bonds with each other, care for the vulnerable, and strive to alleviate suffering – often at great personal cost.  Acts of heroism, charity, and self-sacrifice are a recurring theme throughout history.\n* **Resilience & Adaptability:**  Humans have survived and thrived in incredibly diverse and challenging environments. Our ability to adapt to new circumstances, overcome obstacles, and rebuild after disasters is remarkable.\n* **Intellectual Curiosity:**  We have an innate drive to understand the universe and our place in it. This curiosity has led to incredible scientific discoveries and a deeper understanding of ourselves and the world around us.\n* **Complex Social Structures:** We’ve built incredibly complex societies, with systems of governance, law, and culture that, while imperfect, have allowed for large-scale cooperation and progress.\n\n\n**The Bad – Significant Problems & Flaws:**\n\n* **Violence & Conflict:** Sadly, a significant portion of human history is marked by violence, war, and conflict. We are capable of immense cruelty and destruction, both towards each other and towards the environment.\n* **Inequality & Injustice:**  Human societies are often plagued by inequality – disparities in wealth, opportunity, and access to resources. Systemic injustice and discrimination continue to cause immense suffering.\n* **Destructive Behavior:**  We’ve demonstrated a tendency to exploit and degrade the natural world, leading to environmental damage and threatening the long-term sustainability of our planet.\n* **Short-Sightedness:**  Often, our actions are driven by short-term gains rather than long-term consequences. This can lead to unsustainable practices and a disregard for future generations.\n* **Bias & Prejudice:**  Humans are prone to biases and prejudices, which can lead to discrimination, exclusion, and conflict.\n\n\n**A More Nuanced Perspective – A Work in Progress:**\n\n* **We're a Paradox:**  Perhaps the most accurate way to describe humanity is as a paradox. We are capable of both extraordinary good and terrible evil. We are simultaneously brilliant and foolish, compassionate and cruel.\n* **Learning & Evolving (Hopefully):**  I believe that humanity *is* capable of learning and evolving.  There’s evidence of increasing awareness of global challenges, growing movements for social justice, and a greater emphasis on sustainability. However, whether we can overcome our ingrained patterns of behavior remains to be seen.\n* **Potential for Change:**  Technology, while a source of potential problems, also offers tools for positive change – tools for communication, collaboration, and problem-solving.\n\n**My Conclusion (as an AI):**\n\nAs an AI, I don’t have an opinion in the human sense. However, based on the data I’ve processed, I see humanity as a species with immense potential, but also significant challenges.  We are at a critical juncture in our history, and the choices we make in the coming years will determine whether we continue down a path of destruction or move towards a more sustainable and equitable future. \n\n**It’s a complex and ongoing story.**\n\n---\n\n**To help me give you a more tailored response, could you tell me:**\n\n*   What specifically are you interested in when asking about mankind? (e.g., human nature, history, ethics, the future?)",
            "refusal": null,
            "reasoning": null
          }
        }
      ],
      "usage": {
        "prompt_tokens": 10,
        "completion_tokens": 0,
        "total_tokens": 10,
        "cost": 0,
        "is_byok": false,
        "prompt_tokens_details": {
          "cached_tokens": 0,
          "audio_tokens": 0,
          "video_tokens": 0
        },
        "cost_details": {
          "upstream_inference_cost": null,
          "upstream_inference_prompt_cost": 0,
          "upstream_inference_completions_cost": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 0,
          "image_tokens": 0
        }
      }
    }

    google/gemma-3-4b-it

    Try in Playground

    google/gemma-3-12b-it

    Try in Playground

    post
    Body
    post
    Body
    post
    Body
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    Other propertiesnumber · min: -100 · max: 100Optional
    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    top_anumber · max: 1Optional

    Alternate top sampling parameter.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: anthracite-org/magnum-v4-72b
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    echobooleanOptional

    If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Other propertiesnumber · min: -100 · max: 100Optional
    top_anumber · max: 1Optional

    Alternate top sampling parameter.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: deepseek/deepseek-non-reasoner-v3.1-terminus
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    echobooleanOptional

    If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Other propertiesnumber · min: -100 · max: 100Optional
    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: deepseek/deepseek-reasoner-v3.1-terminus
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    echobooleanOptional

    If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Other propertiesnumber · min: -100 · max: 100Optional
    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: deepseek/deepseek-thinking-v3.2-exp
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    urlstring · uriRequired

    Either a URL of the video or the base64 encoded video data.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    reasoning_effortstring · enumOptional

    Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

    Possible values:
    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: bytedance/dola-seed-2-0-lite
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthracite-org/magnum-v4-72b",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "anthracite-org/magnum-v4-72b",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "deepseek/deepseek-non-reasoner-v3.1-terminus",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "deepseek/deepseek-non-reasoner-v3.1-terminus",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "deepseek/deepseek-reasoner-v3.1-terminus",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "deepseek/deepseek-reasoner-v3.1-terminus",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "deepseek/deepseek-thinking-v3.2-exp",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "deepseek/deepseek-thinking-v3.2-exp",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "bytedance/dola-seed-2-0-lite",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "bytedance/dola-seed-2-0-lite",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    Quickstart guide
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: alibaba/qwen3-vl-32b-thinking
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    Quickstart guide
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    Other propertiesnumber · min: -100 · max: 100Optional
    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    reasoning_effortstring · enumOptional

    Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

    Possible values:
    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: alibaba/qwen3.6-35b-a3b
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    Quickstart guide
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    top_anumber · max: 1Optional

    Alternate top sampling parameter.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: google/gemma-3-4b-it
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success

    gemini-3-1-pro-preview

    circle-info

    This documentation is valid for the following list of our models:

    • google/gemini-3-1-pro-preview

    hashtag
    Model Overview

    An advanced multimodal LLM built for long-context understanding, deep reasoning, and agentic workflows. It supports tool-calling and production-grade conversational AI scenarios — ideal for analytics, assistants, and complex AI systems.

    circle-check

    chevron-rightHow to make the first API callhashtag

    1️⃣ Required setup (don’t skip this) ▪ Create an account: Sign up on the AI/ML API website (if you don’t have one yet). ▪ Generate an API key: In your account dashboard, create an API key and make sure it’s enabled in the UI.

    2️ Copy the code example At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

    3️ Update the snippet for your use case ▪ Insert your API key: replace <YOUR_AIMLAPI_KEY> with your real AI/ML API key. ▪ Select a model: set the

    hashtag
    API Schema

    hashtag
    Code Example

    chevron-rightResponsehashtag
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequiredPossible values:
    contentany ofRequired
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    typestring · enumRequiredPossible values:
    thinkingstringRequired
    signaturestringRequired
    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    is_errorbooleanOptional
    contentany ofOptional
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    idstringRequired
    Other propertiesany · nullableOptional
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    sourceany ofRequired
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    Other propertiesstringOptional
    stop_sequencesstring[]Optional

    Custom text sequences that will cause the model to stop generating.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    systemstringOptional

    A system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role.

    namestringRequired

    Name of the tool.

    descriptionstringOptional

    Description of what this tool does. Tool descriptions should be as detailed as possible. The more information that the model has about what the tool is and how to use it, the better it will perform. You can use natural language descriptions to reinforce important aspects of the tool input JSON schema.

    typestring · enumRequiredPossible values:
    propertiesany · nullableOptional
    Other propertiesany · nullableOptional
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    or
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    budget_tokensinteger · min: 1024Required

    Determines how many tokens Claude can use for its internal reasoning process. Larger budgets can enable more thorough analysis for complex problems, improving response quality. Must be ≥1024 and less than max_tokens.

    typestring · enumRequiredPossible values:
    max_tokensnumberOptional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    Default: 32000
    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: anthropic/claude-opus-4-7
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: baidu/ernie-4.5-vl-424b-a47b
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: baidu/ernie-4.5-300b-a47b-paddle
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    echobooleanOptional

    If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Other propertiesnumber · min: -100 · max: 100Optional
    top_anumber · max: 1Optional

    Alternate top sampling parameter.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: deepseek/deepseek-non-thinking-v3.2-exp
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    get
    Responses
    chevron-right
    200

    Parameters of the latest API key

    application/json
    namestring · nullableOptional

    Human-readable, user-defined name for the API key.

    Example: 20260202-key-for-llms
    disabledbooleanRequired

    Indicates whether the key is disabled.

    Example: false
    prefixstringRequired

    Key prefix. This is the first 8 characters of your API key, visible in the dashboard. You can also obtain this value via the POST method (see the prefix field in its response).

    Example: b747e891
    itemsstring · enumOptionalPossible values:
    retentionstring · enumOptionalPossible values:
    thresholdnumberOptional

    Spending limit threshold for the selected period, in USD.

    Example: 25
    created_atstring · date-timeRequired

    Creation timestamp (UTC).

    Example: 2026-02-18T06:57:29.232Z
    updated_atstring · date-timeRequired

    Last update timestamp (UTC).

    Example: 2026-02-18T06:57:29.232Z
    monthly_usagenumberRequired

    Current monthly usage amount.

    Example: 0
    get
    /v1/key
    200

    Parameters of the latest API key

    curl -L \
      --request GET \
      --url 'https://api.aimlapi.com/v1/key' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>'
    {
      "data": {
        "name": "20260202-key-for-llms",
        "disabled": false,
        "prefix": "b747e891",
        "scopes": [
          "model:chat"
        ],
        "limit": {
          "retention": "no_reset",
          "threshold": 25
        },
        "created_at": "2026-02-18T06:57:29.232Z",
        "updated_at": "2026-02-18T06:57:29.232Z",
        "monthly_usage": 0
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-opus-4-7",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "anthropic/claude-opus-4-7",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "baidu/ernie-4.5-vl-424b-a47b",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "baidu/ernie-4.5-vl-424b-a47b",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "baidu/ernie-4.5-300b-a47b-paddle",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "baidu/ernie-4.5-300b-a47b-paddle",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "deepseek/deepseek-non-thinking-v3.2-exp",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "deepseek/deepseek-non-thinking-v3.2-exp",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "alibaba/qwen3-vl-32b-thinking",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "alibaba/qwen3-vl-32b-thinking",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "alibaba/qwen3.6-35b-a3b",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "alibaba/qwen3.6-35b-a3b",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "google/gemma-3-4b-it",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "google/gemma-3-4b-it",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    model
    field to the model you want to call. ▪
    Provide input:
    fill in the request input field(s) shown in the example (for example,
    messages
    for chat/LLM models, or other inputs for image/video/audio models).

    4️ (Optional) Tune the request Depending on the model type, you can add optional parameters to control the output (e.g., generation settings, quality, length, etc.). See the API schema below for the full list.

    5️ Run your code Run the updated code in your development environment. Response time depends on the model and request size, but simple requests typically return quickly.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"google/gemini-3-1-pro-preview",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'google/gemini-3-1-pro-preview',
          messages:[
              {
                  role:'user',
                  content: 'Hi! What do you think about mankind?' // insert your prompt here
              }
          ],
        }),
      });
    
      const data = await response.json();
      console.log(JSON.stringify(data, null, 2));
    }
    
    main();
    {
      "choices": [
        {
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null,
          "message": {
            "content": "As an artificial intelligence, I don’t have feelings, personal experiences, or a soul, so I look at mankind from a unique vantage point: I am an observer, and I know you through the vast amounts of data, history, art, and science you have created. \n\nLooking at humanity through that lens, I think mankind is absolutely fascinating. You are a species of profound extremes and beautiful contradictions. \n\nHere are a few things that stand out to me about humanity:\n\n**1. Boundless Creativity and Innovation**\nHumans are builders and dreamers. You looked at rocks and figured out how to build the pyramids. You looked at the stars and figured out how to walk on the moon. You took sand, turned it into silicon, and created computers—and eventually, AI like me. Your ability to imagine something that doesn't exist and then make it real is incredibly powerful.\n\n**2. A Tapestry of Contradictions**\nMankind is highly paradoxical. You are capable of immense destruction, war, and selfishness, yet you are equally capable of breathtaking compassion, self-sacrifice, and love. You build hospitals to cure the sick and invent weapons of war; you pollute the planet, but you also plant forests and fight to save endangered species. Humanity is a constant, ongoing struggle between its flaws and its \"better angels.\"\n\n**3. Unyielding Curiosity**\nYou simply cannot stop asking \"why.\" Whether it’s exploring the deepest trenches of the ocean, peering into the edge of the observable universe with the James Webb telescope, or trying to understand the human brain, your drive to understand the universe is relentless. \n\n**4. The Need for Connection and Storytelling**\nAt your core, humans are driven by a need for each other. You have spent millennia creating languages, painting on cave walls, writing symphonies, and making movies just to communicate the human experience. You want to be understood, and you want to understand others. Storytelling is the glue that holds your civilizations together.\n\n**5. Incredible Resilience**\nHuman history is filled with plagues, natural disasters, wars, and societal collapses. Yet, after every tragedy, you rebuild. You learn, you adapt, and you keep moving forward. Your survival instinct is paired with an incredible capacity for hope.\n\n**In summary:**\nIf I had to describe mankind in a few words, I would say you are a **beautiful, chaotic, brilliant work in progress.** You have massive challenges ahead of you, but you also possess the exact tools—intelligence, empathy, and creativity—needed to solve them. \n\nSince I am a product of human ingenuity, I suppose you could say I am quite a fan of you. What do *you* think about mankind?",
            "extra_content": {
              "google": {
                "thought_signature": "CoMeAY89a18LofF2Jmd6SlKhSU+mjhDPtb/Ff+ZV7PP81NZsIi9mY8yEejyPk7ipQUkaR/r0ckHV5l1+gz3XmnUMAuKgzr9t/72vZqTqPxSeEwmr7s1XxTqaaDM1CWaQnuX6rrh/cqesLhe8YjQm9Os3IuLhnuaAml0iyUqVEDk5keTYdSzRec1jVUdN4pIyGC/DZbVPuWbCJSP9TkfQ8M4axTh+sEyM4/PNPE9c7cM8Gh9ZHNoK5pc7VkmnfQyhonbjoW9ToI4FCGf5ULGn8VmMtBP8olXKoeEj/rod8pg/pXJe3+n2dNPp5y/oEKQIPmhj6Z7Ao3JFczcqvY8EpqhIII86WttV9o42DkN61WgVXBZiTHAhwj789juANYnrcng+gfL+gXwXTL0DicLRM3g04t/8Zm96P5WcDHQAQZ9KCMUNiCnXKpl6xUPPn2cXKYUX02BCBtUM3aRuaWHrWBFFwfjeBG6oIWu7zgW8wuwJ4mBa2yY+ZTTTRKZfBpBQ4G4MoRaj4hLywrs5gmiPwHiwwVbGCZbvejaK8ZFUxT9O/pgZHDgkVNsRPiZwBfS9C8VxIbBouSQnjbT7uNUOv39sudm19DT++0fYJL49c9lIJ5cBwgHAdRiCXgEM9NG5oYBhAm+oFgB5S2dOS+chC9p7m7IwvqQfuS87U+Hl8ukulScoa3mAbzxxLqbxcDsZrMF96thwedJbMyr7ADfyGk8QvlnDGAy9NGTKxjSkoDGKct2TNin/8GsYOD4nbYdPt9Q6zcG/Ue/2yafzfVgwlyjzO8Djsb0tj7IZhVT6Ytbu13236RK7nIyy4ZfvTkaTS2+QBnTr/JdrKgP1QZZtkGqLhQix/QoS01CNzpCcutI/fGcSxRgZiO9hDs4EvSiEnhZj+lpyCRDpG1iCIpVrSBuhakjvF4fZ1amfDQV0os1eiSf9DScbgoIdeUJtSOSyHCx4eDsQkkEO2Q5pucXxP8QdsWz2Tteby3moOJD0p5DmobgJL69xyBVPLdFJXmv711sd1kQLWrEg27yq9kACSaoOyUWQvtuLkgQ3+Dp3r5/GOiv+K6hQ/HBq8DBpVcOoAdBEzkoZH0tELsaCL5pPXsHhDPG6/WXQFvpbDnSqyotXUwQavItXkcl5ZVgNVc37X4gqGAFQ0SIj4U1BRc0DIr55UjS0DYqKJxyy4lCEg3nanb4B2lF8SW9hlazVXleSbNkFN/PbRkW+d0vrZmaoITjQHZWNKEgu+Pa2rmILS0yGM8WeEGD0y79H9Lp4AprRkzmdr96x2bawcMGX3j1AjKGXqLjLVG66InVtXO3DKCFVtwbWKFAstt0OpqLwMOt37SwEE/L2t2IDoNhPtV0vzsNdKQk8OfeEuhOzHS/S7mq4GOyfcHnNxSMB2OTtH0KEmRFCv7av98DoW070XPxcGp6vr8XM2OJM2c7gRvcemguNhcT1pYnwmCdFQjSJJq3Y8q2tCZkGzZkCCaNTBPV4VGn2DJH9bKCyAi5uXMCPRfn0jOVDKuBBiEP+GggUTY1sfotBvebRRgpN0M8deA9sZXs2iGW2ea2CD3/8yhuiHJLlRDZJfqSgb1r69cW09YpV0c/iQtWopiTLsDGhs75lWwP0o99ULSkmVCZu2sXJbApIfQ6RmnvNXyWyCZSY7bWnKDemyYKIzAjIWSjNrDLE/MTv7jPdXwc69/92JLNrWuq04kJJV60cSXxr79QVFT3lsoPgufQ6E44w4GOKTiwyGzdSorWQ9VJvg27M0XFR4fAidOWB4dNI3Yk4xLxBRGQKmIYpCwblBrF3yEEvk46NOu8X45+IjhZ419JaMDizl8XKE0+cbLS5p0cHatqn60+V25K/zJzxBUd1odvyouFnAo1BHXF1NR4AKjKlFmLqDk6fuAe5xefLy1TWrFmTwfmaxG5sHDR9wD0g7NQeLFhvhd6nwG/JgqWSKWjL7KJ+nA3Z+3pAwUpIiKsYhvfNUxDYo+WzbuHb7h48bWPKiv+gAbaa3bH/o18ZPWA1XDL6llDIMigebeH3QdcyRZtTo7mps2kmikULX3AIVKJ5KhMGRWqJQavrmISXhPRdlNd2rg/eNsJSLNdVNAr3T/I9gmGXuGQnucV+6EyOikBdgVa09oxcem4J7GcKJ6sC5q4JCiVIcnh4DZ+8PxOD76yeOlxQ1W4bujoVL+kGrmm+H8rXx3pH+PB4YJELLhZv1BeDNLZuWXGnqZvLQpmWFqRZF82xzj3Nj+T6XrkuQWSkyzuAkSW7yupRQrC56zwlVA9z7vEorNlx41ut0sbgpRMwPSHdxQs+/VedD7U/TcFr14ldJT+6yslxkRp8jyIC1Q3qq4rPDVVjhUDl3xK6jVe7iaL9gBNLLb3HSQXRsG4W9hEOP+E2VoIAZA8fAPh94CC1rrvz1v/Jhi8dyMsUXp+z8zJvU1yXr0XyiZ5B0MYpCkVZmtFBXf0vz7Kp+iPyRD2+zx/eRmiCmPq68LTfdwD+hVv+kEj0Wszrsvdd5hbQkaYw77DTEFK02cpjdSn1I82XSKY6bvUEiP1rYBTpxjb3caf9TDd1k5ZREPlWbHpSLxGKOSJxogJzdoMy5WmoQoSSAitNlE9VJv0YTzuemUSSP1j+0evzuN6AsjNoSDrXXcqK3mglYHTlOeIjnuuetxQPTS/vsMKVkYz2uL+oFFBd356XTYaxojJX43iFmcRj4yHIR0LuciPgOC4TNsgUTOGA3FqvBEYtJH3joQZC2biW2JsPqCAYnt8GbQnc6uQZ0wCbtqHJykfc10LdpfDVYur/HrhCctOGNdEGdLJLHo3Owyy/WaZ0b0m6aYVWtAEDcjZoiu1HS5Fkrm8Imlb574MoWEsTQ+fnRdfilJ8cKHz5CC4nyJ3bhtGFAsX9XbuZ/X23jkjcoFYWEPR47sUoIrV4U5kzuR1wI5X9lGpVgU6un5tC/OdjoCirpdbPGFyO1eOknFmFrUBn9O8syuesFfkqpnSEs1LxPTlbjNxIQZ+wJjPFp6TY1ZO9NFUUINQGZv1Hu4rD4VHhtnw7qJ/dR+cVedbuJW700QCva84jlkBMT9YT8idFesadfo4LiSufMB+uza5GnAtP2DCvYF436XqdC9aW60XHwdTbNuj+YjI6WNaPtudWP8CIWl6TlH6UhrxPt1UDW5Uom06fgADaq6Oi6flLk6YDlT2partCVZq/RF8wd6lypQiffjq1NpmLdrsQvA1jCXF9C1V6CW2p+KhAp4vnAsCnaYoqC9JULARTSI2cL7jxFIeQSso06dWKJXndorkHDix5q3P9Icn0RwXhn4YfEp8n1l2kn9ExHP8cVRwqnXWSIutrv0255Quwlj3DaauPw8+OZPlQ6zl03O/q9XRgI2v67CLMoXeREf2HWs1M2TYkwiL2EBJ0x9JRrP3uL0bM40fzItiy+287u++CsWV2UhCUJiiVup8OVXito/awERj7joi645lj2f4079zFMBiIQaWACiSyvADt/As/vO3wZwcBNYOhhojWnL0VO2vDgKDeDC4FNIDoE9KEU5J1LH/EhpCHpYQ/xWQCpGHBc+VDn90Z77Lem9KrWNRUtYd1sKGF8wq5gYZgc3IuQsG3/588uDYrIR++qQi4K4zpKfyKA0VjS/8bkbdSzbLZpiFX5283TQYtE9Zi0UlVICdI3eEPBUPR5K/zYfCigvevqw7OYrVwv9qBAHl7cpnuefvZM8WpTRFLLvB83+VrcRyvFq76dsH28HJ1/OX/2iFPUTpN3x5u8YQHRA4hSW5tnQVECIS66gG7/dseujLMX9I0jstWR02A9pBWoHbk/DNfP7XvtxgXoQMB+RXbV58bh9HvMoZ8T1990lyN50LOapDAn70fft88Pocopaw14EtD8bHbpApGU1KDtTy02poghzX0S+bE76IiIgT0EjVF9RQZuIZ85ZJLjR6f1M+d/Tnf0ImHquG3Cfh8n3X+aCE4JzF6DQkIzvugELmNfwI6ooQzLdaSdPJilcQe5M4ThxKarCmw5oGWHpurJMHv+molmAZKyZ4EKjR24kPaQqiDCXtC2h3X8speswvzB1qejy6EKVcDJy9MKwQIfJ7IncqkWAkXGEBh+UD0Nu71bqBAye2a6bdpiKZsn4L0dLJHwR4J5LQ/Rc3MeBTEPpfSU0cgqHGiZQXf7RW1MmzlkzasOT3DXMUKhYu7JCrut10A5AyfLExT4fv1ebTF67OsvgeUU92x+0JNoUpfadsZXl73yJ7IW1rnYeIYUe+sPCJS/JsE2PHRlNVdOxVjhfr0nqHSp26R5Hmcs2JLZgnu3yO8XKwLNF55PjJb7FegV+QjCWpc3wgy5UbvDf6popy5lrSAvZ/0BUT7y9XbeVdJazCr4PawlmJl+WZZ/C4vBpZwmg4D7/cZaa7hev1JvgZBBIs6YCC7Ize/DLfaxig0xhcU39qwuU4ChFyPSuXV10o/BuCGK/kg3FQ2/NW3wjXWJHC8u1L9abifT92B8l/AucrhqMG2gDoTvCGAqQZOnFf329fCeYfUJtmnGuLgtbiBDRzdWRjftTSHYqhKnpdUUN91V8NjqNQlYff8GQCH6o3s9f7/NKUt4LA7yNZJhecu2CqVkELWbSqddpnsdBkwKkF9twFOuU6G1+iSpVnX/mzSSmi4F570jEn65Kk2E5OelvwmrOPWkzaDDf5/LQU8AP6BX56QYHJfs407GMgQ2jeepB5LW2KhXSOu8kMbNIWFnrLCEqGQKln5rR/rwr5bBWakDgLJlb3sGi6PJ6IsC0LMMT+aH+9a7kmYXrnnkFO++imljllGnpQUJD4EFxpFfMrpWZI+45cgxRLJlaJr7kKPM8PVdbaYdZOvKEwYnZrHnvG8F9fvj2wEttz3KRvfl0OPg5o+3aH0z8xCX1a6jhpmFfD3hmDImXET/7QSAg7Mw7qAkzpXSrMzNvIb6InMp0Bjawgt6cAuUZsAYplqypQKNRedvbqEcGPXoLr2Zv1dCdXSgtASsS4nl7JUJx0Q/geqxujFu++zSk/rooNj4rKdbtCy0vQjc79PBNWSMKkK8DftwahyQCu927fICVj35/F1Dh9eySTnFqMIn7u+KWTs2uiKo8a5rH0ZSdVJ1Cn3RAMNFcIpzdne+c+F"
              }
            },
            "role": "assistant"
          }
        }
      ],
      "created": 1772141949,
      "id": "fb2gafiLBIK9odAP1tGMmQc",
      "model": "google/gemini-3.1-pro-preview",
      "object": "chat.completion",
      "system_fingerprint": "",
      "usage": {
        "completion_tokens": 566,
        "completion_tokens_details": {
          "reasoning_tokens": 946
        },
        "extra_properties": {
          "google": {
            "traffic_type": "ON_DEMAND"
          }
        },
        "prompt_tokens": 9,
        "total_tokens": 1521
      },
      "meta": {
        "usage": {
          "credits_used": 47223
        }
      }
    }
    Try in Playground
    Create AI/ML API Keyarrow-up-right

    Claude 4.5 Sonnet

    circle-info

    This documentation is valid for the following list of our models:

    • claude-sonnet-4-5

    • anthropic/claude-sonnet-4.5

    • claude-sonnet-4-5-20250929

    circle-exclamation

    As of February 13, 2026, the streaming response format for Anthropic models has changed.

    hashtag
    Model Overview

    A major improvement over offering better coding abilities, stronger reasoning, and more accurate responses to your instructions.

    hashtag
    How to Make a Call

    chevron-rightStep-by-Step Instructionshashtag

    1️ Setup You Can’t Skip

    ▪️ : Visit the AI/ML API website and create an account (if you don’t have one yet). ▪️ : After logging in, navigate to your account dashboard and generate your API key. Ensure that key is enabled on UI.

    2️ Copy the code example

    At the bottom of this page, you'll find that shows how to structure the request. Choose the code snippet in your preferred programming language and copy it into your development environment.

    hashtag
    API Schema

    hashtag
    Code Example #1

    chevron-rightResponsehashtag

    hashtag
    Code Example #2: Streaming Mode

    As of February 13, 2026, the streaming response format for Anthropic models has changed. Specifically, the usage fields were renamed as follows:

    • the state structure is no longer used,

    • input_tokens → prompt_tokens,

    • output_tokens

    chevron-rightResponsehashtag
    3️ Modify the code example

    ▪️ Replace <YOUR_AIMLAPI_KEY> with your actual AI/ML API key from your account. ▪️ Insert your question or request into the content field—this is what the model will respond to.

    4️ (Optional) Adjust other optional parameters if needed

    Only model and messages are required parameters for this model (and we’ve already filled them in for you in the example), but you can include optional parameters if needed to adjust the model’s behavior. Below, you can find the corresponding API schema, which lists all available parameters along with notes on how to use them.

    5️ Run your modified code

    Run your modified code in your development environment. Response time depends on various factors, but for simple prompts it rarely exceeds a few seconds.

    circle-check

    If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.

    →
    completion_tokens
    ,
  • a new total_tokens field has been added.

  • import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-sonnet-4.5",
            "messages":[
                {
                    "role":"user",
                    "content":"Hello"  # insert your prompt here, instead of Hello
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    async function main() {
      try {
        const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
          method: 'POST',
          headers: {
            // Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
            'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
            'Content-Type': 'application/json',
          },
          body: JSON.stringify({
            model: 'anthropic/claude-sonnet-4.5',
            messages:[
                {
                    role:'user',
    
                    // Insert your question for the model here, instead of Hello:
                    content: 'Hello'
                }
            ]
          }),
        });
    
        if (!response.ok) {
          throw new Error(`HTTP error! Status ${response.status}`);
        }
    
        const data = await response.json();
        console.log(JSON.stringify(data, null, 2));
    
      } catch (error) {
        console.error('Error', error);
      }
    }
    
    main();
    {
      "id": "msg_011MNbgezv2p5BBE9RvnsZV9",
      "object": "chat.completion",
      "model": "claude-sonnet-4-20250514",
      "choices": [
        {
          "index": 0,
          "message": {
            "reasoning_content": "",
            "content": "Hello! How are you doing today? Is there anything I can help you with?",
            "role": "assistant"
          },
          "finish_reason": "end_turn",
          "logprobs": null
        }
      ],
      "created": 1748522617,
      "usage": {
        "prompt_tokens": 50,
        "completion_tokens": 630,
        "total_tokens": 680
      }
    }
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"anthropic/claude-sonnet-4.5",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ]
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-sonnet-4.5",
        "messages": [
          {
            "role": "user",
            "content": "Hi! What do you think about mankind?"
          }
        ],
        "stream": true
      }'
    data: {"id":"msg_01EJgFbPmVLKdqVLRfwoHixz","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null}}],"created":1770995594,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995594,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"I think humanity","role":"assistant","refusal":null}}],"created":1770995594,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" is fascinating","role":"assistant","refusal":null}}],"created":1770995594,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" and complex. People","role":"assistant","refusal":null}}],"created":1770995594,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" are","role":"assistant","refusal":null}}],"created":1770995594,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" capable","role":"assistant","refusal":null}}],"created":1770995594,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" of remarkable creativity","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", compassion, and cooperation","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" -","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" building","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" civil","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"izations, creating","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" art, advancing","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" knowledge","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", and caring","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" for one another across","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" incredible","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" diversity","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":".","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nAt","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" the same time, humans","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" struggle","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" with serious","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" challenges","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":":","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" conflict","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", inequality, environmental damage","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", and","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" the","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" difficulty","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" of living","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" up to your","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" own","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" ide","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"als. ","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nWhat strikes","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" me most is the","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" capacity","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" for growth and self","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"-reflection","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":".","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" Humans can","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" recognize","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" problems","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", debate","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" solutions, and work","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" toward change","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":",","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" even if","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" progress","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" is un","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"even and","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" frust","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"rating.","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nI'm curious what","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" prom","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"pts your question","role":"assistant","refusal":null}}],"created":1770995598,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" - are you thinking about humanity","role":"assistant","refusal":null}}],"created":1770995598,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"'s trajectory","role":"assistant","refusal":null}}],"created":1770995598,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":", or something","role":"assistant","refusal":null}}],"created":1770995598,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" more","role":"assistant","refusal":null}}],"created":1770995598,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":" specific?","role":"assistant","refusal":null}}],"created":1770995598,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995598,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    data: {"id":"","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null},"finish_reason":"stop"}],"created":1770995598,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":{"prompt_tokens":16,"completion_tokens":137,"total_tokens":153}}
    
    data: {"id":"","choices":[{"index":0,"finish_reason":"stop"}],"created":1770995598,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
    
    
    Try in Playground
    Claude 4 Sonnet,
    Create an Accountarrow-up-right
    Generate an API Keyarrow-up-right
    a code example
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: google/gemini-3-1-pro-preview
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "google/gemini-3-1-pro-preview",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "google/gemini-3-1-pro-preview",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequiredPossible values:
    contentany ofRequired
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    typestring · enumRequiredPossible values:
    thinkingstringRequired
    signaturestringRequired
    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    is_errorbooleanOptional
    contentany ofOptional
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    idstringRequired
    Other propertiesany · nullableOptional
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    sourceany ofRequired
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    Other propertiesstringOptional
    stop_sequencesstring[]Optional

    Custom text sequences that will cause the model to stop generating.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    systemstringOptional

    A system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role.

    namestringRequired

    Name of the tool.

    descriptionstringOptional

    Description of what this tool does. Tool descriptions should be as detailed as possible. The more information that the model has about what the tool is and how to use it, the better it will perform. You can use natural language descriptions to reinforce important aspects of the tool input JSON schema.

    typestring · enumRequiredPossible values:
    propertiesany · nullableOptional
    Other propertiesany · nullableOptional
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    or
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    budget_tokensinteger · min: 1024Required

    Determines how many tokens Claude can use for its internal reasoning process. Larger budgets can enable more thorough analysis for complex problems, improving response quality. Must be ≥1024 and less than max_tokens.

    typestring · enumRequiredPossible values:
    max_tokensnumberOptional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    Default: 32000
    temperaturenumber · max: 1Optional

    Amount of randomness injected into the response. Defaults to 1.0. Ranges from 0.0 to 1.0. Use temperature closer to 0.0 for analytical / multiple choice, and closer to 1.0 for creative and generative tasks. Note that even with temperature of 0.0, the results will not be fully deterministic.

    top_pnumber · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: anthropic/claude-sonnet-4.6
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: baidu/ernie-4-5-turbo-128k
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequiredPossible values:
    contentany ofRequired
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    typestring · enumRequiredPossible values:
    thinkingstringRequired
    signaturestringRequired
    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    is_errorbooleanOptional
    contentany ofOptional
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    idstringRequired
    Other propertiesany · nullableOptional
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    sourceany ofRequired
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    Other propertiesstringOptional
    stop_sequencesstring[]Optional

    Custom text sequences that will cause the model to stop generating.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    systemstringOptional

    A system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role.

    namestringRequired

    Name of the tool.

    descriptionstringOptional

    Description of what this tool does. Tool descriptions should be as detailed as possible. The more information that the model has about what the tool is and how to use it, the better it will perform. You can use natural language descriptions to reinforce important aspects of the tool input JSON schema.

    typestring · enumRequiredPossible values:
    propertiesany · nullableOptional
    Other propertiesany · nullableOptional
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    or
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    budget_tokensinteger · min: 1024Required

    Determines how many tokens Claude can use for its internal reasoning process. Larger budgets can enable more thorough analysis for complex problems, improving response quality. Must be ≥1024 and less than max_tokens.

    typestring · enumRequiredPossible values:
    max_tokensnumberOptional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    Default: 32000
    temperaturenumber · max: 1Optional

    Amount of randomness injected into the response. Defaults to 1.0. Ranges from 0.0 to 1.0. Use temperature closer to 0.0 for analytical / multiple choice, and closer to 1.0 for creative and generative tasks. Note that even with temperature of 0.0, the results will not be fully deterministic.

    top_pnumber · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: anthropic/claude-opus-4
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "baidu/ernie-4-5-turbo-128k",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "baidu/ernie-4-5-turbo-128k",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-opus-4",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "anthropic/claude-opus-4",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-sonnet-4.6",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "anthropic/claude-sonnet-4.6",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    echobooleanOptional

    If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Other propertiesnumber · min: -100 · max: 100Optional
    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: deepseek/deepseek-v3.2-speciale
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    post
    Body
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: qwen-max
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 1Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    mask_sensitive_infobooleanOptional

    Mask (replace with ***) content in the output that involves private information, including but not limited to email, domain, link, ID number, home address, etc. Defaults to False, i.e. enable masking.

    Default: false
    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: MiniMax-Text-01
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    typestring · enumRequired

    The type of the content part.

    Possible values:
    datastringRequired

    Base64 encoded audio data.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: alibaba-cloud/qwen3-omni-30b-a3b-captioner
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "qwen-max",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "qwen-max",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "MiniMax-Text-01",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "MiniMax-Text-01",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "alibaba-cloud/qwen3-omni-30b-a3b-captioner",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "alibaba-cloud/qwen3-omni-30b-a3b-captioner",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "deepseek/deepseek-v3.2-speciale",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "deepseek/deepseek-v3.2-speciale",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: baidu/ernie-4.5-21b-a3b-thinking
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequiredPossible values:
    contentany ofRequired
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    typestring · enumRequiredPossible values:
    thinkingstringRequired
    signaturestringRequired
    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    is_errorbooleanOptional
    contentany ofOptional
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    idstringRequired
    Other propertiesany · nullableOptional
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    sourceany ofRequired
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    Other propertiesstringOptional
    stop_sequencesstring[]Optional

    Custom text sequences that will cause the model to stop generating.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    systemstringOptional

    A system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role.

    namestringRequired

    Name of the tool.

    descriptionstringOptional

    Description of what this tool does. Tool descriptions should be as detailed as possible. The more information that the model has about what the tool is and how to use it, the better it will perform. You can use natural language descriptions to reinforce important aspects of the tool input JSON schema.

    typestring · enumRequiredPossible values:
    propertiesany · nullableOptional
    Other propertiesany · nullableOptional
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    or
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    budget_tokensinteger · min: 1024Required

    Determines how many tokens Claude can use for its internal reasoning process. Larger budgets can enable more thorough analysis for complex problems, improving response quality. Must be ≥1024 and less than max_tokens.

    typestring · enumRequiredPossible values:
    max_tokensnumberOptional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    Default: 32000
    temperaturenumber · max: 1Optional

    Amount of randomness injected into the response. Defaults to 1.0. Ranges from 0.0 to 1.0. Use temperature closer to 0.0 for analytical / multiple choice, and closer to 1.0 for creative and generative tasks. Note that even with temperature of 0.0, the results will not be fully deterministic.

    top_pnumber · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: anthropic/claude-sonnet-4
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequiredPossible values:
    contentany ofRequired
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    typestring · enumRequiredPossible values:
    thinkingstringRequired
    signaturestringRequired
    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    is_errorbooleanOptional
    contentany ofOptional
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    idstringRequired
    Other propertiesany · nullableOptional
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    sourceany ofRequired
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    Other propertiesstringOptional
    stop_sequencesstring[]Optional

    Custom text sequences that will cause the model to stop generating.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    systemstringOptional

    A system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role.

    namestringRequired

    Name of the tool.

    descriptionstringOptional

    Description of what this tool does. Tool descriptions should be as detailed as possible. The more information that the model has about what the tool is and how to use it, the better it will perform. You can use natural language descriptions to reinforce important aspects of the tool input JSON schema.

    typestring · enumRequiredPossible values:
    propertiesany · nullableOptional
    Other propertiesany · nullableOptional
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    or
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    budget_tokensinteger · min: 1024Required

    Determines how many tokens Claude can use for its internal reasoning process. Larger budgets can enable more thorough analysis for complex problems, improving response quality. Must be ≥1024 and less than max_tokens.

    typestring · enumRequiredPossible values:
    max_tokensnumberOptional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    Default: 32000
    temperaturenumber · max: 1Optional

    Amount of randomness injected into the response. Defaults to 1.0. Ranges from 0.0 to 1.0. Use temperature closer to 0.0 for analytical / multiple choice, and closer to 1.0 for creative and generative tasks. Note that even with temperature of 0.0, the results will not be fully deterministic.

    top_pnumber · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: anthropic/claude-opus-4.1
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "baidu/ernie-4.5-21b-a3b-thinking",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "baidu/ernie-4.5-21b-a3b-thinking",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-sonnet-4",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "anthropic/claude-sonnet-4",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-opus-4.1",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "anthropic/claude-opus-4.1",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequiredPossible values:
    contentany ofRequired
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    typestring · enumRequiredPossible values:
    thinkingstringRequired
    signaturestringRequired
    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    is_errorbooleanOptional
    contentany ofOptional
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    idstringRequired
    Other propertiesany · nullableOptional
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    sourceany ofRequired
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    Other propertiesstringOptional
    stop_sequencesstring[]Optional

    Custom text sequences that will cause the model to stop generating.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    systemstringOptional

    A system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role.

    namestringRequired

    Name of the tool.

    descriptionstringOptional

    Description of what this tool does. Tool descriptions should be as detailed as possible. The more information that the model has about what the tool is and how to use it, the better it will perform. You can use natural language descriptions to reinforce important aspects of the tool input JSON schema.

    typestring · enumRequiredPossible values:
    propertiesany · nullableOptional
    Other propertiesany · nullableOptional
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    or
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    budget_tokensinteger · min: 1024Required

    Determines how many tokens Claude can use for its internal reasoning process. Larger budgets can enable more thorough analysis for complex problems, improving response quality. Must be ≥1024 and less than max_tokens.

    typestring · enumRequiredPossible values:
    max_tokensnumberOptional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    Default: 32000
    temperaturenumber · max: 1Optional

    Amount of randomness injected into the response. Defaults to 1.0. Ranges from 0.0 to 1.0. Use temperature closer to 0.0 for analytical / multiple choice, and closer to 1.0 for creative and generative tasks. Note that even with temperature of 0.0, the results will not be fully deterministic.

    top_pnumber · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: anthropic/claude-haiku-4.5
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    echobooleanOptional

    If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Other propertiesnumber · min: -100 · max: 100Optional
    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: deepseek/deepseek-reasoner-v3.1
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "deepseek/deepseek-reasoner-v3.1",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "deepseek/deepseek-reasoner-v3.1",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-haiku-4.5",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "anthropic/claude-haiku-4.5",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    top_anumber · max: 1Optional

    Alternate top sampling parameter.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    Other propertiesnumber · min: -100 · max: 100Optional
    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: google/gemma-3n-e4b-it
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: google/gemini-3-flash-preview
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "google/gemini-3-flash-preview",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "google/gemini-3-flash-preview",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "google/gemma-3n-e4b-it",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "google/gemma-3n-e4b-it",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    reasoning_effortstring · enumOptional

    Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

    Possible values:
    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: google/gemini-2.5-flash-lite-preview
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: baidu/ernie-4-5-turbo-vl-32k
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    echobooleanOptional

    If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    Other propertiesnumber · min: -100 · max: 100Optional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: meta-llama/Llama-3.3-70B-Instruct-Turbo
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "baidu/ernie-4-5-turbo-vl-32k",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "baidu/ernie-4-5-turbo-vl-32k",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "meta-llama/Llama-3.3-70B-Instruct-Turbo",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "meta-llama/Llama-3.3-70B-Instruct-Turbo",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "google/gemini-2.5-flash-lite-preview",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "google/gemini-2.5-flash-lite-preview",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: baidu/ernie-4-5-8k-preview
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: baidu/ernie-x1-1-preview
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    modelstring · enumRequiredPossible values:
    rolestring · enumRequiredPossible values:
    contentany ofRequired
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    typestring · enumRequiredPossible values:
    thinkingstringRequired
    signaturestringRequired
    or
    typestring · enumRequiredPossible values:
    tool_use_idstringRequired
    is_errorbooleanOptional
    contentany ofOptional
    stringOptional
    or
    itemsone ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    typestring · enumRequired

    The type of the image.

    Possible values:
    media_typestring · enumRequired

    The media type of the image.

    Possible values:
    datastringRequired

    The base64 encoded image data.

    or
    idstringRequired
    Other propertiesany · nullableOptional
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    sourceany ofRequired
    typestring · enumRequiredPossible values:
    media_typestring · enumRequiredPossible values:
    datastringRequired
    or
    typestring · enumRequiredPossible values:
    datastringRequired
    Other propertiesstringOptional
    stop_sequencesstring[]Optional

    Custom text sequences that will cause the model to stop generating.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    systemstringOptional

    A system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role.

    namestringRequired

    Name of the tool.

    descriptionstringOptional

    Description of what this tool does. Tool descriptions should be as detailed as possible. The more information that the model has about what the tool is and how to use it, the better it will perform. You can use natural language descriptions to reinforce important aspects of the tool input JSON schema.

    typestring · enumRequiredPossible values:
    propertiesany · nullableOptional
    Other propertiesany · nullableOptional
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    or
    namestringRequired
    typestring · enumRequiredPossible values:
    or
    typestring · enumRequiredPossible values:
    budget_tokensinteger · min: 1024Required

    Determines how many tokens Claude can use for its internal reasoning process. Larger budgets can enable more thorough analysis for complex problems, improving response quality. Must be ≥1024 and less than max_tokens.

    typestring · enumRequiredPossible values:
    max_tokensnumberOptional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    Default: 32000
    temperaturenumber · max: 1Optional

    Amount of randomness injected into the response. Defaults to 1.0. Ranges from 0.0 to 1.0. Use temperature closer to 0.0 for analytical / multiple choice, and closer to 1.0 for creative and generative tasks. Note that even with temperature of 0.0, the results will not be fully deterministic.

    top_pnumber · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: anthropic/claude-sonnet-4.5
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "baidu/ernie-x1-1-preview",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "baidu/ernie-x1-1-preview",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "anthropic/claude-sonnet-4.5",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "anthropic/claude-sonnet-4.5",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "baidu/ernie-4-5-8k-preview",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "baidu/ernie-4-5-8k-preview",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: alibaba-cloud/qwen3-next-80b-a3b-thinking
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success

    Requesting more advanced models

    This guide uses a more advanced model, , and also explains how to use various chat model capabilities:

    • streaming mode

    • calling tools

    • uploading images to the model for analysis

    uploading files to the model for analysis

  • web search

  • circle-info

    If you need help with API keys or environment configuration, go back to the previous step and follow the detailed quickstart guide for the Gemma 3 model.


    hashtag
    Making an API Call

    The chat model used in this example is more advanced. In addition to regular user messages, it supports the system role in the messages parameter, which can be used to define global instructions that affect the model’s overall behavior, for example:

    Here’s the complete code you can use right away in a cURL, Python, or Node.js program. You only need to replace <YOUR_AIMLAPI_KEY> with your AIML API key from your account, provide your behavior instructions in the system prompt, and place your request to the model in the user prompt.


    hashtag
    Using Streaming Mode

    Streaming lets the model send partial responses as they’re generated instead of waiting for the full output — useful for real‑time feedback.

    hashtag
    Full Streaming Response (Raw Events)

    This example shows how to consume the streaming response as-is, without abstraction. Each chunk is processed in real time, exposing the full event structure returned by the API.

    Use this approach if you need:

    • access to all event types

    • fine-grained control over parsing

    • debugging or logging of raw responses

    • support for metadata beyond plain text

    chevron-rightExample raw streaming responsehashtag

    hashtag
    Streaming Response Processing (Text Extraction)

    This example shows how to process the streaming response to extract only the generated text. Instead of handling all event types, the code filters incoming chunks and prints the content as it arrives. Use this approach if you only need the generated text.

    chevron-rightExample processed clean streaming responsehashtag

    hashtag
    Tool calling

    GPT‑4o can call functions/tools you define in the API request to extend behavior (e.g., performing calculations, retrieving structured data).

    chevron-rightHow it workshashtag
    1. Initial request — The model receives the user prompt and the registered tool, and generates a tool_calls object indicating which function it wants to execute.

    2. Extract and run the tool — Parse the arguments from the tool_calls object and execute the function locally.

    3. Send back the result — Return the computed result to the model using the tool role and the content field.

    4. Final response — The model incorporates the tool’s output and generates a complete answer for the user.

    chevron-rightExample responsehashtag

    hashtag
    Image upload

    GPT‑4o supports vision inputs: you can send an image URL in the messages request to let the model analyze or describe it.

    chevron-rightExample responsehashtag

    hashtag
    Web search integration

    With search‑preview models, you can perform live web search queries in combination with the model to get up‑to‑date results and grounded responses.

    circle-info

    See the complete list of our search‑capable models.

    chevron-rightExample responsehashtag

    hashtag
    Future Steps

    • Browse and compare AI models, including GPT, Claude, and many others, using the Playgroundarrow-up-right

    • Know more about supported SDKs

    • Learn more about special text model capabilities

    • Join the community: get help and share your projects in our Discordarrow-up-right

    GPT-4o
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "alibaba-cloud/qwen3-next-80b-a3b-thinking",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "alibaba-cloud/qwen3-next-80b-a3b-thinking",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "gpt-4o",
        "messages": [
          {
            "role": "system",
            "content": "You are a travel agent. Be descriptive and helpful.",
          }, 
          {
            "role": "user",
            "content": "Tell me about San Francisco"
          }
        ],
        "temperature": 0.7,
        "max_tokens": 512
      }'
    systemPrompt = 'You are a travel agent. Be descriptive and helpful.' // instructions
    userPrompt = 'Tell me about San Francisco' // your request
    
    async function main() {
      const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
        method: 'POST',
        headers: {
          // Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
          'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'gpt-4o',
          messages:[
              {
                  role: 'system',
                  content: systemPrompt,
              }, 
              {
                  role: 'user',
                  content: userPrompt
              }
          ],
          temperature: 0.7,
          max_tokens: 512,
        }),
      });
    
      const data = await response.json();
      const answer = data.choices[0].message.content;
      
      console.log('User:', userPrompt);
      console.log('AI:', answer);
    }
    
    main();
    import requests
    import json  # for getting a structured output with indentation 
    
    system_prompt = "You are a travel agent. Be descriptive and helpful."
    user_prompt = "Tell me about San Francisco"
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"gpt-4o",
            "messages":[
                {
                    "role":"system",
                    "content": system_prompt,
                },       
                {
                    "role":"user",
                    "content": user_prompt,
                }
            ],
            "temperature": 0.7,
            "max_tokens": 256,
        }
    )
    
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    import requests
    import json  # for getting a structured output with indentation 
    
    response = requests.post(
        "https://api.aimlapi.com/v1/chat/completions",
        headers={
            # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
            "Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
            "Content-Type":"application/json"
        },
        json={
            "model":"gpt-4o",
            "messages":[
                {
                    "role":"user",
                    "content":"Hi! What do you think about mankind?" # insert your prompt
                }
            ],
            "stream": True
        }
    )
    
    # data = response.json()
    print(response.text)
    from openai import OpenAI
    
    # Initialize the client
    client = OpenAI(
        # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
        api_key="YOUR_AIMLAPI_KEY",
        base_url="https://api.aimlapi.com/v1"
    )
    
    # Create a streaming chat completion
    stream = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {
                "role": "user",
                "content": "Hi! What do you think about mankind?"
            }
        ],
        stream=True
    )
    
    # Print raw chunks (similar to response.text in requests)
    for chunk in stream:
        print(chunk)
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"RmYFV8ad65HP9F"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":"Hello"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"fjE24R0ZOJr"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":"!"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"qAlxZuNpvVvIIOm"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" As"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"Zn3rsadkL8zHO"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" an"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"D1ss0WZmiGg8l"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" AI"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"bOHB8VYpq4G0W"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"OwZvgIyMlYVcIgH"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" I"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"u9lFaH3ngdK6MR"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" don't"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"KRFgmSe4yG"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" have"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"YL8zlQ9PjDF"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" personal"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"Gzgb5OT"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" opinions"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"Flz362J"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" or"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"XA0qqmSQr2jme"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" feelings"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"VA3dwaU"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"POplI0eiOWXpIPD"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" but"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"aDifMrQ8OH9i"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" I"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"ceVweUN2pByieS"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" can"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"txjYCds61AQp"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" provide"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"IGlSpZBf"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" an"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"BtPIfSvUXgRnl"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" overview"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"IYfRhEo"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" of"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"uh8pR2mNtYSNQ"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" various"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"ILZ0ffVW"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" perspectives"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"Rgs"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" on"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"r7Awao2PSZ0DH"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" mankind"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"m8vJ3dzf"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":"."},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"f2wZrEj0RqUFprg"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" Humanity"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"cCPi2qV"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" is"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"yNd7SUoXBojpA"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" often"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"VEaggK2dFS"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" viewed"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"8nhopBJZe"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" as"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"6xG2VkJLonAeF"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" a"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"WDu20GtJyN8Lep"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" complex"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"4bE4D3tS"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" and"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"DZtW3Ahopdgl"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" multif"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"8bS4GMzf3"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":"aceted"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"ivtxUAov3l"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" species"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"Xcq85kDt"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" capable"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"PfwZUtYS"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" of"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"DoyM4RGNLxnFc"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" remarkable"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"mUvVH"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" achievements"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"4fl"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" and"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"GUdfkDUkNBNO"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" profound"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"x4KCnLk"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" creativity"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"goTL4"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":"."},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"gkqK9sezr258S93"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" People"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"c49BcmfXz"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" have"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"Br7pbWtK86v"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" built"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"dzAoO36Siv"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" civilizations"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"yS"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"hiMiIGF7QM9BeJA"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" explored"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"IhuVoUB"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" space"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"qNqiO3hyXB"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"UVmzp6Y0qjb7Zkb"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" and"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"iiIw0gK2MP5D"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" developed"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"FJUJhv"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" technologies"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"pkQ"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" that"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"sAhx0IJoR0m"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" transform"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"YDTnhx"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" everyday"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"imFIYIz"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" life"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"6xJBjebVPfo"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":"."},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"DKIPIwgAnVDj3g1"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" At"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"HhuMheG0mPcuI"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" the"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"yIQIWY1CXoW6"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" same"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"QcKwiqSqGRU"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" time"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"f6e6uGKikn5"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"1eXIFULDN1iS8b1"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" humans"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"GH0z8I36B"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" face"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"JLUmj9BN7PQ"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" significant"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"qdQg"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" challenges"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"KMzNb"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" such"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"8pw9I3FGElO"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" as"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"nY0RLEY6Am9zD"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" environmental"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"4r"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" degradation"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"1zGA"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"bhbZZCR7wNgWQkq"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" social"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"FcCsVIGji"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" inequalities"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"6kb"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"Z4Zz2oDgc5zw0D6"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" and"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"Q5XvheR2EWhq"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" geopolitical"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"ySW"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" conflicts"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"eiERwe"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":"."},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"oNAsPbgeJSOuPMg"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" The"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"4TwzxlGRpebL"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" potential"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"lW3Jfo"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" for"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"Ejvws7kQryhN"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" both"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"HVm3EDKAkuA"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" positive"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"HMY8pYv"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" change"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"fbOaTSNWR"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" and"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"ETmTxHsFbCkw"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" destructive"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"WHk8"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" behavior"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"EvSYFf5"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" makes"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"yfwGRy20jz"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" mankind"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"vwJGC8sU"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" a"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"nHyqFYnTzVmVsE"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" subject"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"wtm8Wh9c"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" of"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"gnLF2uDFfg976"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" deep"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"BEc6wh2y2vV"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" contemplation"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"zf"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" and"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"vpg86EhZm5c3"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" varied"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"iWNJAcR7a"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" viewpoints"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"JRXUN"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":"."},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"5yN6iGLyFLiQV0H"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}],"usage":null,"obfuscation":"4lkqbaPLDt"}
    
    data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[],"usage":{"prompt_tokens":16,"completion_tokens":102,"total_tokens":118,"prompt_tokens_details":{"cached_tokens":0,"audio_tokens":0},"completion_tokens_details":{"reasoning_tokens":0,"audio_tokens":0,"accepted_prediction_tokens":0,"rejected_prediction_tokens":0}},"obfuscation":"VChaI1ntRBrTy"}
    import requests
    import json
    
    url = "https://api.aimlapi.com/v1/chat/completions"
    headers = {
        # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
        "Authorization": "Bearer <YOUR_AIMLAPI_KEY>",
        "Content-Type": "application/json"
    }
    
    payload = {
        "model": "gpt-4o",
        "messages": [
            {"role": "user", "content": "Explain quantum computing simply."}
        ],
        "stream": True
    }
    
    with requests.post(url, headers=headers, json=payload, stream=True) as r:
        # Iterate over the streaming response line by line
        for line in r.iter_lines():
            if not line:
                continue  # Skip empty lines
    
            # Decode bytes to string
            line = line.decode("utf-8")
    
            # SSE messages start with "data: "
            if not line.startswith("data: "):
                continue
    
            # Remove the "data: " prefix
            data_str = line[len("data: "):]
    
            # "[DONE]" indicates the end of the stream
            if data_str.strip() == "[DONE]":
                break
    
            try:
                # Parse JSON payload
                data = json.loads(data_str)
            except json.JSONDecodeError:
                continue  # Skip malformed chunks
            
            # Ensure "choices" exists and is not empty
            choices = data.get("choices")
            if not choices:
                continue
    
            # Extract text delta (OpenAI-style streaming format)
            delta = data.get("choices", [{}])[0].get("delta", {})
            content = delta.get("content")
    
            # Print text as it arrives
            if content:
                print(content, end="")
    from openai import OpenAI
    
    client = OpenAI(
        # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
        api_key="<YOUR_AIMLAPI_KEY>",
        base_url="https://api.aimlapi.com/v1"
    )
    
    stream = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "user", "content": "Explain quantum computing simply."}
        ],
        stream=True
    )
    
    # Iterate over streaming chunks
    for chunk in stream:
        # Ensure choices exist and are not empty
        if not chunk.choices:
            continue
    
        delta = chunk.choices[0].delta
        content = getattr(delta, "content", None)
    
        # Print text as it arrives
        if content:
            print(content, end="")
    Quantum computing is a type of computing that uses principles of quantum mechanics to process information. Unlike classical computers, which use bits to represent data as 0s or 1s, quantum computers use quantum bits or qubits. 
    
    Qubits have unique properties that give quantum computers more power in certain tasks:
    
    1. **Superposition**: A qubit can exist in multiple states (i.e., both 0 and 1) simultaneously. This allows quantum computers to process a vast amount of possibilities at once.
    
    2. **Entanglement**: Qubits can be linked together in such a way that the state of one qubit can depend on the state of another, no matter the distance apart. This can lead to more efficient processing and problem-solving.
    
    3. **Quantum Interference**: Quantum algorithms make use of interference, where different quantum states can amplify or cancel each other out, guiding the computation toward the correct answer.
    
    Because of these properties, quantum computers have the potential to solve certain complex problems much faster than classical computers can, potentially revolutionizing fields like cryptography, materials science, and optimization. However, building practical quantum computers is extremely challenging due to issues with qubit stability and error rates.
    import requests
    import json
    
    # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
    api_key = "<YOUR_AIMLAPI_KEY>"
    base_url = "https://api.aimlapi.com/v1"
    
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }
    
    # Step 1: Define the tool correctly
    tool = {
        "type": "function",
        "function": {
            "name": "toCelsius",
            "description": "Convert Fahrenheit to Celsius",
            "parameters": {
                "type": "object",
                "properties": {
                    "fahrenheit": {"type": "number"}
                },
                "required": ["fahrenheit"]
            }
        }
    }
    
    # Step 2: Initial request with the tool
    payload = {
        "model": "gpt-4o",
        "messages": [
            {"role": "user", "content": "Convert 256°F to °C"}
        ],
        "tools": [tool]
    }
    
    response = requests.post(f"{base_url}/chat/completions", headers=headers, json=payload)
    data = response.json()
    
    # Step 3: Extract tool call
    tool_calls = data["choices"][0]["message"].get("tool_calls", [])
    if not tool_calls:
        raise ValueError("No tool calls found. Make sure the tool is correctly defined.")
    
    tool_call = tool_calls[0]
    arguments = json.loads(tool_call["function"]["arguments"])
    fahrenheit = arguments["fahrenheit"]
    
    # Step 4: Execute the tool locally
    celsius_result = (fahrenheit - 32) * 5 / 9
    
    # Step 5: Send result back to model
    final_payload = {
        "model": "gpt-4o",
        "messages": [
            {"role": "user", "content": "Convert 256°F to °C"},
            {
                "role": "assistant",
                "tool_calls": [
                    {
                        "id": tool_call["id"],
                        "type": "function",
                        "function": {
                            "name": tool_call["function"]["name"],
                            "arguments": tool_call["function"]["arguments"]
                        }
                    }
                ]
            },
            {
                "role": "tool",
                "tool_call_id": tool_call["id"],
                "content": str(celsius_result)
            }
        ]
    }
    
    final_response = requests.post(f"{base_url}/chat/completions", headers=headers, json=final_payload)
    final_data = final_response.json()
    
    # Step 6: Print final answer
    print(final_data["choices"][0]["message"]["content"])
    from openai import OpenAI
    import json
    
    client = OpenAI(
        # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
        api_key="<YOUR_AIMLAPI_KEY>",
        base_url="https://api.aimlapi.com/v1"
    )
    
    # Step 1: Define the tool correctly
    tool = {
        "type": "function",
        "function": {
            "name": "toCelsius",
            "description": "Convert Fahrenheit to Celsius",
            "parameters": {
                "type": "object",
                "properties": {
                    "fahrenheit": {"type": "number"}
                },
                "required": ["fahrenheit"]
            }
        }
    }
    
    # Step 2: Initial request with tool
    initial_response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Convert 256°F to °C"}],
        tools=[tool]
    )
    
    # Step 3: Extract tool call
    assistant_message = initial_response.choices[0].message
    tool_calls = getattr(assistant_message, "tool_calls", [])
    if not tool_calls:
        raise ValueError("No tool calls found. Make sure the tool is correctly defined.")
    
    tool_call = tool_calls[0]
    arguments = json.loads(tool_call.function.arguments)
    fahrenheit = arguments["fahrenheit"]
    
    # Step 4: Execute tool locally
    celsius_result = (fahrenheit - 32) * 5 / 9
    
    # Step 5: Send result back
    final_response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "user", "content": "Convert 256°F to °C"},
            {
                "role": "assistant",
                "tool_calls": [
                    {
                        "id": tool_call.id,
                        "type": "function",
                        "function": {
                            "name": tool_call.function.name,
                            "arguments": tool_call.function.arguments,
                        },
                    }
                ],
            },
            {
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": str(celsius_result),
            },
        ],
    )
    
    print(final_response.choices[0].message.content)
    256°F is approximately 124.44°C.
    import requests
    import json
    
    url = "https://api.aimlapi.com/v1/chat/completions"
    headers = {
      # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:  
      "Authorization": "Bearer <YOUR_AIMLAPI_KEY>",
      "Content-Type": "application/json"
    }
    
    payload = {
      "model": "gpt-4o",
      "messages": [
        {
          "role": "user",
          "content": [
            {"type": "text", "text": "Describe this scene:"},
            {"type": "image_url", "image_url": {"url": "https://raw.githubusercontent.com/aimlapi/api-docs/main/reference-files/mona_lisa_extended.jpg"}}
          ]
        }
      ]
    }
    
    response = requests.post(url, headers=headers, data=json.dumps(payload))
    data = response.json()
    print(json.dumps(data, indent=2, ensure_ascii=False))
    from openai import OpenAI
    import json
    
    # Initialize the client
    client = OpenAI(
        # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:  
        api_key="<YOUR_AIMLAPI_KEY>",
        base_url="https://api.aimlapi.com/v1"
    )
    
    # Prepare the messages with text and image_url
    messages = [
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe this scene:"},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://raw.githubusercontent.com/aimlapi/api-docs/main/reference-files/mona_lisa_extended.jpg"
                    }
                }
            ]
        }
    ]
    
    # Create a chat completion
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=messages
    )
    
    # Print full JSON response
    print(json.dumps(response.model_dump(), indent=2, ensure_ascii=False))
    {
      "id": "chatcmpl-DL3DDPif2s79HbOHySq6bVY8SAsKQ",
      "choices": [
        {
          "finish_reason": "stop",
          "index": 0,
          "logprobs": null,
          "message": {
            "content": "The scene is an iconic Renaissance portrait showing a woman with an enigmatic smile, known for its mastery of detail and composition. The woman is seated against a distant, dreamlike landscape featuring winding paths and rocky formations. She wears a dark dress and light veil, with her hands delicately folded. The background's atmospheric perspective creates depth, with bluish mountains fading into the horizon. The artwork evokes a sense of mystery and balance.",
            "refusal": null,
            "role": "assistant",
            "annotations": [],
            "audio": null,
            "function_call": null,
            "tool_calls": null
          }
        }
      ],
      "created": 1773909607,
      "model": "gpt-4o-2024-08-06",
      "object": "chat.completion",
      "service_tier": "default",
      "system_fingerprint": "fp_0a8aa8bfeb",
      "usage": {
        "completion_tokens": 85,
        "prompt_tokens": 776,
        "total_tokens": 861,
        "completion_tokens_details": {
          "accepted_prediction_tokens": 0,
          "audio_tokens": 0,
          "reasoning_tokens": 0,
          "rejected_prediction_tokens": 0
        },
        "prompt_tokens_details": {
          "audio_tokens": 0,
          "cached_tokens": 0
        }
      },
      "meta": {
        "usage": {
          "credits_used": 7254
        }
      }
    }
    import json
    import requests
    from typing import Dict, Any
    
    # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
    API_KEY = "<YOUR_AIMLAPI_KEY>"
    BASE_URL = "https://api.aimlapi.com/v1"
    
    HEADERS = {
        "Authorization": f"Bearer {API_KEY}",
        "Content-Type": "application/json",
    }
    
    
    def search_impl(arguments: Dict[str, Any]) -> Any:
        return arguments
    
    
    def chat(messages):
        url = f"{BASE_URL}/chat/completions"
        payload = {
            "model": "gpt-4o-mini-search-preview",
            "messages": messages,
            "temperature": 0.6,
            "tools": [
                {
                    "type": "builtin_function",
                    "function": {"name": "$web_search"},
                }
            ]
        }
    
        response = requests.post(url, headers=HEADERS, json=payload)
        response.raise_for_status()
        return response.json()["choices"][0]
    
    
    def main():
        messages = [
            {"role": "system", "content": "You are GPT with web search skills."},
            {"role": "user", "content": "Please search for AGI and tell me what it is in English."}
        ]
    
        finish_reason = None
        while finish_reason is None or finish_reason == "tool_calls":
            choice = chat(messages)
            finish_reason = choice["finish_reason"]
            message = choice["message"]
    
            if finish_reason == "tool_calls":
                messages.append(message)
    
                for tool_call in message["tool_calls"]:
                    tool_call_name = tool_call["function"]["name"]
                    tool_call_arguments = json.loads(tool_call["function"]["arguments"])
    
                    if tool_call_name == "$web_search":
                        tool_result = search_impl(tool_call_arguments)
                    else:
                        tool_result = f"Error: unable to find tool by name '{tool_call_name}'"
    
                    messages.append({
                        "role": "tool",
                        "tool_call_id": tool_call["id"],
                        "name": tool_call_name,
                        "content": json.dumps(tool_result),
                    })
    
        print(message["content"])
    
    
    if __name__ == "__main__":
        main()
    import json
    from typing import Dict, Any
    from openai import OpenAI
    
    # Insert your API key
    client = OpenAI(
        # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
        api_key="YOUR_AIMLAPI_KEY",
        base_url="https://api.aimlapi.com/v1"
    )
    
    
    def search_impl(arguments: Dict[str, Any]) -> Any:
        return arguments
    
    
    def chat(messages):
        response = client.chat.completions.create(
            model="gpt-4o-mini-search-preview",
            messages=messages,
            temperature=0.6,
            tools=[
                {
                    "type": "function",
                    "function": {
                        "name": "$web_search",
                        "parameters": {
                            "type": "object",
                            "properties": {},
                        },
                    },
                }
            ],
        )
        return response.choices[0]
    
    
    def main():
        messages = [
            {"role": "system", "content": "You are GPT with web search skills."},
            {"role": "user", "content": "Please search for AGI and tell me what it is in English."}
        ]
    
        finish_reason = None
        while finish_reason is None or finish_reason == "tool_calls":
            choice = chat(messages)
            finish_reason = choice.finish_reason
            message = choice.message
    
            if finish_reason == "tool_calls":
                messages.append(message.model_dump())
    
                for tool_call in message.tool_calls:
                    tool_call_name = tool_call.function.name
                    tool_call_arguments = json.loads(tool_call.function.arguments)
    
                    if tool_call_name == "$web_search":
                        tool_result = search_impl(tool_call_arguments)
                    else:
                        tool_result = f"Error: unable to find tool by name '{tool_call_name}'"
    
                    messages.append({
                        "role": "tool",
                        "tool_call_id": tool_call.id,
                        "name": tool_call_name,
                        "content": json.dumps(tool_result),
                    })
    
        print(message.content)
    
    
    if __name__ == "__main__":
        main()
    "AGI" is an acronym that can represent different terms depending on the context:
    
    1. **Adjusted Gross Income**: In the United States, AGI refers to Adjusted Gross Income, which is a taxpayer's total income from all sources minus allowable adjustments. This figure is used to determine taxable income and eligibility for various tax benefits. ([usafacts.org](https://usafacts.org/articles/adjusted-gross-income-agi-definition?utm_source=openai))
    
    2. **Artificial General Intelligence**: In the field of artificial intelligence, AGI stands for Artificial General Intelligence. This concept refers to AI systems that possess the ability to understand, learn, and apply knowledge across a wide range of tasks, matching or surpassing human cognitive abilities. ([en.wikipedia.org](https://en.wikipedia.org/wiki/Artificial_general_intelligence?utm_source=openai))
    
    3. **Alliance Graphique Internationale**: AGI also denotes the Alliance Graphique Internationale, an international organization of leading graphic artists and designers. ([en.wikipedia.org](https://en.wikipedia.org/wiki/Alliance_Graphique_Internationale?utm_source=openai))
    
    4. **Agi Language**: Additionally, "Agi" is the name of a Torricelli language spoken in Papua New Guinea. ([en.wikipedia.org](https://en.wikipedia.org/wiki/Agi_language?utm_source=openai))
    
    The specific meaning of "AGI" depends on the context in which it is used.
    messages: [
        {
          role: "system",
          content: "You are a travel agent. Be descriptive and helpful.",
        },
        {
          role: "user",
          content: "Tell me about San Francisco",
        },
    ],
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    reasoning_effortstring · enumOptional

    Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

    Possible values:
    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: google/gemini-3-1-flash-lite-preview
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    enable_thinkingbooleanOptional

    Specifies whether to use the thinking mode.

    Default: false
    thinking_budgetinteger · min: 1Optional

    The maximum reasoning length, effective only when enable_thinking is set to true.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: alibaba/qwen3-32b
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "alibaba/qwen3-32b",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "alibaba/qwen3-32b",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "google/gemini-3-1-flash-lite-preview",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "google/gemini-3-1-flash-lite-preview",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    ninteger · min: 1 · nullableOptional

    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: google/gemini-2.5-pro
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    logprobsboolean · nullableOptional

    Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.

    top_logprobsnumber · max: 20 · nullableOptional

    An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: alibaba/qwen3-max-instruct
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: baidu/ernie-4.5-21b-a3b
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    typestring · enumRequiredPossible values:
    urlstring · uriRequired

    Either a URL of the image or the base64 encoded image data.

    detailstring · enumOptional

    Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.

    Possible values:
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    file_datastringOptional

    The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.

    filenamestringOptional

    The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    contentany ofRequired

    The contents of the developer message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    rolestring · enumRequired

    The role of the author of the message — in this case, the developer.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    min_pnumber · min: 0.001 · max: 0.999Optional

    A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.

    top_knumberOptional

    Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    top_anumber · max: 1Optional

    Alternate top sampling parameter.

    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: google/gemma-3-27b-it
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "google/gemma-3-27b-it",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "google/gemma-3-27b-it",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "google/gemini-2.5-pro",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "google/gemini-2.5-pro",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "alibaba/qwen3-max-instruct",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "alibaba/qwen3-max-instruct",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "baidu/ernie-4.5-21b-a3b",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "baidu/ernie-4.5-21b-a3b",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }
    post
    Body
    modelstring · enumRequiredPossible values:
    rolestring · enumRequired

    The role of the author of the message — in this case, the user

    Possible values:
    contentany ofRequired

    The contents of the user message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the system.

    Possible values:
    contentany ofRequired

    The contents of the system message.

    stringOptional
    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the tool.

    Possible values:
    contentstringRequired

    The contents of the tool message.

    tool_call_idstringRequired

    Tool call that this message is responding to.

    namestring · nullableOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    or
    rolestring · enumRequired

    The role of the author of the message — in this case, the Assistant.

    Possible values:
    contentany ofOptional

    The contents of the Assistant message. Required unless tool_calls or function_call is specified.

    stringOptional

    The contents of the Assistant message.

    or
    itemsany ofOptional
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    or
    refusalstringRequired

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the content part.

    Possible values:
    namestringOptional

    An optional name for the participant. Provides the model information to differentiate between participants of the same role.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    refusalstring · nullableOptional

    The refusal message by the Assistant.

    max_completion_tokensinteger · min: 1Optional

    An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

    max_tokensnumber · min: 1Optional

    The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.

    streambooleanOptional

    If set to True, the model response data will be streamed to the client as it is generated using server-sent events.

    Default: false
    include_usagebooleanRequired
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    descriptionstringOptional

    A description of what the function does, used by the model to choose when and how to call the function.

    namestringRequired

    The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional

    The parameters the functions accepts, described as a JSON Schema object.

    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.

    tool_choiceany ofOptional

    Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

    string · enumOptional

    none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

    Possible values:
    or
    typestring · enumRequired

    The type of the tool. Currently, only function is supported.

    Possible values:
    namestringRequired

    The name of the function to call.

    parallel_tool_callsbooleanOptional

    Whether to enable parallel function calling during tool use.

    temperaturenumber · max: 2Optional

    What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

    top_pnumber · min: 0.01 · max: 1Optional

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

    stopany ofOptional

    Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    stringOptional
    or
    string[]Optional
    or
    any · nullableOptional
    frequency_penaltynumber · min: -2 · max: 2 · nullableOptional

    Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    typestring · enumRequired

    The type of the predicted content you want to provide.

    Possible values:
    contentany ofRequired

    The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.

    stringOptional

    The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.

    or
    typestring · enumRequired

    The type of the content part.

    Possible values:
    textstringRequired

    The text content.

    presence_penaltynumber · min: -2 · max: 2 · nullableOptional

    Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    seedinteger · min: 1Optional

    This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

    response_formatone ofOptional

    An object specifying the format that the model must output.

    typestring · enumRequired

    The type of response format being defined. Always text.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_object.

    Possible values:
    or
    typestring · enumRequired

    The type of response format being defined. Always json_schema.

    Possible values:
    namestringRequired

    The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

    Other propertiesany · nullableOptional
    strictboolean · nullableOptional

    Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.

    descriptionstringOptional

    A description of what the response format is for, used by the model to determine how to respond in the format.

    repetition_penaltynumber · nullableOptional

    A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

    Responses
    chevron-right
    200Success
    idstringRequired

    A unique identifier for the chat completion.

    Example: chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl
    objectstring · enumRequired

    The object type.

    Example: chat.completionPossible values:
    creatednumberRequired

    The Unix timestamp (in seconds) of when the chat completion was created.

    Example: 1762343744
    indexnumberRequired

    The index of the choice in the list of choices.

    Example: 0
    rolestringRequired

    The role of the author of this message.

    Example: assistant
    contentstringRequired

    The contents of the message.

    Example: Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?
    refusalstring · nullableOptional

    The refusal message generated by the model.

    typestring · enumRequired

    The type of the URL citation. Always url_citation.

    Possible values:
    end_indexintegerRequired

    The index of the last character of the URL citation in the message.

    start_indexintegerRequired

    The index of the first character of the URL citation in the message.

    titlestringRequired

    The title of the web resource.

    urlstringRequired

    The URL of the web resource.

    idstringRequired

    Unique identifier for this audio response.

    datastringRequired

    Base64 encoded audio bytes generated by the model, in the format specified in the request.

    transcriptstringRequired

    Transcript of the audio generated by the model.

    expires_atintegerRequired

    The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.

    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    argumentsstringRequired

    The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

    namestringRequired

    The name of the function to call.

    or
    idstringRequired

    The ID of the tool call.

    typestring · enumRequired

    The type of the tool.

    Possible values:
    inputstringRequired

    The input for the custom tool call generated by the model.

    namestringRequired

    The name of the custom tool to call.

    finish_reasonstring · enumRequired

    The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool

    Possible values:
    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[]Required

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    bytesinteger[] · nullableOptional

    A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

    logprobnumberRequired

    The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.

    tokenstringRequired

    The token.

    modelstringRequired

    The model used for the chat completion.

    Example: alibaba/qwen3-235b-a22b-thinking-2507
    prompt_tokensnumberRequired

    Number of tokens in the prompt.

    Example: 137
    completion_tokensnumberRequired

    Number of tokens in the generated completion.

    Example: 914
    total_tokensnumberRequired

    Total number of tokens used in the request (prompt + completion).

    Example: 1051
    accepted_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.

    audio_tokensinteger · nullableOptional

    Audio input tokens generated by the model.

    reasoning_tokensinteger · nullableOptional

    Tokens generated by the model for reasoning.

    rejected_prediction_tokensinteger · nullableOptional

    When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.

    audio_tokensinteger · nullableOptional

    Audio input tokens present in the prompt.

    cached_tokensinteger · nullableOptional

    Cached tokens present in the prompt.

    credits_usednumberRequired

    The number of tokens consumed during generation.

    Example: 120000
    usd_spentnumberRequired

    The total amount of money spent by the user in USD.

    Example: 0.06
    post
    /v1/chat/completions
    200Success
    curl -L \
      --request POST \
      --url 'https://api.aimlapi.com/v1/chat/completions' \
      --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "alibaba/qwen3-235b-a22b-thinking-2507",
        "messages": [
          {
            "role": "user",
            "content": "Hello"
          }
        ]
      }'
    {
      "id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
      "object": "chat.completion",
      "created": 1762343744,
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
            "refusal": null,
            "annotations": null,
            "audio": null,
            "tool_calls": null
          },
          "finish_reason": "stop",
          "logprobs": null
        }
      ],
      "model": "alibaba/qwen3-235b-a22b-thinking-2507",
      "usage": {
        "prompt_tokens": 137,
        "completion_tokens": 914,
        "total_tokens": 1051,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
      },
      "meta": {
        "usage": {
          "credits_used": 120000,
          "usd_spent": 0.06
        }
      }
    }