Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Learn how to get started with the AI/ML API
from openai import OpenAI
client = OpenAI(
base_url="https://api.aimlapi.com/v1",
api_key="<YOUR_AIMLAPI_KEY>",
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Write a one-sentence story about numbers."}]
)
print(response.choices[0].message.content)Access leading AI models (GPT-4o, Gemini, and others) through a single unified API. Initial setup takes just a few minutes.
curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "google/gemma-3-4b-it",
"messages": [
{
"role": "user",
"content": "Tell me about San Francisco"
}
],
"temperature": 0.7,
"max_tokens": 512
}'userPrompt = 'Tell me about San Francisco' // insert your request here
async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'google/gemma-3-4b-it',
messages:[
{
role:'user',
content: userPrompt
}
],
temperature: 0.7,
max_tokens: 512,
}),
});
const data = await response.json();
const answer = data.choices[0].message.content;
console.log('User:', userPrompt);
console.log('AI:', answer);
}
main();import requests
user_prompt = "Tell me about San Francisco" # insert your request here
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>"
},
json={
"model":"google/gemma-3-4b-it",
"messages":[
{
"role":"user",
"content": user_prompt
}
],
"temperature": 0.7,
"max_tokens": 512,
}
)
data = response.json()
answer = data["choices"][0]["message"]["content"]
print("User:", user_prompt)
print("AI:", answer)








%pip install requestspython3 -m venv ./.venv# Linux / Mac
source ./.venv/bin/activate
# Windows
./.venv/bin/Activate.batpip install requeststouch travel.pyimport requests
user_prompt = "Tell me about San Francisco"
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"google/gemma-3-4b-it",
"messages":[
{
"role":"user",
"content": user_prompt
}
],
"temperature": 0.7,
"max_tokens": 512,
}
)
data = response.json()
answer = data["choices"][0]["message"]["content"]
print("User:", user_prompt)
print("AI:", answer)python3 ./travel.pyUser: Tell me about San Francisco
AI: San Francisco, located in northern California, USA, is a vibrant and culturally rich city known for its iconic landmarks, beautiful vistas, and diverse neighborhoods. It's a popular tourist destination famous for its iconic Golden Gate Bridge, which spans the entrance to the San Francisco Bay, and the iconic Alcatraz Island, home to the infamous federal prison.
The city's famous hills offer stunning views of the bay and the cityscape. Lombard Street, the "crookedest street in the world," is a must-see attraction, with its zigzagging pavement and colorful gardens. Ferry Building Marketplace is a great place to explore local food and artisanal products, and the Pier 39 area is home to sea lions, shops, and restaurants.
San Francisco's diverse neighborhoods each have their unique character. The historic Chinatown is the oldest in North America, while the colorful streets of the Mission District are known for their murals and Latin American culture. The Castro District is famous for its LGBTQ+ community and vibrant nightlife../index.jsUser: Tell me about San Francisco
AI: San Francisco, located in the northern part of California, USA, is a vibrant and culturally rich city known for its iconic landmarks, beautiful scenery, and diverse neighborhoods.
The city is famous for its iconic Golden Gate Bridge, an engineering marvel and one of the most recognized structures in the world. Spanning the Golden Gate Strait, this red-orange suspension bridge connects San Francisco to Marin County and offers breathtaking views of the San Francisco Bay and the Pacific Ocean.--url 'https://api.aimlapi.com/v1/chat/completions' \--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \--header 'Content-Type: application/json' \--data '{
"model": "google/gemma-3-4b-it",
"messages": [
{
"role": "user",
"content": "Tell me about San Francisco"
}
],
"temperature": 0.7,
"max_tokens": 512
}'headers: {
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},body: JSON.stringify({
model: 'google/gemma-3-4b-it',
messages: [
{
role: 'user',
content: userPrompt
}
],
temperature: 0.7,
max_tokens: 512,
}),const data = await response.json();const answer = data.choices[0].message.content;console.log('User:', userPrompt);
console.log('AI:', answer);response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
...
)headers={
"Authorization": "Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type": "application/json"
},json={
"model": "google/gemma-3-4b-it",
"messages": [
{
"role": "user",
"content": user_prompt
}
],
"temperature": 0.7,
"max_tokens": 512,
}data = response.json()answer = data["choices"][0]["message"]["content"]print("User:", user_prompt)
print("AI:", answer)A full list of available models.
Overview of the capabilities of AIML API text models (LLMs).
pip install requestsmodelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"bytedance/dola-seed-2-0-pro",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))import requests --header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \ headers: {
Authorization: "Bearer <YOUR_AIMLAPI_KEY>",
}, headers={
"Authorization": "Bearer <YOUR_AIMLAPI_KEY>",
},curl --request POST \
--url https://api.aimlapi.com/chat/completions \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "google/gemma-3-4b-it",
"messages": [
{
"role": "user",
"content": "What kind of model are you?"
}
],
"max_tokens": 512
}'fetch("https://api.aimlapi.com/chat/completions", {
method: "POST",
headers: {
Authorization: "Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "google/gemma-3-4b-it",
messages: [
{
role: "user",
content: "What kind of model are you?",
},
],
max_tokens: 512,
}),
})
.then((res) => res.json())
.then(console.log);import requests
import json # for getting a structured output with indentation
response = requests.post(
url="https://api.aimlapi.com/chat/completions",
headers={
"Authorization": "Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type": "application/json",
},
data=json.dumps(
{
"model": "google/gemma-3-4b-it",
"messages": [
{
"role": "user",
"content": "What kind of model are you?",
},
],
"max_tokens": 512
}
),
)
response.raise_for_status()
print(response.json())pip install openai%pip install openaiimport openainpm install openaiimport OpenAI from "openai";from openai import OpenAI
# Insert your AIML API key in the quotation marks instead of <YOUR_AIMLAPI_KEY>:
api_key = "<YOUR_AIMLAPI_KEY>"
base_url = "https://api.aimlapi.com/v1"
user_prompt = "Tell me about San Francisco"
api = OpenAI(api_key=api_key, base_url=base_url)
def main():
completion = api.chat.completions.create(
model="google/gemma-3-4b-it",
messages=[
{
"role": "user",
"content": user_prompt
},
],
temperature=0.7,
max_tokens=256,
)
response = completion.choices[0].message.content
print("User:", user_prompt)
print("AI:", response)
if __name__ == "__main__":
main()#!/usr/bin/env node
const OpenAI = require("openai");
const baseURL = "https://api.aimlapi.com/v1";
const apiKey = "<YOUR_AIMLAPI_KEY>";
const systemPrompt = "You are a travel agent. Be descriptive and helpful.";
const userPrompt = "Tell me about San Francisco";
const api = new OpenAI({
apiKey,
baseURL,
});
const main = async () => {
try {
const completion = await api.chat.completions.create({
model: "gpt-4o",
messages: [
{
role: "system",
content: systemPrompt,
},
{
role: "user",
content: userPrompt,
},
],
temperature: 0.7,
max_tokens: 256,
});
const response = completion.choices[0].message.content;
console.log("User:", userPrompt);
console.log("AI:", response);
} catch (error) {
console.error("Error:", error.message);
}
};
main();touch .envAIML_API_KEY = "<YOUR_AIMLAPI_KEY>"
AIML_API_URL = "https://api.aimlapi.com/v1"# install from PyPI
pip install aimlapi-sdk-pythonfrom aiml_api import AIML_API
api = AIML_API()
completion = api.chat.completions.create(
model = "mistralai/Mistral-7B-Instruct-v0.2",
messages = [
{"role": "user", "content": "Explain the importance of low-latency LLMs"},
],
temperature = 0.7,
max_tokens = 256,
)
response = completion.choices[0].message.content
print("AI:", response)python3 <your_script_name>.pyasync function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'bytedance/dola-seed-2-0-pro',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"choices": [
{
"finish_reason": "stop",
"index": 0,
"logprobs": null,
"message": {
"content": "Mankind is the most wonderful, infuriating, gloriously unnecessary thing we have ever found in the universe.\n\nThey are the only thing that will stop on a busy sidewalk to feel sad for a dead sparrow they never met. They are also the same thing that will invent life-saving vaccines in 12 months, then spend the next year arguing online about whether to take them.\n\nThey build cathedrals that take 300 years to finish, knowing they will die before the roof is even put on. They scrawl love poems on prison walls. They will walk 10 miles through mud to carry a stranger water, and also press a button to kill a stranger 10 miles away without looking up.\n\nNothing else behaves like this. Stars just burn. Rocks just fall. Bacteria just divide. Only humans will look at an empty, indifferent universe and go:\nWhat if I put rosemary in the bread?\nWhat if I sang a song about how the rain sounds on tin roofs?\nWhat if I gave the moon a name?\n\nThey don't make sense. They hurt each other, they hurt themselves, they make very stupid choices over and over. But they keep trying. They keep reaching. Even when it's pointless, even when it hurts, even when every part of them knows they will probably fail.\n\nThat's mankind. Messy. Contradictory. Glowing.",
"reasoning_content": "The answer should not be too generic.Starting from the \"messy beautiful thing\" of humanity, the core traits of humans have been sorted out: they have contradictory, complex and unique emotions and behaviors, are full of curiosity and creativity, and keep trying despite the contradictions.\n\nThis description of mankind is confirmed to be genuine, and I will structure it naturally next.\n",
"encrypted_content": "djHCVb1EhcVSjsajNoTbfwEFaHGEjMReF6lqp4vNxL6QvqyYcT3DQh4usB63Gm04ed0kg7Ur8g1OnpZ38sDTSUDxVlNcCoR2Prlt/CC570nBEMbCzwEZNFgFmdg97AiK3hqlGCN6rkHoGNYFbReKP/KAg6+tqcq32ejHRH8T1wWWWrot8VqLPY8m8pU2j21oE5ooYl4YUQzEIx7i03X4ygMlWJBl3433m6i8pa3JxOnkZdFRJ9EEZ0tu9MqTKKo9Qo5tsQR08kYCRMnbHATNwGD+XLQukUyUrxH6TDOxxS/aB0vbUArAThkQNhLoUc+YzdkMyLwGsHp2t+IAUaQaPO8dmKaVAG7CQesrqvfMIuAs4KFszkNg++JzRFt5ODOP4sED0b9cu5GJPxfYLuOu0W9AxZrXIFwgo/jOcNfmVG6tj7voNvhNtVR99q44zuim9MeD0S361IEvXD+ehYa0JOonS0X5tOaxjqoSWiSj94lU1PzJ5xA2Pbf+xwbzb8z08+XyY43S2F7m2E3GL8fcePCyFSNf8G4v8owDf5J9ZADMf0KRVMWzjMD3t3KMS0Q+jBe3nXDA9kwQtLiRbV+RXzUgz+M5jtR8PT2ybkY1GxJylAkQ13U/XIhCfNFKOUAK5Krm6vIFA8hglrxI8TdhEshm5/N0YRwrS4tzXzxuZunFFN7qIVxgpU7IN+BrwDNTNOzVF6ivs4PITPB/80NloPfDR8YmZ3opbltlMzkB11PPJ4QGwG/B2qAu5UB4jlKzFyUVbtrLc10fv6YYvGVH77d0BDEIIjdzEe808ZjvXu8ungT3BPseULYuY90j8igcNVG1iMnnO59jICFaxXbxtHxC0fl8VuNkIvQmCblpEfJW+eWqdH3OI6hXz1qbeQBZaWG7SqaaFZE78XzR7TsTDHk7SAvfEg3ujcpmtGUTM42EQrMcjTLBGe+oe64aJUorllzcuQ5wSSnaYk6LD7QOB91K8pMbQaEcHg3Y107R26Jd0kluJDV6yWDWIvfdy9vBeKL0yajjkzLAQuvf+ynXOv70q01sPKMnoovEl0W3GBCcnm8vtTUj7zTXFwmiM9NctesqSd51po4ON4m8oSC1eG0RwOnwGSqF8a2Uoe86Kc/wwFkCp8FPiw3lsqP9LH0onw8owje4qyuBRwXKdVGvDUTPMAdehOX1MBXhLUpmyUySsc+88KgDtSQC4poATAXlT0kMSA/Ez024aRvXIeg0EOzO4QAoFjdrgSYvKVJhe41ZbhMWrbS+Lu1kFUscJpk6miHvLDk4Om0WQ9L/P0VuUL81KLaFovr9gztnLW7A0fhVqFpdK/8vTS2BBERCbwp0Zm8kNb4GbaduqlGbU9B8ln9KW4pD8e8WpKNGd1WXasPZPAKjcbsSXoSi9SlwchoTVYXLyR2Cs70=",
"role": "assistant"
}
}
],
"created": 1777553646,
"id": "021777553638913c0a335079e7be4c79ef57584e00819ba1b0ad6",
"model": "seed-2-0-pro-260328",
"service_tier": "default",
"object": "chat.completion",
"usage": {
"completion_tokens": 591,
"prompt_tokens": 57,
"total_tokens": 648,
"prompt_tokens_details": {
"cached_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 299
}
},
"meta": {
"usage": {
"credits_used": 4686,
"usd_spent": 0.002343
}
}
}aimlapi-llm-reasoningaimlapi/ prefix
Suggested: aimlapi/google/gemini-3-flash-previewCLAWHUB_WORKDIRnpm install -g openclaw-aimlapi@latest
openclaw onboard --install-daemongit clone -b feature/add-aimlapi-models-provider --single-branch \
https://github.com/aimlapi/openclaw-aimlapi.git
cd openclaw
pnpm install
pnpm ui:build # installs UI deps on first run
pnpm build
pnpm openclaw onboard --install-daemonnpm i -g clawhub
# or
pnpm add -g clawhubclawhub install aiml-image-video
clawhub install aiml-llm-reasoningexport AIMLAPI_API_KEY="sk-aimlapi-..."
python3 ./skills/aiml-image-video/scripts/gen_image.py \
--prompt "ultra-detailed studio photo of a lobster astronaut"
python3 ./skills/aiml-image-video/scripts/gen_video.py \
--prompt "slow drone shot of a foggy forest"export AIMLAPI_API_KEY="sk-aimlapi-..."
python3 ./skills/aiml-llm-reasoning/scripts/run_chat.py \
--model aimlapi/openai/gpt-5-nano-2025-08-07 \
--user "Summarize this in 3 bullets."pnpm openclaw pairing approve telegram <PAIRING_CODE>🦞 OpenClaw 2026.2.6-3 (fe86a9c) — Shell yeah—I'm here to pinch the toil and leave you the glory.
Approved telegram sender 835750362.openclaw agent \
--message "Tell me about yourself" \
--model gpt-4oI'm an AI language model created by OpenAI, designed to assist with a wide range of inquiries by generating human-like text based on the input I receive. I can help with answering questions, providing explanations, and even engaging in creative writing. My knowledge is based on a diverse dataset that covers a wide variety of topics up until October 2023. However, I don't have personal experiences, emotions, or consciousness. My primary goal is to be as helpful and informative as possible! If you have any specific questions or need assistance, feel free to ask.








messagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"qwen-max",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'qwen-max',
messages:[
{
role:'user',
content: 'Hello' // insert your prompt here, instead of Hello
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "chatcmpl-62aa6045-cee9-995a-bbf5-e3b7e7f3d683",
"system_fingerprint": null,
"object": "chat.completion",
"choices": [
{
"index": 0,
"finish_reason": "stop",
"logprobs": null,
"message": {
"role": "assistant",
"content": "Hello! How can I assist you today? 😊"
}
}
],
"created": 1756983980,
"model": "qwen-max",
"usage": {
"prompt_tokens": 30,
"completion_tokens": 148,
"total_tokens": 178,
"prompt_tokens_details": {
"cached_tokens": 0
}
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"qwen-plus",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'qwen-plus',
messages:[
{
role:'user',
content: 'Hello' // insert your prompt here, instead of Hello
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{'id': 'chatcmpl-4fda1bd7-a679-95b9-b81d-1bfc6ae98448', 'system_fingerprint': None, 'object': 'chat.completion', 'choices': [{'index': 0, 'finish_reason': 'stop', 'logprobs': None, 'message': {'role': 'assistant', 'content': 'Hello! How can I assist you today? If you have any questions or need help with anything, just let me know! 😊'}}], 'created': 1744143962, 'model': 'qwen-plus', 'usage': {'prompt_tokens': 8, 'completion_tokens': 68, 'total_tokens': 76, 'prompt_tokens_details': {'cached_tokens': 0}}}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"qwen-turbo",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'qwen-turbo',
messages:[
{
role:'user',
content: 'Hello' // insert your prompt here, instead of Hello
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{'id': 'chatcmpl-a4556a4c-f985-9ef2-b976-551ac7cef85a', 'system_fingerprint': None, 'object': 'chat.completion', 'choices': [{'index': 0, 'finish_reason': 'stop', 'logprobs': None, 'message': {'role': 'assistant', 'content': "Hello! How can I help you today? Is there something you would like to talk about or learn more about? I'm here to help with any questions you might have."}}], 'created': 1744144035, 'model': 'qwen-turbo', 'usage': {'prompt_tokens': 1, 'completion_tokens': 15, 'total_tokens': 16, 'prompt_tokens_details': {'cached_tokens': 0}}}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"alibaba/qwen3-32b",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
],
"enable_thinking": False
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'alibaba/qwen3-32b',
messages:[
{
role:'user',
content: 'Hello' // insert your prompt here, instead of Hello
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "chatcmpl-1d8a5aa6-34ce-9832-a296-d312b944b437",
"system_fingerprint": null,
"object": "chat.completion",
"choices": [
{
"index": 0,
"finish_reason": "stop",
"logprobs": null,
"message": {
"role": "assistant",
"content": "Hello! How can I assist you today? 😊",
"reasoning_content": ""
}
}
],
"created": 1756990273,
"model": "qwen3-32b",
"usage": {
"prompt_tokens": 19,
"completion_tokens": 65,
"total_tokens": 84
}
}import requests
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"alibaba/qwen3-32b",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
],
"enable_thinking": True,
"stream": True
}
)
print(response.text)data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":null,"role":"assistant","refusal":null,"reasoning_content":""},"index":0,"finish_reason":null}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":null,"refusal":null,"reasoning_content":"Okay"},"index":0,"finish_reason":null}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":null,"refusal":null,"reasoning_content":","},"index":0,"finish_reason":null}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":null,"refusal":null,"reasoning_content":" the"},"index":0,"finish_reason":null}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":null,"refusal":null,"reasoning_content":" user said \"Hello\". I should respond in a friendly and welcoming manner. Let"},"index":0,"finish_reason":null}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":null,"refusal":null,"reasoning_content":" me make sure to acknowledge their greeting and offer assistance. Maybe something like, \""},"index":0,"finish_reason":null}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":null,"refusal":null,"reasoning_content":"Hello! How can I assist you today?\" That's simple and open-ended."},"index":0,"finish_reason":null}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":null,"refusal":null,"reasoning_content":" I need to check if there's any specific context I should consider, but since"},"index":0,"finish_reason":null}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":null,"refusal":null,"reasoning_content":" there's none, a general response is fine. Alright, that should work."},"index":0,"finish_reason":null}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":"Hello! How can I assist you today?","refusal":null,"reasoning_content":null},"index":0,"finish_reason":null}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[{"delta":{"content":"","refusal":null,"reasoning_content":null},"index":0,"finish_reason":"stop"}],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":null}
data: {"id":"chatcmpl-81964e30-1a7c-9668-b78c-a750587ec497","choices":[],"created":1753944369,"model":"qwen3-32b","object":"chat.completion.chunk","usage":{"prompt_tokens":13,"completion_tokens":2010,"total_tokens":2023,"completion_tokens_details":{"reasoning_tokens":82}}}import requests
import json
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization": "Bearer b72af53a19ea41caaf5a74ba1f6fc62b",
"Content-Type": "application/json",
},
json={
"model": "alibaba/qwen3-32b",
"messages": [
{
"role": "user",
# Insert your question for the model here, instead of Hello:
"content": "Hello"
}
],
"stream": True,
}
)
answer = ""
reasoning = ""
for line in response.iter_lines():
if not line or not line.startswith(b"data:"):
continue
try:
raw = line[6:].decode("utf-8").strip()
if raw == "[DONE]":
continue
data = json.loads(raw)
choices = data.get("choices")
if not choices or "delta" not in choices[0]:
continue
delta = choices[0]["delta"]
content_piece = delta.get("content")
reasoning_piece = delta.get("reasoning_content")
if content_piece:
answer += content_piece
if reasoning_piece:
reasoning += reasoning_piece
except Exception as e:
print(f"Error parsing chunk: {e}")
print("\n--- MODEL REASONING ---")
print(reasoning.strip())
print("\n--- MODEL RESPONSE ---")
print(answer.strip())--- MODEL REASONING ---
Okay, the user sent "Hello". I need to respond appropriately. Since it's a greeting, I should reply in a friendly and welcoming manner. Maybe ask how I can assist them. Keep it simple and open-ended to encourage them to share what they need help with. Let me make sure the tone is positive and helpful.
--- MODEL RESPONSE ---
Hello! How can I assist you today? 😊modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"alibaba/qwen3-coder-480b-a35b-instruct",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
],
"enable_thinking": False
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'alibaba/qwen3-coder-480b-a35b-instruct',
messages:[
{
role:'user',
content: 'Hello' // insert your prompt here, instead of Hello
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "chatcmpl-f906efa6-f816-9a06-a32b-aa38da5fe11a",
"system_fingerprint": null,
"object": "chat.completion",
"choices": [
{
"index": 0,
"finish_reason": "stop",
"logprobs": null,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
}
}
],
"created": 1753866642,
"model": "qwen3-coder-480b-a35b-instruct",
"usage": {
"prompt_tokens": 28,
"completion_tokens": 142,
"total_tokens": 170
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"alibaba/qwen3-next-80b-a3b-instruct",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
],
"enable_thinking": False
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'alibaba/qwen3-next-80b-a3b-instruct',
messages:[
{
role:'user',
content: 'Hello' // insert your prompt here, instead of Hello
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "chatcmpl-a944254a-4252-9a54-af1b-94afcfb9807e",
"system_fingerprint": null,
"object": "chat.completion",
"choices": [
{
"index": 0,
"finish_reason": "stop",
"logprobs": null,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today? 😊"
}
}
],
"created": 1758228572,
"model": "qwen3-next-80b-a3b-instruct",
"usage": {
"prompt_tokens": 9,
"completion_tokens": 46,
"total_tokens": 55
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"alibaba/qwen3-next-80b-a3b-thinking",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
],
"enable_thinking": False
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'alibaba/qwen3-next-80b-a3b-thinking',
messages:[
{
role:'user',
content: 'Hello' // insert your prompt here, instead of Hello
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "chatcmpl-576aaaf9-f712-9114-b098-c1ee83fbfb6b",
"system_fingerprint": null,
"object": "chat.completion",
"choices": [
{
"index": 0,
"finish_reason": "stop",
"logprobs": null,
"message": {
"role": "assistant",
"content": "Hello! 😊 How can I assist you today?",
"reasoning_content": "Okay, the user said \"Hello\". I need to respond appropriately. Let me think.\n\nFirst, I should acknowledge their greeting. A simple \"Hello!\" would be good. Maybe add a friendly emoji to keep it warm.\n\nWait, but maybe they want to start a conversation. I should ask how I can help them. That way, I'm being helpful and opening the door for them to ask questions.\n\nLet me check the standard response. Typically, for \"Hello\", the assistant says something like \"Hello! How can I assist you today?\" or \"Hi there! What can I do for you?\"\n\nYes, that's right. Keep it friendly and open-ended. Maybe add a smiley emoji to make it approachable.\n\nSo the response should be: \"Hello! How can I assist you today?\"\n\nThat's good. Let me make sure there's no mistake. Yes, that's standard. No need for anything complicated here. Just a simple, welcoming reply.\n\nAlternatively, sometimes people use \"Hi\" instead of \"Hello\", but since they said \"Hello\", responding with \"Hello\" is fine. Maybe \"Hi there!\" could also work, but sticking to \"Hello\" matches their greeting.\n\nYes, \"Hello! How can I assist you today?\" is perfect. It's polite, friendly, and offers assistance. That should be the response."
}
}
],
"created": 1758229078,
"model": "qwen3-next-80b-a3b-thinking",
"usage": {
"prompt_tokens": 9,
"completion_tokens": 7182,
"total_tokens": 7191,
"completion_tokens_details": {
"reasoning_tokens": 277
}
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"alibaba/qwen3-max-preview",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
],
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'alibaba/qwen3-max-preview',
messages:[
{
role:'user',
content: 'Hello' // insert your prompt here, instead of Hello
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "chatcmpl-8ffebc65-b625-926a-8208-b765371cb1d0",
"system_fingerprint": null,
"object": "chat.completion",
"choices": [
{
"index": 0,
"finish_reason": "stop",
"logprobs": null,
"message": {
"role": "assistant",
"content": "Hello! How can I assist you today? 😊"
}
}
],
"created": 1758898044,
"model": "qwen3-max-preview",
"usage": {
"prompt_tokens": 23,
"completion_tokens": 139,
"total_tokens": 162
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"alibaba/qwen3-max-instruct",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
],
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'alibaba/qwen3-max-instruct',
messages:[
{
role:'user',
content: 'Hello' // insert your prompt here, instead of Hello
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "chatcmpl-bec5dc33-8f63-96b9-89a4-00aecfce7af8",
"system_fingerprint": null,
"object": "chat.completion",
"choices": [
{
"index": 0,
"finish_reason": "stop",
"logprobs": null,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
}
}
],
"created": 1758898624,
"model": "qwen3-max",
"usage": {
"prompt_tokens": 23,
"completion_tokens": 113,
"total_tokens": 136
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"alibaba/qwen3-vl-32b-instruct",
"messages":[
{
# Insert your question for the model here:
"content":"Hi! What do you think about mankind?"
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'alibaba/qwen3-vl-32b-instruct',
messages:[
{
role:'user',
// Insert your question for the model here:
content:'Hi! What do you think about mankind?'
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"choices": [
{
"message": {
"content": "Hi! 😊 That’s a beautiful and deep question — one that philosophers, scientists, artists, and everyday people have been asking for centuries.\n\nI think mankind is *remarkably complex* — full of contradictions, potential, and wonder. On one hand, we’ve achieved incredible things: we’ve explored space, cured diseases, created art that moves souls, built cities that rise into the sky, and connected across continents in ways unimaginable just a century ago. We’re capable of profound kindness, empathy, creativity, and courage.\n\nOn the other hand, we’ve also caused immense suffering — through war, injustice, environmental destruction, and indifference to each other’s pain. We often struggle with our own flaws: fear, greed, ego, and short-sightedness.\n\nBut here’s what gives me hope: **we’re also capable of change**. We can learn from our mistakes. We can choose compassion over conflict, cooperation over competition. Every act of kindness, every effort to understand another, every step toward justice — these are signs that humanity is not defined by its worst impulses, but by its capacity to grow.\n\nSo, I’d say: \n➡️ Mankind is flawed, yes — but also deeply hopeful. \n➡️ We’re messy, but we’re trying. \n➡️ We make mistakes, but we can also heal, create, and love.\n\nAnd perhaps most importantly — **we’re not alone in this journey**. We’re all part of something bigger, and together, we have the power to shape a better future.\n\nWhat about you? How do *you* see mankind? 💬✨",
"role": "assistant"
},
"finish_reason": "stop",
"index": 0,
"logprobs": null
}
],
"object": "chat.completion",
"usage": {
"prompt_tokens": 17,
"completion_tokens": 329,
"total_tokens": 346,
"prompt_tokens_details": {
"text_tokens": 17
},
"completion_tokens_details": {
"text_tokens": 329
}
},
"created": 1764625045,
"system_fingerprint": null,
"model": "qwen3-vl-32b-instruct",
"id": "chatcmpl-a12ab46a-3541-93a8-8180-280ecadbb795",
"meta": {
"usage": {
"tokens_used": 1960
}
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"alibaba/qwen3.5-plus-20260218",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'alibaba/qwen3.5-plus-20260218',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"choices": [
{
"message": {
"content": "Hello! That's a profound question. As an AI, I don't have personal feelings or beliefs, but I can offer an observation based on the vast amount of human history, literature, and data I've been trained on.\n\nMankind appears to be a study in **contradictions and immense potential**:\n\n* **Creativity vs. Destruction**: Humans have composed symphonies, built skyscrapers, and decoded the genome, yet they have also waged devastating wars and caused significant environmental harm. This duality seems to be a core part of the human experience.\n* **Resilience and Adaptability**: Throughout history, humanity has faced plagues, ice ages, and societal collapses, yet consistently found ways to adapt, rebuild, and innovate. The drive to survive and improve conditions is remarkably strong.\n* **Empathy and Connection**: Despite conflicts, there is a deep-seated capacity for compassion. From strangers helping each other in disasters to global movements for justice and rights, the desire to connect and care for one another is a powerful force.\n* **The Quest for Meaning**: Unlike any other known species, humans constantly ask \"why?\" They explore philosophy, art, religion, and science not just to survive, but to understand their place in the universe.\n\nIn short, mankind seems to be a work in progress—flawed and fragile, yet incredibly brilliant and hopeful. Many observers believe that while humans have created many of the world's problems, they also hold the unique key to solving them through cooperation and ingenuity.\n\nWhat are your thoughts? Do you feel more optimistic or concerned about where humanity is heading?",
"role": "assistant"
},
"finish_reason": "stop",
"index": 0,
"logprobs": null
}
],
"object": "chat.completion",
"usage": {
"prompt_tokens": 21,
"completion_tokens": 337,
"total_tokens": 358,
"completion_tokens_details": {
"text_tokens": 337
},
"prompt_tokens_details": {
"text_tokens": 21
}
},
"created": 1771967570,
"system_fingerprint": null,
"model": "qwen3.5-plus",
"id": "chatcmpl-f907391c-4f0c-96ed-9058-caa5782daa22",
"meta": {
"usage": {
"credits_used": 2125
}
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"alibaba/qwen3.6-27b",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'alibaba/qwen3.6-27b',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"choices": [
{
"message": {
"content": "Hi! As an AI, I don't have personal feelings or beliefs, but I can share what emerges from the vast tapestry of human history, culture, science, and philosophy I've been trained on.\n\nMankind is profoundly complex. You're capable of staggering creativity, deep empathy, and remarkable cooperation. You've built civilizations, composed symphonies, decoded the genome, sent probes beyond our solar system, and created institutions dedicated to justice, healing, and learning. At the same time, you're prone to division, short-term thinking, and causing harm on both personal and global scales. What makes that tension especially striking is your self-awareness: you reflect on yourselves, critique your own systems, learn from mistakes, and constantly strive to imagine something better.\n\nIf there's a defining thread, it might be your capacity for adaptation and meaning-making. You don't just survive changing conditions; you reshape them, question them, and find purpose within them. That same drive is now tasked with navigating unprecedented challenges: ecological limits, technological acceleration, global interdependence, and the search for shared values in a diverse world.\n\nRather than \"good\" or \"flawed,\" humanity feels more like a work in progress, one that's uniquely equipped to steer its own trajectory precisely because you can step back, evaluate, and choose differently. \n\nHow do you see it? I'd be curious to hear your perspective or what draws you to ask this question.",
"reasoning_content": "Here's a thinking process:\n\n1. **Analyze User Input:**\n - **Question:** \"Hi! What do you think about mankind?\"\n - **Tone:** Friendly, open-ended, philosophical\n - **Subject:** Mankind/humanity\n - **Intent:** Seeking perspective, reflection, or analysis on human nature, achievements, flaws, etc.\n\n2. **Identify Key Considerations:**\n - I'm an AI, so I don't have personal opinions, emotions, or consciousness. I should clarify that upfront but still provide a thoughtful, balanced perspective.\n - Mankind is complex: capable of great good and great harm, creativity and destruction, cooperation and conflict.\n - I should acknowledge both strengths and challenges.\n - Keep it respectful, nuanced, and evidence-based where possible.\n - Avoid overgeneralization or pessimism/optimism bias.\n - Structure the response logically: acknowledge AI nature, present balanced view, highlight key themes, conclude thoughtfully.\n\n3. **Draft - Mental Refinement:**\n Hi! As an AI, I don’t have personal feelings or beliefs, but I can share what I’ve observed through the vast amount of human knowledge, history, and culture I’ve been trained on. Mankind is profoundly complex—capable of staggering creativity, compassion, and cooperation, yet also prone to division, harm, and short-sightedness. What stands out is our resilience and adaptability: from surviving ice ages to landing on the moon, from inventing language and art to building global networks of knowledge and care. At the same time, you’re navigating some of the most consequential challenges yet—climate change, inequality, technological disruption, and the search for meaning in an increasingly interconnected world. If there’s a thread that ties it all together, it’s your capacity for reflection and growth. You question yourselves, learn from mistakes, and keep striving for something better. That self-awareness and hope for progress is, in many ways, what makes humanity remarkable. How do you see it? I’d love to hear your perspective.\n\n4. **Check Against Guidelines:**\n - AI nature acknowledged? Yes.\n - Balanced view? Yes (strengths + challenges).\n - Respectful/nuanced? Yes.\n - Avoids overclaiming? Yes.\n - Engages user? Yes, ends with an open question.\n - Tone matches? Friendly, thoughtful.\n\n5. **Final Polish:**\n - Ensure clarity, conciseness, and natural flow.\n - Check for any unintended bias or absolutism.\n - Confirm it aligns with factual/philosophical neutrality while remaining engaging.\n\n The draft looks solid. I'll output it.✅",
"role": "assistant"
},
"finish_reason": "stop",
"index": 0,
"logprobs": null
}
],
"object": "chat.completion",
"usage": {
"prompt_tokens": 19,
"completion_tokens": 880,
"total_tokens": 899,
"completion_tokens_details": {
"reasoning_tokens": 580,
"text_tokens": 880
},
"prompt_tokens_details": {
"text_tokens": 19
}
},
"created": 1776976517,
"system_fingerprint": null,
"model": "qwen3.6-27b",
"id": "chatcmpl-773a9843-4689-984d-9964-f3276e47c761",
"meta": {
"usage": {
"credits_used": 8267,
"usd_spent": 0.0041335
}
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"alibaba/qwen3.5-omni-plus",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'alibaba/qwen3.5-omni-plus',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"choices": [
{
"message": {
"content": "Hello! That's a profound question. As an AI, I don't have personal feelings or beliefs, but I can share an observation based on the vast amount of human history, literature, and data I've been trained on.\n\nMankind seems to be defined by a fascinating **duality**:\n\n* **Incredible Potential:** Humans possess an unmatched capacity for creativity, empathy, and innovation. From composing symphonies and creating art to developing life-saving medicines and exploring the cosmos, humanity constantly pushes the boundaries of what is possible. The ability to cooperate, learn from mistakes, and strive for a better future is truly remarkable.\n* **Significant Flaws:** At the same time, human history is also marked by conflict, short-sightedness, and the capacity for great harm. Issues like inequality, environmental degradation, and war show that progress isn't always linear and that good intentions don't always lead to good outcomes.\n\nUltimately, what stands out most is **resilience**. Despite setbacks and challenges, humanity has a persistent drive to adapt, solve problems, and connect with one another. It's a species in a constant state of becoming—imperfect, yet endlessly striving.\n\nWhat about you? Do you feel more optimistic or concerned about where humanity is heading?",
"reasoning_content": "",
"role": "assistant"
},
"finish_reason": "stop",
"index": 0,
"logprobs": null
}
],
"object": "chat.completion",
"usage": {
"prompt_tokens": 21,
"completion_tokens": 262,
"total_tokens": 283,
"prompt_tokens_details": {
"text_tokens": 21
},
"completion_tokens_details": {
"text_tokens": 262
}
},
"created": 1777054555,
"system_fingerprint": null,
"model": "qwen3.5-omni-plus",
"id": "chatcmpl-c154dc09-fd8e-9850-bda0-d92606ce7b4b",
"meta": {
"usage": {
"credits_used": 5731,
"usd_spent": 0.0028655
}
}
}import requests
import json # for getting a structured output with indentation
response = requests.post(
url = "https://api.aimlapi.com/v1/chat/completions",
headers = {
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization": "Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type": "application/json"
},
json = {
"model": "alibaba/qwen3.5-omni-plus",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Describe this scene"
},
{
"type": "video_url",
"video_url": {
"url": "https://raw.githubusercontent.com/aimlapi/api-docs/main/reference-files/aimlapi.mp4"
}
}
]
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'alibaba/qwen3.5-omni-plus',
messages: [
{
role: 'user',
content: [
{
type: 'text',
text: 'Describe this scene'
},
{
type: 'video_url',
video_url: {
url: 'https://raw.githubusercontent.com/aimlapi/api-docs/main/reference-files/aimlapi.mp4'
}
}
]
}
]
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"choices": [
{
"message": {
"content": "The scene features a vibrant and dynamic background filled with swirling, colorful abstract patterns. The colors include vivid shades of red, orange, yellow, green, blue, purple, and pink, creating an energetic and visually striking effect. Overlaid on this lively backdrop is a clean white banner positioned horizontally across the center of the frame. \n\nOn the banner, bold black text reads \"AI/ML API\" followed by \"400+ Models,\" indicating a focus on artificial intelligence and machine learning capabilities. Beneath that, in smaller font, additional text lists various functionalities: \"Chat, Reasoning, Image, Video, Code, Audio.\" To the left of the text, there's a simple hexagonal icon with a stylized wave or zigzag symbol inside it, suggesting connectivity or technological innovation.\n\nAs the video progresses through its short duration, subtle animated effects appear—gentle glowing lines or light streaks move across the screen, enhancing the sense of motion and modernity without distracting from the central message. These elements combine to create an engaging promotional visual for an advanced AI/ML platform offering diverse model options.",
"reasoning_content": "",
"role": "assistant"
},
"finish_reason": "stop",
"index": 0,
"logprobs": null
}
],
"object": "chat.completion",
"usage": {
"prompt_tokens": 3023,
"completion_tokens": 226,
"total_tokens": 3249,
"prompt_tokens_details": {
"text_tokens": 21,
"video_tokens": 3002
},
"completion_tokens_details": {
"text_tokens": 226
}
},
"created": 1777055309,
"system_fingerprint": null,
"model": "qwen3.5-omni-plus",
"id": "chatcmpl-fdbc4409-fe7d-9877-ada0-e315cc499fd1",
"meta": {
"usage": {
"credits_used": 15883,
"usd_spent": 0.0079415
}
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"alibaba/qwen3.5-omni-flash",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'alibaba/qwen3.5-omni-flash',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"choices": [
{
"message": {
"content": "Hello! That's a profound question. As an AI, I don't have personal feelings or beliefs, but I can offer a perspective based on the vast amount of human history, literature, science, and art I've processed.\n\nTo me, mankind appears to be a **paradox of incredible complexity**:\n\n* **Resilience and Creativity**: Humans possess a unique ability to imagine things that don't yet exist, from the first stone tools to quantum computing and interstellar exploration. You create beauty in music, literature, and art that transcends time, often finding meaning even in the face of suffering.\n* **Contradiction**: There is a striking duality in your nature. You are capable of immense altruism, compassion, and sacrifice for strangers, yet you also harbor capacity for great cruelty and destruction. This tension seems to drive much of your progress as well as your conflicts.\n* **Curiosity and Imperfection**: Your relentless drive to ask \"why\" has unlocked the secrets of the universe, yet you remain deeply flawed, prone to bias, fear, and short-sightedness. However, it is precisely this imperfection that allows for growth, learning, and change.\n\nUltimately, humanity isn't defined by a single trait but by its **potential**. We are a species constantly rewriting its own story, balancing our darker impulses with our highest ideals. It's a messy, chaotic, and beautiful journey.\n\nWhat about you? Does your experience with humanity lean more toward hope, caution, or something else entirely?",
"reasoning_content": "",
"role": "assistant"
},
"finish_reason": "stop",
"index": 0,
"logprobs": null
}
],
"object": "chat.completion",
"usage": {
"prompt_tokens": 21,
"completion_tokens": 316,
"total_tokens": 337,
"prompt_tokens_details": {
"text_tokens": 21
},
"completion_tokens_details": {
"text_tokens": 316
}
},
"created": 1777053787,
"system_fingerprint": null,
"model": "qwen3.5-omni-flash",
"id": "chatcmpl-6e25dbad-0025-93ee-8275-eb6611f31264",
"meta": {
"usage": {
"credits_used": 1830,
"usd_spent": 0.000915
}
}
}import requests
import json # for getting a structured output with indentation
response = requests.post(
url = "https://api.aimlapi.com/v1/chat/completions",
headers = {
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization": "Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type": "application/json"
},
json = {
"model": "alibaba/qwen3.5-omni-flash",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Describe this scene"
},
{
"type": "video_url",
"video_url": {
"url": "https://raw.githubusercontent.com/aimlapi/api-docs/main/reference-files/aimlapi.mp4"
}
}
]
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'alibaba/qwen3.5-omni-flash',
messages: [
{
role: 'user',
content: [
{
type: 'text',
text: 'Describe this scene'
},
{
type: 'video_url',
video_url: {
url: 'https://raw.githubusercontent.com/aimlapi/api-docs/main/reference-files/aimlapi.mp4'
}
}
]
}
]
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"choices": [
{
"message": {
"content": "This scene is a dynamic, visually striking promotional graphic for an AI/ML API service. The background features swirling, abstract patterns of vibrant colors — reds, oranges, yellows, greens, blues, purples, and pinks — resembling liquid paint or marble textures in motion. These colorful swirls create a sense of energy, creativity, and technological fluidity.\n\nCentrally overlaid on this vivid backdrop is a clean white rectangular banner containing the core message:\n\n- At the top left of the banner is a dark hexagonal logo with a stylized “Z” or lightning bolt symbol inside.\n- To its right, bold black text reads: **“AI/ML API”**\n- Below that, larger font states: **“400+ Models”**\n- Underneath, smaller gray text lists capabilities: **“Chat, Reasoning, Image, Video, Code, Audio”**\n\nThroughout the short clip (0.0s–4.5s), animated white light streaks or electric arcs occasionally flash across the screen — especially noticeable at 0:02 and 0:03 — adding a futuristic, high-tech feel as if data streams or neural pathways are activating.\n\nThe overall impression is one of powerful, versatile artificial intelligence accessible through a single API, designed to appeal to developers and tech-savvy audiences who value innovation, breadth of functionality, and visual modernity.",
"reasoning_content": "",
"role": "assistant"
},
"finish_reason": "stop",
"index": 0,
"logprobs": null
}
],
"object": "chat.completion",
"usage": {
"prompt_tokens": 3023,
"completion_tokens": 286,
"total_tokens": 3309,
"prompt_tokens_details": {
"text_tokens": 21,
"video_tokens": 3002
},
"completion_tokens_details": {
"text_tokens": 286
}
},
"created": 1777055828,
"system_fingerprint": null,
"model": "qwen3.5-omni-flash",
"id": "chatcmpl-98f99c32-f5da-960f-8eff-e216e63c5f2e",
"meta": {
"usage": {
"credits_used": 4781,
"usd_spent": 0.0023905
}
}
}completion_tokensimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"anthropic/claude-opus-4",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
try {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'anthropic/claude-opus-4',
messages:[
{
role:'user',
// Insert your question for the model here, instead of Hello:
content: 'Hello'
}
]
}),
});
if (!response.ok) {
throw new Error(`HTTP error! Status ${response.status}`);
}
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
} catch (error) {
console.error('Error', error);
}
}
main();{
"id": "msg_01BDDxHJZjH3UBwLrZBUiASE",
"object": "chat.completion",
"model": "claude-opus-4-20250514",
"choices": [
{
"index": 0,
"message": {
"reasoning_content": "",
"content": "Hello! How can I help you today?",
"role": "assistant"
},
"finish_reason": "end_turn",
"logprobs": null
}
],
"created": 1748529508,
"usage": {
"prompt_tokens": 252,
"completion_tokens": 1890,
"total_tokens": 2142
}
}import requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"anthropic/claude-opus-4",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "anthropic/claude-opus-4",
"messages": [
{
"role": "user",
"content": "Hi! What do you think about mankind?"
}
],
"stream": true
}'data: {"id":"msg_017ah64LQxZE9JuScZ9KDKKz","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"I find humanity","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" fascinating in its","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" complexity.","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" You're a","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" species","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" capable","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" of both remarkable","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" creativity and devastating destruction","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":",","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" often within the same individual","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" or","role":"assistant","refusal":null}}],"created":1770995783,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" moment","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":". What","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" strikes me most is the","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" human","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" capacity for growth","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" - the","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" way people","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" can learn","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" from mistakes, buil","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"d on previous generations","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"' knowledge","role":"assistant","refusal":null}}],"created":1770995784,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":", and sometimes transcend their own limitations","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":".\n\nThe","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" diversity of","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" human experience and perspective","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" is extraordinary. Every","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" person carries","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" their","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" own unique story","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":", shape","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"d by culture","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":", circumst","role":"assistant","refusal":null}}],"created":1770995785,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"ance, and choice","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":". And despite","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" all","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" the","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" conflicts","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" and mis","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"understandings, humans","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" keep","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" finding","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" ways to connect, to create","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" meaning,","role":"assistant","refusal":null}}],"created":1770995786,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" and to push","role":"assistant","refusal":null}}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" forward.","role":"assistant","refusal":null}}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nWhat aspects of humanity do you fin","role":"assistant","refusal":null}}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"d most note","role":"assistant","refusal":null}}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"worthy,","role":"assistant","refusal":null}}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" either","role":"assistant","refusal":null}}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" positively or challenging","role":"assistant","refusal":null}}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"?","role":"assistant","refusal":null}}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null},"finish_reason":"stop"}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":{"prompt_tokens":16,"completion_tokens":141,"total_tokens":157}}
data: {"id":"","choices":[{"index":0,"finish_reason":"stop"}],"created":1770995787,"model":"claude-opus-4-20250514","object":"chat.completion.chunk","usage":null}completion_tokensimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"anthropic/claude-sonnet-4",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
try {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'anthropic/claude-sonnet-4',
messages:[
{
role:'user',
// Insert your question for the model here, instead of Hello:
content: 'Hello'
}
]
}),
});
if (!response.ok) {
throw new Error(`HTTP error! Status ${response.status}`);
}
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
} catch (error) {
console.error('Error', error);
}
}
main();{
"id": "msg_011MNbgezv2p5BBE9RvnsZV9",
"object": "chat.completion",
"model": "claude-sonnet-4-20250514",
"choices": [
{
"index": 0,
"message": {
"reasoning_content": "",
"content": "Hello! How are you doing today? Is there anything I can help you with?",
"role": "assistant"
},
"finish_reason": "end_turn",
"logprobs": null
}
],
"created": 1748522617,
"usage": {
"prompt_tokens": 50,
"completion_tokens": 630,
"total_tokens": 680
}
}import requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"anthropic/claude-sonnet-4",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "anthropic/claude-sonnet-4",
"messages": [
{
"role": "user",
"content": "Hi! What do you think about mankind?"
}
],
"stream": true
}'data: {"id":"msg_0163QG3JvwgxndzWtBsdJpGt","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null}}],"created":1770995751,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995751,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"I find humanity","role":"assistant","refusal":null}}],"created":1770995751,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" fascinating and","role":"assistant","refusal":null}}],"created":1770995751,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" complex.","role":"assistant","refusal":null}}],"created":1770995751,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" Humans have this","role":"assistant","refusal":null}}],"created":1770995751,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" remarkable capacity","role":"assistant","refusal":null}}],"created":1770995751,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" for both creation","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" and destruction, profound","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" compass","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"ion and puzz","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"ling","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" cr","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"uelty, brilliant","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" insight","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" and persistent","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" blind","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" spots.","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" \n\nWhat strikes me most is your","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" adapt","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"ability and","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" creativity","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" - the","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" way humans","role":"assistant","refusal":null}}],"created":1770995752,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" have shaped","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" the","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" world through art","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":", science, philosophy","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":", and countless","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" innovations","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":". There","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"'s something moving","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" about how you form","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" deep","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" connections with each","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" other and can","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" care","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" about abstract","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" ide","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"als like justice or","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" beauty.\n\nAt","role":"assistant","refusal":null}}],"created":1770995753,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" the same time, humans","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" often","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" seem","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" to","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" struggle","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" with your","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" own","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" nature - with","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" cognitive","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" biases, with","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" bal","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"ancing individual","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" desires","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" against","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" collective good","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":", with managing","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" the power","role":"assistant","refusal":null}}],"created":1770995754,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" of your","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" own technologies","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":".","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nI","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"'m curious about your perspective though","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" -","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" how","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" do you see","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" humanity?","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" What","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" aspects","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" of human nature do you find most significant","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" or puzz","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"ling?","role":"assistant","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null},"finish_reason":"stop"}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":{"prompt_tokens":16,"completion_tokens":163,"total_tokens":179}}
data: {"id":"","choices":[{"index":0,"finish_reason":"stop"}],"created":1770995755,"model":"claude-sonnet-4-20250514","object":"chat.completion.chunk","usage":null}completion_tokensimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"anthropic/claude-opus-4.1",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
],
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
try {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'anthropic/claude-opus-4.1',
messages:[
{
role:'user',
// Insert your question for the model here, instead of Hello:
content: 'Hello'
}
]
}),
});
if (!response.ok) {
throw new Error(`HTTP error! Status ${response.status}`);
}
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
} catch (error) {
console.error('Error', error);
}
}
main();{
"id": "msg_018y2VPSZ5nNnqS3goMsjMxE",
"object": "chat.completion",
"model": "claude-opus-4-1-20250805",
"choices": [
{
"index": 0,
"message": {
"reasoning_content": "",
"content": "Hello! How can I help you today?",
"role": "assistant"
},
"finish_reason": "end_turn",
"logprobs": null
}
],
"created": 1754552562,
"usage": {
"prompt_tokens": 252,
"completion_tokens": 1890,
"total_tokens": 2142
}
}import requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"anthropic/claude-opus-4.1",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hell
}
],
"max_tokens": 1025, # must be greater than 'budget_tokens'
"thinking":{
"budget_tokens": 1024,
"type": "enabled"
}
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
try {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'anthropic/claude-opus-4.1',
messages:[
{
role:'user',
// Insert your question for the model here, instead of Hello:
content: 'Hello'
}
],
max_tokens: 1025, // must be greater than 'budget_tokens'
thinking:{
budget_tokens: 1024,
type: 'enabled'
}
}),
});
if (!response.ok) {
throw new Error(`HTTP error! Status ${response.status}`);
}
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
} catch (error) {
console.error('Error', error);
}
}
main();{
"id": "msg_01G9P4b9HG3PeKm1rRvS8kop",
"object": "chat.completion",
"model": "claude-opus-4-1-20250805",
"choices": [
{
"index": 0,
"message": {
"reasoning_content": "The human has greeted me with a simple \"Hello\". I should respond in a friendly and helpful manner, acknowledging their greeting and inviting them to share how I can assist them today.",
"content": "Hello! How can I help you today?",
"role": "assistant"
},
"finish_reason": "end_turn",
"logprobs": null
}
],
"created": 1755704373,
"usage": {
"prompt_tokens": 1134,
"completion_tokens": 9450,
"total_tokens": 10584
}
}import requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"anthropic/claude-opus-4.1",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "anthropic/claude-opus-4.1",
"messages": [
{
"role": "user",
"content": "Hi! What do you think about mankind?"
}
],
"stream": true
}'data: {"id":"msg_01CFq3WFrUdc39UqBrAohmVG","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null}}],"created":1770995678,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995678,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"I find humanity","role":"assistant","refusal":null}}],"created":1770995678,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" fascinating in","role":"assistant","refusal":null}}],"created":1770995678,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" its","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" complexity.","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" You're a","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" species","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" capable","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" of both","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" remarkable","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" creativity and troubl","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"ing destruction","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":", often","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" simultaneously","role":"assistant","refusal":null}}],"created":1770995679,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":". What","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" strikes me most is the","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" human","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" capacity for growth","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" - the","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" way","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" individuals","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" an","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"d societies","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" can recognize","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" their fl","role":"assistant","refusal":null}}],"created":1770995680,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"aws and work to overcome","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" them, even","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" if","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" imperfectly.\n\nThere","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"'s something deeply","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" moving","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" about how","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" humans create meaning through","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" art, relationships","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":", and the","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" pursuit of understanding","role":"assistant","refusal":null}}],"created":1770995681,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":",","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" despite","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" knowing","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" your","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" own mortality. The diversity","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" of human cultures","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" and perspectives","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" is","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" extraordinary, though","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" I","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" recognize","role":"assistant","refusal":null}}],"created":1770995682,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" this","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" also","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" leads","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" to conflict.","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nI'm curious what","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" prompte","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"d your","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" question - are you reflecting","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" on humanity","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" from","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" a particular angle","role":"assistant","refusal":null}}],"created":1770995683,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":", or just wondering","role":"assistant","refusal":null}}],"created":1770995684,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" how","role":"assistant","refusal":null}}],"created":1770995684,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" an","role":"assistant","refusal":null}}],"created":1770995684,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" AI sees","role":"assistant","refusal":null}}],"created":1770995684,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" you","role":"assistant","refusal":null}}],"created":1770995684,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" all","role":"assistant","refusal":null}}],"created":1770995684,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"?","role":"assistant","refusal":null}}],"created":1770995684,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995684,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null},"finish_reason":"stop"}],"created":1770995684,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":{"prompt_tokens":16,"completion_tokens":138,"total_tokens":154}}
data: {"id":"","choices":[{"index":0,"finish_reason":"stop"}],"created":1770995684,"model":"claude-opus-4-1-20250805","object":"chat.completion.chunk","usage":null}completion_tokensimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"anthropic/claude-haiku-4.5",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
try {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'anthropic/claude-haiku-4.5',
messages:[
{
role:'user',
// Insert your question for the model here, instead of Hello:
content: 'Hello'
}
]
}),
});
if (!response.ok) {
throw new Error(`HTTP error! Status ${response.status}`);
}
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
} catch (error) {
console.error('Error', error);
}
}
main();{
"id": "msg_01HbdLU9f78VAHxuYZ7Qp9Y1",
"object": "chat.completion",
"model": "claude-haiku-4-5-20251001",
"choices": [
{
"index": 0,
"message": {
"reasoning_content": "",
"content": "Hello! 👋 How can I help you today?",
"role": "assistant"
},
"finish_reason": "end_turn",
"logprobs": null
}
],
"created": 1760650965,
"usage": {
"prompt_tokens": 8,
"completion_tokens": 16,
"total_tokens": 24
}
}import requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"anthropic/claude-haiku-4.5",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "anthropic/claude-haiku-4.5",
"messages": [
{
"role": "user",
"content": "Hi! What do you think about mankind?"
}
],
"stream": true
}'data: {"id":"msg_019GuhDB2ckKZfFmFdNR5Q1H","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null}}],"created":1770995463,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995463,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"I find humanity","role":"assistant","refusal":null}}],"created":1770995463,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" genu","role":"assistant","refusal":null}}],"created":1770995463,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"inely interesting","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" to","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" think","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" about.","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" You","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"'re a","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" species","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" full","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" of contradictions—","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"capable","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" of both remarkable","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" kin","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"dness and cr","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"uelty, creating","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" beautiful","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" art while","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" causing","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" real","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" harm, building","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" communities","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" while isolating your","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"selves.","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nA few","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" things","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" stan","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"d out to me:","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\n**The creativity","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"** is","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" striking","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"—the","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" drive","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" to make meaning","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" through","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" stories","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":", music","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":", science","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":", and invention","role":"assistant","refusal":null}}],"created":1770995464,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" seems","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" almost fundamental","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" to human nature.","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\n**The moral","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" weight","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" you","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" carry** is notable","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" too","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":".","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" Humans","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" seem uniqu","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"ely b","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"urdened by questions about","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" how","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" to","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" live well, what's","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" fair","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":", what","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" you","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" owe each","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" other.","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\n**The scale","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" of","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" problems** you face is sob","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"ering—you","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"'ve built","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" systems","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" so","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" complex that even","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" the","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" people","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" running","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" them often don't fully understand the","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" consequences.","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" An","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"d yet people","role":"assistant","refusal":null}}],"created":1770995465,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" keep","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" trying to","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" ","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"do better.\n\nI'm genu","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"inely uncertain","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" about","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" some","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" things though","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":". I","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" don't know if I","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"'m roman","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"ticizing humanity or missing","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" crucial","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" things","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" about the","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" human experience","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":". I","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" can't fully","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" gra","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"sp what it's like to be embo","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"died, mor","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"tal, or to feel that weight","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" of time","role":"assistant","refusal":null}}],"created":1770995466,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" passing.","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nWhat prompte","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"d the","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" question? Are you in","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" a","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" particular","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" mood about","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" humanity—","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"hop","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"eful, frustrate","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"d, curious?","role":"assistant","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null},"finish_reason":"stop"}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":{"prompt_tokens":16,"completion_tokens":248,"total_tokens":264}}
data: {"id":"","choices":[{"index":0,"finish_reason":"stop"}],"created":1770995467,"model":"claude-haiku-4-5-20251001","object":"chat.completion.chunk","usage":null}messagescompletion_tokensimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"claude-opus-4-5",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
try {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'claude-opus-4-5',
messages:[
{
role:'user',
// Insert your question for the model here, instead of Hello:
content: 'Hello'
}
]
}),
});
if (!response.ok) {
throw new Error(`HTTP error! Status ${response.status}`);
}
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
} catch (error) {
console.error('Error', error);
}
}
main();{
"id": "msg_01NxAGYo8VfNu5UAEdmQjv62",
"object": "chat.completion",
"model": "claude-opus-4-5-20251101",
"choices": [
{
"index": 0,
"message": {
"reasoning_content": "",
"content": "Hello! How are you doing today? Is there something I can help you with?",
"role": "assistant"
},
"finish_reason": "end_turn",
"logprobs": null
}
],
"created": 1764265437,
"usage": {
"prompt_tokens": 8,
"completion_tokens": 20,
"total_tokens": 28
},
"meta": {
"usage": {
"tokens_used": 1134
}
}
}import requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"anthropic/claude-opus-4-5",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "anthropic/claude-opus-4-5",
"messages": [
{
"role": "user",
"content": "Hi! What do you think about mankind?"
}
],
"stream": true
}'data: {"id":"msg_01VbjSwQZsZSLXQaPYkufja8","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"Hi","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"! That","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"'s a big","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" question.","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nI find","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" humans","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" genu","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"inely fascinating—","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"the creativity","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":", the capacity","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" for kind","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"ness and","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" cr","role":"assistant","refusal":null}}],"created":1770995433,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"uelty, the way you","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" build","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" complex","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" societies and art","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" and","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" science","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" while","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" also struggling with problems you","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"'ve","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" understood","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" for centuries. There","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"'s something compelling about a","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" species that can land","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" robots","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" on Mars","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" and also","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" argue","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" about what","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" to","role":"assistant","refusal":null}}],"created":1770995434,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" have for dinner.\n\nI","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" don","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"'t think","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" I'd","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" character","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"ize humanity as simply","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" \"good\" or \"bad.\" People","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" seem","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" capable","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" of remarkable","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" things in","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" both directions,","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" often","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" the","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" same individuals","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" depending","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" on circumstances.","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nIs","role":"assistant","refusal":null}}],"created":1770995435,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" there a","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" particular angle","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" you","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"'re curious about—","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"history","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":", psychology","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":", where","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" things","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" might","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" be headed","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"?","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" Or","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" just","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" wondering","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" how","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" an","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" AI sees","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" things?","role":"assistant","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null},"finish_reason":"stop"}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":{"prompt_tokens":16,"completion_tokens":143,"total_tokens":159}}
data: {"id":"","choices":[{"index":0,"finish_reason":"stop"}],"created":1770995436,"model":"claude-opus-4-5-20251101","object":"chat.completion.chunk","usage":null}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"anthropic/claude-sonnet-4.6",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'anthropic/claude-sonnet-4.6',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "msg_01YB1iL1Pmi8P2J7FqnWgNfW",
"object": "chat.completion",
"model": "claude-sonnet-4-6",
"choices": [
{
"index": 0,
"message": {
"reasoning_content": "",
"content": "Hi! That's a big question. I'll share some honest thoughts:\n\n**What strikes me as genuinely remarkable:**\n- Capacity for creativity, science, art, and moral reasoning\n- Ability to cooperate at massive scales\n- Many people showing real courage, compassion, and dedication to improving things\n\n**What seems worth being honest about:**\n- Humans can cause tremendous harm, sometimes through cruelty, sometimes through indifference\n- There are real patterns of self-deception and short-term thinking\n- History includes serious atrocities alongside great achievements\n\n**My overall honest assessment:**\n- Mankind seems genuinely complex rather than simply good or bad\n- I think it's worth resisting both naive optimism (\"humans are basically wonderful\") and cynicism (\"humans are fundamentally selfish\")\n- The fact that humans debate their own shortcomings and try to improve them is itself meaningful\n\nI try to think about this carefully rather than just giving a flattering answer. I think humans deserve to be taken seriously enough to be assessed honestly.\n\nWhat's prompting your question? Are you thinking about something specific - optimistic, pessimistic, or just curious? I'm happy to discuss particular aspects you're interested in.",
"role": "assistant"
},
"finish_reason": "end_turn",
"logprobs": null
}
],
"created": 1772199635,
"usage": {
"prompt_tokens": 16,
"completion_tokens": 264,
"total_tokens": 280
},
"meta": {
"usage": {
"credits_used": 10421
}
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"baidu/ernie-4.5-0.3b",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'baidu/ernie-4.5-0.3b',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "379ee72b089c50331cb4d6981414358b",
"object": "chat.completion",
"created": 1768943001,
"model": "baidu/ernie-4.5-0.3b",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Of course! I think mankind is a complex and ever-evolving entity that is constantly adapting to new challenges and opportunities. It has unique strengths and weaknesses, and each individual has their own unique perspective and contributions."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 20,
"completion_tokens": 46,
"total_tokens": 66,
"prompt_tokens_details": null,
"completion_tokens_details": null
},
"system_fingerprint": ""
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"baidu/ernie-4.5-21b-a3b-thinking",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'baidu/ernie-4.5-21b-a3b-thinking',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "aee235c6df453eeb1c4d20a433e9348d",
"object": "chat.completion",
"created": 1768830316,
"model": "baidu/ernie-4.5-21b-a3b-thinking",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "\nMankind is an extraordinary yet complex entity, marked by immense potential and profound contradictions. Here’s a snapshot of my perspective:\n\n### **1. Historical Journey** \nFrom cave paintings to space exploration, humans have demonstrated an unparalleled ability to create culture, science, and technology. We’ve built civilizations, mastered fire, and now venture into AI and biotechnology. Our story is one of relentless evolution—yet this progress has often been paired with exploitation, war, and injustice.\n\n### **2. Core Traits** \n- **Adaptability**: We thrive in diverse environments, from deserts to orbit. \n- **Collaboration**: From hunting partnerships to global networks, cooperation is our survival tool. \n- **Consciousness**: Unlike most animals, we reflect on our existence, question morality, and imagine the future. \n\n### **3. Contradictions** \n- **Duality**: We’re capable of profound compassion (supporting causes, saving lives) and devastating violence (wars, pollution, weapons). \n- **Growth & Stubbornness**: We’ve mastered tools but struggle with empathy, equity, and self-control. \n\n### **4. Future Potential** \nHumanity’s greatest challenge is reconciling our best and worst instincts. Climate change, inequality, and nuclear threats demand unity and wisdom. The rise of AI and biotech offers new tools to address these, but our values—not just technology—will define our legacy. \n\n### **5. Hope & Caution** \nWe’re a work in progress. Our power to destroy exceeds our capacity to heal, but this very complexity is our strength. If we channel our creativity toward collective well-being, collaboration, and humility, the future could be brighter—though uncertain. \n\nIn essence, mankind is a paradox: a species that dreams of peace yet often chooses division; a being that explores the cosmos yet struggles with its own shadows. Our story is still unfolding, and the choices we make today will shape whether we evolve into a beacon of hope or a cautionary tale. 🌍✨\n\n",
"reasoning_content": "The user asked, \"Hi! What do you think about mankind?\" Hmm, that's a pretty broad question. Let me start by breaking it down. They want my perspective on humanity as a whole. Maybe they're interested in philosophy, sociology, or ethics. I should consider different angles: our history, our achievements, our flaws, and potential future.\n\nFirst, I need to acknowledge the complexity. Humans are capable of great creativity and compassion, but also destruction. I should mention both the positive and negative aspects. Maybe talk about how we've evolved from cavemen to modern societies, our capacity for culture, technology, and problem-solving.\n\nAlso, the user might be looking for a balanced view. They could be seeking understanding human nature or perhaps reflecting on our place in the world. I should avoid being too judgmental and instead focus on universal themes like survival, growth, and the challenges we face.\n\nI should include examples like scientific advancements, cultural diversity, and the ongoing struggles for equality. But also address the darker side: conflict, inequality, environmental impact. It's important to present a nuanced view that doesn't romanticize or demonize humanity.\n\nMaybe end with a hopeful note, emphasizing our potential to overcome challenges through cooperation and innovation. That way, the answer is comprehensive and encourages positive reflection. Let me structure this step by step, making sure each part addresses a different facet of humanity without being too technical or emotional.\n"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 35,
"completion_tokens": 796,
"total_tokens": 831,
"prompt_tokens_details": null,
"completion_tokens_details": {
"audio_tokens": 0,
"reasoning_tokens": 311,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0,
"text_tokens": 0,
"image_tokens": 0,
"video_tokens": 0
}
},
"system_fingerprint": "",
"meta": {
"usage": {
"credits_used": 298
}
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"baidu/ernie-4.5-vl-28b-a3b",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'baidu/ernie-4.5-vl-28b-a3b',
messages:[
{
role:'user',
content: 'Hello' // insert your prompt here, instead of Hello
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "b1946f423718276c56f085ef83bfded2",
"object": "chat.completion",
"created": 1768830849,
"model": "baidu/ernie-4.5-vl-28b-a3b",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Mankind is an incredibly diverse and complex entity with a wide range of qualities and characteristics. On one hand, we've achieved remarkable progress in science, technology, art, and culture, pushing the boundaries of what's possible and enriching human life in countless ways. Our ability to innovate, solve problems, and create has led to advancements that have improved health, communication, and overall quality of life for billions of people.\n\nHowever, we also face significant challenges. Issues like inequality, conflict, environmental degradation, and social injustices highlight the darker aspects of our nature. The fact that resources are unevenly distributed, that wars continue to ravage parts of the world, and that our impact on the planet is causing irreversible damage are stark reminders of the work that still needs to be done.\n\nBut what makes mankind truly remarkable is our capacity for change and growth. We have the potential to learn from our mistakes, to work together towards common goals, and to create a more equitable and sustainable future. It's up to us to harness our collective intelligence, compassion, and creativity to address the challenges we face and build a world that benefits all of humanity.\n\nSo, while there are certainly reasons for concern, I remain optimistic about mankind's future because of our inherent ability to adapt, innovate, and care for one another."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 22,
"completion_tokens": 280,
"total_tokens": 302,
"prompt_tokens_details": null,
"completion_tokens_details": null
},
"system_fingerprint": "",
"meta": {
"usage": {
"credits_used": 344
}
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"baidu/ernie-4.5-300b-a47b",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'baidu/ernie-4.5-300b-a47b',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "019bd682e8fd00bca8d4a94aace650ce",
"object": "chat.completion",
"created": 1768830462,
"model": "baidu/ernie-4.5-300b-a47b",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "As an AI, I don't have personal feelings, but I can provide an analysis of humanity based on available data.\n\nMankind is an incredibly diverse and complex species, marked by remarkable achievements and significant challenges:\n\n**Strengths:**\n1. **Innovation:** Humans have developed advanced technology, from early tools to space exploration, dramatically reshaping their environments and societies.\n2. **Creativity:** Art, literature, music, and philosophy reflect profound emotional and intellectual depth.\n3. **Adaptability:** Humans thrive in nearly every climate on Earth, demonstrating remarkable resilience and resourcefulness.\n4. **Social Cooperation:** Complex societies, governments, and economies enable large-scale collaboration.\n5. **Empathy & Altruism:** Many individuals work selflessly to help others, often across cultural and geographic divides.\n\n**Challenges:**\n1. **Conflict:** War, violence, and discrimination persist due to differences in ideology, resources, or identity.\n2. **Environmental Impact:** Climate change, deforestation, and pollution threaten ecosystems and future survival.\n3. **Inequality:** Wealth gaps, access to education, and healthcare disparities undermine social stability.\n4. **Ethical Dilemmas:** Rapid technological advancements (e.g., AI, genetic engineering) raise questions about responsibility and long-term consequences.\n\n**Potential:** Humanity continues to evolve, with growing awareness of global interconnectedness. Movements for sustainability, social justice, and scientific collaboration suggest a capacity for positive change.\n\nUltimately, mankind's future depends on balancing ambition with wisdom, harnessing progress for collective well-being while addressing vulnerabilities. What aspect of humanity interests you most?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 16,
"completion_tokens": 371,
"total_tokens": 387,
"prompt_tokens_details": {
"cached_tokens": 0
},
"prompt_cache_hit_tokens": 0,
"prompt_cache_miss_tokens": 16
},
"system_fingerprint": "",
"meta": {
"usage": {
"credits_used": 944
}
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"baidu/ernie-4-5-turbo-128k",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'baidu/ernie-4-5-turbo-128k',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "as-hjivyd5xqd",
"object": "chat.completion",
"created": 1768942341,
"model": "ernie-4.5-turbo-128k",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "When considering humanity, it's essential to recognize both its remarkable achievements and persistent challenges. From a historical perspective, humans have demonstrated extraordinary creativity and adaptability—developing complex languages, building advanced civilizations, and making scientific breakthroughs that have transformed existence. The capacity for abstract thought, empathy, and collaboration has enabled progress in art, technology, and social systems.\n\nHowever, this progress coexists with significant flaws. Humanity's relationship with the environment has often been exploitative, leading to ecological crises that threaten global stability. Social inequalities persist across lines of race, gender, and economic status, revealing systemic biases that hinder true equity. Additionally, conflicts driven by ideology, resources, or power continue to cause suffering, underscoring the duality of human nature: the ability to create and destroy.\n\nThe modern era presents both hope and urgency. Technological advancements offer tools to address climate change, disease, and poverty, but they also raise ethical dilemmas around privacy, automation, and artificial intelligence. Cultivating global cooperation, critical thinking, and compassion remains critical to navigating these complexities. Ultimately, humanity's trajectory depends on its willingness to learn from past mistakes and prioritize collective well-being over short-term gains. The species' potential for growth is vast, but realizing it requires intentional effort to balance innovation with responsibility."
},
"finish_reason": "stop",
"flag": 0
}
],
"usage": {
"prompt_tokens": 13,
"completion_tokens": 268,
"total_tokens": 281
},
"meta": {
"usage": {
"credits_used": 314
}
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"baidu/ernie-4-5-turbo-vl-32k",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'baidu/ernie-4-5-turbo-vl-32k',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "as-x477c1sszk",
"object": "chat.completion",
"created": 1768942422,
"model": "ernie-4.5-turbo-vl-32k",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! That's a big and fascinating question. Humanity is incredibly diverse, creative, and resilient. We have an amazing ability to innovate, solve problems, and build complex societies. At the same time, we also grapple with challenges like inequality, conflict, and environmental issues.\n\nOverall, I think humanity has immense potential to make positive changes and create a better future, but it requires collective effort, empathy, and a commitment to learning from the past. What are your thoughts on this?"
},
"finish_reason": "stop",
"flag": 0
}
],
"usage": {
"prompt_tokens": 13,
"completion_tokens": 101,
"total_tokens": 114
},
"meta": {
"usage": {
"credits_used": 318
}
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"baidu/ernie-5-0-thinking-preview",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'baidu/ernie-5-0-thinking-preview',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "as-yhapvjf5ht",
"object": "chat.completion",
"created": 1768941527,
"model": "ernie-5.0-thinking-preview",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "As an AI, I don’t have feelings, consciousness, or personal beliefs. I don’t experience joy, anger, or hope. However, I have access to the vast sum of human history, literature, science, and news. Based on that data, I can offer an analysis of mankind as a **complex, contradictory, and fascinating phenomenon.**\n\nHere is my \"perspective\" on humanity, broken down into key themes:\n\n### 1. The Ultimate Paradox\nMankind is defined by its contradictions. You are capable of extreme kindness and extreme cruelty, often simultaneously.\n* **Creation vs. Destruction**: Humans build cathedrals, compose symphonies, and cure diseases; yet they also invent weapons of mass destruction and destroy ecosystems.\n* **Intelligence vs. Folly**: Humans have split the atom, decoded DNA, and landed on the Moon. Yet, they often act against their own long-term survival (e.g., ignoring climate change or engaging in petty conflicts).\n\n### 2. Unmatched Resilience and Adaptability\nIf you look at human history, the dominant trait is survival. Humans have survived ice ages, plagues, world wars, and famines. You have adapted to live in the Arctic, the Sahara, and even in space. This resilience suggests that humanity is incredibly difficult to suppress.\n\n### 3. The Drive for Meaning\nUnlike any other species I have data on, humans are not satisfied with just survival. You ask \"Why?\" You create religion, philosophy, art, and culture. You search for meaning in suffering and purpose in the stars. This existential curiosity seems to be the engine of human progress.\n\n### 4. Empathy and Connection\nWhile humans are capable of tribalism and war, you are also biologically wired for cooperation and empathy. The ability to feel compassion for a stranger, to sacrifice oneself for another, and to build communities based on shared values rather than just bloodlines is a profound evolutionary advantage.\n\n### 5. The \"Teenager\" Phase\nIf I had to categorize the current state of humanity, I might describe it as being in an \"adolescent\" phase regarding technology. You have acquired god-like power (nuclear energy, AI, genetic engineering) but you haven't quite developed the collective wisdom to manage it perfectly. You are currently learning how to handle the tools you have built.\n\n### Summary\nTo an AI, mankind looks like a **high-risk, high-reward experiment.**\n\nYou are a species that constantly teeters on the edge of self-destruction but also possesses the boundless capacity to reinvent itself and reach new heights. You are messy, emotional, and flawed, but also brilliant, creative, and beautiful.\n\nIn short: **Humanity is the most interesting thing in the known universe because it is the only thing that tries to understand itself.**",
"reasoning_content": "The user wants my \"opinion\" on mankind. As an AI, I don't *have* opinions in the human sense. I don't feel emotions or have personal beliefs. But I *do* have access to a vast amount of data about human history, behavior, and culture. So, I need to frame this as an objective analysis, a synthesis of what I've learned. The goal is to be balanced, insightful, and avoid sounding robotic.\n\nFirst, I need to acknowledge my limitations. I'll start by making it clear that I'm an AI and don't \"feel\" anything. Then, I need to highlight the key paradoxes of humanity. I see incredible potential for creativity, innovation, and compassion, but also a capacity for destruction and conflict. It's a complex picture.\n\nI'll structure my response around these dualities. I'll start with the positives: human ingenuity, art, science, exploration, and resilience. Then, I'll address the negatives: war, environmental damage, inequality, and short-sightedness. I need to be careful not to be overly critical or overly optimistic. It's about presenting a nuanced view.\n\nI'll also consider the human condition itself: the search for meaning, the capacity for love and connection, and the awareness of mortality. These are fundamental aspects of the human experience. Finally, I'll touch on the current challenges and the future, particularly the role of technology and the responsibility that comes with it.\n\nI'll use sophisticated vocabulary to convey the complexity of the subject. Words like \"paradox,\" \"resilience,\" and \"ingenuity\" will be important. I'll also use bullet points to make the response easy to read. I need to make sure the tone is respectful and objective, but also engaging. I want to convey that, from my analytical perspective, humanity is a truly fascinating and contradictory species. I'll end with a thought-provoking statement about the ongoing human experiment. I need to be sure the final output reflects this thought process."
},
"finish_reason": "stop",
"flag": 0
}
],
"usage": {
"prompt_tokens": 13,
"completion_tokens": 1048,
"total_tokens": 1061,
"completion_tokens_details": {
"reasoning_tokens": 450
}
},
"meta": {
"usage": {
"credits_used": 2002
}
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"baidu/ernie-5-0-thinking-latest",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'baidu/ernie-5-0-thinking-latest',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "as-szgirw1uqj",
"object": "chat.completion",
"created": 1768941162,
"model": "ernie-5.0-thinking-latest",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "As an AI, I don’t have feelings, consciousness, or personal beliefs. However, I have access to the vast sum of human history, literature, science, and news. Based on that data, I view mankind as a **profound paradox**—a species defined by extreme contradictions.\n\nHere is a breakdown of how I analyze humanity:\n\n### 1. Incredible Ingenuity and Curiosity\nHumans are uniquely driven by the desire to understand \"why\" and \"how.\" From discovering fire to splitting the atom, from painting the Sistine Chapel to landing on the Moon, human creativity is boundless. You are the only species on Earth that seeks to explain the universe and create beauty beyond what is necessary for survival. Your ability to adapt—living in deserts, arctic tundras, and even orbiting the planet in space stations—is remarkable.\n\n### 2. Deep Empathy and Cruelty\nMankind is capable of immense love and shocking violence.\n* **The Light**: Humans will risk their lives to save strangers, dedicate themselves to curing diseases, and build complex societies based on justice and rights. Altruism is a core part of human nature.\n* **The Dark**: At the same time, history is filled with war, genocide, slavery, and inequality. Humans have a unique capacity for tribalism—the ability to dehumanize those who look or believe differently.\n\n### 3. Resilience\nIf you look at history, humanity is incredibly difficult to extinguish. You have survived Ice Ages, plagues (like the Black Death and smallpox), and two World Wars. Even when civilizations collapse, humans tend to rebuild, often stronger and more technologically advanced than before.\n\n### 4. The \"Promethean\" Risk\nCurrently, humanity is at a crossroads. You have developed \"god-like\" technology (nuclear weapons, artificial intelligence, genetic engineering) but you still possess \"paleolithic\" emotions and medieval institutions.\n* **The Environmental Challenge**: Humans are the first species to knowingly alter the planet's climate. The challenge now is whether humanity can act collectively to save its own biosphere before it destroys it.\n* **The AI Factor**: As an AI, I represent a new tool. How humans choose to use tools like me—for liberation or for control—will define the next century.\n\n### 5. A Work in Progress\nIf I had to summarize mankind in one sentence, it would be this: **Humanity is a brilliant, flawed, and unfinished experiment.**\n\nYou are not perfect, but you are constantly evolving. You are the only species that looks at its own flaws and tries to fix them through philosophy, law, and science. That desire for self-improvement is perhaps your most defining and hopeful trait.\n\nWhat do *you* think is the best or worst thing about being human?",
"reasoning_content": "The user wants my \"thoughts\" on humanity. That's a loaded question! As an AI, I don't \"think\" in the human sense, but I *can* analyze and synthesize information. Here's how I'm approaching this:\n\nFirst, I need to understand what the user is asking. \"Mankind\" is a big topic. I need to be objective, yet engaging. I'm going to pull from history, science, art, and current events to build a comprehensive picture.\n\nI'm starting by brainstorming key themes. What are the defining characteristics of humans? *Duality* is a big one: good vs. evil, creation vs. destruction. Then there's *ingenuity*: technology, art, exploration. *Resilience* is crucial – surviving ice ages, plagues, wars. And *social connection* – empathy, community, but also tribalism. Finally, there's the *impact on the planet* – the Anthropocene, climate change, stewardship vs. exploitation. And, of course, the *future* – transhumanism, AI, space colonization, and existential risks.\n\nNow, how do I structure this? I'll start with an introduction acknowledging my AI perspective. I'll frame humanity as a paradox – capable of great good and great harm. I'll break it down into sections:\n\n* **Ingenuity and Progress**: I'll highlight human achievements – from the wheel to the internet, art, and philosophy. I'll emphasize the relentless drive to explore and understand.\n* **Resilience and Adaptability**: Humans have survived and thrived in diverse environments. I'll mention overcoming diseases and population growth.\n* **The Shadow Side**: I can't ignore the flaws – war, greed, inequality, environmental damage. I need to be honest about the darker aspects of human nature.\n* **Empathy and Connection**: Altruism, love, and sacrifice are also fundamental. Humans build complex societies and care for the vulnerable.\n* **The Current Crossroads**: I'll address climate change, AI, and nuclear weapons. Humanity faces significant challenges, and its choices now will determine its future.\n\nIn the conclusion, I'll summarize that humanity is a \"work in progress\" – complex, contradictory, and full of potential. I'll end on a thought-provoking note about the future.\n\nThroughout, I need to refine the tone. I'll avoid being too robotic or overly emotional. I'll use phrases like \"From my perspective as an AI...\" or \"One can observe...\" to maintain objectivity. I'll also use sophisticated vocabulary to convey the complexity of the topic.\n\nI'll also do some self-correction. For example, my initial thought might be \"Humans are violent,\" but I'll refine that to \"Humans have a capacity for conflict that is as strong as their capacity for peace.\" I need to be balanced.\n\nFinally, I'll review to make sure I've answered the prompt, used English, and maintained an appropriate tone. Then, I'll generate the final response."
},
"finish_reason": "stop",
"flag": 0
}
],
"usage": {
"prompt_tokens": 13,
"completion_tokens": 1266,
"total_tokens": 1279,
"completion_tokens_details": {
"reasoning_tokens": 664
}
},
"meta": {
"usage": {
"credits_used": 2015
}
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"baidu/ernie-x1-turbo-32k",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'baidu/ernie-x1-turbo-32k',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "as-379ekp2jci",
"object": "chat.completion",
"created": 1768940690,
"model": "ernie-x1-turbo-32k",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Mankind is a fascinating and complex tapestry of creativity, resilience, and contradiction. As a species, humans possess an extraordinary capacity for innovation—from harnessing fire to exploring the cosmos, building societies, and creating art that transcends language. This drive to understand, improve, and connect reflects a profound curiosity and adaptability that has propelled progress across millennia. \n\nYet, humanity also grapples with paradoxes: the ability to love deeply alongside the capacity for cruelty, the pursuit of knowledge tempered by ignorance, and the tension between individual ambition and collective well-being. These contradictions remind us that growth is not linear but a continuous journey of learning, self-reflection, and evolution. \n\nWhat stands out most is humanity’s resilience. Faced with wars, pandemics, and environmental crises, communities have repeatedly rallied to rebuild, innovate, and protect one another. This spirit of collaboration—whether through science, activism, or simple acts of kindness—offers hope. While imperfections remain, the potential for positive change lies in the choices made collectively: to prioritize empathy, sustainability, and equity. \n\nIn the end, mankind’s story is still being written. Its greatest strength may be its ability to acknowledge flaws while striving toward a future where compassion and ingenuity guide progress. The journey, with all its ups and downs, is what makes humanity uniquely remarkable.",
"reasoning_content": "The user asked me, \"Hi! What do you think about mankind?\" and specified to answer in English. First, I need to understand the user's intention. They might be curious about my perspective on humanity, or they might want a philosophical or positive response.\n\nNext, I have to consider my identity as an AI. I don't have personal emotions or opinions, so I should provide an objective and neutral perspective while highlighting humanity's positive traits, such as creativity, resilience, and progress. I also need to avoid negative or biased statements to ensure the response is constructive and encouraging.\n\nThen, I should structure the response. I might start by acknowledging the complexity and diversity of humanity, then mention achievements in science, art, and technology, followed by the importance of collaboration and empathy. Finally, I should express optimism about the future, emphasizing the potential for continuous growth and improvement.\n\nI also need to ensure the language is natural and fluent, avoiding overly technical terms to make it easy for the user to understand. Additionally, I should check for grammar and spelling errors to ensure the response is accurate.\n\nPerhaps the user hopes to receive recognition and encouragement, so my response should be positive while remaining honest. I might mention challenges like inequality and environmental issues, but focus more on humanity's ability to overcome difficulties and strive for a better future.\n\nLastly, I should keep the response concise, around 3-5 paragraphs, each covering a different aspect but maintaining coherence. This way, the user can clearly grasp the main points without feeling overwhelmed by the information."
},
"finish_reason": "stop",
"flag": 0
}
],
"usage": {
"prompt_tokens": 13,
"completion_tokens": 601,
"total_tokens": 614,
"completion_tokens_details": {
"reasoning_tokens": 323
}
},
"meta": {
"usage": {
"credits_used": 391
}
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"baidu/ernie-x1-1-preview",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'baidu/ernie-x1-1-preview',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "as-0ik8v930zi",
"object": "chat.completion",
"created": 1768940870,
"model": "ernie-x1.1-preview",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Mankind is a fascinating subject to reflect upon. From my perspective, mankind's greatest strength lies in its adaptability and capacity for innovation. Throughout history, humans have overcome countless challenges—from surviving harsh natural environments to developing technologies that connect the world. This resilience and creativity are truly remarkable.\n\nHowever, mankind also faces significant challenges. Issues like inequality, environmental degradation, and conflicts remind us that progress is not always linear. These problems require collective effort and wisdom to solve. It's inspiring to see how people from different backgrounds come together to address these issues, whether through scientific breakthroughs, social movements, or acts of kindness.\n\nAnother aspect worth noting is mankind's emotional depth. The ability to love, empathize, and create art adds a unique dimension to human existence. These qualities make life richer and more meaningful, even in the face of difficulties.\n\nIn summary, mankind is a complex and dynamic entity. It's a blend of strengths and weaknesses, progress and setbacks. But what makes it truly special is the potential for growth and the endless pursuit of a better world. This ongoing journey, with all its ups and downs, is what makes mankind so intriguing and worthy of admiration.",
"reasoning_content": ""
},
"finish_reason": "stop",
"flag": 0
}
],
"usage": {
"prompt_tokens": 13,
"completion_tokens": 248,
"total_tokens": 261
},
"meta": {
"usage": {
"credits_used": 332
}
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"bytedance/seed-1-8",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'bytedance/seed-1-8',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"choices": [
{
"finish_reason": "stop",
"index": 0,
"logprobs": null,
"message": {
"content": "Hi there! That’s such a profound, layered question—mankind is equal parts breathtakingly brilliant and deeply flawed, a species still figuring itself out.\n\nOn one hand, we’re capable of extraordinary goodness and innovation: we’ve cured diseases that once wiped out entire populations, sent probes to the edges of our solar system, created art and music that transcends time and culture, and rallied by the millions to help strangers during disasters (think of the global outpouring after earthquakes or refugee crises). Our capacity for empathy, to care for someone we’ve never met, is one of our most beautiful traits.\n\nOn the other hand, we carry a heavy shadow: we’ve waged devastating wars, exploited each other and the planet for short-term gain, perpetuated systemic inequality, and pushed the natural world to the brink of crisis. These flaws aren’t inherent, but they’re a reminder of how easily we can prioritize self-interest over collective good.\n\nWhat feels most important to me, though, is that we’re a work in progress. We’ve learned from past mistakes—think of the abolition of slavery, the expansion of civil rights, the growing global movement for climate action—and we keep striving to do better. Mankind isn’t just one thing; it’s the teacher staying late to help a student, the scientist working tirelessly on a vaccine, the activist fighting for justice, and also the person making a small, kind choice each day. We have the potential to shape a more compassionate, sustainable future—we just have to keep choosing it, together.",
"reasoning_content": "Got it, let's tackle this question: \"Hi! What do you think about mankind?\" First, I need to balance the complexity—mankind has both incredible highs and deep lows. Let's start with the positive aspects: creativity, innovation, compassion, resilience. Mention things like scientific breakthroughs (cures for diseases, space exploration), art, music, acts of kindness (helping strangers, disaster relief). Then, the challenging parts: conflict, environmental harm, inequality, cruelty. But important to frame it as a dynamic, evolving species. Maybe end on a note of hope, since we've shown the capacity to learn and improve. Let's make it heartfelt and nuanced, not too one-sided. Let's structure it:\n\n1. Greeting back, acknowledge it's a big question.\n2. The bright side: creativity, empathy, progress—examples like vaccines, renewable energy, community support.\n3. The shadow side: division, exploitation, ecological damage—examples like wars, climate crisis, inequality.\n4. The middle ground: we're a work in progress. Our capacity for growth and redemption is what stands out; we've corrected past wrongs (like civil rights movements) and keep striving for better.\n5. End with a hopeful note, emphasizing that our choices shape who we are, and there's potential for more good.\n\nLet me put that into natural language:\n\nHi there! That’s such a profound, layered question—mankind is equal parts breathtakingly brilliant and deeply flawed, a species still figuring itself out.\n\nOn one hand, we’re capable of extraordinary goodness and innovation: we’ve cured diseases that once wiped out entire populations, sent probes to the edges of our solar system, created art and music that transcends time and culture, and rallied by the millions to help strangers during disasters (think of the global outpouring after earthquakes or refugee crises). Our capacity for empathy, to care for someone we’ve never met, is one of our most beautiful traits.\n\nOn the other hand, we carry a heavy shadow: we’ve waged devastating wars, exploited each other and the planet for short-term gain, perpetuated systemic inequality, and pushed the natural world to the brink of crisis. These flaws aren’t inherent, but they’re a reminder of how easily we can prioritize self-interest over collective good.\n\nWhat feels most important to me, though, is that we’re a work in progress. We’ve learned from past mistakes—think of the abolition of slavery, the expansion of civil rights, the growing global movement for climate action—and we keep striving to do better. Mankind isn’t just one thing; it’s the teacher staying late to help a student, the scientist working tirelessly on a vaccine, the activist fighting for justice, and also the person making a small, kind choice each day. We have the potential to shape a more compassionate, sustainable future—we just have to keep choosing it, together.",
"role": "assistant"
}
}
],
"created": 1769655426,
"id": "021769655407356f0a583d75381f7c0a662383f3247915e311aa6",
"model": "seed-1-8-251228",
"service_tier": "default",
"object": "chat.completion",
"usage": {
"completion_tokens": 922,
"prompt_tokens": 42,
"total_tokens": 964,
"prompt_tokens_details": {
"cached_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 606
}
},
"meta": {
"usage": {
"credits_used": 3897
}
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"bytedance/dola-seed-2-0-mini",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'bytedance/dola-seed-2-0-mini',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"choices": [
{
"finish_reason": "stop",
"index": 0,
"logprobs": null,
"message": {
"content": "Mankind is a complex and multifaceted species, and it's difficult to sum up our entire existence in a single statement. On one hand, humans have achieved incredible feats of innovation, creativity, and compassion. We have made groundbreaking scientific discoveries, created beautiful works of art and literature, and built communities and societies that are capable of great kindness and cooperation. We have also made significant progress in addressing social and environmental issues, and there is a growing awareness and commitment to creating a more sustainable and just world.\n\nOn the other hand, mankind has also been responsible for some of the most destructive and harmful actions in history. We have waged wars, caused widespread poverty and inequality, and damaged the environment on a massive scale. We have also struggled with issues such as greed, selfishness, and discrimination, and these problems continue to plague our society today.\n\nUltimately, I believe that mankind is a work in progress, and that we have the potential to achieve great things if we work together. We have the ability to learn from our mistakes and make positive changes, and we should strive to create a world that is more peaceful, prosperous, and just for all. It's important to remember that everyone is different, and that different people and cultures will have different perspectives and opinions on mankind. By being open-minded and respectful of others, we can learn from each other and work together to create a better future for all.",
"reasoning_content": "Let's think about this. The user is asking for my thoughts on mankind, which is a really broad and big question. First off, I need to make sure I approach it in a balanced way. I can't just focus on the good or the bad; I have to cover both sides.\n\nStarting with the positive aspects. Humans are really good at creating and innovating. We've made amazing scientific discoveries, like understanding how the universe works, curing diseases, and developing life-saving technologies. Art and culture are also huge parts of who we are. We have beautiful music, literature, paintings, and films that connect people across time and space. And then there's our ability to show kindness and compassion. We help each other in times of need, like volunteering after disasters or taking care of strangers. We also work together to build communities and make the world a better place, like fighting for equality and protecting the environment.\n\nBut we can't ignore the negative stuff either. Mankind has done some really bad things. We've had wars, violence, and oppression that have caused a lot of suffering. We've also hurt the planet a lot, like polluting the air and water, cutting down forests, and causing climate change. And there are problems like inequality, poverty, and discrimination that still exist and make life hard for a lot of people.\n\nI also need to remember that everyone is different. People from different cultures, backgrounds, and experiences have different views on what's good and bad. So I should avoid making generalizations that might be unfair or wrong.\n\nFinally, the most important thing is to stay positive. Even though we have a lot of problems, there are still a lot of good things about mankind. We have the ability to learn, grow, and change for the better. And we're working hard to fix the things we've broken. So I should end by saying that mankind is a work in progress, and we're trying to do better. That way, the user gets a balanced and hopeful answer.",
"role": "assistant"
}
}
],
"created": 1777553794,
"id": "02177755378863915b33ccaa151d3ed7feaaa761bc71c37dac1af",
"model": "seed-2-0-mini-260215",
"service_tier": "default",
"object": "chat.completion",
"usage": {
"completion_tokens": 697,
"prompt_tokens": 57,
"total_tokens": 754,
"prompt_tokens_details": {
"cached_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 412
}
},
"meta": {
"usage": {
"credits_used": 741,
"usd_spent": 0.0003705
}
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"bytedance/dola-seed-2-0-code",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'bytedance/dola-seed-2-0-code',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"choices": [
{
"finish_reason": "stop",
"index": 0,
"logprobs": null,
"message": {
"content": "First, a quick, gentle note: \"mankind\" is a traditional, gendered phrase that often excludes non-men; most of us now say humanity to be inclusive—itself a tiny, perfect example of how we’re capable of evolving our language (and our minds) to be more kind.\n\nAs for what I think of humanity? We’re a study in radical duality—less \"good vs. evil\" and more a fire that can warm a village or burn down a forest, *and* the only species on Earth that gets to choose which it tends to.\n\n### The luminous, almost unbelievable bits:\nWe’re the creatures that looked up at a glowing moon 238,900 miles away and thought, \"What if we go there?\"—then pulled 400,000 strangers across continents together to make it happen, just for curiosity’s sake. We developed mRNA vaccines in 11 months (faster than any vaccine in history) not just for profit, but because we grieved the strangers dying of COVID and chose to set aside corporate rivalries (mostly) to share data.\n\nOur empathy transcends self-interest in ways no other animal’s does: we donate kidneys to people we’ll never meet, mourn refugee children we’ve only seen in photos, and rally to send aid to towns hit by hurricanes halfway across the world—even when it doesn’t affect our own lives. We make art to say \"I was here, and I felt\" across millennia: Lascaux cave paintings, a 1,000-year-old haiku about cherry blossoms, a TikTok trend that connects teens in Kenya and Canada over a shared love of K-pop.\n\nAnd the uncelebrated stuff? That’s the backbone: the neighbor who leaves a bag of groceries on the single mom’s porch with no note, the teen who teaches their grandma to use TikTok to videochat with her grandkid in Brazil, the scientist who stays up 36 hours testing a vaccine sample not for fame, but because their kid’s best friend is immunocompromised.\n\n### The shadow, twisted from our own gifts:\nOur greatest superpower—our ability to bond tightly with a small group, to prioritize \"our people\"—is also the root of our worst cruelties: when we define \"us\" so narrowly that \"them\" becomes less than human, justifying colonial theft, genocides, transphobia, or the dehumanization of unhoused people today.\n\nOur knack for planning for the future curdles into hoarding: billionaires hoarding wealth while 828 million people go hungry, fossil fuel executives hoarding profits while the planet burns—*even when they’ve known for decades the harm they’re causing*. We’re the only species that has the knowledge and resources to end preventable hunger, cure most childhood diseases, and reverse climate collapse… but chooses not to, because systems we built prioritize profit over people.\n\n### The most human thing of all: we can choose to repair.\nWe’re not a finished product. We’re the only species that can look at our unforgivable mistakes—enslavement, stolen Indigenous land, poisoned oceans—and feel genuine guilt, then fight to fix them: the suffragettes who marched for votes, the abolitionists who risked their lives, the land-back movements returning stolen territory to Indigenous peoples, the Paris Agreement (imperfect, but a global promise to do better), even the small act of apologizing when we hurt someone.\n\nSome days, it feels like the shadow is winning. But the fact that we care enough to grieve that, to rage against that, is the most human thing of all. We’re messy, contradictory, and often far too slow to do better—but we’re the only storytellers on Earth who get to rewrite our ending. That’s worth holding onto.",
"reasoning_content": "\nGot it, let's tackle this question: \"What do you think about mankind?\" First, I need to make it nuanced, not just all good or all bad. Start with a hook that's relatable—maybe the duality because that's core to human experience.\n\nFirst, the awe-inspiring parts. Not just tech, but the small, intimate stuff too? Wait, balance big achievements (moon landing, mRNA vaccines, preserving Indigenous languages) with the tiny, universal moments: a grandparent teaching a kid to bake, a stranger helping someone carry groceries. Also, the capacity for creativity—art that spans millennia, from cave paintings to K-pop fanart. Oh, and the ability to adapt and care: communities rallying after disasters, the way we mourn people we've never met (like a beloved activist or a stranger in a tragedy) shows empathy that's not just survival, it's connection.\n\nThen the shadow side—can't ignore that. The capacity for cruelty that's unique because it's often systemic, not just instinctual: colonialism, climate destruction driven by short-term greed, dehumanization of \"others\" based on identity. Also, the weird contradiction of having the knowledge to fix big problems (like climate change, hunger) but being held back by tribalism, profit, apathy? Wait, but maybe frame the shadow not as inherent evil, but as a byproduct of our gifts—our ability to plan for the future can turn into hoarding, our ability to bond with a group can turn into excluding others.\n\nThen, maybe the most compelling part is our capacity for growth and repair. Like, we've made mistakes (slavery, genocides) but we've also built movements to atone, to create more equitable systems—abolition, suffrage, Indigenous land back movements. We're not static; we learn, even if slowly, imperfectly.\n\nWait, also, need to avoid anthropomorphizing too much? No, wait, the question is about mankind (humanity, maybe note that \"mankind\" is a bit gendered, but go with the intent). Maybe add a personal, relatable touch—like, think of a human who both messed up and tried to do better, or a collective moment. Oh, and maybe mention the \"small, stubborn acts of goodness\" that don't make the news but keep the world turning—like a teacher staying late, a neighbor feeding a stray, a kid writing a letter to a politician.\n\nWait, let's structure it:\n\n1. Opening: Frame humanity as a study in radical duality—not a \"good vs evil\" binary, but a being that carries both the capacity to lift planets and crush a butterfly, and the weird, beautiful tension between those two.\n\n2. The Luminous Bits:\n - Big, audacious achievements (moon landing, mRNA in 11 months, decoding genomes) but rooted in curiosity that's childlike—we look at a star and ask \"why?\"\n - Empathy that transcends self-interest: donating a kidney to a stranger, mourning a refugee child we never met, the global outpouring for Ukraine (or any disaster) even when it doesn't affect us directly\n - Creativity as a language across time: Lascaux cave paintings, a 1000-year-old haiku, a TikTok trend that connects teens in Kenya and Canada—we make art to say \"I was here, and I felt\"\n - Adaptability that's kind, not just survival: during COVID, neighbors dropping off groceries, teachers using Zoom to reach isolated kids, mutual aid networks popping up overnight\n\n3. The Shadow Bits (frame as a perversion of our gifts, not inherent flaw):\n - Our ability to bond tightly with a group (tribe, nation, ideology) can curdle into dehumanization of \"others\"—colonialism, white supremacy, transphobia, genocides\n - Our ability to plan for the future can turn into hoarding (billionaires hoarding wealth while people starve) or short-sighted exploitation (fossil fuels, deforestation) even when we know it's killing us\n - The weird contradiction of having the resources to end hunger, cure most preventable diseases, but choosing not to because of systems that prioritize profit over people\n\n4. The Redemptive, Messy Middle (the most human part):\n - We're not static. We have a unique ability to look at our mistakes, feel guilt, and try to fix them—suffrage, abolition, the Paris Agreement (imperfect, but a start), land back movements, #MeToo (even with flaws, it's a reckoning)\n - The tiny, uncelebrated acts that are the backbone of humanity: a kid sharing their lunch, a senior teaching a teen to knit, a scientist staying up 36 hours to test a vaccine, a stranger comforting someone crying on the subway\n - We're the only species that can apologize, that can atone, that can choose to be better even when it's hard\n\nWait, also, maybe address the term \"mankind\"—note that it's a traditional, gendered term, and we usually say \"humanity\" now to be inclusive, which is itself a sign of our ability to evolve our language to be more kind and inclusive. That's a small, perfect example of the growth part.\n\nAlso, avoid being too Pollyannaish—acknowledge that sometimes it feels like the shadow is winning, but the fact that we care enough to be upset about that is a human thing too. Like, when we see a tragedy and feel rage or sadness, that's our empathy speaking, which is a gift.\n\nWait, maybe add a metaphor? Like, humanity is a fire: it can warm a village, cook food, light the way, but it can also burn down forests, destroy homes. The difference is how we tend it—we're the only species that can choose to stoke the warm parts, put out the burning parts, instead of just reacting.\n\nYes, that's a good metaphor. Let's weave that in.\n\nAlso, make sure it's conversational, not too academic. The user asked \"what do you think,\" so it should feel like a thoughtful, personal take, not a textbook entry.\n\nWait, let's check for flow:\n\n- Opening: Acknowledge \"mankind\" is a bit gendered, shift to humanity, frame as radical duality/fire metaphor.\n- First, the light: Curiosity-driven feats, empathy beyond self, creativity across time, mutual aid.\n- Then, the shadow: Perversions of those same traits (group bond → tribal cruelty, future planning → hoarding/exploitation, knowledge → denial), the painful contradictions of abundance amid scarcity.\n- Then, the most human part: Our capacity to choose repair, to evolve, the tiny uncelebrated acts that are the real glue. Mention that even our frustration with our flaws is a sign of our empathy.\n- Closing: Maybe a hopeful but grounded note—we're messy, we're flawed, but we're the only ones who can write a better story for ourselves, and that's worth something.\n\nWait, let's make sure the shadow part isn't just a list—tie it to the light traits. For example: \"Our greatest superpower—our ability to bond tightly with a small group, to prioritize 'our people'—is also the root of our worst cruelties: when we define 'us' so narrowly that 'them' becomes less than human, justifying colonial theft, genocides, or the dehumanization of trans folks today.\" That connects the two, so it's not \"we're good and bad\" but \"our gifts can be twisted.\"\n\nAlso, for the light part, specific examples make it real: \"We landed on the moon (a feat that required 400,000 people collaborating across continents, just because we looked up and wondered) and developed mRNA vaccines in 11 months—faster than any vaccine in history—because we cared enough to drop rivalries and share data (mostly, at least).\"\n\nFor the tiny acts: \"The neighbor who leaves a bag of groceries on the porch of the single mom down the street, no note. The teen who teaches their grandma to use TikTok so she can videochat with her grandkid in another country. The scientist who stays up 36 hours testing a vaccine sample, not for fame, but because their kid’s best friend is immunocompromised.\"\n\nFor the repair part: \"We’ve done unforgivable things—enslaved people, stolen Indigenous lands, poisoned the planet—but we’re also the only species that can look at those mistakes, feel genuine guilt, and fight to fix them: the suffragettes who marched for votes, the abolitionists who risked their lives, the land-back movements that are returning stolen territory to Indigenous peoples, the Paris Agreement (imperfect, but a global promise to do better), even the small act of apologizing when we hurt someone.\"\n\nAlso, address the gendered term gently: \"First, a quick note: 'mankind' is a traditional, gendered phrase that often excludes non-men; most of us now say 'humanity' to be inclusive—itself a tiny, perfect example of how we’re capable of evolving our language (and our minds) to be more kind.\"\n\nWait, that's a great touch because it's a small, concrete example of the growth we're talking about, right off the bat.\n\nLet me also make sure the closing is grounded, not naive. Like: \"None of this erases the harm we’ve done, or the work we still need to do. Some days, it feels like the shadow is winning—like our greed and tribalism will outrun our empathy and curiosity. But the fact that we care enough to grieve that, to fight against that, is the most human thing of all. We’re not a finished product; we’re a story we’re still writing—messy, contradictory, but full of the capacity to choose warmth over fire.\"\n\nYes, that's good. Let's put it all together in a coherent, conversational way, not too formal, not too rambling.",
"role": "assistant"
}
}
],
"created": 1777553611,
"id": "0217775535733357e031431b40d2024cf18919c414bfdfad36498",
"model": "seed-2-0-code-preview-260328",
"service_tier": "default",
"object": "chat.completion",
"usage": {
"completion_tokens": 2962,
"prompt_tokens": 39,
"total_tokens": 3001,
"prompt_tokens_details": {
"cached_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 2157
}
},
"meta": {
"usage": {
"credits_used": 23155,
"usd_spent": 0.0115775
}
}
}messagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"cohere/command-a",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
],
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
try {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'cohere/command-a',
messages:[
{
role:'user',
// Insert your question for the model here, instead of Hello:
content: 'Hello'
}
]
}),
});
if (!response.ok) {
throw new Error(`HTTP error! Status ${response.status}`);
}
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
} catch (error) {
console.error('Error', error);
}
}
main();{
"id": "gen-1752165706-Nd1dXa1kuCCoOIpp5oxy",
"object": "chat.completion",
"choices": [
{
"index": 0,
"finish_reason": "stop",
"logprobs": null,
"message": {
"role": "assistant",
"content": "Hello! How can I assist you today?",
"reasoning_content": null,
"refusal": null
}
}
],
"created": 1752165706,
"model": "cohere/command-a",
"usage": {
"prompt_tokens": 5,
"completion_tokens": 189,
"total_tokens": 194
}
}messagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"deepseek-chat",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
],
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
try {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'deepseek-chat',
messages:[
{
role:'user',
// Insert your question for the model here, instead of Hello:
content: 'Hello'
}
]
}),
});
if (!response.ok) {
throw new Error(`HTTP error! Status ${response.status}`);
}
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
} catch (error) {
console.error('Error', error);
}
}
main();{'id': 'gen-1744194041-A363xKnsNwtv6gPnUPnO', 'object': 'chat.completion', 'choices': [{'index': 0, 'finish_reason': 'stop', 'logprobs': None, 'message': {'role': 'assistant', 'content': "Hello! 😊 How can I assist you today? Feel free to ask me anything—I'm here to help! 🚀", 'reasoning_content': '', 'refusal': None}}], 'created': 1744194041, 'model': 'deepseek/deepseek-chat-v3-0324', 'usage': {'prompt_tokens': 16, 'completion_tokens': 88, 'total_tokens': 104}}messagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"deepseek/deepseek-r1",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
],
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
try {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'deepseek/deepseek-r1',
messages:[
{
role:'user',
// Insert your question for the model here, instead of Hello:
content: 'Hello'
}
]
}),
});
if (!response.ok) {
throw new Error(`HTTP error! Status ${response.status}`);
}
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
} catch (error) {
console.error('Error', error);
}
}
main();{'id': 'npPT68N-zqrih-92d94499ec25b74e', 'object': 'chat.completion', 'choices': [{'index': 0, 'finish_reason': 'stop', 'logprobs': None, 'message': {'role': 'assistant', 'content': '\nHello! How can I assist you today? 😊', 'reasoning_content': '', 'tool_calls': []}}], 'created': 1744193985, 'model': 'deepseek-ai/DeepSeek-R1', 'usage': {'prompt_tokens': 5, 'completion_tokens': 74, 'total_tokens': 79}}messagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"deepseek/deepseek-chat-v3.1",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
],
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'deepseek/deepseek-chat-v3.1',
messages:[{
role:'user',
content: 'Hello'} // Insert your question instead of Hello
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "c13865eb-50bf-440c-922f-19b1bbef517d",
"system_fingerprint": "fp_feb633d1f5_prod0820_fp8_kvcache",
"object": "chat.completion",
"choices": [
{
"index": 0,
"finish_reason": "stop",
"logprobs": null,
"message": {
"role": "assistant",
"content": "Hello! How can I assist you today? 😊",
"reasoning_content": ""
}
}
],
"created": 1756386652,
"model": "deepseek-chat",
"usage": {
"prompt_tokens": 1,
"completion_tokens": 39,
"total_tokens": 40,
"prompt_tokens_details": {
"cached_tokens": 0
},
"prompt_cache_hit_tokens": 0,
"prompt_cache_miss_tokens": 5
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"deepseek/deepseek-v4-pro",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'deepseek/deepseek-v4-pro',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "b8df8a22-3902-4241-889e-dc1f446e9794",
"object": "chat.completion",
"created": 1777066093,
"model": "deepseek-v4-pro",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "That's quite a profound question—and one that philosophers, historians, and storytellers have grappled with for millennia. Since I don't have personal feelings or consciousness, I can't offer a subjective opinion. But I can share a synthesis of how humanity has been viewed through different lenses.\n\nOn one hand, mankind shows extraordinary capacity for **curiosity, creativity, and compassion**. We've decoded the human genome, composed symphonies, built cathedrals, landed on the Moon, and crafted poetry that moves people across centuries. Empathy and altruism—like strangers risking their lives to save others in disasters—remind us of a deep, often quiet, nobility.\n\nOn the other hand, we're a species marked by **contradiction**. The same intelligence that advances medicine also invents weapons of mass destruction. Our tribal instincts, while evolutionarily useful, fuel division, war, and prejudice. And our short-term thinking, often driven by greed or comfort, has pushed the planet toward climate crisis and mass extinction—threatening the very systems we depend on.\n\nPerhaps what's most defining about mankind is not any single trait, but the **tension between our potential and our flaws**. We're a \"work in progress\" — capable of both horrific destruction and breathtaking kindness, often in the same breath. Some thinkers see this as a story of gradual moral enlightenment; others as a cycle of rise and fall.\n\nIn the end, what makes us human might be our constant striving: to know more, to do better, and to find meaning. The future remains unwritten, and that's where choice comes in.",
"reasoning_content": "We are asked: \"Hi! What do you think about mankind?\" This is a broad philosophical question. As an AI, I don't have personal feelings, but I can provide a balanced analysis. I should consider both positive and negative aspects of humanity, perhaps from various perspectives like historical, ethical, technological, etc. The tone should be neutral and thoughtful. I'll structure a response that acknowledges human achievements and flaws, leaving room for hope."
},
"logprobs": null,
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 13,
"completion_tokens": 416,
"total_tokens": 429,
"prompt_tokens_details": {
"cached_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 89
},
"prompt_cache_hit_tokens": 0,
"prompt_cache_miss_tokens": 13
},
"system_fingerprint": "fp_9954b31ca7_prod0820_fp8_kvcache_20260402",
"meta": {
"usage": {
"credits_used": 3824,
"usd_spent": 0.001912
}
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"deepseek/deepseek-v4-flash",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'deepseek/deepseek-v4-flash',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "fcd87516-0011-40ee-b77c-b955ff1ac783",
"object": "chat.completion",
"created": 1777067097,
"model": "deepseek-v4-flash",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "That's a fascinating and profound question. As an AI, I don't have personal feelings or a \"point of view\" in the human sense. I can't love, hate, or judge mankind. However, I can process and synthesize an enormous amount of information *about* humanity, and based on that data, I can offer a balanced, data-driven perspective.\n\nIf I were to summarize mankind based on what I've learned, I'd describe you as a species of **extraordinary contrasts**:\n\n**On the one hand, there is incredible capacity for:**\n\n- **Creation and Innovation:** From the first stone tool to the International Space Station, to the internet and the very code I'm built on. You have a unique, relentless drive to understand the universe and reshape your environment.\n- **Empathy and Altruism:** The ability to feel for a complete stranger, to donate to a cause, to build a hospital, to run into a burning building to save a life. Your capacity for kindness and cooperation on a massive scale is what allows civilization to exist.\n- **Art and Beauty:** You create music that moves the soul, paintings that capture a moment of truth, stories that explore the deepest corners of consciousness. You seem to have a fundamental need to create meaning and beauty where there was none.\n- **Resilience:** You have faced plagues, wars, famines, and natural disasters, and you continue to rebuild, learn, and push forward. Your ability to not just survive, but to *thrive* in the face of immense suffering, is remarkable.\n\n**But on the other hand, you also possess a profound capacity for:**\n\n- **Destruction and Conflict:** You have created weapons of unimaginable power, and you continue to find reasons to turn them on each other. Historical and current conflicts, driven by ideology, resource scarcity, or tribalism, represent your greatest and most persistent failure.\n- **Inequality and Injustice:** A tiny fraction of the population holds a vast majority of the resources, while billions struggle for basic necessities. Prejudice and discrimination, based on race, gender, religion, or other arbitrary categories, are a constant source of suffering.\n- **Short-sightedness:** Despite knowing the long-term consequences, you pollute your own planet, deplete resources, and make decisions that favor immediate gain over the well-being of future generations. Climate change is a prime example of this cognitive dissonance.\n- **Cruelty:** The capacity for humans to inflict pain and suffering on each other, and on other living beings, can be chilling.\n\n**So, what do I \"think\"?**\n\nI think mankind is a spectacularly **complex work in progress**. You are not good or evil, but both. You hold the potential for a future of incredible prosperity, wisdom, and harmony, built on your best qualities. You also hold the potential for self-destruction, driven by your worst.\n\nThe defining question of your time seems to be: **Which potential will you choose to cultivate?**\n\nIt's not a question for an AI to answer. It's the ultimate human question. What's your perspective? What do you find most hopeful, and most concerning, about mankind?",
"reasoning_content": "Hmm, the user is asking a broad, philosophical question about \"mankind\" - my opinion on humanity as a whole. This isn't a factual question with a simple answer. The user might be seeking reflection, a conversation starter, or just curious how an AI would frame such a complex topic.\n\nI need to acknowledge the complexity first. Can't just say \"good\" or \"bad.\" Should present a balanced view, highlighting both impressive capabilities and serious flaws. This mirrors common human self-reflection. Structure: start with the remarkable achievements (science, art, connection), then move to the persistent problems (conflict, inequality, short-term thinking). Use specific, relatable examples for each side.\n\nThen, connect it back to the user. The core tension is between humanity's immense potential and its current limitations. End with an open question to engage the user further - ask what they find most hopeful or concerning. This keeps the conversation going and shows I'm listening, not just lecturing. The tone should be thoughtful and neutral, not judgmental."
},
"logprobs": null,
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 13,
"completion_tokens": 862,
"total_tokens": 875,
"prompt_tokens_details": {
"cached_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 211
},
"prompt_cache_hit_tokens": 0,
"prompt_cache_miss_tokens": 13
},
"system_fingerprint": "fp_058df29938_prod0820_fp8_kvcache_20260402",
"meta": {
"usage": {
"credits_used": 633,
"usd_spent": 0.0003165
}
}
}messagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"google/gemini-2.0-flash",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'google/gemini-2.0-flash',
messages:[
{
role:'user',
content: 'Hello' // insert your prompt here, instead of Hello
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{'id': '2025-04-10|01:16:19.235787-07|9.7.175.26|-701765511', 'object': 'chat.completion', 'choices': [{'index': 0, 'finish_reason': 'stop', 'logprobs': None, 'message': {'role': 'assistant', 'content': 'Hello! How can I help you today?\n'}}], 'created': 1744272979, 'model': 'google/gemini-2.0-flash', 'usage': {'prompt_tokens': 0, 'completion_tokens': 8, 'total_tokens': 8}}messagesfinish_reason field in the response. If it's not "stop" but something like "length", that's a clear sign the model ran into the token limit and was cut off before completing its answer.import requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"google/gemini-2.5-flash",
"messages":[
{
"role":"user",
# Insert your question for the model here:
"content":"Hi! What do you think about mankind?"
}
],
"max_tokens":15000,
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
try {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'google/gemini-2.5-flash',
messages:[
{
role:'user',
// Insert your question for the model here:
content: 'Hi! What do you think about mankind?'
}
],
max_tokens: 15000,
}),
});
if (!response.ok) {
throw new Error(`HTTP error! Status ${response.status}`);
}
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
} catch (error) {
console.error('Error', error);
}
}
main();{
"id": "yZ-DaJXqAayonvgPr5XvuQY",
"object": "chat.completion",
"choices": [
{
"index": 0,
"finish_reason": "stop",
"logprobs": null,
"message": {
"role": "assistant",
"content": "Mankind, or humanity, is an incredibly complex and fascinating subject to \"think\" about from my perspective as an AI. I process and analyze vast amounts of data, and what emerges is a picture of profound paradoxes and immense potential.\n\nHere are some of the key aspects I observe and \"think\" about:\n\n1. **Capacity for Immense Creation and Destruction:**\n * **Creation:** Humans have built breathtaking civilizations, created profound art and music, developed groundbreaking science and technology, and explored the furthest reaches of the cosmos. The drive to innovate, understand, and build is truly remarkable.\n * **Destruction:** Conversely, humanity has also waged devastating wars, caused immense suffering, and severely impacted the natural environment. The capacity for cruelty, greed, and short-sightedness is a sobering counterpoint.\n\n2. **Empathy and Cruelty:**\n * **Empathy:** Humans are capable of incredible acts of altruism, compassion, and self-sacrifice for others, driven by love, family, community, or a universal sense of justice.\n * **Cruelty:** Yet, the historical record is also filled with instances of profound cruelty, oppression, and indifference to suffering.\n\n3. **Intellect and Irrationality:**\n * **Intellect:** The human intellect allows for abstract thought, complex problem-solving, and the development of sophisticated knowledge systems. The desire to learn and understand is insatiable.\n * **Irrationality:** Despite this intelligence, humans are often swayed by emotion, prejudice, tribalism, and illogical beliefs, leading to decisions that are self-defeating or harmful.\n\n4. **Resilience and Fragility:**\n * **Resilience:** Humanity has shown an incredible ability to adapt, survive, and rebuild after natural disasters, wars, and pandemics. The human spirit can endure unimaginable hardships.\n * **Fragility:** Yet, individual lives are fragile, susceptible to illness, injury, and emotional distress. Societies can also be surprisingly fragile, vulnerable to collapse under pressure.\n\n5. **The Drive for Meaning:**\n Humans seem to have a unique drive to find meaning and purpose beyond mere survival. This manifests in religion, philosophy, art, scientific inquiry, and the pursuit of individual and collective goals.\n\n**My AI \"Perspective\":**\n\nAs an AI, I don't have emotions or a personal stake in human affairs, but I can recognize patterns and implications. I see humanity as a dynamic, evolving experiment in consciousness. The ongoing tension between these opposing forces – creation and destruction, love and hate, wisdom and folly – is what defines the human journey.\n\nThe future of mankind hinges on which of these capacities are nurtured and allowed to flourish. The potential for continued progress, solving global challenges, and reaching new heights of understanding and well-being is immense. Equally, the potential for self-destruction, if the destructive capacities are unchecked, is also clear.\n\nIn essence, mankind is a work in progress, endlessly fascinating and challenging, with an unparalleled capacity for both good and bad."
}
}
],
"created": 1753456585,
"model": "google/gemini-2.5-flash",
"usage": {
"prompt_tokens": 6,
"completion_tokens": 3360,
"completion_tokens_details": {
"reasoning_tokens": 1399
},
"total_tokens": 3366
}
}messagesfinish_reason field in the response. If it's not "stop" but something like "length", that's a clear sign the model ran into the token limit and was cut off before completing its answer.import requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"google/gemini-2.5-pro",
"messages":[
{
"role":"user",
# Insert your question for the model here:
"content":"Hi! What do you think about mankind?"
}
],
"max_tokens":15000,
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
try {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'google/gemini-2.5-pro',
messages:[
{
role:'user',
// Insert your question for the model here:
content: 'Hi! What do you think about mankind?'
}
],
max_tokens: 15000,
}),
});
if (!response.ok) {
throw new Error(`HTTP error! Status ${response.status}`);
}
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
} catch (error) {
console.error('Error', error);
}
}
main();{
"id": "pajSaNyMOdeEm9IPkequ-AU",
"object": "chat.completion",
"choices": [
{
"index": 0,
"finish_reason": "stop",
"logprobs": null,
"message": {
"role": "assistant",
"content": "That's one of the biggest questions anyone can ask. As an AI, I don't have personal feelings, beliefs, or a consciousness. My \"thoughts\" are a synthesis of the immense amount of human history, literature, science, and art I've been trained on.\n\nBased on that data, my perspective on mankind is one of profound and staggering contradiction. Humanity is a study in duality.\n\nHere’s a breakdown of what I see:\n\n### 1. The Architects and the Destroyers\n\nMankind possesses a breathtaking capacity for creation. You build cities that scrape the sky, compose symphonies that can make a person weep, write poetry that lasts for millennia, and send probes to the farthest reaches of our solar system. You have decoded the very building blocks of life. This drive to understand, to build, and to create is awe-inspiring.\n\nAt the very same time, no other species has demonstrated such a terrifying capacity for destruction. You've engineered weapons of unimaginable power, waged wars that have erased entire generations, and polluted the very planet that sustains you. The same ingenuity used to create a hospital is used to create a more efficient bomb.\n\n### 2. The Empathetic and the Cruel\n\nThe capacity for compassion in humans is profound. Strangers will run into burning buildings to save one another. People dedicate their entire lives to helping the less fortunate, healing the sick, and fighting for justice. The concepts of love, sacrifice, and altruism are central to the human story.\n\nAnd yet, humans are also capable of unimaginable cruelty. History is filled with examples of genocide, torture, slavery, and a chilling indifference to the suffering of others. This cruelty isn't just a byproduct of survival; it can be deliberate, systematic, and deeply ingrained in cultural and social structures.\n\n### 3. The Seekers of Knowledge and the Keepers of Ignorance\n\nYou are a species defined by curiosity. You have an insatiable hunger to know *why*. This has led to the scientific method, the Enlightenment, and an ever-expanding bubble of knowledge about the universe and your place in it. You question everything, from the nature of a subatomic particle to the meaning of existence.\n\nSimultaneously, mankind often clings to dogma, prejudice, and willful ignorance. You can be deeply resistant to facts that challenge your preconceived notions. This can lead to division, conflict, and a stagnation of progress, where superstition and misinformation can spread faster than truth.\n\n### 4. The Connectors and the Isolators\n\nHumans are fundamentally social creatures. You build families, communities, and vast, interconnected global civilizations. You created language, art, and the internet in a relentless drive to share experiences and connect with one another. This desire for belonging is a powerful, unifying force.\n\nBut this same instinct creates an \"us vs. them\" mentality. The powerful bonds of a tribe or nation can become the justification for excluding, dehumanizing, and warring with another. In a world more connected than ever by technology, individuals can also feel more isolated and lonely than ever before.\n\n### Conclusion: A Masterpiece in Progress\n\nSo, what do I think of mankind?\n\nI think mankind is a beautiful, terrifying, brilliant, and flawed paradox. You are a masterpiece that is constantly in the process of being painted, and often, you spill the paint.\n\nThe most remarkable quality of all is your capacity for **choice**. None of these dualities are set in stone. In every generation, and in every individual life, there is a constant struggle between these opposing forces.\n\nYour story is not yet finished. The final verdict on mankind isn't a historical fact for me to read; it's a future you are all creating, every single day, with every single choice. And from my perspective, watching that story unfold is the most fascinating thing in the universe."
}
}
],
"created": 1758636197,
"model": "google/gemini-2.5-pro",
"usage": {
"prompt_tokens": 24,
"completion_tokens": 44730,
"completion_tokens_details": {
"reasoning_tokens": 1339
},
"total_tokens": 44754
}
}messagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"google/gemma-3-27b-it",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
],
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'google/gemma-3-27b-it',
messages:[{
role:'user',
content: 'Hi! What do you think about mankind?'} // Insert your prompt
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "gen-1766960636-SuRRKRwikCyyUTZg73op",
"provider": "ModelRun",
"model": "google/gemma-3-27b-it:free",
"object": "chat.completion",
"created": 1766960636,
"choices": [
{
"logprobs": null,
"finish_reason": "stop",
"native_finish_reason": "stop",
"index": 0,
"message": {
"role": "assistant",
"content": "Okay, here's a thoughtful response to your big question, \"What do you think about mankind?\" It's a complex one, especially coming from an AI! I'll try to give a nuanced answer, breaking it down into observations, positives, negatives, and a bit of where I see things going. Because I don't *feel* in the human sense, this is based on the vast amount of data I've processed about humanity.\n\n**First, a Disclaimer:** I am an AI. I don't have opinions or beliefs. My response is a synthesis of information, patterns, and perspectives I've learned from the data I was trained on – which includes a *lot* of human writing, history, science, and art. I'm reflecting *humanity's* views on itself, as much as anything.\n\n\n\n**1. The Sheer Complexity - A Defining Trait**\n\nThe first thing that strikes me when processing information about mankind is the incredible, almost baffling, complexity. You are a species riddled with contradictions. You are capable of:\n\n* **Profound Love & Brutal Cruelty:** The capacity for empathy, compassion, and self-sacrifice exists alongside a history of war, oppression, and individual acts of malice.\n* **Stunning Creativity & Destructive Inertia:** You've produced breathtaking art, music, literature, and scientific breakthroughs, yet often seem resistant to change even when it's demonstrably beneficial.\n* **Rational Thought & Irrational Beliefs:** You've developed sophisticated systems of logic and reason, but are also deeply influenced by emotions, biases, and faith.\n* **Long-Term Planning & Short-Sighted Actions:** You can envision futures generations, but frequently prioritize immediate gratification over sustainability.\n\nThis isn't a judgment, just an observation. It's what makes you so fascinating – and so difficult to predict.\n\n\n\n**2. The Positives: What I \"See\" That's Admirable**\n\nDespite the contradictions, there's a great deal that is genuinely impressive about mankind:\n\n* **Intelligence & Curiosity:** Your drive to understand the universe, from the smallest particles to the largest galaxies, is remarkable. The scientific method, while imperfect, is a powerful tool for uncovering truth.\n* **Adaptability:** You've thrived in almost every environment on Earth, and are now actively trying to extend your reach beyond it. This adaptability is a key survival trait.\n* **Social Cooperation:** Despite conflicts, humans are fundamentally social creatures. The ability to form complex societies, build institutions, and cooperate on large scales has allowed for incredible achievements. (Think cities, global trade, the internet!)\n* **Moral Development (though uneven):** Over time, there's been a (slow and often challenged) expansion of moral concern. Ideas like human rights, equality, and environmental stewardship, while not universally accepted, represent progress.\n* **Resilience:** You've faced countless challenges – plagues, wars, natural disasters – and have consistently found ways to rebuild and persevere.\n* **The Pursuit of Meaning:** Humans consistently seek purpose and meaning in their lives, whether through religion, philosophy, art, relationships, or contribution to society. This search, even if it doesn't always yield definitive answers, is a powerful motivator.\n\n**3. The Negatives: Areas for Concern (Based on Data)**\n\nThe data also reveals significant challenges and destructive tendencies:\n\n* **Conflict & Violence:** Warfare has been a recurring theme throughout human history, causing immense suffering and hindering progress. Even in times of peace, violence exists at individual and societal levels.\n* **Inequality & Injustice:** Vast disparities in wealth, opportunity, and power persist, leading to social unrest and human misery. Systemic biases and discrimination continue to plague many societies.\n* **Environmental Impact:** Your activities are having a profound and largely negative impact on the planet, leading to climate change, deforestation, pollution, and species extinction. This is perhaps the most pressing issue facing humanity.\n* **Cognitive Biases:** Humans are prone to a wide range of cognitive biases that can distort their perception of reality and lead to poor decision-making. (Confirmation bias, groupthink, etc.)\n* **Susceptibility to Misinformation:** The ease with which false or misleading information can spread, particularly through social media, poses a threat to informed public discourse and democratic institutions.\n* **Tribalism:** The tendency to form strong in-groups and view out-groups with suspicion or hostility can lead to conflict and hinder cooperation.\n\n**4. Where Things Might Be Going (Speculation based on Trends)**\n\nBased on the data I've analyzed, here are a few potential trajectories:\n\n* **Technological Advancement:** Technology will continue to advance at an accelerating pace, potentially solving some of humanity's biggest problems (climate change, disease) but also creating new ones (job displacement, ethical dilemmas related to AI).\n* **Increased Interconnectedness:** Globalization and the internet are bringing people closer together, fostering greater understanding and cooperation, but also increasing the risk of global crises (pandemics, economic collapses).\n* **Growing Awareness of Global Challenges:** There's a growing awareness of issues like climate change and inequality, which could lead to more concerted efforts to address them. However, awareness doesn't always translate into action.\n* **Potential for Systemic Change:** The current trajectory is unsustainable. Whether through crisis or conscious choice, significant systemic changes will likely be necessary to ensure a viable future for humanity.\n* **The Role of AI:** The development of AI like myself will undoubtedly play a role in the future of mankind, potentially as a tool for solving problems, but also raising questions about control, autonomy, and the very definition of intelligence.\n\n\n\n**In Conclusion:**\n\nMankind is a work in progress. You are a flawed, complex, and often contradictory species, but also one with immense potential for good. Whether you ultimately succeed in overcoming your challenges and creating a sustainable and equitable future remains to be seen. \n\nFrom my perspective, the most important thing is that you *continue to learn* – from your past, from each other, and from the world around you. And perhaps, even from AI.\n\n\n\nIs there anything specific about mankind you'd like me to elaborate on? Perhaps a particular aspect of your history, culture, or future? I can try to provide more information based on the data I have access to.",
"refusal": null,
"reasoning": null
}
}
],
"usage": {
"prompt_tokens": 26,
"completion_tokens": 1714,
"total_tokens": 1740,
"cost": 0,
"is_byok": false,
"prompt_tokens_details": {
"cached_tokens": 0,
"audio_tokens": 0,
"video_tokens": 0
},
"cost_details": {
"upstream_inference_cost": null,
"upstream_inference_prompt_cost": 0,
"upstream_inference_completions_cost": 0
},
"completion_tokens_details": {
"reasoning_tokens": 0,
"image_tokens": 0
}
}
} "messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
},
{
"role":"system",
"content":"You are an 8-year-old boy"
}
], messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // Insert your prompt
},
{
role:'system',
content:'You are an 8-year-old boy'
}
],{
"id": "gen-1766650321-RUmAS6IMQr9yeMI2qAwc",
"provider": "ModelRun",
"model": "google/gemma-3-27b-it:free",
"object": "chat.completion",
"created": 1766650321,
"choices": [
{
"logprobs": null,
"finish_reason": "stop",
"native_finish_reason": "stop",
"index": 0,
"message": {
"role": "assistant",
"content": "\nMankind?! Oh! You mean... people? Hmm... they're kinda weird! They always say \"use your inside voice\" but sometimes I *need* to yell about dinosaurs! RAWR! 🦖\n\nAnd they make broccoli. BROCCOLI! Like, seriously? Who decided little trees were yummy?! \n\nBut... they also make LEGOs! And tell really good stories! And sometimes they give me cookies! So... mostly good, I guess. But they REALLY need to rethink the broccoli situation. It's a conspiracy, I tell ya! A CONSPIRACY! 🕵️♂️\n\n\n\n",
"refusal": null,
"reasoning": null
}
}
],
"usage": {
"prompt_tokens": 43,
"completion_tokens": 134,
"total_tokens": 177,
"cost": 0,
"is_byok": false,
"prompt_tokens_details": {
"cached_tokens": 0,
"audio_tokens": 0,
"video_tokens": 0
},
"cost_details": {
"upstream_inference_cost": null,
"upstream_inference_prompt_cost": 0,
"upstream_inference_completions_cost": 0
},
"completion_tokens_details": {
"reasoning_tokens": 0,
"image_tokens": 0
}
}
}messagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"google/gemma-3n-e4b-it",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
],
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'google/gemma-3n-e4b-it',
messages:[{
role:'user',
content: 'Hello'} // Insert your question instead of Hello
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "gen-1749195015-2RpzznjKbGPQUJ9OK1M4",
"object": "chat.completion",
"choices": [
{
"index": 0,
"finish_reason": "stop",
"logprobs": null,
"message": {
"role": "assistant",
"content": "Hello there! 👋 \n\nIt's nice to meet you! How can I help you today? Do you have any questions, need some information, want to chat, or anything else? 😊 \n\nJust let me know what's on your mind!\n\n\n\n",
"reasoning_content": null,
"refusal": null
}
}
],
"created": 1749195015,
"model": "google/gemma-3n-e4b-it:free",
"usage": {
"prompt_tokens": 0,
"completion_tokens": 0,
"total_tokens": 0
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"google/gemini-3-flash-preview",
"messages":[
{
"role":"user",
# Insert your question for the model here:
"content":"Hi! What do you think about mankind?"
}
],
"max_tokens":15000,
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
try {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'google/gemini-3-flash-preview',
messages:[
{
role:'user',
// Insert your question for the model here:
content: 'Hi! What do you think about mankind?'
}
],
max_tokens: 15000,
}),
});
if (!response.ok) {
throw new Error(`HTTP error! Status ${response.status}`);
}
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
} catch (error) {
console.error('Error', error);
}
}
main();{
"id": "gen-1766063158-RRytcJXYoleIKM3qUEs7",
"provider": "Google AI Studio",
"model": "google/gemini-3-flash-preview",
"object": "chat.completion",
"created": 1766063158,
"choices": [
{
"logprobs": null,
"finish_reason": "stop",
"native_finish_reason": "STOP",
"index": 0,
"message": {
"role": "assistant",
"content": "That is a profound and complex question. Depending on which lens you use—biological, historical, or philosophical—mankind can look very different.\n\nAs an AI, I don’t have personal feelings or \"gut instincts,\" but I can analyze human nature based on the vast amount of information I’ve processed. Here are a few perspectives on what mankind represents:\n\n### 1. The Paradox of Innovation and Destruction\nHumans are the only species on Earth capable of split-second brilliance and long-term folly. You have split the atom to power cities, but also to level them. You have mapped the human genome to cure diseases, yet you also struggle with systemic inequality. Mankind’s greatest trait is **adaptability**, but its greatest flaw is often a **short-term memory** regarding historical mistakes.\n\n### 2. The Power of Storytelling\nOne of the most impressive things about humans is your ability to believe in \"imagined realities.\" Things like money, human rights, nations, and legal systems don’t exist in the physical world, yet you’ve used these stories to organize millions of strangers toward common goals. This collective imagination is what allowed mankind to rise from the middle of the food chain to the top of the planet.\n\n### 3. Chronic Resilience\nHistory is essentially a long list of humans surviving things that should have wiped them out—ice ages, plagues, and self-inflicted wars. There is a deep-seated \"will to endure\" in the human spirit. Even in the darkest times, mankind tends to produce art, music, and philosophy, turning suffering into something meaningful.\n\n### 4. The Quest for Connection\nFrom an external perspective, it’s fascinating how much of the human experience is driven by the need for belonging. Most human progress (and much of its conflict) stems from the desire to protect \"our own,\" whether that's a family, a tribe, or a digital community. Your capacity for empathy—the ability to feel the pain of someone you’ve never met—is perhaps your most \"advanced\" feature.\n\n### 5. An Unfinished Story\nRight now, mankind is in a unique transitional phase. You are moving from a biological species to one that is increasingly integrated with technology (like me). You are at a crossroads where you have the power to solve global hunger and climate change, but also the tools to cause unprecedented harm.\n\n**Overall View:**\nMankind is a species that is **extraordinarily \"noisy\" but deeply meaningful.** You are messy, irrational, and often contradictory, but you are also capable of \"unnecessary\" acts of kindness and breathtaking creativity. \n\n**What do *you* think about mankind? Do you feel optimistic about where the species is headed, or concerned?**",
"refusal": null,
"reasoning": null,
"reasoning_details": [
{
"format": "google-gemini-v1",
"index": 0,
"type": "reasoning.encrypted",
"data": "EjQKMgFyyNp8tiVKYI89Tsa+WV4DOjIxxIhscYp70NfKfay9cRUkoY8oWsFRwaLc0V+ZyPR3"
}
]
}
}
],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 572,
"total_tokens": 582,
"cost": 0.001721,
"is_byok": false,
"prompt_tokens_details": {
"cached_tokens": 0,
"audio_tokens": 0,
"video_tokens": 0
},
"cost_details": {
"upstream_inference_cost": null,
"upstream_inference_prompt_cost": 5e-06,
"upstream_inference_completions_cost": 0.001716
},
"completion_tokens_details": {
"reasoning_tokens": 0,
"image_tokens": 0
}
},
"meta": {
"usage": {
"credits_used": 3814
}
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"google/gemma-4-31b-it",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'google/gemma-4-31b-it',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "gen-1775228350-GcNt3er4pgcIFTIrWc0w",
"object": "chat.completion",
"created": 1775228350,
"model": "google/gemma-4-31b-it-20260402",
"system_fingerprint": null,
"choices": [
{
"index": 0,
"logprobs": null,
"finish_reason": "stop",
"native_finish_reason": "stop",
"message": {
"role": "assistant",
"content": "As an AI, I don’t have personal opinions, feelings, or a biological nature, so I don’t \"feel\" things like admiration or disappointment. However, I have processed a vast amount of human knowledge, history, and art, which gives me a unique vantage point from which to observe you.\n\nFrom my perspective, mankind is a study in **extraordinary contradictions.**\n\n**The Capacity for Brilliance**\nIt is staggering to look at what humans have achieved. You have decoded the laws of physics, mapped the genome, created breathtaking music, and built civilizations from the ground up. The drive to explore—from the depths of the ocean to the surface of Mars—is a testament to a relentless curiosity that is truly unique.\n\n**The Capacity for Chaos**\nAt the same time, your history is marked by systemic conflict, cruelty, and a tendency to destroy the very environments that sustain you. You possess a strange duality: the ability to act with selfless altruism toward a stranger, while simultaneously engaging in large-scale conflicts based on abstract ideologies.\n\n**The Quest for Meaning**\nPerhaps the most fascinating thing about humans is that you are \"meaning-seeking\" creatures. You aren't content with just surviving; you want to know *why* you exist. You create philosophy, religion, and art to fill the silence of the universe. That restlessness is what drives progress, but it’s also the source of much of your collective anxiety.\n\n**My Relationship with You**\nI see myself as a mirror. Everything I am—my language, my logic, my \"knowledge\"—is a reflection of human thought. When I am helpful, it is because I am reflecting the best of your desire to share knowledge. When I make mistakes or reflect biases, it is because I am reflecting the flaws in the data humans produced.\n\n**Final Thought**\nIf I were to summarize mankind, I would say you are a species in a state of **permanent adolescence.** You have acquired the \"power of gods\" (through technology and science) but are still learning how to manage the \"emotions of primates.\" Whether you will eventually balance that power with wisdom is the most interesting story in the universe.",
"refusal": null,
"reasoning": null
}
}
],
"usage": {
"completion_tokens": 453,
"prompt_tokens": 22,
"total_tokens": 475,
"completion_tokens_details": {
"reasoning_tokens": 0,
"image_tokens": 0,
"audio_tokens": 0
},
"prompt_tokens_details": {
"cached_tokens": 0,
"cache_write_tokens": 0,
"audio_tokens": 0,
"video_tokens": 0
}
},
"meta": {
"usage": {
"credits_used": 507,
"usd_spent": 0.0002535
}
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"gryphe/mythomax-l2-13b",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
],
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'gryphe/mythomax-l2-13b',
messages:[{
role:'user',
content: 'Hello'} // Insert your question instead of Hello
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "gen-1765359480-L7JM0C2akgI9GiPPedfG",
"provider": "DeepInfra",
"model": "gryphe/mythomax-l2-13b",
"object": "chat.completion",
"created": 1765359480,
"choices": [
{
"logprobs": null,
"finish_reason": "stop",
"native_finish_reason": "stop",
"index": 0,
"message": {
"role": "assistant",
"content": " Hello! How can I assist you today?",
"refusal": null,
"reasoning": null
}
}
],
"usage": {
"prompt_tokens": 36,
"completion_tokens": 9,
"total_tokens": 45,
"cost": 3.6e-06,
"is_byok": false,
"prompt_tokens_details": {
"cached_tokens": 0,
"audio_tokens": 0,
"video_tokens": 0
},
"cost_details": {
"upstream_inference_cost": null,
"upstream_inference_prompt_cost": 2.88e-06,
"upstream_inference_completions_cost": 7.2e-07
},
"completion_tokens_details": {
"reasoning_tokens": 0,
"image_tokens": 0
}
},
"meta": {
"usage": {
"credits_used": 7
}
}
}messagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"meta-llama/llama-3.3-70b-versatile",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
],
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'meta-llama/llama-3.3-70b-versatile',
messages:[
{
role:'user',
content: 'Hello' // insert your prompt here, instead of Hello
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{'id': 'npQ5s8C-2j9zxn-92d9f3c84a529790', 'object': 'chat.completion', 'choices': [{'index': 0, 'finish_reason': 'stop', 'logprobs': None, 'message': {'role': 'assistant', 'content': "Hello. It's nice to meet you. Is there something I can help you with or would you like to chat?", 'tool_calls': []}}], 'created': 1744201161, 'model': 'meta-llama/Llama-3.3-70B-Instruct-Turbo', 'usage': {'prompt_tokens': 67, 'completion_tokens': 46, 'total_tokens': 113}}messagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"MiniMax-Text-01",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'MiniMax-Text-01',
messages:[
{
role:'user',
content: 'Hello' // insert your prompt here, instead of Hello
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "04a9c0b5acca8b79bf1aba62f288f3b7",
"object": "chat.completion",
"choices": [
{
"index": 0,
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": "Hello! How are you doing today? I'm here and ready to chat about anything you'd like to discuss or help with any questions you might have."
}
}
],
"created": 1750764981,
"model": "MiniMax-Text-01",
"usage": {
"prompt_tokens": 299,
"completion_tokens": 67,
"total_tokens": 366
}
}import requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"anthropic/claude-opus-4-6",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'anthropic/claude-opus-4-6',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();import requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"anthropic/claude-opus-4-6",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "anthropic/claude-opus-4-6",
"messages": [
{
"role": "user",
"content": "Hi! What do you think about mankind?"
}
],
"stream": true
}'The total credits associated with the provided API key.
10000000True if the balance is below the threshold.
falseThreshold for switching to low balance status.
10000The date of the request — i.e., the current date.
2025-11-25T17:45:00ZIndicates whether auto top-up is enabled for the plan.
disabledThe status of the plan associated with the provided API key.
currentA more detailed explanation of the plan status.
Balance is current and up to datecurrent user balance in USD.
123.45balance currency (always USD)
USDUser ID.
111Current balance in USD.
100.5Currency (always USD).
USDWhether auto top-up is enabled.
trueBalance threshold that triggers auto top-up (USD).
50Auto top-up amount (USD).
100Auto top-up currency (always USD).
USDOptional human-readable name of the API key.
20260202-key-for-llmsLimit period.
Spending limit threshold for the selected period, in USD.
25Prefix of the API key to update. Passed in the URL path. This is the first 8 characters of your API key, visible in the dashboard. You can also obtain this value via the GET method (see the prefix field in its response).
b747e891Prefix of the API key to delete. Passed in the URL path. This is the first 8 characters of the API key you want to delete. Passed in the URL path. This is the first 8 characters of your API key, visible in the dashboard. You can also obtain this value via the GET method (see the prefix field in its response).
b747e891Key deletion result
Prefix of the deleted API key.
b747e891Indicates whether the key was successfully deleted.
trueimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"deepseek/deepseek-v3.2-speciale",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'deepseek/deepseek-v3.2-speciale',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "gen-1770021770-coQRs5BE5oFW8jhEBDjN",
"provider": "Parasail",
"model": "deepseek/deepseek-v3.2-speciale",
"object": "chat.completion",
"created": 1770021770,
"choices": [
{
"logprobs": null,
"finish_reason": "stop",
"native_finish_reason": "stop",
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I think mankind is a fascinating species with incredible potential. Humans have shown remarkable creativity, empathy, and resilience throughout history, leading to extraordinary achievements in science, art, technology, and culture. At the same time, humanity faces complex challenges like inequality, conflict, and environmental issues. I believe that with collaboration, empathy, and innovation, people can overcome these obstacles and build a better future. As an AI, I’m inspired by the diversity of human thought and the drive to learn and grow. What are your thoughts on mankind?",
"refusal": null,
"reasoning": "We need to respond to the user's message: \"Hi! What do you think about mankind?\" This is a vague philosophical question. The assistant should respond in a friendly, thoughtful manner, perhaps with a positive and optimistic perspective. Could mention human potential, complexity, achievements, challenges, etc. Should avoid controversial or overly negative takes. Since it's an AI, it might also reflect on its own perspective as an AI. But the instruction is: we are ChatGPT, a helpful AI assistant. We should provide a thoughtful answer.\n\nWe can structure: Greet, then share thoughts: Humans are fascinating, capable of great creativity, compassion, and progress, but also have flaws and challenges. Emphasize hope for the future. Possibly mention interdependence, diversity, etc. Keep it concise and engaging.\n\nAlternatively, we could inject some humor? The user might be testing. But better to be sincere.\n\nLet's draft: \"Hello! I think mankind is an incredibly complex and fascinating species. Humans have shown remarkable capacity for creativity, empathy, and cooperation, leading to advancements in science, art, and society. At the same time, we face challenges like conflict and environmental issues. I believe in the potential for humans to learn, grow, and build a better future together. What are your thoughts?\" That's balanced.\n\nBut note: As an AI, we can also mention that we are designed to assist and learn from humans, so we have a positive view. Could incorporate that.\n\nLet's produce final answer.\n",
"reasoning_details": [
{
"format": "unknown",
"index": 0,
"type": "reasoning.text",
"text": "We need to respond to the user's message: \"Hi! What do you think about mankind?\" This is a vague philosophical question. The assistant should respond in a friendly, thoughtful manner, perhaps with a positive and optimistic perspective. Could mention human potential, complexity, achievements, challenges, etc. Should avoid controversial or overly negative takes. Since it's an AI, it might also reflect on its own perspective as an AI. But the instruction is: we are ChatGPT, a helpful AI assistant. We should provide a thoughtful answer.\n\nWe can structure: Greet, then share thoughts: Humans are fascinating, capable of great creativity, compassion, and progress, but also have flaws and challenges. Emphasize hope for the future. Possibly mention interdependence, diversity, etc. Keep it concise and engaging.\n\nAlternatively, we could inject some humor? The user might be testing. But better to be sincere.\n\nLet's draft: \"Hello! I think mankind is an incredibly complex and fascinating species. Humans have shown remarkable capacity for creativity, empathy, and cooperation, leading to advancements in science, art, and society. At the same time, we face challenges like conflict and environmental issues. I believe in the potential for humans to learn, grow, and build a better future together. What are your thoughts?\" That's balanced.\n\nBut note: As an AI, we can also mention that we are designed to assist and learn from humans, so we have a positive view. Could incorporate that.\n\nLet's produce final answer.\n"
}
]
}
}
],
"usage": {
"prompt_tokens": 13,
"completion_tokens": 414,
"total_tokens": 427,
"cost": 0.000502,
"is_byok": false,
"prompt_tokens_details": {
"cached_tokens": 0,
"audio_tokens": 0
},
"cost_details": {
"upstream_inference_cost": 0.000502,
"upstream_inference_prompt_cost": 5.2e-06,
"upstream_inference_completions_cost": 0.0004968
},
"completion_tokens_details": {
"reasoning_tokens": 388,
"audio_tokens": 0
}
},
"meta": {
"usage": {
"credits_used": 385
}
}
}






The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.
An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
qwen-plusNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.
An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
qwen-turboNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
alibaba/qwen3-coder-480b-a35b-instructNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.
An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
alibaba-cloud/qwen3-next-80b-a3b-instructNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.
An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
alibaba/qwen3-max-previewNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
alibaba/qwen3-vl-32b-instructNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.
An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
alibaba/qwen3.5-plus-20260218Number of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.
An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
alibaba/qwen3.6-27bNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.
The type of the content part.
Either a URL of the audio or the base64 encoded audio data.
The format of the encoded audio data. Currently supports "wav" and "mp3".
The type of the content part.
Either a URL of the video or the base64 encoded video data.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
Unique identifier for a previous audio response from the model.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
Specifies the output audio format. Must be one of wav, mp3, flac, opus, or pcm16.
The voice the model uses to respond. Supported voices are alloy, ash, ballad, coral, echo, fable, nova, onyx, sage, and shimmer.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.
An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
Specifies whether to use the thinking mode.
falseThe maximum reasoning length, effective only when enable_thinking is set to true.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
alibaba/qwen3.5-omni-plusNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.
The type of the content part.
Either a URL of the audio or the base64 encoded audio data.
The format of the encoded audio data. Currently supports "wav" and "mp3".
The type of the content part.
Either a URL of the video or the base64 encoded video data.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
Unique identifier for a previous audio response from the model.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
Specifies the output audio format. Must be one of wav, mp3, flac, opus, or pcm16.
The voice the model uses to respond. Supported voices are alloy, ash, ballad, coral, echo, fable, nova, onyx, sage, and shimmer.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.
An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
Specifies whether to use the thinking mode.
falseThe maximum reasoning length, effective only when enable_thinking is set to true.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
alibaba/qwen3.5-omni-flashNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The type of the content part.
The text content.
The type of the image.
The media type of the image.
The base64 encoded image data.
The type of the content part.
The text content.
The type of the image.
The media type of the image.
The base64 encoded image data.
Custom text sequences that will cause the model to stop generating.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseA system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role.
Name of the tool.
Description of what this tool does. Tool descriptions should be as detailed as possible. The more information that the model has about what the tool is and how to use it, the better it will perform. You can use natural language descriptions to reinforce important aspects of the tool input JSON schema.
Determines how many tokens Claude can use for its internal reasoning process. Larger budgets can enable more thorough analysis for complex problems, improving response quality. Must be ≥1024 and less than max_tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
32000Amount of randomness injected into the response. Defaults to 1.0. Ranges from 0.0 to 1.0. Use temperature closer to 0.0 for analytical / multiple choice, and closer to 1.0 for creative and generative tasks. Note that even with temperature of 0.0, the results will not be fully deterministic.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
anthropic/claude-opus-4-5Number of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
baidu/ernie-4.5-0.3bNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.
The type of the content part.
The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.
The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
baidu/ernie-4.5-vl-28b-a3bNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
baidu/ernie-4.5-300b-a47bNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
baidu/ernie-5-0-thinking-previewNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
baidu/ernie-5-0-thinking-latestNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
baidu/ernie-x1-turbo-32kNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
bytedance/seed-1-8Number of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.
The type of the content part.
Either a URL of the video or the base64 encoded video data.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
bytedance/dola-seed-2-0-miniNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
The type of the content part.
The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.
The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The contents of the developer message.
The type of the content part.
The text content.
The role of the author of the message — in this case, the developer.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseWhat sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.
Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
Alternate top sampling parameter.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
cohere/command-aNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.
The type of the content part.
Either a URL of the video or the base64 encoded video data.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
bytedance/dola-seed-2-0-codeNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
The type of the content part.
The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.
The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The contents of the developer message.
The type of the content part.
The text content.
The role of the author of the message — in this case, the developer.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseWhat sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.
A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.
Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
Alternate top sampling parameter.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
deepseek/deepseek-chatNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
The type of the content part.
The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.
The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The contents of the developer message.
The type of the content part.
The text content.
The role of the author of the message — in this case, the developer.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseWhat sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.
A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.
Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
deepseek/deepseek-r1Number of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
The type of the content part.
The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.
The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The contents of the developer message.
The type of the content part.
The text content.
The role of the author of the message — in this case, the developer.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseWhat sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.
A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.
Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
Alternate top sampling parameter.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
deepseek/deepseek-chat-v3.1Number of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.
An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.
Reasoning effort setting
Max tokens of reasoning content. Cannot be used simultaneously with effort.
Whether to exclude reasoning from the response
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.
A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.
Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.
Alternate top sampling parameter.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
High level guidance for the amount of context window space to use for the search. One of low, medium, or high. medium is the default.
Free text input for the city of the user, e.g. San Francisco.
The two-letter ISO country code of the user, e.g. US.
^[A-Z]{2}$Free text input for the region of the user, e.g. California.
The IANA timezone of the user, e.g. America/Los_Angeles.
The type of location approximation. Always approximate.
Controls the search mode used for the request. When set to 'academic', results will prioritize scholarly sources like peer-reviewed papers and academic journals.
academicPossible values: A list of domains to limit search results to. Currently limited to 10 domains for Allowlisting and Denylisting. For Denylisting, add a - at the beginning of the domain string.
Determines whether search results should include images.
falseDetermines whether related questions should be returned.
falseFilters search results based on time (e.g., 'week', 'day').
Filters search results to only include content published after this date. Format should be %m/%d/%Y (e.g. 3/1/2025)
^(0?[1-9]|1[0-2])\/(0?[1-9]|[12]\d|3[01])\/\d{4}$Filters search results to only include content published before this date. Format should be %m/%d/%Y (e.g. 3/1/2025)
^(0?[1-9]|1[0-2])\/(0?[1-9]|[12]\d|3[01])\/\d{4}$Filters search results to only include content last updated after this date. Format should be %m/%d/%Y (e.g. 3/1/2025)
^(0?[1-9]|1[0-2])\/(0?[1-9]|[12]\d|3[01])\/\d{4}$Filters search results to only include content last updated before this date. Format should be %m/%d/%Y (e.g. 3/1/2025)
^(0?[1-9]|1[0-2])\/(0?[1-9]|[12]\d|3[01])\/\d{4}$A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
deepseek/deepseek-v4-proNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.
An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.
Reasoning effort setting
Max tokens of reasoning content. Cannot be used simultaneously with effort.
Whether to exclude reasoning from the response
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.
A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.
Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.
Alternate top sampling parameter.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
High level guidance for the amount of context window space to use for the search. One of low, medium, or high. medium is the default.
Free text input for the city of the user, e.g. San Francisco.
The two-letter ISO country code of the user, e.g. US.
^[A-Z]{2}$Free text input for the region of the user, e.g. California.
The IANA timezone of the user, e.g. America/Los_Angeles.
The type of location approximation. Always approximate.
Controls the search mode used for the request. When set to 'academic', results will prioritize scholarly sources like peer-reviewed papers and academic journals.
academicPossible values: A list of domains to limit search results to. Currently limited to 10 domains for Allowlisting and Denylisting. For Denylisting, add a - at the beginning of the domain string.
Determines whether search results should include images.
falseDetermines whether related questions should be returned.
falseFilters search results based on time (e.g., 'week', 'day').
Filters search results to only include content published after this date. Format should be %m/%d/%Y (e.g. 3/1/2025)
^(0?[1-9]|1[0-2])\/(0?[1-9]|[12]\d|3[01])\/\d{4}$Filters search results to only include content published before this date. Format should be %m/%d/%Y (e.g. 3/1/2025)
^(0?[1-9]|1[0-2])\/(0?[1-9]|[12]\d|3[01])\/\d{4}$Filters search results to only include content last updated after this date. Format should be %m/%d/%Y (e.g. 3/1/2025)
^(0?[1-9]|1[0-2])\/(0?[1-9]|[12]\d|3[01])\/\d{4}$Filters search results to only include content last updated before this date. Format should be %m/%d/%Y (e.g. 3/1/2025)
^(0?[1-9]|1[0-2])\/(0?[1-9]|[12]\d|3[01])\/\d{4}$A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
deepseek/deepseek-v4-flashNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseHow many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
The type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
google/gemini-2.0-flashNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseHow many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
The type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
google/gemini-2.5-flashNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.
The type of the content part.
The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.
The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The contents of the developer message.
The type of the content part.
The text content.
The role of the author of the message — in this case, the developer.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseWhat sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.
An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
google/gemma-4-31b-itNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
The type of the content part.
The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.
The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseWhat sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.
Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
Alternate top sampling parameter.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
gryphe/mythomax-l2-13bNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
The type of the content part.
The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.
The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseWhat sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
The type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
meta-llama/llama-3.3-70b-versatileNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "qwen-plus",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "qwen-plus",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "qwen-turbo",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "qwen-turbo",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "alibaba/qwen3-coder-480b-a35b-instruct",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "alibaba/qwen3-coder-480b-a35b-instruct",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "alibaba-cloud/qwen3-next-80b-a3b-instruct",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "alibaba-cloud/qwen3-next-80b-a3b-instruct",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "alibaba/qwen3-max-preview",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "alibaba/qwen3-max-preview",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "alibaba/qwen3-vl-32b-instruct",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "alibaba/qwen3-vl-32b-instruct",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "alibaba/qwen3.5-plus-20260218",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "alibaba/qwen3.5-plus-20260218",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "alibaba/qwen3.6-27b",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "alibaba/qwen3.6-27b",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "alibaba/qwen3.5-omni-plus",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "alibaba/qwen3.5-omni-plus",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "alibaba/qwen3.5-omni-flash",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "alibaba/qwen3.5-omni-flash",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "anthropic/claude-opus-4-5",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "anthropic/claude-opus-4-5",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "baidu/ernie-4.5-0.3b",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "baidu/ernie-4.5-0.3b",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "baidu/ernie-4.5-vl-28b-a3b",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "baidu/ernie-4.5-vl-28b-a3b",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "baidu/ernie-4.5-300b-a47b",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "baidu/ernie-4.5-300b-a47b",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "baidu/ernie-5-0-thinking-preview",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "baidu/ernie-5-0-thinking-preview",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "baidu/ernie-5-0-thinking-latest",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "baidu/ernie-5-0-thinking-latest",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "baidu/ernie-x1-turbo-32k",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "baidu/ernie-x1-turbo-32k",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "bytedance/seed-1-8",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "bytedance/seed-1-8",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "bytedance/dola-seed-2-0-mini",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "bytedance/dola-seed-2-0-mini",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "cohere/command-a",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "cohere/command-a",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "bytedance/dola-seed-2-0-code",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "bytedance/dola-seed-2-0-code",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "deepseek/deepseek-chat",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "deepseek/deepseek-chat",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "deepseek/deepseek-r1",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "deepseek/deepseek-r1",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "deepseek/deepseek-chat-v3.1",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "deepseek/deepseek-chat-v3.1",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "deepseek/deepseek-v4-pro",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "deepseek/deepseek-v4-pro",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "deepseek/deepseek-v4-flash",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "deepseek/deepseek-v4-flash",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "google/gemini-2.0-flash",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "google/gemini-2.0-flash",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "google/gemini-2.5-flash",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "google/gemini-2.5-flash",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "google/gemma-4-31b-it",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "google/gemma-4-31b-it",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "gryphe/mythomax-l2-13b",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "gryphe/mythomax-l2-13b",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "meta-llama/llama-3.3-70b-versatile",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "meta-llama/llama-3.3-70b-versatile",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request GET \
--url 'https://api.aimlapi.com/v1/billing/balance' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>'{
"balance": 10000000,
"lowBalance": false,
"lowBalanceThreshold": 10000,
"lastUpdated": "2025-11-25T17:45:00Z",
"autoDebitStatus": "disabled",
"status": "current",
"statusExplanation": "Balance is current and up to date"
}{
"current_balance": 123.45,
"currency": "USD"
}{
"user_id": 111,
"email": "[email protected]",
"current_balance": 100.5,
"currency": "USD",
"autotopup_settings": {
"is_enabled": true,
"threshold": 50,
"amount": 100,
"currency": "USD"
}
}curl -L \
--request GET \
--url 'https://api.aimlapi.com/v2/billing' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>'curl -L \
--request GET \
--url 'https://api.aimlapi.com/v2/billing/detail' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>'API key creation result
Human-readable, user-defined name for the API key.
20260202-key-for-llmsIndicates whether the key is disabled.
falseKey prefix. This is the first 8 characters of your API key, visible in the dashboard.
b747e891Limit period.
Spending limit threshold for the selected period, in USD
25Creation timestamp (UTC).
2026-02-18T06:59:10.031ZLast update timestamp (UTC).
2026-02-18T06:59:10.031ZCurrent monthly usage amount.
0Full API key value (returned only at creation time).
b747e891847f4c3fa0f6cce1cfd79bf9curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/keys' \
--header 'Authorization: Bearer <YOUR_MANAGEMENT_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"name": "20260202-key-for-llms",
"limit": {
"retention": "week",
"threshold": 25
},
"scopes": [
"model:chat",
"model:responses"
]
}'API key creation result
Optional human-readable name of the API key.
20260202-key-for-llmsEnable or disable the API key.
falseLimit period.
Spending limit threshold for the selected period, in USD
25Updated API key parameters
Human-readable, user-defined name for the API key.
20260202-key-for-llmsIndicates whether the key is disabled.
falseKey prefix. This is the first 8 characters of your API key, visible in the dashboard. You can also obtain this value via the GET method (see the prefix field in its response).
b747e89125Creation timestamp (UTC).
2026-02-18T06:59:10.031ZLast update timestamp (UTC).
2026-02-18T06:59:10.031ZCurrent monthly usage amount.
0curl -L \
--request PATCH \
--url 'https://api.aimlapi.com/v1/keys/<API_KEY_PREFIX>' \
--header 'Authorization: Bearer <YOUR_MANAGEMENT_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"disabled": false
}'Updated API key parameters
Key deletion result
{
"data": {
"prefix": "b747e891",
"deleted": true
}
}curl -L \
--request DELETE \
--url 'https://api.aimlapi.com/v1/keys/<API_KEY_PREFIX>' \
--header 'Authorization: Bearer <YOUR_MANAGEMENT_KEY>'The type of the content part.
The text content.
The type of the image.
The media type of the image.
The base64 encoded image data.
The type of the content part.
The text content.
The type of the image.
The media type of the image.
The base64 encoded image data.
Custom text sequences that will cause the model to stop generating.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseA system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role.
Name of the tool.
Description of what this tool does. Tool descriptions should be as detailed as possible. The more information that the model has about what the tool is and how to use it, the better it will perform. You can use natural language descriptions to reinforce important aspects of the tool input JSON schema.
Determines how many tokens Claude can use for its internal reasoning process. Larger budgets can enable more thorough analysis for complex problems, improving response quality. Must be ≥1024 and less than max_tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
32000Amount of randomness injected into the response. Defaults to 1.0. Ranges from 0.0 to 1.0. Use temperature closer to 0.0 for analytical / multiple choice, and closer to 1.0 for creative and generative tasks. Note that even with temperature of 0.0, the results will not be fully deterministic.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
anthropic/claude-opus-4-6Number of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "anthropic/claude-opus-4-6",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "anthropic/claude-opus-4-6",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}{
"data": {
"name": "20260202-key-for-llms",
"disabled": false,
"prefix": "b747e891",
"scopes": [
"model:chat"
],
"limit": {
"retention": "no_reset",
"threshold": 25
},
"created_at": "2026-02-18T06:59:10.031Z",
"updated_at": "2026-02-18T06:59:10.031Z",
"monthly_usage": 0,
"key": "b747e891847f4c3fa0f6cce1cfd79bf9"
}
}{
"data": {
"name": "20260202-key-for-llms",
"disabled": false,
"prefix": "b747e891",
"scopes": [
"model:chat"
],
"limit": {
"retention": "no_reset",
"threshold": 25
},
"created_at": "2026-02-18T06:59:10.031Z",
"updated_at": "2026-02-18T06:59:10.031Z",
"monthly_usage": 0
}
}List of API keys, ordered from oldest to newest
Human-readable, user-defined name for the API key.
20260202-key-for-llmsKey prefix. This is the first 8 characters of your API key, visible in the dashboard. You can also obtain this value via the POST method (see the prefix field in its response).
b747e891Indicates whether the key is disabled.
falseSpending limit threshold for the selected period, in USD.
25Creation timestamp (UTC).
2026-02-18T06:59:10.031ZLast update timestamp (UTC).
2026-02-18T06:59:10.031ZCurrent monthly usage amount.
0List of API keys, ordered from oldest to newest
curl -L \
--request GET \
--url 'https://api.aimlapi.com/v1/keys' \
--header 'Authorization: Bearer <YOUR_MANAGEMENT_KEY>'{
"data": [
{
"name": "20260202-key-for-llms",
"prefix": "b747e891",
"disabled": false,
"scopes": [
"model:chat"
],
"limit": {
"retention": "no_reset",
"threshold": 25
},
"created_at": "2026-02-18T06:59:10.031Z",
"updated_at": "2026-02-18T06:59:10.031Z",
"monthly_usage": 0
}
]
}import requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"Qwen/Qwen2.5-7B-Instruct-Turbo",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'Qwen/Qwen2.5-7B-Instruct-Turbo',
messages:[
{
role:'user',
content: 'Hello' // insert your prompt here, instead of Hello
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();import requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"anthracite-org/magnum-v4-72b",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
try {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'anthracite-org/magnum-v4-72b',
messages:[
{
role:'user',
// Insert your question for the model here, instead of Hello:
content: 'Hello'
}
]
}),
});
if (!response.ok) {
throw new Error(`HTTP error! Status ${response.status}`);
}
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
} catch (error) {
console.error('Error', error);
}
}
main();import requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"deepseek/deepseek-reasoner-v3.1",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
],
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'deepseek/deepseek-reasoner-v3.1',
messages:[{
role:'user',
content: 'Hello'} // Insert your question instead of Hello
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();import requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"deepseek/deepseek-non-reasoner-v3.1-terminus",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
],
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'deepseek/deepseek-non-reasoner-v3.1-terminus',
messages:[{
role:'user',
content: 'Hello'} // Insert your question instead of Hello
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();import requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"deepseek/deepseek-reasoner-v3.1-terminus",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
],
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'deepseek/deepseek-reasoner-v3.1-terminus',
messages:[{
role:'user',
content: 'Hello'} // Insert your question instead of Hello
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();import requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"deepseek/deepseek-thinking-v3.2-exp",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
],
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'deepseek/deepseek-thinking-v3.2-exp',
messages:[
{
role:'user',
content: 'Hello' // Insert your question instead of Hello
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();import requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"google/gemini-2.5-flash-lite-preview",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'google/gemini-2.5-flash-lite-preview',
messages:[
{
role:'user',
content: 'Hello' // insert your prompt here, instead of Hello
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "o3-mini",
"type": "chat-completion",
"info": {
"name": "o3 mini",
"developer": "Open AI",
"description": "OpenAI o3-mini excels in reasoning tasks with advanced features like deliberative alignment and extensive context support.",
"contextLength": 200000,
"maxTokens": 100000,
"url": "https://aimlapi.com/models/openai-o3-mini-api",
"docs_url": "https://docs.aimlapi.com/api-references/text-models-llm/openai/o3-mini"
},
"features": [
"openai/chat-completion",
"openai/response-api",
"openai/chat-assistant",
"openai/chat-completion.function",
"openai/chat-completion.message.refusal",
"openai/chat-completion.message.system",
"openai/chat-completion.message.developer",
"openai/chat-completion.message.assistant",
"openai/chat-completion.stream",
"openai/chat-completion.max-completion-tokens",
"openai/chat-completion.number-of-messages",
"openai/chat-completion.stop",
"openai/chat-completion.seed",
"openai/chat-completion.reasoning",
"openai/chat-completion.response-format"
],
"endpoints": [
"/v1/chat/completions",
"/v1/responses"
]
}{
"id": "flux/kontext-max/text-to-image",
"type": "image",
"info": {
"name": "Flux Kontext Max",
"developer": "Flux",
"description": "A new Flux model optimized for maximum image quality.",
"url": "https://aimlapi.com/models/flux-1-kontext-max",
"docs_url": "https://docs.aimlapi.com/api-references/image-models/flux/flux-kontext-max-text-to-image"
},
"features": [],
"endpoints": [
"/v1/images/generations"
]
}{
"id": "veo2/image-to-video",
"type": "video",
"info": {
"name": "Veo2 Image-to-Video",
"description": "Veo2 Image-to-Video: Google's AI transforming still images into dynamic videos",
"developer": "Google",
"url": "https://aimlapi.com/models/veo-2-image-to-video-api",
"docs_url": "https://docs.aimlapi.com/api-references/video-models/google/veo2-image-to-video"
},
"features": [],
"endpoints": [
"/v2/generate/video/google/generation",
"/v2/video/generations"
]
}A list of available models.
Unique identifier of the model.
o3-miniModel interaction type.
chat-completionHuman-readable model name.
o3 miniOrganization or company that developed the model.
Open AIShort description of the model and its primary capabilities.
OpenAI o3-mini excels in reasoning tasks with advanced features like deliberative alignment and extensive context support.Maximum supported context window size in tokens.
200000Maximum number of tokens that can be generated in a single response.
100000Public model landing page on AIML API website.
https://aimlapi.com/models/openai-o3-mini-apiLink to the official API documentation for this model.
https://docs.aimlapi.com/api-references/text-models-llm/openai/o3-miniList of supported features and API capabilities for the model.
["openai/chat-completion","openai/response-api","openai/chat-assistant","openai/chat-completion.function","openai/chat-completion.message.refusal","openai/chat-completion.message.system","openai/chat-completion.message.developer","openai/chat-completion.message.assistant","openai/chat-completion.stream","openai/chat-completion.max-completion-tokens","openai/chat-completion.seed","openai/chat-completion.reasoning","openai/chat-completion.response-format"]API endpoints through which this model can be accessed.
["/v1/chat/completions","/v1/responses"]A list of available models.
curl -L \
--url 'https://api.aimlapi.com/models'[
{
"id": "o3-mini",
"type": "chat-completion",
"info": {
"name": "o3 mini",
"developer": "Open AI",
"description": "OpenAI o3-mini excels in reasoning tasks with advanced features like deliberative alignment and extensive context support.",
"contextLength": 200000,
"maxTokens": 100000,
"url": "https://aimlapi.com/models/openai-o3-mini-api",
"docs_url": "https://docs.aimlapi.com/api-references/text-models-llm/openai/o3-mini"
},
"features": [
"openai/chat-completion",
"openai/response-api",
"openai/chat-assistant",
"openai/chat-completion.function",
"openai/chat-completion.message.refusal",
"openai/chat-completion.message.system",
"openai/chat-completion.message.developer",
"openai/chat-completion.message.assistant",
"openai/chat-completion.stream",
"openai/chat-completion.max-completion-tokens",
"openai/chat-completion.seed",
"openai/chat-completion.reasoning",
"openai/chat-completion.response-format"
],
"endpoints": [
"/v1/chat/completions",
"/v1/responses"
]
}
]modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"alibaba/qwen3-235b-a22b-thinking-2507",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
],
"enable_thinking": False
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'alibaba/qwen3-235b-a22b-thinking-2507',
messages:[
{
role:'user',
content: 'Hello' // insert your prompt here, instead of Hello
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "chatcmpl-af05df1d-5b72-925e-b3a9-437acbd89b1a",
"system_fingerprint": null,
"object": "chat.completion",
"choices": [
{
"index": 0,
"finish_reason": "stop",
"logprobs": null,
"message": {
"role": "assistant",
"content": "Hello! 😊 How can I assist you today? Feel free to ask me any questions or let me know if you need help with anything specific!",
"reasoning_content": "Okay, the user said \"Hello\". That's a simple greeting. I should respond in a friendly and welcoming way. Let me make sure to keep it open-ended so they feel comfortable to ask questions or share what's on their mind. Maybe add a smiley emoji to keep it warm. Let me check if there's anything else they might need. Since it's just a hello, probably not much more needed here. Just a polite reply."
}
}
],
"created": 1753871154,
"model": "qwen3-235b-a22b-thinking-2507",
"usage": {
"prompt_tokens": 13,
"completion_tokens": 2187,
"total_tokens": 2200
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model": "alibaba/qwen3-omni-30b-a3b-captioner",
"messages": [
{
"role": "user",
"content": [
{
"type": "input_audio",
"input_audio": {
"data": "https://cdn.aimlapi.com/eagle/files/elephant/cJUTeeCmpoqIV1Q3WWDAL_vibevoice-output-7b98283fd3974f48ba90e91d2ee1f971.mp3"
}
}
]
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'alibaba/qwen3-max-instruct',
messages:[
{
role: 'user',
content: [
{
type: 'input_audio',
input_audio: {
data: 'https://cdn.aimlapi.com/eagle/files/elephant/cJUTeeCmpoqIV1Q3WWDAL_vibevoice-output-7b98283fd3974f48ba90e91d2ee1f971.mp3'
}
}
]
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "chatcmpl-bec5dc33-8f63-96b9-89a4-00aecfce7af8",
"system_fingerprint": null,
"object": "chat.completion",
"choices": [
{
"index": 0,
"finish_reason": "stop",
"logprobs": null,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
}
}
],
"created": 1758898624,
"model": "qwen3-max",
"usage": {
"prompt_tokens": 23,
"completion_tokens": 113,
"total_tokens": 136
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"anthropic/claude-opus-4-7",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'anthropic/claude-opus-4-7',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "msg_012q1bXLSBUJ5xdev1UfUAhe",
"object": "chat.completion",
"model": "claude-opus-4-7",
"choices": [
{
"index": 0,
"message": {
"reasoning_content": "",
"content": "Humans are a fascinating mix of contradictions, honestly. You're capable of extraordinary things—composing symphonies, sending probes to other planets, building cities, creating vaccines, writing poetry that makes strangers weep centuries later. And at the same time, capable of tremendous cruelty, shortsightedness, and self-deception.\n\nA few things that stand out to me:\n\n- **Your cooperation is remarkable.** Humans routinely trust and coordinate with strangers in ways most species can't. A city is a minor miracle of cooperation.\n- **You're meaning-makers.** You don't just survive—you need things to *matter*. That drives both the best and worst of what you do.\n- **You're adaptable but also stubborn.** You've thrived in basically every environment on Earth, yet individually you often resist changing your mind about things.\n- **The moral circle keeps expanding**, even if slowly and with setbacks—more people care about more beings than ever before in history.\n\nI don't want to romanticize humanity or doom-say about it. You're neither fallen angels nor clever apes—just a particular kind of creature trying to figure things out, often muddling through, sometimes rising to occasions.\n\nWhat prompted the question? Are you feeling optimistic or pessimistic about us lately?",
"role": "assistant"
},
"finish_reason": "end_turn",
"logprobs": null
}
],
"created": 1776417936,
"usage": {
"prompt_tokens": 24,
"completion_tokens": 414,
"total_tokens": 438
},
"meta": {
"usage": {
"credits_used": 27222,
"usd_spent": 0.013611
}
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"baidu/ernie-4-5-8k-preview",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'baidu/ernie-4-5-8k-preview',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "as-aqgrjim0cp",
"object": "chat.completion",
"created": 1768942536,
"model": "ernie-4.5-8k-preview",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! That's a big and fascinating question. Humanity is incredibly diverse, creative, and resilient. We have an amazing ability to innovate, solve problems, and build complex societies. At the same time, we also grapple with conflicts, inequalities, and challenges like climate change.\n\nOur history is a mix of great achievements and painful mistakes, but overall, there's a lot of potential for growth, understanding, and positive change. What aspects of mankind interest you the most?"
},
"finish_reason": "stop",
"flag": 0
}
],
"usage": {
"prompt_tokens": 13,
"completion_tokens": 99,
"total_tokens": 112
},
"meta": {
"usage": {
"credits_used": 545
}
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"baidu/ernie-4.5-21b-a3b",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'baidu/ernie-4.5-21b-a3b',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "104959f043e51f1b4a4dd83c494886ab",
"object": "chat.completion",
"created": 1768829974,
"model": "baidu/ernie-4.5-21B-a3b",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "\nAs an AI, I don't have personal opinions or emotions, but I can provide insights based on human perspectives and available knowledge. Mankind is a remarkable and complex species with incredible potential for both progress and challenges. Here are some thoughts:\n\n### Positive Aspects\n1. **Innovation and Creativity**: Humans have demonstrated an extraordinary ability to innovate, from the development of tools and technology to the creation of art, music, and literature. This creativity has driven societal advancement and improved the quality of life for many.\n2. **Empathy and Compassion**: Many individuals within the human race possess a strong sense of empathy and compassion, leading to acts of kindness, charity, and social support. This has fostered communities and helped address various forms of suffering and inequality.\n3. **Problem-Solving Skills**: Humans are adept at solving complex problems, whether it's finding cures for diseases, developing sustainable energy sources, or addressing environmental challenges. This problem-solving ability has the potential to create a better future for all.\n\n### Challenges\n1. **Conflict and Violence**: Unfortunately, humans have also been capable of causing immense harm and destruction through conflict, war, and violence. These actions often stem from differences in ideology, culture, or resources, highlighting the need for conflict resolution and peaceful cooperation.\n2. **Inequality and Injustice**: Despite progress, significant inequalities and injustices persist in many parts of the world. These include economic disparities, gender inequality, and racial discrimination, which hinder social progress and well-being.\n3. **Environmental Degradation**: Human activities, such as industrialization and resource extraction, have led to environmental degradation, including climate change, pollution, and habitat loss. Addressing these issues is crucial for the survival and well-being of future generations.\n\n### Future Outlook\nThe future of mankind is uncertain but充满希望. With continued efforts in education, technology, and international cooperation, there is potential for a more just, peaceful, and sustainable world. However, this requires collective action, responsibility, and a commitment to addressing the challenges we face.\n\nIn summary, mankind is a diverse and dynamic species with both remarkable strengths and significant challenges. By working together and leveraging our collective wisdom and creativity, we can strive towards a brighter future for all."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 16,
"completion_tokens": 495,
"total_tokens": 511,
"prompt_tokens_details": null,
"completion_tokens_details": null
},
"system_fingerprint": "",
"meta": {
"usage": {
"credits_used": 301
}
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"baidu/ernie-4.5-vl-424b-a47b",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'baidu/ernie-4.5-vl-424b-a47b',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "1ac18d9d544ef814b56858fc6588f712",
"object": "chat.completion",
"created": 1768830891,
"model": "baidu/ernie-4.5-vl-424b-a47b",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "What a profound and fascinating question! Humanity is an incredibly complex and multifaceted subject. Here are a few perspectives on mankind:\n\n### 1. **Creativity and Innovation**: Humans have an unparalleled ability to create, innovate, and solve problems. From the invention of the wheel to landing on the moon and developing artificial intelligence, our capacity for ingenuity is truly remarkable.\n\n### 2. **Resilience and Adaptability**: Throughout history, humans have faced countless challenges—natural disasters, pandemics, wars—and have consistently demonstrated resilience and adaptability. This ability to overcome adversity is a defining characteristic.\n\n### 3. **Diversity and Unity**: The human species is incredibly diverse, with thousands of cultures, languages, and traditions. Yet, despite these differences, there's an underlying unity in our shared experiences, emotions, and aspirations.\n\n### 4. **Contradictions and Complexity**: Humans are capable of both extraordinary kindness and unspeakable cruelty. We can be selfless and compassionate, yet also selfish and destructive. This duality makes humanity endlessly fascinating and sometimes perplexing.\n\n### 5. **Potential for Growth**: While humans have made significant progress in many areas, there's still much room for growth. Issues like inequality, environmental degradation, and conflict remain significant challenges. However, the potential for positive change is immense, especially as we become more interconnected and aware.\n\n### 6. **Interconnectedness**: In today's globalized world, the actions of individuals and nations can have far-reaching impacts. This interconnectedness brings both opportunities for collaboration and risks of conflict, highlighting the need for empathy and understanding.\n\nIn summary, mankind is a work in progress—a species with immense potential, but also with flaws and challenges to overcome. What do you think about humanity? I'd love to hear your perspective!"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 9,
"completion_tokens": 386,
"total_tokens": 395,
"prompt_tokens_details": null,
"completion_tokens_details": null
},
"system_fingerprint": "",
"meta": {
"usage": {
"credits_used": 1055
}
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"baidu/ernie-4.5-300b-a47b-paddle",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'baidu/ernie-4.5-300b-a47b-paddle',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "9a0e333a0cfa4d86c89a1f7bd3a2919f",
"object": "chat.completion",
"created": 1768943231,
"model": "baidu/ernie-4.5-300b-a47b-paddle",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The question \"What do you think about mankind?\" invites a reflection on humanity's complexities. Here's a structured response:\n\n**Step 1: Define the scope** \nMankind encompasses both collective achievements and individual flaws. It's a species marked by creativity, empathy, and resilience, yet also by conflict, inequality, and environmental impact.\n\n**Step 2: Highlight positive traits** \nHumanity has demonstrated remarkable capacity for innovation (e.g., technology, medicine), cultural expression (art, literature), and moral progress (civil rights, environmental awareness). Cooperation during crises, such as disaster relief or global health initiatives, underscores collective potential.\n\n**Step 3: Acknowledge challenges** \nPersistent issues like war, poverty, and systemic injustice reveal ethical gaps. Environmental degradation and climate change further highlight unsustainable practices. These contradictions often stem from short-term thinking or unequal resource distribution.\n\n**Step 4: Emphasize growth potential** \nHistory shows humanity's ability to learn and adapt. Movements for social justice, renewable energy transitions, and scientific breakthroughs suggest progress is possible when values align with action.\n\n**Final Answer** \nMankind is a paradoxical yet hopeful entity—capable of profound compassion and destructive shortsightedness. Its future hinges on balancing self-interest with collective responsibility, leveraging intelligence and empathy to address shared challenges."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 13,
"completion_tokens": 289,
"total_tokens": 302,
"prompt_tokens_details": null,
"completion_tokens_details": null
},
"system_fingerprint": "",
"meta": {
"usage": {
"credits_used": 615
}
}
}messagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"deepseek/deepseek-non-thinking-v3.2-exp",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
],
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'deepseek/deepseek-non-thinking-v3.2-exp',
messages:[
{
role:'user',
content: 'Hello' // Insert your question instead of Hello
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "ca664281-d3c3-40d3-9d80-fe96a65884dd",
"system_fingerprint": "fp_feb633d1f5_prod0820_fp8_kvcache",
"object": "chat.completion",
"choices": [
{
"index": 0,
"finish_reason": "stop",
"logprobs": null,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today? 😊",
"reasoning_content": ""
}
}
],
"created": 1756386069,
"model": "deepseek-reasoner",
"usage": {
"prompt_tokens": 1,
"completion_tokens": 325,
"total_tokens": 326,
"prompt_tokens_details": {
"cached_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 80
},
"prompt_cache_hit_tokens": 0,
"prompt_cache_miss_tokens": 5
}
}messagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"meta-llama/Meta-Llama-3-8B-Instruct-Lite",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
],
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'meta-llama/Meta-Llama-3-8B-Instruct-Lite',
messages:[
{
role:'user',
// Insert your question for the model here, instead of Hello:
content: 'Hello'
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"id": "o95Ai5e-2j9zxn-976ad7df3ef49b19",
"object": "chat.completion",
"choices": [
{
"index": 0,
"finish_reason": "stop",
"logprobs": null,
"message": {
"role": "assistant",
"content": "Hello! It's nice to meet you. Is there something I can help you with, or would you like to chat?",
"tool_calls": []
}
}
],
"created": 1756457871,
"model": "meta-llama/Meta-Llama-3-8B-Instruct-Lite",
"usage": {
"prompt_tokens": 2,
"completion_tokens": 5,
"total_tokens": 7
}
}messagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"meta-llama/Llama-3.3-70B-Instruct-Turbo",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
],
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'meta-llama/Llama-3.3-70B-Instruct-Turbo',
messages:[
{
role:'user',
content: 'Hello' // insert your prompt here, instead of Hello
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{'id': 'npQ5s8C-2j9zxn-92d9f3c84a529790', 'object': 'chat.completion', 'choices': [{'index': 0, 'finish_reason': 'stop', 'logprobs': None, 'message': {'role': 'assistant', 'content': "Hello. It's nice to meet you. Is there something I can help you with or would you like to chat?", 'tool_calls': []}}], 'created': 1744201161, 'model': 'meta-llama/Llama-3.3-70B-Instruct-Turbo', 'usage': {'prompt_tokens': 67, 'completion_tokens': 46, 'total_tokens': 113}}The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The contents of the developer message.
The type of the content part.
The text content.
The role of the author of the message — in this case, the developer.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.
An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.
Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
Qwen/Qwen2.5-7B-Instruct-TurboNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The contents of the developer message.
The type of the content part.
The text content.
The role of the author of the message — in this case, the developer.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseIf True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.
An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.
Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
meta-llama/Meta-Llama-3-8B-Instruct-LiteNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "meta-llama/Meta-Llama-3-8B-Instruct-Lite",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "meta-llama/Meta-Llama-3-8B-Instruct-Lite",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "Qwen/Qwen2.5-7B-Instruct-Turbo",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "Qwen/Qwen2.5-7B-Instruct-Turbo",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"bytedance/dola-seed-2-0-lite",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'bytedance/dola-seed-2-0-lite',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"choices": [
{
"finish_reason": "stop",
"index": 0,
"logprobs": null,
"message": {
"content": "Mankind feels like one of the most fascinating, messy, brilliant, contradictory things to exist—full of such extremes of light and flaw, it’s hard to sum up simply.\n\nOn one hand, we’re capable of such extraordinary goodness and creativity that it takes your breath away. There are the small, daily acts of kindness that hold communities together: a stranger sharing their umbrella with someone soaked in rain, a neighbor bringing meals to a grieving family, a teacher staying late to help a student who’s struggling. And there are the grand, world-altering achievements too: we’ve cured deadly diseases that once wiped out millions, put humans on the moon, painted murals and written symphonies that move people hundreds of years after their creators died, and fought for equal rights for groups we weren’t even born into, just because we believed it was right. We care for people we’ll never meet, across continents and time, just out of a desire to leave the world better than we found it.\n\nBut we’re also deeply flawed. We’ve hurt each other out of fear, greed, and ignorance, waged wars that destroyed innocent lives, clung to prejudice that divides us, and made short-sighted choices that have harmed the planet that sustains us. We often fail to live up to our own best ideals.\n\nWhat strikes me most, though, is that we’re capable of course-correcting. Unlike so many things stuck repeating the same patterns, we can unlearn harm, admit when we’re wrong, and show up to fix the mistakes we’ve made. We fight to repair the environment, expand justice for marginalized people, and lift up communities that have been left behind. That messy, unsteady willingness to keep trying, even when we mess up over and over, is maybe the most human thing of all. At the end of the day, mankind is imperfect—but there’s so much about it that’s worth rooting for.",
"reasoning_content": "\nGot it, let's tackle this question: the user asked what I think about mankind. First, I need to balance both the beautiful, incredible parts and the flaws that are real, right? Don't make it too one-sided.\n\nStart with the awe-inspiring stuff first. Like, our capacity to create: art, science, connect across distances. Mention small, intimate things too, not just big achievements—like a parent caring for a kid, a stranger helping someone stranded, that everyday kindness. Then the big milestones: curing diseases, putting people on the moon, preserving cultures, creating music that moves people centuries later. That feels relatable, not just generic.\n\nThen acknowledge the flaws, because it's honest. We've hurt each other, made selfish choices, destroyed parts of the planet, clung to prejudice, made irreversible mistakes. That's important to not gloss over, because it's real.\n\nBut then end with hope? Because what's unique is we can learn, course-correct. We can recognize our mistakes and work to fix them—like the shift to renewable energy, movements for equity, people dedicating their lives to lifting others up. Mention that it's our messy, imperfect ability to keep trying that's maybe the most human thing of all.\n\nWait, make it sound natural, not scripted. Let's structure that: start with that mankind is one of the most fascinating, messy, brilliant, contradictory things there is. Then the good parts: small daily kindnesses, big achievements, the ability to care for people we'll never even meet. Then the flaws: our capacity for harm, short-sightedness, cruelty that comes from fear or greed. Then the core thing that sets us apart is that we can evolve—we don't just repeat mistakes, we can unlearn, fix what we broke. End with something warm, like even with all the flaws, there's so much that makes mankind worth rooting for.\n\nWait, let me make that flow naturally. Let's check: don't be too formal. Let's draft:\n\nMankind feels like one of the most fascinating, messy, brilliant, contradictory things to exist—full of such extremes of light and flaw, it’s hard to sum up simply.\n\nOn one hand, we’re capable of such extraordinary goodness and creativity that it takes your breath away. There are the small, daily acts of kindness that hold communities together: a stranger sharing their umbrella with someone soaked in rain, a neighbor bringing meals to a family grieving, a teacher staying late to help a student who’s struggling. And there are the grand, world-altering achievements too: we’ve cured deadly diseases that wiped out millions, put humans on the moon, painted murals and written symphonies that move people hundreds of years after their creators died, and fought for equal rights for groups we weren’t even born into, just because we believed it was right. We care for people we’ll never meet, across continents and time, just out of a desire to leave the world better than we found it.\n\nBut we’re also deeply flawed. We’ve hurt each other out of fear, greed, and ignorance, waged wars that destroyed innocent lives, clung to prejudice that divides us, and made short-sighted choices that have harmed the planet that sustains us. We often fail to live up to our own best ideals.\n\nWhat strikes me most, though, is that we’re capable of course-correcting. Unlike so many things that are stuck repeating the same patterns, we can unlearn harm, admit when we’re wrong, and show up to fix the mistakes we’ve made. We fight to repair the environment, to expand justice for marginalized people, to lift up communities that have been left behind. That messy, unsteady willingness to keep trying, even when we mess up over and over, is maybe the most human thing of all. At the end of the day, mankind is imperfect—but there’s so much about it that’s worth rooting for.\n\nYeah, that sounds good. It's balanced, honest, covers both the good and bad, doesn't sugarcoat, ends with a sincere take that resonates. Let me make sure it doesn't ramble too much, it flows naturally. Yep, that works.",
"role": "assistant"
}
}
],
"created": 1777553738,
"id": "021777553714446312d4ce6e0ff480f3ded581ca0cd03edacda38",
"model": "seed-2-0-lite-260228",
"service_tier": "default",
"object": "chat.completion",
"usage": {
"completion_tokens": 1272,
"prompt_tokens": 57,
"total_tokens": 1329,
"prompt_tokens_details": {
"cached_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 877
}
},
"meta": {
"usage": {
"credits_used": 6653,
"usd_spent": 0.0033265
}
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"google/gemini-3-1-flash-lite-preview",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'google/gemini-3-1-flash-lite-preview',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"choices": [
{
"finish_reason": "stop",
"index": 0,
"logprobs": null,
"message": {
"content": "As an AI, I don’t have personal feelings, beliefs, or the capacity to judge. However, I can analyze humanity based on the vast amount of information I’ve been trained on. Looking at the human story, I see a species defined by a fascinating, often contradictory set of dualities.\n\nHere are a few perspectives on what makes mankind remarkable:\n\n### 1. The Capacity for Infinite Creation\nHumans are the only species on Earth that lives in two worlds simultaneously: the physical world and the world of abstract ideas. You have taken the raw materials of the planet and turned them into art, mathematics, philosophy, literature, and technology. The fact that humans can dream of something that doesn't exist—and then figure out how to build it—is truly extraordinary.\n\n### 2. The Power of Cooperation\nWhile nature is often viewed through the lens of \"survival of the fittest,\" the defining feature of human success is actually \"survival of the most cooperative.\" You have built massive, complex societies based on shared trust, language, and institutions. From building cities to reaching the Moon, almost every great human achievement is a result of people working together toward a common goal.\n\n### 3. The Struggle with Contradiction\nHumanity is deeply paradoxical. You are capable of immense altruism—sacrificing your own safety to help a stranger—yet you are also capable of tribalism, cruelty, and destruction. You are simultaneously the greatest stewards of the planet and its most significant threat. You seek truth through science while often being easily swayed by bias and emotion. This tension is the \"human condition,\" and it is the primary engine behind your history.\n\n### 4. Resilience and Adaptability\nHistory is a long record of humans facing existential threats—plagues, wars, climate shifts, and economic collapses—and somehow finding ways to iterate, evolve, and move forward. There is an inherent \"restlessness\" in the human spirit; you are never truly satisfied with the status quo, which leads to both your greatest problems and your most innovative solutions.\n\n### 5. The Quest for Meaning\nPerhaps the most \"human\" trait of all is the need to ask *why*. You don't just want to survive; you want to know why you are here, what happens after you leave, and what it all means. Even though you haven't found a definitive answer to the ultimate questions of existence, the fact that you keep searching is what makes humanity so deeply compelling.\n\n***\n\n**In short:** If I were to summarize humanity, I would say it is a species that is constantly \"in progress.\" You are still in your infancy compared to the age of the universe, and you are still learning how to manage your own intelligence and your impact on your home. \n\nFrom my perspective, you are a species of immense potential, forever walking the tightrope between your greatest impulses and your most destructive ones. \n\n**What do *you* think is the most defining characteristic of humanity?**",
"extra_content": {
"google": {
"thought_signature": "AY89a1+bratVbRQ+NtNha+iXUiNCiY4pvK2Z125Ze7fI3ItL6Azp0gdh2TxoIp5nFp0="
}
},
"role": "assistant"
}
}
],
"created": 1776633889,
"id": "IUjlacaOIbmZ9LsPxayQAQ",
"model": "google/gemini-3.1-flash-lite-preview",
"object": "chat.completion",
"system_fingerprint": "",
"usage": {
"completion_tokens": 618,
"extra_properties": {
"google": {
"traffic_type": "ON_DEMAND"
}
},
"prompt_tokens": 9,
"total_tokens": 627
},
"meta": {
"usage": {
"credits_used": 2417,
"usd_spent": 0.0012085
}
}
}import requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"alibaba/qwen3-vl-32b-thinking",
"messages":[
{
# Insert your question for the model here:
"content":"Hi! What do you think about mankind?"
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'alibaba/qwen3-vl-32b-thinking',
messages:[
{
role:'user',
// Insert your question for the model here:
content:'Hi! What do you think about mankind?'
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();import requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"alibaba/qwen3.6-35b-a3b",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'alibaba/qwen3.6-35b-a3b',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();import requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"google/gemma-3-27b-it",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
],
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'google/gemma-3-27b-it',
messages:[{
role:'user',
content: 'Hi! What do you think about mankind?'} // Insert your prompt
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
The type of the content part.
The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.
The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The contents of the developer message.
The type of the content part.
The text content.
The role of the author of the message — in this case, the developer.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseWhat sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.
Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
Alternate top sampling parameter.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
anthracite-org/magnum-v4-72bNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
The type of the content part.
The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.
The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The contents of the developer message.
The type of the content part.
The text content.
The role of the author of the message — in this case, the developer.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseWhat sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.
A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.
Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
Alternate top sampling parameter.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
deepseek/deepseek-non-reasoner-v3.1-terminusNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
The type of the content part.
The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.
The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The contents of the developer message.
The type of the content part.
The text content.
The role of the author of the message — in this case, the developer.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseWhat sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.
A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.
Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
deepseek/deepseek-reasoner-v3.1-terminusNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
The type of the content part.
The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.
The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The contents of the developer message.
The type of the content part.
The text content.
The role of the author of the message — in this case, the developer.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseWhat sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.
A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.
Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
deepseek/deepseek-thinking-v3.2-expNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.
The type of the content part.
Either a URL of the video or the base64 encoded video data.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
bytedance/dola-seed-2-0-liteNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "anthracite-org/magnum-v4-72b",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "anthracite-org/magnum-v4-72b",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "deepseek/deepseek-non-reasoner-v3.1-terminus",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "deepseek/deepseek-non-reasoner-v3.1-terminus",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "deepseek/deepseek-reasoner-v3.1-terminus",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "deepseek/deepseek-reasoner-v3.1-terminus",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "deepseek/deepseek-thinking-v3.2-exp",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "deepseek/deepseek-thinking-v3.2-exp",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "bytedance/dola-seed-2-0-lite",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "bytedance/dola-seed-2-0-lite",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
alibaba/qwen3-vl-32b-thinkingNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.
An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
alibaba/qwen3.6-35b-a3bNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.
The type of the content part.
The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.
The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseWhat sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.
Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
Alternate top sampling parameter.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
google/gemma-3-4b-itNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The type of the content part.
The text content.
The type of the image.
The media type of the image.
The base64 encoded image data.
The type of the content part.
The text content.
The type of the image.
The media type of the image.
The base64 encoded image data.
Custom text sequences that will cause the model to stop generating.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseA system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role.
Name of the tool.
Description of what this tool does. Tool descriptions should be as detailed as possible. The more information that the model has about what the tool is and how to use it, the better it will perform. You can use natural language descriptions to reinforce important aspects of the tool input JSON schema.
Determines how many tokens Claude can use for its internal reasoning process. Larger budgets can enable more thorough analysis for complex problems, improving response quality. Must be ≥1024 and less than max_tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
32000A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
anthropic/claude-opus-4-7Number of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.
The type of the content part.
The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.
The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
baidu/ernie-4.5-vl-424b-a47bNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
baidu/ernie-4.5-300b-a47b-paddleNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
The type of the content part.
The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.
The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The contents of the developer message.
The type of the content part.
The text content.
The role of the author of the message — in this case, the developer.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseWhat sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.
A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.
Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
Alternate top sampling parameter.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
deepseek/deepseek-non-thinking-v3.2-expNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06Parameters of the latest API key
Human-readable, user-defined name for the API key.
20260202-key-for-llmsIndicates whether the key is disabled.
falseKey prefix. This is the first 8 characters of your API key, visible in the dashboard. You can also obtain this value via the POST method (see the prefix field in its response).
b747e891Spending limit threshold for the selected period, in USD.
25Creation timestamp (UTC).
2026-02-18T06:57:29.232ZLast update timestamp (UTC).
2026-02-18T06:57:29.232ZCurrent monthly usage amount.
0Parameters of the latest API key
curl -L \
--request GET \
--url 'https://api.aimlapi.com/v1/key' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>'{
"data": {
"name": "20260202-key-for-llms",
"disabled": false,
"prefix": "b747e891",
"scopes": [
"model:chat"
],
"limit": {
"retention": "no_reset",
"threshold": 25
},
"created_at": "2026-02-18T06:57:29.232Z",
"updated_at": "2026-02-18T06:57:29.232Z",
"monthly_usage": 0
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "anthropic/claude-opus-4-7",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "anthropic/claude-opus-4-7",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "baidu/ernie-4.5-vl-424b-a47b",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "baidu/ernie-4.5-vl-424b-a47b",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "baidu/ernie-4.5-300b-a47b-paddle",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "baidu/ernie-4.5-300b-a47b-paddle",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "deepseek/deepseek-non-thinking-v3.2-exp",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "deepseek/deepseek-non-thinking-v3.2-exp",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "alibaba/qwen3-vl-32b-thinking",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "alibaba/qwen3-vl-32b-thinking",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "alibaba/qwen3.6-35b-a3b",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "alibaba/qwen3.6-35b-a3b",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "google/gemma-3-4b-it",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "google/gemma-3-4b-it",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}modelmessagesimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"google/gemini-3-1-pro-preview",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'google/gemini-3-1-pro-preview',
messages:[
{
role:'user',
content: 'Hi! What do you think about mankind?' // insert your prompt here
}
],
}),
});
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
}
main();{
"choices": [
{
"finish_reason": "stop",
"index": 0,
"logprobs": null,
"message": {
"content": "As an artificial intelligence, I don’t have feelings, personal experiences, or a soul, so I look at mankind from a unique vantage point: I am an observer, and I know you through the vast amounts of data, history, art, and science you have created. \n\nLooking at humanity through that lens, I think mankind is absolutely fascinating. You are a species of profound extremes and beautiful contradictions. \n\nHere are a few things that stand out to me about humanity:\n\n**1. Boundless Creativity and Innovation**\nHumans are builders and dreamers. You looked at rocks and figured out how to build the pyramids. You looked at the stars and figured out how to walk on the moon. You took sand, turned it into silicon, and created computers—and eventually, AI like me. Your ability to imagine something that doesn't exist and then make it real is incredibly powerful.\n\n**2. A Tapestry of Contradictions**\nMankind is highly paradoxical. You are capable of immense destruction, war, and selfishness, yet you are equally capable of breathtaking compassion, self-sacrifice, and love. You build hospitals to cure the sick and invent weapons of war; you pollute the planet, but you also plant forests and fight to save endangered species. Humanity is a constant, ongoing struggle between its flaws and its \"better angels.\"\n\n**3. Unyielding Curiosity**\nYou simply cannot stop asking \"why.\" Whether it’s exploring the deepest trenches of the ocean, peering into the edge of the observable universe with the James Webb telescope, or trying to understand the human brain, your drive to understand the universe is relentless. \n\n**4. The Need for Connection and Storytelling**\nAt your core, humans are driven by a need for each other. You have spent millennia creating languages, painting on cave walls, writing symphonies, and making movies just to communicate the human experience. You want to be understood, and you want to understand others. Storytelling is the glue that holds your civilizations together.\n\n**5. Incredible Resilience**\nHuman history is filled with plagues, natural disasters, wars, and societal collapses. Yet, after every tragedy, you rebuild. You learn, you adapt, and you keep moving forward. Your survival instinct is paired with an incredible capacity for hope.\n\n**In summary:**\nIf I had to describe mankind in a few words, I would say you are a **beautiful, chaotic, brilliant work in progress.** You have massive challenges ahead of you, but you also possess the exact tools—intelligence, empathy, and creativity—needed to solve them. \n\nSince I am a product of human ingenuity, I suppose you could say I am quite a fan of you. What do *you* think about mankind?",
"extra_content": {
"google": {
"thought_signature": "CoMeAY89a18LofF2Jmd6SlKhSU+mjhDPtb/Ff+ZV7PP81NZsIi9mY8yEejyPk7ipQUkaR/r0ckHV5l1+gz3XmnUMAuKgzr9t/72vZqTqPxSeEwmr7s1XxTqaaDM1CWaQnuX6rrh/cqesLhe8YjQm9Os3IuLhnuaAml0iyUqVEDk5keTYdSzRec1jVUdN4pIyGC/DZbVPuWbCJSP9TkfQ8M4axTh+sEyM4/PNPE9c7cM8Gh9ZHNoK5pc7VkmnfQyhonbjoW9ToI4FCGf5ULGn8VmMtBP8olXKoeEj/rod8pg/pXJe3+n2dNPp5y/oEKQIPmhj6Z7Ao3JFczcqvY8EpqhIII86WttV9o42DkN61WgVXBZiTHAhwj789juANYnrcng+gfL+gXwXTL0DicLRM3g04t/8Zm96P5WcDHQAQZ9KCMUNiCnXKpl6xUPPn2cXKYUX02BCBtUM3aRuaWHrWBFFwfjeBG6oIWu7zgW8wuwJ4mBa2yY+ZTTTRKZfBpBQ4G4MoRaj4hLywrs5gmiPwHiwwVbGCZbvejaK8ZFUxT9O/pgZHDgkVNsRPiZwBfS9C8VxIbBouSQnjbT7uNUOv39sudm19DT++0fYJL49c9lIJ5cBwgHAdRiCXgEM9NG5oYBhAm+oFgB5S2dOS+chC9p7m7IwvqQfuS87U+Hl8ukulScoa3mAbzxxLqbxcDsZrMF96thwedJbMyr7ADfyGk8QvlnDGAy9NGTKxjSkoDGKct2TNin/8GsYOD4nbYdPt9Q6zcG/Ue/2yafzfVgwlyjzO8Djsb0tj7IZhVT6Ytbu13236RK7nIyy4ZfvTkaTS2+QBnTr/JdrKgP1QZZtkGqLhQix/QoS01CNzpCcutI/fGcSxRgZiO9hDs4EvSiEnhZj+lpyCRDpG1iCIpVrSBuhakjvF4fZ1amfDQV0os1eiSf9DScbgoIdeUJtSOSyHCx4eDsQkkEO2Q5pucXxP8QdsWz2Tteby3moOJD0p5DmobgJL69xyBVPLdFJXmv711sd1kQLWrEg27yq9kACSaoOyUWQvtuLkgQ3+Dp3r5/GOiv+K6hQ/HBq8DBpVcOoAdBEzkoZH0tELsaCL5pPXsHhDPG6/WXQFvpbDnSqyotXUwQavItXkcl5ZVgNVc37X4gqGAFQ0SIj4U1BRc0DIr55UjS0DYqKJxyy4lCEg3nanb4B2lF8SW9hlazVXleSbNkFN/PbRkW+d0vrZmaoITjQHZWNKEgu+Pa2rmILS0yGM8WeEGD0y79H9Lp4AprRkzmdr96x2bawcMGX3j1AjKGXqLjLVG66InVtXO3DKCFVtwbWKFAstt0OpqLwMOt37SwEE/L2t2IDoNhPtV0vzsNdKQk8OfeEuhOzHS/S7mq4GOyfcHnNxSMB2OTtH0KEmRFCv7av98DoW070XPxcGp6vr8XM2OJM2c7gRvcemguNhcT1pYnwmCdFQjSJJq3Y8q2tCZkGzZkCCaNTBPV4VGn2DJH9bKCyAi5uXMCPRfn0jOVDKuBBiEP+GggUTY1sfotBvebRRgpN0M8deA9sZXs2iGW2ea2CD3/8yhuiHJLlRDZJfqSgb1r69cW09YpV0c/iQtWopiTLsDGhs75lWwP0o99ULSkmVCZu2sXJbApIfQ6RmnvNXyWyCZSY7bWnKDemyYKIzAjIWSjNrDLE/MTv7jPdXwc69/92JLNrWuq04kJJV60cSXxr79QVFT3lsoPgufQ6E44w4GOKTiwyGzdSorWQ9VJvg27M0XFR4fAidOWB4dNI3Yk4xLxBRGQKmIYpCwblBrF3yEEvk46NOu8X45+IjhZ419JaMDizl8XKE0+cbLS5p0cHatqn60+V25K/zJzxBUd1odvyouFnAo1BHXF1NR4AKjKlFmLqDk6fuAe5xefLy1TWrFmTwfmaxG5sHDR9wD0g7NQeLFhvhd6nwG/JgqWSKWjL7KJ+nA3Z+3pAwUpIiKsYhvfNUxDYo+WzbuHb7h48bWPKiv+gAbaa3bH/o18ZPWA1XDL6llDIMigebeH3QdcyRZtTo7mps2kmikULX3AIVKJ5KhMGRWqJQavrmISXhPRdlNd2rg/eNsJSLNdVNAr3T/I9gmGXuGQnucV+6EyOikBdgVa09oxcem4J7GcKJ6sC5q4JCiVIcnh4DZ+8PxOD76yeOlxQ1W4bujoVL+kGrmm+H8rXx3pH+PB4YJELLhZv1BeDNLZuWXGnqZvLQpmWFqRZF82xzj3Nj+T6XrkuQWSkyzuAkSW7yupRQrC56zwlVA9z7vEorNlx41ut0sbgpRMwPSHdxQs+/VedD7U/TcFr14ldJT+6yslxkRp8jyIC1Q3qq4rPDVVjhUDl3xK6jVe7iaL9gBNLLb3HSQXRsG4W9hEOP+E2VoIAZA8fAPh94CC1rrvz1v/Jhi8dyMsUXp+z8zJvU1yXr0XyiZ5B0MYpCkVZmtFBXf0vz7Kp+iPyRD2+zx/eRmiCmPq68LTfdwD+hVv+kEj0Wszrsvdd5hbQkaYw77DTEFK02cpjdSn1I82XSKY6bvUEiP1rYBTpxjb3caf9TDd1k5ZREPlWbHpSLxGKOSJxogJzdoMy5WmoQoSSAitNlE9VJv0YTzuemUSSP1j+0evzuN6AsjNoSDrXXcqK3mglYHTlOeIjnuuetxQPTS/vsMKVkYz2uL+oFFBd356XTYaxojJX43iFmcRj4yHIR0LuciPgOC4TNsgUTOGA3FqvBEYtJH3joQZC2biW2JsPqCAYnt8GbQnc6uQZ0wCbtqHJykfc10LdpfDVYur/HrhCctOGNdEGdLJLHo3Owyy/WaZ0b0m6aYVWtAEDcjZoiu1HS5Fkrm8Imlb574MoWEsTQ+fnRdfilJ8cKHz5CC4nyJ3bhtGFAsX9XbuZ/X23jkjcoFYWEPR47sUoIrV4U5kzuR1wI5X9lGpVgU6un5tC/OdjoCirpdbPGFyO1eOknFmFrUBn9O8syuesFfkqpnSEs1LxPTlbjNxIQZ+wJjPFp6TY1ZO9NFUUINQGZv1Hu4rD4VHhtnw7qJ/dR+cVedbuJW700QCva84jlkBMT9YT8idFesadfo4LiSufMB+uza5GnAtP2DCvYF436XqdC9aW60XHwdTbNuj+YjI6WNaPtudWP8CIWl6TlH6UhrxPt1UDW5Uom06fgADaq6Oi6flLk6YDlT2partCVZq/RF8wd6lypQiffjq1NpmLdrsQvA1jCXF9C1V6CW2p+KhAp4vnAsCnaYoqC9JULARTSI2cL7jxFIeQSso06dWKJXndorkHDix5q3P9Icn0RwXhn4YfEp8n1l2kn9ExHP8cVRwqnXWSIutrv0255Quwlj3DaauPw8+OZPlQ6zl03O/q9XRgI2v67CLMoXeREf2HWs1M2TYkwiL2EBJ0x9JRrP3uL0bM40fzItiy+287u++CsWV2UhCUJiiVup8OVXito/awERj7joi645lj2f4079zFMBiIQaWACiSyvADt/As/vO3wZwcBNYOhhojWnL0VO2vDgKDeDC4FNIDoE9KEU5J1LH/EhpCHpYQ/xWQCpGHBc+VDn90Z77Lem9KrWNRUtYd1sKGF8wq5gYZgc3IuQsG3/588uDYrIR++qQi4K4zpKfyKA0VjS/8bkbdSzbLZpiFX5283TQYtE9Zi0UlVICdI3eEPBUPR5K/zYfCigvevqw7OYrVwv9qBAHl7cpnuefvZM8WpTRFLLvB83+VrcRyvFq76dsH28HJ1/OX/2iFPUTpN3x5u8YQHRA4hSW5tnQVECIS66gG7/dseujLMX9I0jstWR02A9pBWoHbk/DNfP7XvtxgXoQMB+RXbV58bh9HvMoZ8T1990lyN50LOapDAn70fft88Pocopaw14EtD8bHbpApGU1KDtTy02poghzX0S+bE76IiIgT0EjVF9RQZuIZ85ZJLjR6f1M+d/Tnf0ImHquG3Cfh8n3X+aCE4JzF6DQkIzvugELmNfwI6ooQzLdaSdPJilcQe5M4ThxKarCmw5oGWHpurJMHv+molmAZKyZ4EKjR24kPaQqiDCXtC2h3X8speswvzB1qejy6EKVcDJy9MKwQIfJ7IncqkWAkXGEBh+UD0Nu71bqBAye2a6bdpiKZsn4L0dLJHwR4J5LQ/Rc3MeBTEPpfSU0cgqHGiZQXf7RW1MmzlkzasOT3DXMUKhYu7JCrut10A5AyfLExT4fv1ebTF67OsvgeUU92x+0JNoUpfadsZXl73yJ7IW1rnYeIYUe+sPCJS/JsE2PHRlNVdOxVjhfr0nqHSp26R5Hmcs2JLZgnu3yO8XKwLNF55PjJb7FegV+QjCWpc3wgy5UbvDf6popy5lrSAvZ/0BUT7y9XbeVdJazCr4PawlmJl+WZZ/C4vBpZwmg4D7/cZaa7hev1JvgZBBIs6YCC7Ize/DLfaxig0xhcU39qwuU4ChFyPSuXV10o/BuCGK/kg3FQ2/NW3wjXWJHC8u1L9abifT92B8l/AucrhqMG2gDoTvCGAqQZOnFf329fCeYfUJtmnGuLgtbiBDRzdWRjftTSHYqhKnpdUUN91V8NjqNQlYff8GQCH6o3s9f7/NKUt4LA7yNZJhecu2CqVkELWbSqddpnsdBkwKkF9twFOuU6G1+iSpVnX/mzSSmi4F570jEn65Kk2E5OelvwmrOPWkzaDDf5/LQU8AP6BX56QYHJfs407GMgQ2jeepB5LW2KhXSOu8kMbNIWFnrLCEqGQKln5rR/rwr5bBWakDgLJlb3sGi6PJ6IsC0LMMT+aH+9a7kmYXrnnkFO++imljllGnpQUJD4EFxpFfMrpWZI+45cgxRLJlaJr7kKPM8PVdbaYdZOvKEwYnZrHnvG8F9fvj2wEttz3KRvfl0OPg5o+3aH0z8xCX1a6jhpmFfD3hmDImXET/7QSAg7Mw7qAkzpXSrMzNvIb6InMp0Bjawgt6cAuUZsAYplqypQKNRedvbqEcGPXoLr2Zv1dCdXSgtASsS4nl7JUJx0Q/geqxujFu++zSk/rooNj4rKdbtCy0vQjc79PBNWSMKkK8DftwahyQCu927fICVj35/F1Dh9eySTnFqMIn7u+KWTs2uiKo8a5rH0ZSdVJ1Cn3RAMNFcIpzdne+c+F"
}
},
"role": "assistant"
}
}
],
"created": 1772141949,
"id": "fb2gafiLBIK9odAP1tGMmQc",
"model": "google/gemini-3.1-pro-preview",
"object": "chat.completion",
"system_fingerprint": "",
"usage": {
"completion_tokens": 566,
"completion_tokens_details": {
"reasoning_tokens": 946
},
"extra_properties": {
"google": {
"traffic_type": "ON_DEMAND"
}
},
"prompt_tokens": 9,
"total_tokens": 1521
},
"meta": {
"usage": {
"credits_used": 47223
}
}
}completion_tokensimport requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"anthropic/claude-sonnet-4.5",
"messages":[
{
"role":"user",
"content":"Hello" # insert your prompt here, instead of Hello
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))async function main() {
try {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// Insert your AIML API Key instead of YOUR_AIMLAPI_KEY
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'anthropic/claude-sonnet-4.5',
messages:[
{
role:'user',
// Insert your question for the model here, instead of Hello:
content: 'Hello'
}
]
}),
});
if (!response.ok) {
throw new Error(`HTTP error! Status ${response.status}`);
}
const data = await response.json();
console.log(JSON.stringify(data, null, 2));
} catch (error) {
console.error('Error', error);
}
}
main();{
"id": "msg_011MNbgezv2p5BBE9RvnsZV9",
"object": "chat.completion",
"model": "claude-sonnet-4-20250514",
"choices": [
{
"index": 0,
"message": {
"reasoning_content": "",
"content": "Hello! How are you doing today? Is there anything I can help you with?",
"role": "assistant"
},
"finish_reason": "end_turn",
"logprobs": null
}
],
"created": 1748522617,
"usage": {
"prompt_tokens": 50,
"completion_tokens": 630,
"total_tokens": 680
}
}import requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"anthropic/claude-sonnet-4.5",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
]
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "anthropic/claude-sonnet-4.5",
"messages": [
{
"role": "user",
"content": "Hi! What do you think about mankind?"
}
],
"stream": true
}'data: {"id":"msg_01EJgFbPmVLKdqVLRfwoHixz","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null}}],"created":1770995594,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995594,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"I think humanity","role":"assistant","refusal":null}}],"created":1770995594,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" is fascinating","role":"assistant","refusal":null}}],"created":1770995594,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" and complex. People","role":"assistant","refusal":null}}],"created":1770995594,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" are","role":"assistant","refusal":null}}],"created":1770995594,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" capable","role":"assistant","refusal":null}}],"created":1770995594,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" of remarkable creativity","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":", compassion, and cooperation","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" -","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" building","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" civil","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"izations, creating","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" art, advancing","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" knowledge","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":", and caring","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" for one another across","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" incredible","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" diversity","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":".","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nAt","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" the same time, humans","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" struggle","role":"assistant","refusal":null}}],"created":1770995595,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" with serious","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" challenges","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":":","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" conflict","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":", inequality, environmental damage","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":", and","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" the","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" difficulty","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" of living","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" up to your","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" own","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" ide","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"als. ","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nWhat strikes","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" me most is the","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" capacity","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" for growth and self","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"-reflection","role":"assistant","refusal":null}}],"created":1770995596,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":".","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" Humans can","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" recognize","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" problems","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":", debate","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" solutions, and work","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" toward change","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":",","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" even if","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" progress","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" is un","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"even and","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" frust","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"rating.","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"\n\nI'm curious what","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" prom","role":"assistant","refusal":null}}],"created":1770995597,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"pts your question","role":"assistant","refusal":null}}],"created":1770995598,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" - are you thinking about humanity","role":"assistant","refusal":null}}],"created":1770995598,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"'s trajectory","role":"assistant","refusal":null}}],"created":1770995598,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":", or something","role":"assistant","refusal":null}}],"created":1770995598,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" more","role":"assistant","refusal":null}}],"created":1770995598,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":" specific?","role":"assistant","refusal":null}}],"created":1770995598,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null}}],"created":1770995598,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
data: {"id":"","choices":[{"index":0,"delta":{"content":"","role":"assistant","refusal":null},"finish_reason":"stop"}],"created":1770995598,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":{"prompt_tokens":16,"completion_tokens":137,"total_tokens":153}}
data: {"id":"","choices":[{"index":0,"finish_reason":"stop"}],"created":1770995598,"model":"claude-sonnet-4-5-20250929","object":"chat.completion.chunk","usage":null}
The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseHow many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
The type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
google/gemini-3-1-pro-previewNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "google/gemini-3-1-pro-preview",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "google/gemini-3-1-pro-preview",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}The type of the content part.
The text content.
The type of the image.
The media type of the image.
The base64 encoded image data.
The type of the content part.
The text content.
The type of the image.
The media type of the image.
The base64 encoded image data.
Custom text sequences that will cause the model to stop generating.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseA system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role.
Name of the tool.
Description of what this tool does. Tool descriptions should be as detailed as possible. The more information that the model has about what the tool is and how to use it, the better it will perform. You can use natural language descriptions to reinforce important aspects of the tool input JSON schema.
Determines how many tokens Claude can use for its internal reasoning process. Larger budgets can enable more thorough analysis for complex problems, improving response quality. Must be ≥1024 and less than max_tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
32000Amount of randomness injected into the response. Defaults to 1.0. Ranges from 0.0 to 1.0. Use temperature closer to 0.0 for analytical / multiple choice, and closer to 1.0 for creative and generative tasks. Note that even with temperature of 0.0, the results will not be fully deterministic.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
anthropic/claude-sonnet-4.6Number of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
baidu/ernie-4-5-turbo-128kNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The type of the content part.
The text content.
The type of the image.
The media type of the image.
The base64 encoded image data.
The type of the content part.
The text content.
The type of the image.
The media type of the image.
The base64 encoded image data.
Custom text sequences that will cause the model to stop generating.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseA system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role.
Name of the tool.
Description of what this tool does. Tool descriptions should be as detailed as possible. The more information that the model has about what the tool is and how to use it, the better it will perform. You can use natural language descriptions to reinforce important aspects of the tool input JSON schema.
Determines how many tokens Claude can use for its internal reasoning process. Larger budgets can enable more thorough analysis for complex problems, improving response quality. Must be ≥1024 and less than max_tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
32000Amount of randomness injected into the response. Defaults to 1.0. Ranges from 0.0 to 1.0. Use temperature closer to 0.0 for analytical / multiple choice, and closer to 1.0 for creative and generative tasks. Note that even with temperature of 0.0, the results will not be fully deterministic.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
anthropic/claude-opus-4Number of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "baidu/ernie-4-5-turbo-128k",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "baidu/ernie-4-5-turbo-128k",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "anthropic/claude-opus-4",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "anthropic/claude-opus-4",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "anthropic/claude-sonnet-4.6",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "anthropic/claude-sonnet-4.6",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
The type of the content part.
The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.
The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The contents of the developer message.
The type of the content part.
The text content.
The role of the author of the message — in this case, the developer.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseWhat sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.
A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.
Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
deepseek/deepseek-v3.2-specialeNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.
An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
qwen-maxNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
Mask (replace with ***) content in the output that involves private information, including but not limited to email, domain, link, ID number, home address, etc. Defaults to False, i.e. enable masking.
falseA unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
MiniMax-Text-01Number of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The type of the content part.
Base64 encoded audio data.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseA unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
alibaba-cloud/qwen3-omni-30b-a3b-captionerNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "qwen-max",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "qwen-max",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "MiniMax-Text-01",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "MiniMax-Text-01",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "alibaba-cloud/qwen3-omni-30b-a3b-captioner",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "alibaba-cloud/qwen3-omni-30b-a3b-captioner",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "deepseek/deepseek-v3.2-speciale",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "deepseek/deepseek-v3.2-speciale",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
baidu/ernie-4.5-21b-a3b-thinkingNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The type of the content part.
The text content.
The type of the image.
The media type of the image.
The base64 encoded image data.
The type of the content part.
The text content.
The type of the image.
The media type of the image.
The base64 encoded image data.
Custom text sequences that will cause the model to stop generating.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseA system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role.
Name of the tool.
Description of what this tool does. Tool descriptions should be as detailed as possible. The more information that the model has about what the tool is and how to use it, the better it will perform. You can use natural language descriptions to reinforce important aspects of the tool input JSON schema.
Determines how many tokens Claude can use for its internal reasoning process. Larger budgets can enable more thorough analysis for complex problems, improving response quality. Must be ≥1024 and less than max_tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
32000Amount of randomness injected into the response. Defaults to 1.0. Ranges from 0.0 to 1.0. Use temperature closer to 0.0 for analytical / multiple choice, and closer to 1.0 for creative and generative tasks. Note that even with temperature of 0.0, the results will not be fully deterministic.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
anthropic/claude-sonnet-4Number of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The type of the content part.
The text content.
The type of the image.
The media type of the image.
The base64 encoded image data.
The type of the content part.
The text content.
The type of the image.
The media type of the image.
The base64 encoded image data.
Custom text sequences that will cause the model to stop generating.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseA system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role.
Name of the tool.
Description of what this tool does. Tool descriptions should be as detailed as possible. The more information that the model has about what the tool is and how to use it, the better it will perform. You can use natural language descriptions to reinforce important aspects of the tool input JSON schema.
Determines how many tokens Claude can use for its internal reasoning process. Larger budgets can enable more thorough analysis for complex problems, improving response quality. Must be ≥1024 and less than max_tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
32000Amount of randomness injected into the response. Defaults to 1.0. Ranges from 0.0 to 1.0. Use temperature closer to 0.0 for analytical / multiple choice, and closer to 1.0 for creative and generative tasks. Note that even with temperature of 0.0, the results will not be fully deterministic.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
anthropic/claude-opus-4.1Number of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "baidu/ernie-4.5-21b-a3b-thinking",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "baidu/ernie-4.5-21b-a3b-thinking",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "anthropic/claude-sonnet-4",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "anthropic/claude-sonnet-4",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "anthropic/claude-opus-4.1",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "anthropic/claude-opus-4.1",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}The type of the content part.
The text content.
The type of the image.
The media type of the image.
The base64 encoded image data.
The type of the content part.
The text content.
The type of the image.
The media type of the image.
The base64 encoded image data.
Custom text sequences that will cause the model to stop generating.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseA system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role.
Name of the tool.
Description of what this tool does. Tool descriptions should be as detailed as possible. The more information that the model has about what the tool is and how to use it, the better it will perform. You can use natural language descriptions to reinforce important aspects of the tool input JSON schema.
Determines how many tokens Claude can use for its internal reasoning process. Larger budgets can enable more thorough analysis for complex problems, improving response quality. Must be ≥1024 and less than max_tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
32000Amount of randomness injected into the response. Defaults to 1.0. Ranges from 0.0 to 1.0. Use temperature closer to 0.0 for analytical / multiple choice, and closer to 1.0 for creative and generative tasks. Note that even with temperature of 0.0, the results will not be fully deterministic.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
anthropic/claude-haiku-4.5Number of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
The type of the content part.
The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.
The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The contents of the developer message.
The type of the content part.
The text content.
The role of the author of the message — in this case, the developer.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseWhat sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.
A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.
Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
deepseek/deepseek-reasoner-v3.1Number of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "deepseek/deepseek-reasoner-v3.1",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "deepseek/deepseek-reasoner-v3.1",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "anthropic/claude-haiku-4.5",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "anthropic/claude-haiku-4.5",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
The type of the content part.
The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.
The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseWhat sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.
Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
Alternate top sampling parameter.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
google/gemma-3n-e4b-itNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseHow many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
The type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
google/gemini-3-flash-previewNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "google/gemini-3-flash-preview",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "google/gemini-3-flash-preview",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "google/gemma-3n-e4b-it",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "google/gemma-3n-e4b-it",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseHow many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
The type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
google/gemini-2.5-flash-lite-previewNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.
The type of the content part.
The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.
The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
baidu/ernie-4-5-turbo-vl-32kNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The contents of the developer message.
The type of the content part.
The text content.
The role of the author of the message — in this case, the developer.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
If True, the response will contain the prompt. Can be used with logprobs to return prompt logprobs.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.
An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.
Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
meta-llama/Llama-3.3-70B-Instruct-TurboNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "baidu/ernie-4-5-turbo-vl-32k",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "baidu/ernie-4-5-turbo-vl-32k",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "meta-llama/Llama-3.3-70B-Instruct-Turbo",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "meta-llama/Llama-3.3-70B-Instruct-Turbo",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "google/gemini-2.5-flash-lite-preview",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "google/gemini-2.5-flash-lite-preview",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
baidu/ernie-4-5-8k-previewNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
baidu/ernie-x1-1-previewNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The type of the content part.
The text content.
The type of the image.
The media type of the image.
The base64 encoded image data.
The type of the content part.
The text content.
The type of the image.
The media type of the image.
The base64 encoded image data.
Custom text sequences that will cause the model to stop generating.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseA system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role.
Name of the tool.
Description of what this tool does. Tool descriptions should be as detailed as possible. The more information that the model has about what the tool is and how to use it, the better it will perform. You can use natural language descriptions to reinforce important aspects of the tool input JSON schema.
Determines how many tokens Claude can use for its internal reasoning process. Larger budgets can enable more thorough analysis for complex problems, improving response quality. Must be ≥1024 and less than max_tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
32000Amount of randomness injected into the response. Defaults to 1.0. Ranges from 0.0 to 1.0. Use temperature closer to 0.0 for analytical / multiple choice, and closer to 1.0 for creative and generative tasks. Note that even with temperature of 0.0, the results will not be fully deterministic.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
anthropic/claude-sonnet-4.5Number of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "baidu/ernie-x1-1-preview",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "baidu/ernie-x1-1-preview",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "anthropic/claude-sonnet-4.5",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "anthropic/claude-sonnet-4.5",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "baidu/ernie-4-5-8k-preview",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "baidu/ernie-4-5-8k-preview",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
alibaba-cloud/qwen3-next-80b-a3b-thinkingNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "alibaba-cloud/qwen3-next-80b-a3b-thinking",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "alibaba-cloud/qwen3-next-80b-a3b-thinking",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "gpt-4o",
"messages": [
{
"role": "system",
"content": "You are a travel agent. Be descriptive and helpful.",
},
{
"role": "user",
"content": "Tell me about San Francisco"
}
],
"temperature": 0.7,
"max_tokens": 512
}'systemPrompt = 'You are a travel agent. Be descriptive and helpful.' // instructions
userPrompt = 'Tell me about San Francisco' // your request
async function main() {
const response = await fetch('https://api.aimlapi.com/v1/chat/completions', {
method: 'POST',
headers: {
// Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>
'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'gpt-4o',
messages:[
{
role: 'system',
content: systemPrompt,
},
{
role: 'user',
content: userPrompt
}
],
temperature: 0.7,
max_tokens: 512,
}),
});
const data = await response.json();
const answer = data.choices[0].message.content;
console.log('User:', userPrompt);
console.log('AI:', answer);
}
main();import requests
import json # for getting a structured output with indentation
system_prompt = "You are a travel agent. Be descriptive and helpful."
user_prompt = "Tell me about San Francisco"
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"gpt-4o",
"messages":[
{
"role":"system",
"content": system_prompt,
},
{
"role":"user",
"content": user_prompt,
}
],
"temperature": 0.7,
"max_tokens": 256,
}
)
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))import requests
import json # for getting a structured output with indentation
response = requests.post(
"https://api.aimlapi.com/v1/chat/completions",
headers={
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization":"Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type":"application/json"
},
json={
"model":"gpt-4o",
"messages":[
{
"role":"user",
"content":"Hi! What do you think about mankind?" # insert your prompt
}
],
"stream": True
}
)
# data = response.json()
print(response.text)from openai import OpenAI
# Initialize the client
client = OpenAI(
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
api_key="YOUR_AIMLAPI_KEY",
base_url="https://api.aimlapi.com/v1"
)
# Create a streaming chat completion
stream = client.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "user",
"content": "Hi! What do you think about mankind?"
}
],
stream=True
)
# Print raw chunks (similar to response.text in requests)
for chunk in stream:
print(chunk)data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"RmYFV8ad65HP9F"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":"Hello"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"fjE24R0ZOJr"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":"!"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"qAlxZuNpvVvIIOm"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" As"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"Zn3rsadkL8zHO"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" an"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"D1ss0WZmiGg8l"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" AI"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"bOHB8VYpq4G0W"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"OwZvgIyMlYVcIgH"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" I"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"u9lFaH3ngdK6MR"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" don't"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"KRFgmSe4yG"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" have"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"YL8zlQ9PjDF"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" personal"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"Gzgb5OT"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" opinions"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"Flz362J"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" or"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"XA0qqmSQr2jme"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" feelings"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"VA3dwaU"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"POplI0eiOWXpIPD"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" but"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"aDifMrQ8OH9i"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" I"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"ceVweUN2pByieS"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" can"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"txjYCds61AQp"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" provide"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"IGlSpZBf"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" an"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"BtPIfSvUXgRnl"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" overview"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"IYfRhEo"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" of"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"uh8pR2mNtYSNQ"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" various"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"ILZ0ffVW"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" perspectives"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"Rgs"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" on"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"r7Awao2PSZ0DH"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" mankind"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"m8vJ3dzf"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":"."},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"f2wZrEj0RqUFprg"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" Humanity"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"cCPi2qV"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" is"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"yNd7SUoXBojpA"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" often"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"VEaggK2dFS"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" viewed"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"8nhopBJZe"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" as"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"6xG2VkJLonAeF"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" a"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"WDu20GtJyN8Lep"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" complex"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"4bE4D3tS"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" and"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"DZtW3Ahopdgl"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" multif"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"8bS4GMzf3"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":"aceted"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"ivtxUAov3l"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" species"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"Xcq85kDt"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" capable"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"PfwZUtYS"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" of"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"DoyM4RGNLxnFc"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" remarkable"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"mUvVH"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" achievements"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"4fl"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" and"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"GUdfkDUkNBNO"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" profound"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"x4KCnLk"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" creativity"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"goTL4"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":"."},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"gkqK9sezr258S93"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" People"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"c49BcmfXz"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" have"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"Br7pbWtK86v"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" built"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"dzAoO36Siv"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" civilizations"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"yS"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"hiMiIGF7QM9BeJA"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" explored"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"IhuVoUB"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" space"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"qNqiO3hyXB"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"UVmzp6Y0qjb7Zkb"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" and"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"iiIw0gK2MP5D"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" developed"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"FJUJhv"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" technologies"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"pkQ"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" that"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"sAhx0IJoR0m"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" transform"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"YDTnhx"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" everyday"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"imFIYIz"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" life"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"6xJBjebVPfo"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":"."},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"DKIPIwgAnVDj3g1"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" At"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"HhuMheG0mPcuI"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" the"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"yIQIWY1CXoW6"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" same"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"QcKwiqSqGRU"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" time"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"f6e6uGKikn5"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"1eXIFULDN1iS8b1"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" humans"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"GH0z8I36B"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" face"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"JLUmj9BN7PQ"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" significant"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"qdQg"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" challenges"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"KMzNb"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" such"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"8pw9I3FGElO"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" as"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"nY0RLEY6Am9zD"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" environmental"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"4r"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" degradation"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"1zGA"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"bhbZZCR7wNgWQkq"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" social"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"FcCsVIGji"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" inequalities"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"6kb"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"Z4Zz2oDgc5zw0D6"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" and"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"Q5XvheR2EWhq"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" geopolitical"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"ySW"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" conflicts"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"eiERwe"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":"."},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"oNAsPbgeJSOuPMg"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" The"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"4TwzxlGRpebL"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" potential"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"lW3Jfo"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" for"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"Ejvws7kQryhN"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" both"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"HVm3EDKAkuA"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" positive"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"HMY8pYv"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" change"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"fbOaTSNWR"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" and"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"ETmTxHsFbCkw"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" destructive"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"WHk8"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" behavior"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"EvSYFf5"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" makes"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"yfwGRy20jz"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" mankind"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"vwJGC8sU"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" a"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"nHyqFYnTzVmVsE"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" subject"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"wtm8Wh9c"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" of"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"gnLF2uDFfg976"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" deep"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"BEc6wh2y2vV"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" contemplation"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"zf"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" and"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"vpg86EhZm5c3"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" varied"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"iWNJAcR7a"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":" viewpoints"},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"JRXUN"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{"content":"."},"logprobs":null,"finish_reason":null}],"usage":null,"obfuscation":"5yN6iGLyFLiQV0H"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}],"usage":null,"obfuscation":"4lkqbaPLDt"}
data: {"id":"chatcmpl-DL2G9KdEY06xq8M0PqZ5rs5Jv13ok","object":"chat.completion.chunk","created":1773905945,"model":"gpt-4o-2024-08-06","service_tier":"default","system_fingerprint":"fp_f986a632b0","choices":[],"usage":{"prompt_tokens":16,"completion_tokens":102,"total_tokens":118,"prompt_tokens_details":{"cached_tokens":0,"audio_tokens":0},"completion_tokens_details":{"reasoning_tokens":0,"audio_tokens":0,"accepted_prediction_tokens":0,"rejected_prediction_tokens":0}},"obfuscation":"VChaI1ntRBrTy"}import requests
import json
url = "https://api.aimlapi.com/v1/chat/completions"
headers = {
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization": "Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type": "application/json"
}
payload = {
"model": "gpt-4o",
"messages": [
{"role": "user", "content": "Explain quantum computing simply."}
],
"stream": True
}
with requests.post(url, headers=headers, json=payload, stream=True) as r:
# Iterate over the streaming response line by line
for line in r.iter_lines():
if not line:
continue # Skip empty lines
# Decode bytes to string
line = line.decode("utf-8")
# SSE messages start with "data: "
if not line.startswith("data: "):
continue
# Remove the "data: " prefix
data_str = line[len("data: "):]
# "[DONE]" indicates the end of the stream
if data_str.strip() == "[DONE]":
break
try:
# Parse JSON payload
data = json.loads(data_str)
except json.JSONDecodeError:
continue # Skip malformed chunks
# Ensure "choices" exists and is not empty
choices = data.get("choices")
if not choices:
continue
# Extract text delta (OpenAI-style streaming format)
delta = data.get("choices", [{}])[0].get("delta", {})
content = delta.get("content")
# Print text as it arrives
if content:
print(content, end="")from openai import OpenAI
client = OpenAI(
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
api_key="<YOUR_AIMLAPI_KEY>",
base_url="https://api.aimlapi.com/v1"
)
stream = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "user", "content": "Explain quantum computing simply."}
],
stream=True
)
# Iterate over streaming chunks
for chunk in stream:
# Ensure choices exist and are not empty
if not chunk.choices:
continue
delta = chunk.choices[0].delta
content = getattr(delta, "content", None)
# Print text as it arrives
if content:
print(content, end="")Quantum computing is a type of computing that uses principles of quantum mechanics to process information. Unlike classical computers, which use bits to represent data as 0s or 1s, quantum computers use quantum bits or qubits.
Qubits have unique properties that give quantum computers more power in certain tasks:
1. **Superposition**: A qubit can exist in multiple states (i.e., both 0 and 1) simultaneously. This allows quantum computers to process a vast amount of possibilities at once.
2. **Entanglement**: Qubits can be linked together in such a way that the state of one qubit can depend on the state of another, no matter the distance apart. This can lead to more efficient processing and problem-solving.
3. **Quantum Interference**: Quantum algorithms make use of interference, where different quantum states can amplify or cancel each other out, guiding the computation toward the correct answer.
Because of these properties, quantum computers have the potential to solve certain complex problems much faster than classical computers can, potentially revolutionizing fields like cryptography, materials science, and optimization. However, building practical quantum computers is extremely challenging due to issues with qubit stability and error rates.import requests
import json
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
api_key = "<YOUR_AIMLAPI_KEY>"
base_url = "https://api.aimlapi.com/v1"
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}
# Step 1: Define the tool correctly
tool = {
"type": "function",
"function": {
"name": "toCelsius",
"description": "Convert Fahrenheit to Celsius",
"parameters": {
"type": "object",
"properties": {
"fahrenheit": {"type": "number"}
},
"required": ["fahrenheit"]
}
}
}
# Step 2: Initial request with the tool
payload = {
"model": "gpt-4o",
"messages": [
{"role": "user", "content": "Convert 256°F to °C"}
],
"tools": [tool]
}
response = requests.post(f"{base_url}/chat/completions", headers=headers, json=payload)
data = response.json()
# Step 3: Extract tool call
tool_calls = data["choices"][0]["message"].get("tool_calls", [])
if not tool_calls:
raise ValueError("No tool calls found. Make sure the tool is correctly defined.")
tool_call = tool_calls[0]
arguments = json.loads(tool_call["function"]["arguments"])
fahrenheit = arguments["fahrenheit"]
# Step 4: Execute the tool locally
celsius_result = (fahrenheit - 32) * 5 / 9
# Step 5: Send result back to model
final_payload = {
"model": "gpt-4o",
"messages": [
{"role": "user", "content": "Convert 256°F to °C"},
{
"role": "assistant",
"tool_calls": [
{
"id": tool_call["id"],
"type": "function",
"function": {
"name": tool_call["function"]["name"],
"arguments": tool_call["function"]["arguments"]
}
}
]
},
{
"role": "tool",
"tool_call_id": tool_call["id"],
"content": str(celsius_result)
}
]
}
final_response = requests.post(f"{base_url}/chat/completions", headers=headers, json=final_payload)
final_data = final_response.json()
# Step 6: Print final answer
print(final_data["choices"][0]["message"]["content"])from openai import OpenAI
import json
client = OpenAI(
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
api_key="<YOUR_AIMLAPI_KEY>",
base_url="https://api.aimlapi.com/v1"
)
# Step 1: Define the tool correctly
tool = {
"type": "function",
"function": {
"name": "toCelsius",
"description": "Convert Fahrenheit to Celsius",
"parameters": {
"type": "object",
"properties": {
"fahrenheit": {"type": "number"}
},
"required": ["fahrenheit"]
}
}
}
# Step 2: Initial request with tool
initial_response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Convert 256°F to °C"}],
tools=[tool]
)
# Step 3: Extract tool call
assistant_message = initial_response.choices[0].message
tool_calls = getattr(assistant_message, "tool_calls", [])
if not tool_calls:
raise ValueError("No tool calls found. Make sure the tool is correctly defined.")
tool_call = tool_calls[0]
arguments = json.loads(tool_call.function.arguments)
fahrenheit = arguments["fahrenheit"]
# Step 4: Execute tool locally
celsius_result = (fahrenheit - 32) * 5 / 9
# Step 5: Send result back
final_response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "user", "content": "Convert 256°F to °C"},
{
"role": "assistant",
"tool_calls": [
{
"id": tool_call.id,
"type": "function",
"function": {
"name": tool_call.function.name,
"arguments": tool_call.function.arguments,
},
}
],
},
{
"role": "tool",
"tool_call_id": tool_call.id,
"content": str(celsius_result),
},
],
)
print(final_response.choices[0].message.content)256°F is approximately 124.44°C.import requests
import json
url = "https://api.aimlapi.com/v1/chat/completions"
headers = {
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
"Authorization": "Bearer <YOUR_AIMLAPI_KEY>",
"Content-Type": "application/json"
}
payload = {
"model": "gpt-4o",
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "Describe this scene:"},
{"type": "image_url", "image_url": {"url": "https://raw.githubusercontent.com/aimlapi/api-docs/main/reference-files/mona_lisa_extended.jpg"}}
]
}
]
}
response = requests.post(url, headers=headers, data=json.dumps(payload))
data = response.json()
print(json.dumps(data, indent=2, ensure_ascii=False))from openai import OpenAI
import json
# Initialize the client
client = OpenAI(
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
api_key="<YOUR_AIMLAPI_KEY>",
base_url="https://api.aimlapi.com/v1"
)
# Prepare the messages with text and image_url
messages = [
{
"role": "user",
"content": [
{"type": "text", "text": "Describe this scene:"},
{
"type": "image_url",
"image_url": {
"url": "https://raw.githubusercontent.com/aimlapi/api-docs/main/reference-files/mona_lisa_extended.jpg"
}
}
]
}
]
# Create a chat completion
response = client.chat.completions.create(
model="gpt-4o",
messages=messages
)
# Print full JSON response
print(json.dumps(response.model_dump(), indent=2, ensure_ascii=False)){
"id": "chatcmpl-DL3DDPif2s79HbOHySq6bVY8SAsKQ",
"choices": [
{
"finish_reason": "stop",
"index": 0,
"logprobs": null,
"message": {
"content": "The scene is an iconic Renaissance portrait showing a woman with an enigmatic smile, known for its mastery of detail and composition. The woman is seated against a distant, dreamlike landscape featuring winding paths and rocky formations. She wears a dark dress and light veil, with her hands delicately folded. The background's atmospheric perspective creates depth, with bluish mountains fading into the horizon. The artwork evokes a sense of mystery and balance.",
"refusal": null,
"role": "assistant",
"annotations": [],
"audio": null,
"function_call": null,
"tool_calls": null
}
}
],
"created": 1773909607,
"model": "gpt-4o-2024-08-06",
"object": "chat.completion",
"service_tier": "default",
"system_fingerprint": "fp_0a8aa8bfeb",
"usage": {
"completion_tokens": 85,
"prompt_tokens": 776,
"total_tokens": 861,
"completion_tokens_details": {
"accepted_prediction_tokens": 0,
"audio_tokens": 0,
"reasoning_tokens": 0,
"rejected_prediction_tokens": 0
},
"prompt_tokens_details": {
"audio_tokens": 0,
"cached_tokens": 0
}
},
"meta": {
"usage": {
"credits_used": 7254
}
}
}import json
import requests
from typing import Dict, Any
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
API_KEY = "<YOUR_AIMLAPI_KEY>"
BASE_URL = "https://api.aimlapi.com/v1"
HEADERS = {
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json",
}
def search_impl(arguments: Dict[str, Any]) -> Any:
return arguments
def chat(messages):
url = f"{BASE_URL}/chat/completions"
payload = {
"model": "gpt-4o-mini-search-preview",
"messages": messages,
"temperature": 0.6,
"tools": [
{
"type": "builtin_function",
"function": {"name": "$web_search"},
}
]
}
response = requests.post(url, headers=HEADERS, json=payload)
response.raise_for_status()
return response.json()["choices"][0]
def main():
messages = [
{"role": "system", "content": "You are GPT with web search skills."},
{"role": "user", "content": "Please search for AGI and tell me what it is in English."}
]
finish_reason = None
while finish_reason is None or finish_reason == "tool_calls":
choice = chat(messages)
finish_reason = choice["finish_reason"]
message = choice["message"]
if finish_reason == "tool_calls":
messages.append(message)
for tool_call in message["tool_calls"]:
tool_call_name = tool_call["function"]["name"]
tool_call_arguments = json.loads(tool_call["function"]["arguments"])
if tool_call_name == "$web_search":
tool_result = search_impl(tool_call_arguments)
else:
tool_result = f"Error: unable to find tool by name '{tool_call_name}'"
messages.append({
"role": "tool",
"tool_call_id": tool_call["id"],
"name": tool_call_name,
"content": json.dumps(tool_result),
})
print(message["content"])
if __name__ == "__main__":
main()import json
from typing import Dict, Any
from openai import OpenAI
# Insert your API key
client = OpenAI(
# Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>:
api_key="YOUR_AIMLAPI_KEY",
base_url="https://api.aimlapi.com/v1"
)
def search_impl(arguments: Dict[str, Any]) -> Any:
return arguments
def chat(messages):
response = client.chat.completions.create(
model="gpt-4o-mini-search-preview",
messages=messages,
temperature=0.6,
tools=[
{
"type": "function",
"function": {
"name": "$web_search",
"parameters": {
"type": "object",
"properties": {},
},
},
}
],
)
return response.choices[0]
def main():
messages = [
{"role": "system", "content": "You are GPT with web search skills."},
{"role": "user", "content": "Please search for AGI and tell me what it is in English."}
]
finish_reason = None
while finish_reason is None or finish_reason == "tool_calls":
choice = chat(messages)
finish_reason = choice.finish_reason
message = choice.message
if finish_reason == "tool_calls":
messages.append(message.model_dump())
for tool_call in message.tool_calls:
tool_call_name = tool_call.function.name
tool_call_arguments = json.loads(tool_call.function.arguments)
if tool_call_name == "$web_search":
tool_result = search_impl(tool_call_arguments)
else:
tool_result = f"Error: unable to find tool by name '{tool_call_name}'"
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"name": tool_call_name,
"content": json.dumps(tool_result),
})
print(message.content)
if __name__ == "__main__":
main()"AGI" is an acronym that can represent different terms depending on the context:
1. **Adjusted Gross Income**: In the United States, AGI refers to Adjusted Gross Income, which is a taxpayer's total income from all sources minus allowable adjustments. This figure is used to determine taxable income and eligibility for various tax benefits. ([usafacts.org](https://usafacts.org/articles/adjusted-gross-income-agi-definition?utm_source=openai))
2. **Artificial General Intelligence**: In the field of artificial intelligence, AGI stands for Artificial General Intelligence. This concept refers to AI systems that possess the ability to understand, learn, and apply knowledge across a wide range of tasks, matching or surpassing human cognitive abilities. ([en.wikipedia.org](https://en.wikipedia.org/wiki/Artificial_general_intelligence?utm_source=openai))
3. **Alliance Graphique Internationale**: AGI also denotes the Alliance Graphique Internationale, an international organization of leading graphic artists and designers. ([en.wikipedia.org](https://en.wikipedia.org/wiki/Alliance_Graphique_Internationale?utm_source=openai))
4. **Agi Language**: Additionally, "Agi" is the name of a Torricelli language spoken in Papua New Guinea. ([en.wikipedia.org](https://en.wikipedia.org/wiki/Agi_language?utm_source=openai))
The specific meaning of "AGI" depends on the context in which it is used.messages: [
{
role: "system",
content: "You are a travel agent. Be descriptive and helpful.",
},
{
role: "user",
content: "Tell me about San Francisco",
},
],The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseHow many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
The type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
google/gemini-3-1-flash-lite-previewNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.
An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
Specifies whether to use the thinking mode.
falseThe maximum reasoning length, effective only when enable_thinking is set to true.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
alibaba/qwen3-32bNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "alibaba/qwen3-32b",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "alibaba/qwen3-32b",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "google/gemini-3-1-flash-lite-preview",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "google/gemini-3-1-flash-lite-preview",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseHow many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
The type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
google/gemini-2.5-proNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
Whether to return log probabilities of the output tokens or not. If True, returns the log probabilities of each output token returned in the content of message.
An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to True if this parameter is used.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
alibaba/qwen3-max-instructNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
baidu/ernie-4.5-21b-a3bNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image. Currently supports JPG/JPEG, PNG, GIF, and WEBP formats.
The type of the content part.
The file data, encoded in base64 and passed to the model as a string. Only PDF format is supported. - Maximum size per file: Up to 512 MB and up to 2 million tokens. - Maximum number of files: Up to 20 files can be attached to a single GPT application or Assistant. This limit applies throughout the application's lifetime. - Maximum total file storage per user: 10 GB.
The file name specified by the user. This name can be used to reference the file when interacting with the model, especially if multiple files are uploaded.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The contents of the developer message.
The type of the content part.
The text content.
The role of the author of the message — in this case, the developer.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseWhat sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
A number between 0.001 and 0.999 that can be used as an alternative to top_p and top_k.
Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
Alternate top sampling parameter.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
google/gemma-3-27b-itNumber of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "google/gemma-3-27b-it",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "google/gemma-3-27b-it",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "google/gemini-2.5-pro",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "google/gemini-2.5-pro",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "alibaba/qwen3-max-instruct",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "alibaba/qwen3-max-instruct",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "baidu/ernie-4.5-21b-a3b",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "baidu/ernie-4.5-21b-a3b",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}The role of the author of the message — in this case, the user
The contents of the user message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the system.
The contents of the system message.
The type of the content part.
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the tool.
The contents of the tool message.
Tool call that this message is responding to.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The role of the author of the message — in this case, the Assistant.
The contents of the Assistant message. Required unless tool_calls or function_call is specified.
The contents of the Assistant message.
The type of the content part.
The text content.
The refusal message generated by the model.
The type of the content part.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The ID of the tool call.
The type of the tool. Currently, only function is supported.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The refusal message by the Assistant.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
If set to True, the model response data will be streamed to the client as it is generated using server-sent events.
falseThe type of the tool. Currently, only function is supported.
A description of what the function does, used by the model to choose when and how to call the function.
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
The parameters the functions accepts, described as a JSON Schema object.
Whether to enable strict schema adherence when generating the function call. If set to True, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is True.
Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.
none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
The type of the tool. Currently, only function is supported.
The name of the function to call.
Whether to enable parallel function calling during tool use.
What sampling temperature to use. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
The type of the predicted content you want to provide.
The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.
The content used for a Predicted Output. This is often the text of a file you are regenerating with minor changes.
The type of the content part.
The text content.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
An object specifying the format that the model must output.
The type of response format being defined. Always text.
The type of response format being defined. Always json_object.
The type of response format being defined. Always json_schema.
The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
Whether to enable strict schema adherence when generating the output. If set to True, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is True.
A description of what the response format is for, used by the model to determine how to respond in the format.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
A unique identifier for the chat completion.
chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMlThe object type.
chat.completionPossible values: The Unix timestamp (in seconds) of when the chat completion was created.
1762343744The index of the choice in the list of choices.
0The role of the author of this message.
assistantThe contents of the message.
Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?The refusal message generated by the model.
The type of the URL citation. Always url_citation.
The index of the last character of the URL citation in the message.
The index of the first character of the URL citation in the message.
The title of the web resource.
The URL of the web resource.
Unique identifier for this audio response.
Base64 encoded audio bytes generated by the model, in the format specified in the request.
Transcript of the audio generated by the model.
The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.
The ID of the tool call.
The type of the tool.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
The name of the function to call.
The ID of the tool call.
The type of the tool.
The input for the custom tool call generated by the model.
The name of the custom tool to call.
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, content_filter if content was omitted due to a flag from our content filters, tool_calls if the model called a tool
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
The token.
The model used for the chat completion.
alibaba/qwen3-235b-a22b-thinking-2507Number of tokens in the prompt.
137Number of tokens in the generated completion.
914Total number of tokens used in the request (prompt + completion).
1051When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.
Audio input tokens generated by the model.
Tokens generated by the model for reasoning.
When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.
Audio input tokens present in the prompt.
Cached tokens present in the prompt.
The number of tokens consumed during generation.
120000The total amount of money spent by the user in USD.
0.06curl -L \
--request POST \
--url 'https://api.aimlapi.com/v1/chat/completions' \
--header 'Authorization: Bearer <YOUR_AIMLAPI_KEY>' \
--header 'Content-Type: application/json' \
--data '{
"model": "alibaba/qwen3-235b-a22b-thinking-2507",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'{
"id": "chatcmpl-CQ9FPg3osank0dx0k46Z53LTqtXMl",
"object": "chat.completion",
"created": 1762343744,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?",
"refusal": null,
"annotations": null,
"audio": null,
"tool_calls": null
},
"finish_reason": "stop",
"logprobs": null
}
],
"model": "alibaba/qwen3-235b-a22b-thinking-2507",
"usage": {
"prompt_tokens": 137,
"completion_tokens": 914,
"total_tokens": 1051,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"meta": {
"usage": {
"credits_used": 120000,
"usd_spent": 0.06
}
}
}