Assistant API
Last updated
Was this helpful?
Last updated
Was this helpful?
Assistants are AI-driven entities with assigned roles and instructions, allowing them to process messages, use tools, and maintain context within threads for structured and interactive responses. One Assistant can be used across multiple Threads and users.
This page provides API schemas for the following methods:
After each API schema, you'll find a short example demonstrating how to correctly call the described method in code using the OpenAI SDK.
Note that the method names in the API schema and the SDK often differ. Accordingly, when calling these methods via the REST API, you should use the names from the API schema, while for calls through the OpenAI SDK, use the names from the examples.
Create an Assistant with a model and instructions.
https://api.aimlapi.com/assistants
https://api.aimlapi.com/assistants
https://api.aimlapi.com/assistants/{assistantId}
https://api.aimlapi.com/assistants/{assistantId}
https://api.aimlapi.com/assistants/{assistantId}
A limit on the number of objects to be returned. Limit can range between 1 and 100, and the default is 20.
Sort order by the created_at timestamp of the objects. asc for ascending order and desc for descending order.
A cursor for use in pagination. before is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, starting with obj_foo, your subsequent call can include before=obj_foo in order to fetch the previous page of the list.
A cursor for use in pagination. after is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, ending with obj_foo, your subsequent call can include after=obj_foo in order to fetch the next page of the list.
ID of the model to use
gpt-4o
, gpt-4o-2024-08-06
, gpt-4o-2024-05-13
, gpt-4o-mini
, gpt-4o-mini-2024-07-18
, chatgpt-4o-latest
, gpt-4-turbo
, gpt-4-turbo-2024-04-09
, gpt-4
, gpt-4-0125-preview
, gpt-4-1106-preview
, gpt-3.5-turbo
, gpt-3.5-turbo-0125
, gpt-3.5-turbo-1106
, o1-preview
, o1-preview-2024-09-12
, o1-mini
, o1-mini-2024-09-12
, o3-mini
, gpt-4.5-preview
The description of the assistant. The maximum length is 512 characters
The system instructions that the assistant uses. The maximum length is 256,000 characters
Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
The name of the assistant. The maximum length is 256 characters
Constrains effort on reasoning for reasoning models
low
, medium
, high
Specifies the format that the model must output
What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic
A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs.
A list of tool enabled on the assistant. There can be a maximum of 128 tools per assistant.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.
We generally recommend altering this or temperature but not both.