Concepts
API
API stands for Application Programming Interface. In the context of AI/ML, an API serves as a "handle" that enables you to integrate and utilize any Machine Learning model within your application. Our API supports communication via HTTP requests and is fully backward-compatible with OpenAI’s API. This means you can refer to OpenAI’s documentation for making calls to our API. However, be sure to change the base URL to direct your requests to our servers and select the desired model from our offerings.
API Key
An API Key is a credential that grants you access to our API from within your code. It is a sensitive string of characters that should be kept confidential. Do not share your API key with anyone else, as it could be misused without your knowledge.
You can find your API key on the account page.
Base URL
The Base URL is the first part of the URL (including the protocol, domain, and pathname) that determines the server responsible for handling your request. It’s crucial to configure the correct Base URL in your application, especially if you are using SDKs from OpenAI, Azure, or other providers. By default, these SDKs are set to point to their servers, which are not compatible with our API keys and do not support many of the models we offer.
Our base URL also supports versioning, so you can use the following as well:
https://api.aimlapi.com
https://api.aimlapi.com/v1
Usually, you pass the base URL as the same field inside the SDK constructor. In some cases, you can set the environment variable BASE_URL
, and it will work. If you want to use the OpenAI SDK, then follow the setting up article and take a closer look at how to use it with the AI/ML API.
Base64
Base64 is a way to encode binary data, such as files or images, into text format, making it safe to include in places like URLs or JSON requests.
In the context of working with AI models, this means that if a model expects a parameter like file_data
or image_url
, you can encode your local file or image as a Base64 string, pass it as the value for that parameter, and in most cases, the model will successfully receive and process your file. You’ll need to import the base64
library to handle file encoding. Below is a code example showing a real model call.
Deprecation
Deprecation is the process where a provider marks a model, parameter, or feature as outdated and no longer recommended for use. Deprecated items may remain available for some time but are likely to be removed or unsupported in the future.
Deprecation can apply to an entire model (see our list of deprecated/no longer supported models) or to individual parameters. For example, in a recent update to the video model v1.6-pro/image-to-video by Kling AI, the aspect_ratio
parameter was deprecated: the model now automatically determines the aspect ratio based on the properties of the provided reference image, and explicit aspect_ratio
input is no longer required.
Users are encouraged to monitor deprecation notices carefully and update their integrations accordingly. We notify our users about such changes in our email newsletters.
Endpoint
A specific URL where an API can be accessed to perform an operation (e.g., generate a response, upload a file).
Fine-tuned model
A fine-tuned model is a base AI model that has been further trained on additional, specific data to specialize it for certain tasks or behaviors.
For example, an "11B Llama 3.2 model fine-tuned for content safety" means that the original Llama 3.2 model (with 11 billion parameters) has received extra training using datasets focused on safe and appropriate content generation.
Multimodal Model
A model that can process and generate different types of data (text, images, audio) in a single interaction.
Prompt
The input given to a model to generate a response.
The parameter used to pass a prompt is most often called simply prompt
:
But there can be other variations. For example, the messages structure used in chat models passes the prompt within the content subfield. Depending on the value of the role
parameter value, this prompt will be interpreted either as a user message (role: user) or as a model instruction (role: system or role: assistant).
There are also special parameters that allow you to refine prompts, control how strongly the model should follow them, or adjust the strictness of their interpretation.
prompt_optimizer
orenhance_prompt
: The model will automatically optimize the incoming prompt to improve the video generation quality if necessary. For more precise control, this parameter can be set toFalse
, and the model will follow the instructions more strictly.negative_prompt
: The description of elements to avoid in the generated video/image/etc.cfg_scale
orguidance_scale
: The Classifier Free Guidance (CFG) scale is a measure of how close you want the model to stick to your prompt.strength
: Determines how much the prompt influences the generated image.
Which of these parameters are supported by a specific model can be found in the API Schema section on that model's page.
Terminal
If you are not a developer or are using modern systems, you might be familiar with it only as a "black window for hackers." However, the terminal is a very old and useful way to communicate with a computer. The terminal is an app inside your operating system that allows you to run commands by typing strings associated with some program. Depending on the operating system, you can run the terminal in many ways. Here are basic ways that usually work:
On Windows: Press the combination
Win + R
and typecmd
.On Mac: Press
Command + Space
, search for Terminal, then hitEnter
.On Linux: You are probably already familiar with it. On Ubuntu with GUI, for example, you can type
Ctrl + F
, search for Terminal, then hitEnter
.
Token
A chunk of text (word, part of a word, or symbol) that text models use for processing inputs and outputs. The cost of using a text model is calculated based on the number of tokens processed. Both the text documents you send and the conversation history (in the case of interacting with an Assistant) are tokenized (split into tokens) and included in the cost calculation.
You can limit the model’s output using the max_completion_tokens
parameter (the fully equivalent deprecated max_tokens
parameter is still supported for now).
Last updated
Was this helpful?