Llama-4-scout
Model Overview
A 17 billion active parameter model with 16 experts, is the best multimodal model in the world in its class and is more powerful than all previous generation Llama models. Additionally, the model offers an industry-leading context window of 10M and delivers better results than Gemma 3, Gemini 2.0 Flash-Lite, and Mistral 3.1 on a wide range of common benchmarks.
How to Make a Call
Setup You Can’t Skip
▪️ Create an Account: Visit the AI/ML API website and create an account (if you don’t have one yet). ▪️ Generate an API Key: After logging in, navigate to your account dashboard and generate your API key. Ensure that key is enabled on UI.
Copy the code example
At the bottom of this page, you'll find a code example that shows how to structure the request. Choose the code snippet in your preferred programming language and copy it into your development environment.
(Optional) Adjust other optional parameters if needed
Only model
and messages
are required parameters for this model (and we’ve already filled them in for you in the example), but you can include optional parameters if needed to adjust the model’s behavior. Below, you can find the corresponding API schema, which lists all available parameters along with notes on how to use them.
If you need a more detailed walkthrough for setting up your development environment and making a request step by step — feel free to use our Quickstart guide.
API Schema
Creates a chat completion using a language model, allowing interactive conversation by predicting the next response based on the given chat history. This is useful for AI-driven dialogue systems and virtual assistants.
512
false
No content
Code Example (Python)
Last updated
Was this helpful?