> For the complete documentation index, see [llms.txt](https://docs.aimlapi.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.aimlapi.com/api-references/vision-models/ocr-optical-character-recognition/mistral-ai/mistral-ocr-4-0.md).

# Mistral OCR 4

{% columns %}
{% column width="66.66666666666666%" %}
{% hint style="info" %}
This documentation is valid for the following list of our models:

* `mistral/mistral-ocr-4-0`
  {% endhint %}
  {% endcolumn %}

{% column width="33.33333333333334%" %} <a href="https://aimlapi.com/app/mistral/mistral-ocr-4-0" class="button primary">Try in Playground</a>
{% endcolumn %}
{% endcolumns %}

## Model Overview

Mistral OCR 4 is Mistral's most advanced document extraction and understanding model, adding native paragraph-level bounding box extraction and structural block labels on top of high-fidelity text, table, and image extraction from PDFs and images. Fully backward compatible with Mistral OCR 3.

{% hint style="success" %}
[Create AI/ML API Key](https://aimlapi.com/app/keys)
{% endhint %}

<details>

<summary>How to make the first API call</summary>

**1️⃣ Required setup (don’t skip this)**\
▪ **Create an account:** Sign up on the AI/ML API website (if you don’t have one yet).\
▪ **Generate an API key:** In your account dashboard, create an API key and make sure it’s **enabled** in the UI.

**2️ Copy the code example**\
At the bottom of this page, pick the snippet for your preferred programming language (Python / Node.js) and copy it into your project.

**3️ Update the snippet for your use case**\
▪ **Insert your API key:** replace `<YOUR_AIMLAPI_KEY>` with your real AI/ML API key.\
▪ **Select a model:** set the `model` field to the model you want to call.\
▪ **Provide input:** fill in the request input field(s) shown in the example.

**4️ (Optional) Tune the request**\
See the API schema below for optional generation settings.

**5️ Run your code**\
Run the updated code in your development environment.

{% hint style="success" %}
For a detailed walkthrough, use our [Quickstart guide](https://docs.aimlapi.com/quickstart/setting-up).
{% endhint %}

</details>

## API Schema

## POST /v1/ocr

>

```json
{"openapi":"3.0.0","info":{"title":"AIML API","version":"1.0.0"},"servers":[{"url":"https://api.aimlapi.com"}],"paths":{"/v1/ocr":{"post":{"operationId":"_v1_ocr","requestBody":{"required":true,"content":{"application/json":{"schema":{"type":"object","properties":{"model":{"type":"string","enum":["mistral/mistral-ocr-4-0"]},"document":{"oneOf":[{"type":"object","properties":{"type":{"type":"string","enum":["document_url"],"description":"Type of document."},"document_url":{"type":"string","format":"uri","description":"Document URL."}},"required":["type","document_url"]},{"type":"object","properties":{"type":{"type":"string","enum":["image_url"],"description":"Image URL."},"image_url":{"type":"string","format":"uri","description":"Type of document."}},"required":["type","image_url"]}],"description":"Document to run OCR"},"pages":{"anyOf":[{"type":"string"},{"type":"array","items":{"type":"integer"}},{"nullable":true}],"description":"Specific pages you wants to process"},"include_image_base64":{"type":"boolean","nullable":true,"description":"Include base64 images in response"},"image_limit":{"type":"integer","nullable":true,"description":"Max images to extract"},"image_min_size":{"type":"integer","nullable":true,"description":"Minimum height and width of image to extract"},"bbox_annotation_format":{"type":"object","nullable":true,"properties":{"type":{"type":"string","enum":["json_schema"]},"json_schema":{"type":"object","properties":{"name":{"type":"string"},"schema":{"type":"object","additionalProperties":{"nullable":true}},"description":{"type":"string","nullable":true},"strict":{"type":"boolean","nullable":true}},"required":["name","schema"]}},"required":["type","json_schema"],"description":"JSON schema to structure the annotation of each extracted bounding box (figures, charts, images). Using any annotation format switches the request to the annotated-page rate."},"document_annotation_format":{"type":"object","nullable":true,"properties":{"type":{"type":"string","enum":["json_schema"]},"json_schema":{"type":"object","properties":{"name":{"type":"string"},"schema":{"type":"object","additionalProperties":{"nullable":true}},"description":{"type":"string","nullable":true},"strict":{"type":"boolean","nullable":true}},"required":["name","schema"]}},"required":["type","json_schema"],"description":"JSON schema to extract structured data from the whole document. Using any annotation format switches the request to the annotated-page rate."},"document_annotation_prompt":{"type":"string","nullable":true,"description":"Optional high-level prompt to guide and instruct how the document is annotated."}},"required":["model","document"],"title":"mistral/mistral-ocr-4-0"}}}},"responses":{"200":{"content":{"application/json":{"schema":{"type":"object","properties":{"pages":{"type":"array","items":{"type":"object","properties":{"index":{"type":"integer","description":"The page index in a PDF document starting from 0"},"markdown":{"type":"string","description":"The markdown string response of the page"},"images":{"type":"array","items":{"type":"object","properties":{"id":{"type":"string","description":"Image ID for extracted image in a page"},"top_left_x":{"type":"integer","nullable":true,"description":"X coordinate of top-left corner of the extracted image"},"top_left_y":{"type":"integer","nullable":true,"description":"Y coordinate of top-left corner of the extracted image"},"bottom_right_x":{"type":"integer","nullable":true,"description":"X coordinate of bottom-right corner of the extracted image"},"bottom_right_y":{"type":"integer","nullable":true,"description":"Y coordinate of bottom-right corner of the extracted image"},"image_base64":{"type":"string","nullable":true,"format":"uri","description":"Base64 string of the extracted image"}},"required":["id","top_left_x","top_left_y","bottom_right_x","bottom_right_y"]},"description":"List of all extracted images in the page"},"dimensions":{"type":"object","nullable":true,"properties":{"dpi":{"type":"integer","description":"Dots per inch of the page-image."},"height":{"type":"integer","description":"Height of the image in pixels."},"width":{"type":"integer","description":"Width of the image in pixels."}},"required":["dpi","height","width"],"description":"The dimensions of the PDF page's screenshot image"}},"required":["index","markdown","images","dimensions"]},"description":"List of OCR info for pages"},"model":{"type":"string","description":"The model used to generate the OCR."},"document_annotation":{"type":"string","nullable":true,"description":"Structured annotation of the whole document as a JSON string, returned when document_annotation_format is provided."},"usage_info":{"type":"object","properties":{"pages_processed":{"type":"integer","description":"Number of pages processed"},"doc_size_bytes":{"type":"integer","nullable":true,"description":"Document size in bytes"}},"required":["pages_processed","doc_size_bytes"],"description":"Usage info for the OCR request."},"meta":{"type":"object","nullable":true,"properties":{"usage":{"type":"object","nullable":true,"properties":{"credits_used":{"type":"number","description":"The number of tokens consumed during generation."},"usd_spent":{"type":"number","description":"The total amount of money spent by the user in USD."}},"required":["credits_used","usd_spent"]}},"description":"Additional details about the generation."}},"required":["pages","model","usage_info"]}}},"description":"Successful response."}}}}}}
```

## Code Example

{% tabs %}
{% tab title="Python" %}
{% code overflow="wrap" %}

```python
import requests

response = requests.post(
    "https://api.aimlapi.com/v1/ocr",
    headers={
        "Authorization": "Bearer <YOUR_AIMLAPI_KEY>",
        "Content-Type": "application/json",
    },
    json={'model': 'mistral/mistral-ocr-4-0', 'document': '<document>'},
)

print(response.json())
```

{% endcode %}
{% endtab %}

{% tab title="JavaScript" %}
{% code overflow="wrap" %}

```javascript
const response = await fetch('https://api.aimlapi.com/v1/ocr', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer <YOUR_AIMLAPI_KEY>',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
  "model": "mistral/mistral-ocr-4-0",
  "document": "<document>"
}),
});

console.log(await response.json());
```

{% endcode %}
{% endtab %}
{% endtabs %}

<details>

<summary>Response</summary>

{% code overflow="wrap" %}

```json
{
  "pages": [
    {
      "index": 0,
      "markdown": "<markdown>",
      "images": [
        {
          "id": "<id>",
          "top_left_x": 0,
          "top_left_y": 0,
          "bottom_right_x": 0,
          "bottom_right_y": 0,
          "image_base64": "<image_base64>"
        }
      ],
      "dimensions": {
        "dpi": 0,
        "height": 0,
        "width": 0
      }
    }
  ],
  "model": "<model>",
  "document_annotation": "<document_annotation>",
  "usage_info": {
    "pages_processed": 0,
    "doc_size_bytes": 0
  },
  "meta": {
    "usage": {
      "credits_used": 120000,
      "usd_spent": 0.06
    }
  }
}
```

{% endcode %}

</details>


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.aimlapi.com/api-references/vision-models/ocr-optical-character-recognition/mistral-ai/mistral-ocr-4-0.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
