# Thinking / Reasoning

## Overview

Some text models support advanced reasoning mode, enabling them to perform multi-step problem solving, draw inferences, and follow complex instructions. This makes them well-suited for tasks like code generation, data analysis, and answering questions that require understanding context or logic.

{% hint style="warning" %}
Sometimes, if you give the model a serious and complex task, generating a response can take quite a while. In such cases, you might want to use streaming mode to receive the answer word by word as it is being generated.
{% endhint %}

## Models That Support Thinking / Reasoning Mode

### Anthropic

Special parameters, such as `thinking` in Claude models, provide transparency into the model’s step-by-step reasoning process before it gives its final answer.

Supported models:

* [anthropic/claude-opus-4](/api-references/text-models-llm/anthropic/claude-4-opus.md)
* [anthropic/claude-sonnet-4](/api-references/text-models-llm/anthropic/claude-4-sonnet.md)
* [anthropic/claude-opus-4.1](/api-references/text-models-llm/anthropic/claude-opus-4.1.md)
* [anthropic/claude-sonnet-4.5](/api-references/text-models-llm/anthropic/claude-4-5-sonnet.md)
* [anthropic/claude-opus-4-5](/api-references/text-models-llm/anthropic/claude-4.5-opus.md)
* [anthropic/claude-opus-4-6](/api-references/text-models-llm/anthropic/claude-4.6-opus.md)
* [anthropic/claude-sonnet-4.6](/api-references/text-models-llm/anthropic/claude-4.6-sonnet.md)
* [anthropic/claude-opus-4-7](/api-references/text-models-llm/anthropic/claude-4.7-opus.md)

### Google

Google's policy regarding reasoning models is not to provide parameters for explicitly controlling the model's reasoning activity during invocation. However, this activity does occur, and you can even inspect how many tokens it consumed by checking the `reasoning_tokens` field in the response.

<details>

<summary>Example of the "usage" section in a Gemini model response</summary>

```json
  "usage": {
    "prompt_tokens": 6,
    "completion_tokens": 3050,
    "completion_tokens_details": {
      "reasoning_tokens": 1097
    },
    "total_tokens": 3056
```

</details>

Supported models:

* [google/gemini-2.5-flash-lite-preview](/api-references/text-models-llm/google/gemini-2.5-flash-lite-preview.md)
* [google/gemini-2.5-flash](/api-references/text-models-llm/google/gemini-2.5-flash.md)
* [google/gemini-2.5-pro](/api-references/text-models-llm/google/gemini-2.5-pro.md)
* [google/gemini-3-1-pro-preview](/api-references/text-models-llm/google/gemini-3-1-pro-preview.md)
* [google/gemini-3-1-flash-lite-preview](/api-references/text-models-llm/google/gemini-3-1-flash-lite-preview.md)

### OpenAI and other vendors

The standard way to control reasoning behavior in OpenAI models—and in some models from other providers—is through the `reasoning_effort` parameter, which tells the model how much internal reasoning it should perform before responding to the prompt.

Accepted values are `low`, `medium`, and `high`. Lower levels prioritize speed and efficiency, while higher levels provide deeper reasoning at the cost of increased token usage and latency. The default is `medium`, offering a balance between performance and quality.

Supported models:

* [o1](/api-references/text-models-llm/openai/o1.md)
* [o3-mini](/api-references/text-models-llm/openai/o3-mini.md)
* [openai/gpt-4.1-mini-2025-04-14](/api-references/text-models-llm/openai/gpt-4.1-mini.md)
* [openai/gpt-4.1-nano-2025-04-14](/api-references/text-models-llm/openai/gpt-4.1-nano.md)
* [openai/o3-2025-04-16](/api-references/text-models-llm/openai/o3.md)
* [openai/o4-mini-2025-04-16](/api-references/text-models-llm/openai/o4-mini.md)
* [openai/gpt-oss-20b](/api-references/text-models-llm/openai/gpt-oss-20b.md)
* [openai/gpt-oss-120b](/api-references/text-models-llm/openai/gpt-oss-120b.md)
* [openai/gpt-5-2025-08-07](/api-references/text-models-llm/openai/gpt-5.md)
* [openai/gpt-5-mini-2025-08-07](/api-references/text-models-llm/openai/gpt-5-mini.md)
* [openai/gpt-5-nano-2025-08-07](/api-references/text-models-llm/openai/gpt-5-nano.md)
* [openai/gpt-5-1](/api-references/text-models-llm/openai/gpt-5-1.md)
* [openai/gpt-5-2](/api-references/text-models-llm/openai/gpt-5.2.md)
* [openai/gpt-5-4](/api-references/text-models-llm/openai/gpt-5-4.md)
* [openai/gpt-5-4-pro](/api-references/text-models-llm/openai/gpt-5-4-pro.md)
* [openai/gpt-5-5](/api-references/text-models-llm/openai/gpt-5-5.md)
* [openai/gpt-5-5-pro](/api-references/text-models-llm/openai/gpt-5-5-pro.md)

***

* [alibaba/qwen3-32b](/api-references/text-models-llm/alibaba-cloud/qwen3-32b.md)
* [alibaba/qwen3-coder-480b-a35b-instruct](/api-references/text-models-llm/alibaba-cloud/qwen3-coder-480b-a35b-instruct.md)
* [alibaba/qwen3-235b-a22b-thinking-2507](/api-references/text-models-llm/alibaba-cloud/qwen3-235b-a22b-thinking-2507.md)
* [alibaba/qwen3-next-80b-a3b-thinking](/api-references/text-models-llm/alibaba-cloud/qwen3-next-80b-a3b-thinking.md)
* [alibaba/qwen3-vl-32b-thinking](/api-references/text-models-llm/alibaba-cloud/qwen3-vl-32b-thinking.md)
* [alibaba/qwen3.5-plus-20260218](/api-references/text-models-llm/alibaba-cloud/qwen3.5-plus.md)
* [alibaba/qwen3.5-omni-plus](/api-references/text-models-llm/alibaba-cloud/qwen3.5-omni-plus.md)
* [alibaba/qwen3.5-omni-flash](/api-references/text-models-llm/alibaba-cloud/qwen3.5-omni-flash.md)
* [alibaba/qwen3.5-flash](/api-references/text-models-llm/alibaba-cloud/qwen3.5-flash.md)
* [alibaba/qwen3.6-27b](/api-references/text-models-llm/alibaba-cloud/qwen3.6-27b.md)
* [alibaba/qwen3.6-35b-a3b](/api-references/text-models-llm/alibaba-cloud/qwen3.6-35b-a3b.md)

***

* [baidu/ernie-4.5-21b-a3b-thinking](/api-references/text-models-llm/baidu/ernie-4.5-21b-a3b-thinking.md)
* [baidu/ernie-5-0-thinking-preview](/api-references/text-models-llm/baidu/ernie-5.0-thinking-preview.md)
* [baidu/ernie-5-0-thinking-latest](/api-references/text-models-llm/baidu/ernie-5.0-thinking-latest.md)

***

* [bytedance/dola-seed-2-0-mini](/api-references/text-models-llm/bytedance/dola-seed-2.0-mini.md)
* [bytedance/dola-seed-2-0-lite](/api-references/text-models-llm/bytedance/dola-seed-2.0-lite.md)
* [bytedance/dola-seed-2-0-pro](/api-references/text-models-llm/bytedance/dola-seed-2.0-pro.md)
* [bytedance/dola-seed-2-0-code](/api-references/text-models-llm/bytedance/dola-seed-2.0-code.md)

***

* [deepseek/deepseek-v3.2-speciale](/api-references/text-models-llm/deepseek/deepseek-v3.2-speciale.md)
* [deepseek/deepseek-v4-pro](/api-references/text-models-llm/deepseek/deepseek-v4-pro.md)
* [deepseek/deepseek-v4-flash](/api-references/text-models-llm/deepseek/deepseek-v4-flash.md)

***

* [minimax/m2](/api-references/text-models-llm/minimax/m2.md)
* [minimax/m2-1](/api-references/text-models-llm/minimax/m2-1.md)
* [minimax/m2-1-highspeed](/api-references/text-models-llm/minimax/m2.1-highspeed.md)
* [minimax/m2-5-20260218](/api-references/text-models-llm/minimax/m2-5.md)
* [minimax/m2-5-highspeed-20260218](/api-references/text-models-llm/minimax/m2-5-highspeed.md)
* [minimax/m2-7-20260402](/api-references/text-models-llm/minimax/m2-7.md)
* [minimax/m2-7-highspeed](/api-references/text-models-llm/minimax/m2.7-highspeed.md)

***

* [moonshot/kimi-k2-5](/api-references/text-models-llm/moonshot/kimi-k2-5.md)
* [moonshot/kimi-k2-6](/api-references/text-models-llm/moonshot/kimi-k2-6.md)

***

* [nvidia/nemotron-nano-9b-v2](/api-references/text-models-llm/nvidia/nemotron-nano-9b-v2.md)
* [nvidia/nemotron-nano-12b-v2-vl](/api-references/text-models-llm/nvidia/nemotron-nano-12b-v2-vl.md)
* [nvidia/nemotron-3-nano-30b-a3b](/api-references/text-models-llm/nvidia/nemotron-3-nano-30b-a3b.md)
* [nvidia/nemotron-3-nano-omni-30b-a3b-reasoning](/api-references/text-models-llm/nvidia/nemotron-3-nano-omni-30b-a3b-reasoning.md)
* [nvidia/nemotron-3-super-120b-a12b](/api-references/text-models-llm/nvidia/nemotron-3-super-120b-a12b.md)

***

* [x-ai/grok-3-mini-beta](/api-references/text-models-llm/xai/grok-3-mini-beta.md)
* [x-ai/grok-4-07-09](/api-references/text-models-llm/xai/grok-4.md)
* [x-ai/grok-code-fast-1](/api-references/text-models-llm/xai/grok-code-fast-1.md)
* [x-ai/grok-4-fast-reasoning](/api-references/text-models-llm/xai/grok-4-fast-reasoning.md)
* [x-ai/grok-4-1-fast-reasoning](/api-references/text-models-llm/xai/grok-4-1-fast-reasoning.md)
* [x-ai/grok-4-20-0309-reasoning](/api-references/text-models-llm/xai/grok-4-20-reasoning.md)

***

* [xiaomi/mimo-v2.5](/api-references/text-models-llm/xiaomi/mimo-v2.5.md)
* [xiaomi/mimo-v2.5-pro](/api-references/text-models-llm/xiaomi/mimo-v2.5-pro.md)

***

* [zhipu/glm-4.5-air](/api-references/text-models-llm/zhipu/glm-4.5-air.md)
* [zhipu/glm-4.5](/api-references/text-models-llm/zhipu/glm-4.5.md)
* [zhipu/glm-4.7](/api-references/text-models-llm/zhipu/glm-4.7.md)
* [zhipu/glm-5](/api-references/text-models-llm/zhipu/glm-5.md)
* [zhipu/glm-5-1](/api-references/text-models-llm/zhipu/glm-5.1.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.aimlapi.com/capabilities/thinking-reasoning.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
