> For the complete documentation index, see [llms.txt](https://docs.aimlapi.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.aimlapi.com/api-references/text-models-llm/google.md).

# Google

Google develops multiple families of AI models focused on reasoning, coding, multimodal understanding, and long-context interaction across text, images, audio, and video. As part of the broader Google ecosystem, the company integrates frontier AI capabilities into developer platforms, productivity tools, search systems, and enterprise infrastructure.

The currently supported model families include:

* **Gemini models** — Google’s flagship multimodal and reasoning-oriented models designed for production AI workloads. The Gemini lineup includes:
  * **Flash models** optimized for low latency and cost efficiency.
  * **Pro models** intended for balanced general-purpose production use.
  * Advanced reasoning-capable variants focused on coding, analytical tasks, tool usage, and agentic workflows.\
    Gemini models place strong emphasis on multimodal processing, long context windows, native tool integration, and real-time interaction capabilities.
* **Gemma models** — lightweight open-weight models designed for efficient deployment, customization, and research workflows. Gemma models are optimized for smaller-scale inference environments while still supporting strong reasoning, coding, and conversational capabilities across a wide range of applications.

All Google models in this developer are accessed through the standard `/v1/chat/completions` endpoint, providing a unified OpenAI-compatible integration layer across the entire model catalog.

***

Supported capabilities vary depending on the specific model, with different models offering different combinations of the features listed below.

* **Text completions**: Build conversational systems and advanced text-processing pipelines.
* **Function Calling**: Utilize tools, APIs, and structured workflows.
* **Stream mode**: Receive partial responses incrementally as tokens are generated.
* **Batch Processing**: Execute multiple independent requests within a single API call.
* **Vision Tasks**: Process and analyze images and visual inputs.
* **Audio Tasks**: Transcribe, generate, and process audio content.
* **Video Tasks**: Analyze and reason over video inputs.

***

Other model categories from this provider are available as well.

* [Image](/api-references/image-models/google.md)
* [Video](/api-references/video-models/google.md)
* [Music](/api-references/vision-models/ocr-optical-character-recognition/google.md)
* [Vision(OCR)](/api-references/music-models/google.md)
* [Embedding](/api-references/embedding-models/google.md)


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.aimlapi.com/api-references/text-models-llm/google.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.