Google OCR
Model Overview
This API provides a feature to extract characters from images.
Setup your API Key
If you don’t have an API key for the AI/ML API yet, feel free to use our Quickstart guide.
API Schema
Performs optical character recognition (OCR) to extract text from images, enabling text-based analysis, data extraction, and automation workflows from visual data.
Authorizations
Body
modelundefined · enumOptionalPossible values:
documentany ofRequired
The document file to be processed by the OCR model.
string · uriOptional
stringOptional
mimeTypestring · enumOptionalPossible values:
The MIME type of the document.
pagesany ofOptional
Specific pages you wants to process
or
or
or
Responses
201
Successfully processed document with OCR
application/json
post
POST /v1/ocr HTTP/1.1
Host: api.aimlapi.com
Authorization: Bearer <YOUR_AIMLAPI_KEY>
Content-Type: application/json
Accept: */*
Content-Length: 130
{
"model": "google/gc-document-ai",
"document": "https://example.com",
"mimeType": "application/pdf",
"pages": {
"type": "start",
"start": 1
}
}
201
Successfully processed document with OCR
{
"pages": [
{
"index": 1,
"markdown": "text",
"images": [
{
"id": "text",
"top_left_x": 1,
"top_left_y": 1,
"bottom_right_x": 1,
"bottom_right_y": 1,
"image_base64": "https://example.com"
}
],
"dimensions": {
"dpi": 1,
"height": 1,
"width": 1
}
}
],
"model": "mistral-ocr-latest",
"usage_info": {
"pages_processed": 1,
"doc_size_bytes": 1
}
}
Last updated
Was this helpful?