glm-ocr

circle-info

This documentation is valid for the following list of our models:

  • zhipu/glm-ocr

Model Overview

A lightweight, production-ready OCR model that returns recognized content in Markdown format. It delivers high recognition accuracy even on complex page layouts, varied fonts, and mixed text-image documents.

Maximum file size: 50 MB. Maximum number of pages: 100.

circle-exclamation

Setup your API Key

If you don’t have an API key for the AI/ML API yet, feel free to use our Quickstart guidearrow-up-right.

How to Make a Call

chevron-rightStep-by-Step Instructionshashtag
  • Copy the code from one of the examples below, depending on whether you want to process an image or a PDF.

  • Replace <YOUR_AIMLAPI_KEY> with your AIML API key from your personal accountarrow-up-right.

  • Replace the URL of the document or image with the one you need.

  • If you need to use different parameters, refer to the API schema below for valid values and operational logic.

  • Save the modified code as a Python file and run it in an IDE or via the console.

API Schema

post
Authorizations
AuthorizationstringRequired

Bearer key

Body
modelundefined · enumOptionalPossible values:
documentone ofRequired

Document to run OCR.

or
pagesany ofOptional

Specific pages to process, e.g. "3", "0-2", [0, 3, 4].

stringOptional
or
integer[]Optional
or
any · nullableOptional
include_image_base64boolean · nullableOptional

Include base64 images in response.

image_limitinteger · nullableOptional

Max images to extract.

image_min_sizeinteger · nullableOptional

Minimum height and width of image to extract.

return_crop_imagesboolean · nullableOptional

Whether to return screenshot information

need_layout_visualizationboolean · nullableOptional

Whether to return detailed layout image result information

Responses
chevron-right
201

Successfully processed document with OCR

application/json
anyOptional
post
/v1/ocr
201

Successfully processed document with OCR

No content

Example #1: Text Recognition From an Image

We’ve found a photo of a short handwritten text for OCR testing and will be passing it to the model via URL:

chevron-rightResponsehashtag

Example #2: Process a PDF File

Let's process a PDF file from the internet using the described model:

chevron-rightResponsehashtag

Last updated

Was this helpful?