M

Mgp Str Base

Developed by alibaba-damo
MGP-STR is a pure vision-based scene text recognition model that achieves efficient OCR through multi-granularity prediction.
Downloads 4,981
Release Time : 11/23/2022

Model Overview

This model is used for optical character recognition (OCR) of text images, employing a ViT architecture and specially designed A^3 module to support multi-granularity prediction at character, subword, and word levels.

Model Features

Multi-Granularity Prediction
Simultaneously performs character, subword, and word-level predictions, merging results through a fusion strategy
Pure Vision Architecture
Does not rely on language models, using only visual features for text recognition
A^3 Module
Specially designed attention module for selecting and integrating meaningful token combinations

Model Capabilities

Image-to-Text Conversion
Scene Text Recognition
Optical Character Recognition (OCR)

Use Cases

Document Digitization
Scanned Document Recognition
Converts scanned document images into editable text
High-precision recognition of printed text
Scene Text Recognition
Street View Text Recognition
Recognizes text in photos such as street signs and storefronts
Can handle texts with different fonts and backgrounds
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase