Finetuned-ViT Image-Text Classifier Open-Source Model - Precise Identification of Image Text and Text Types

Finetuned Vit Image Text Classifier

Developed by ernie-ai

An image classification model based on the ViT architecture, designed to identify whether an image contains text and the type of text (Latin, Chinese, Arabic)

Image Classification

Transformers

Open Source License:Apache-2.0 #Multilingual Text Recognition #Document Image Classification #High-Accuracy ViT

Downloads 45

Release Time : 2/8/2023

Model Overview

This model is a fine-tuned image classifier based on google/vit-base-patch16-224-in21k, specifically designed for document text classification tasks. It can identify text types (Latin, Chinese, Arabic) and non-text images.

Model Features

High-Accuracy Text Classification

Achieves 90.3% accuracy on the test set, effectively distinguishing between different text types.

ViT-Based Architecture

Utilizes the Vision Transformer architecture with powerful image feature extraction capabilities.

Multi-Category Recognition

Can simultaneously identify Latin, Chinese, and Arabic text types as well as non-text images.

Model Capabilities

Image Classification

Text Type Recognition

Document Image Analysis

Use Cases

Document Processing

Multilingual Document Classification

Automatically classify scanned documents containing different language texts.

Accurately distinguishes between Latin, Chinese, and Arabic documents.

Image Content Filtering

Filter images containing specific language texts from a collection.

OCR Preprocessing

OCR Language Identification

Identify the text type in documents before OCR processing.

Improves the accuracy of subsequent OCR processing.

Training Loss	Epoch	Step	Validation Loss	Accuracy
0.2719	2.08	100	0.4120	0.8657
0.1027	4.17	200	0.3907	0.8881
0.0723	6.25	300	0.3107	0.9030

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Finetuned Vit Image Text Classifier

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 finetuned-vit-doc-text-classifer

🚀 Quick Start

✨ Features

📚 Documentation

Training and evaluation data

Training hyperparameters

Training results

Framework versions

📄 License