Erax VL 2B V1.5 I1 GGUF
EraX-VL-2B-V1.5 is a multimodal foundation model supporting Vietnamese, English, and Chinese, with capabilities for image-to-text and image-text-to-text conversion.
Downloads 467
Release Time : 12/29/2024
Model Overview
This is a multimodal vision-language model focusing on image-to-text and image-text-to-text conversion tasks, particularly suitable for fields like insurance and optical character recognition (OCR).
Model Features
Multilingual Support
Supports text processing in three languages: Vietnamese, English, and Chinese
Multimodal Capability
Can process both image and text inputs to achieve image-to-text conversion
Diverse Quantized Versions
Offers multiple quantized versions to accommodate different hardware and performance needs
Model Capabilities
Image-to-text
Image-text-to-text
Multilingual processing
Optical Character Recognition (OCR)
Use Cases
Insurance Industry
Insurance Document Processing
Automatically identifies and processes text information in insurance documents
Document Digitization
Document OCR
Converts text in scanned documents or images into editable text
Featured Recommended AI Models