Erax VL 2B V1.5 Q4 K M GGUF
This is a multimodal visual question answering model supporting Vietnamese, English, and Chinese, converted to GGUF format based on erax-ai/EraX-VL-2B-V1.5.
Downloads 55
Release Time : 1/2/2025
Model Overview
This model is a visual question answering (VQA) model capable of processing image and text inputs to generate relevant answers. It is particularly suitable for scenarios such as insurance and optical character recognition (OCR).
Model Features
Multilingual Support
Supports visual question answering tasks in three languages: Vietnamese, English, and Chinese.
GGUF Format Optimization
Converted to GGUF format for efficient operation on tools like llama.cpp.
Multimodal Capability
Capable of processing both image and text inputs for cross-modal understanding.
Industry Application Optimization
Specifically optimized for applications such as insurance and OCR.
Model Capabilities
Visual Question Answering
Image Understanding
Multilingual Processing
Text Generation
Use Cases
Insurance
Insurance Document Processing
Automatically identify and analyze information in insurance documents.
Healthcare
Prescription Recognition
Recognize text and content in medical prescriptions.
Featured Recommended AI Models