E

Eagle X5 13B Chat

Developed by NVEagle
Eagle is a series of high-resolution multimodal large language models centered around vision, supporting an input resolution of over 1K and performing excellently in tasks such as optical character recognition and document understanding.
Downloads 1,748
Release Time : 8/23/2024

Model Overview

This model enhances the perception ability of multimodal large language models by fusing multiple visual encoders and different input resolutions, and adopts the 'CLIP+X' fusion method based on channel splicing to fuse visual experts with different architectures and knowledge.

Model Features

Multimodal fusion
Adopt the 'CLIP+X' fusion method based on channel splicing to fuse visual experts with different architectures (ViT/Convolutional Network) and knowledge (detection/segmentation/OCR/self-supervised learning).
High-resolution support
Support an input resolution of over 1K and perform excellently in resolution-sensitive tasks.

Model Capabilities

Image understanding
Text generation
Optical character recognition
Document understanding

Use Cases

Document processing
Document content understanding
Parse and understand the content and structure in high-resolution documents
Perform excellently in high-resolution document understanding tasks
Image analysis
Complex scene understanding
Analyze high-resolution images containing rich details
Maintain high accuracy in scenes with rich details
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase