G

Granite Vision 3.3 2b Embedding

Developed by ibm-granite
An efficient embedding model built on granite-vision-3.3-2b, designed for multimodal document retrieval and capable of processing documents containing tables, charts, infographics, and complex layouts.
Downloads 205
Release Time : 6/3/2025

Model Overview

This model generates ColBERT-style multi-vector page representations without OCR-based text extraction, simplifying and accelerating the RAG pipeline.

Model Features

Multimodal document processing
Capable of processing documents containing tables, charts, infographics, and complex layouts
ColBERT-style representation
Generates ColBERT-style multi-vector representations of pages to improve retrieval efficiency
No OCR requirement
No OCR-based text extraction is required, simplifying the RAG pipeline
Efficient retrieval
Optimized for accelerating multimodal document retrieval

Model Capabilities

Multimodal document embedding
Image-text similarity calculation
Complex layout document processing
Cross-modal retrieval

Use Cases

Document retrieval
Financial report retrieval
Retrieve relevant information from financial reports containing tables and charts
NDCG@5 reaches 70 on the FinReport dataset
Technical document retrieval
Retrieve specific information from technical reports and slides
NDCG@5 reaches 84 and 93 on the TechReport and TechSlides datasets respectively
Cross-modal search
Image-text matching
Calculate the similarity between an image and a text description
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase