A

Aya Vision 8b

Developed by CohereLabs
Aya Vision 8B is an open-weight 8-billion-parameter multilingual vision-language model supporting visual and language tasks in 23 languages.
Downloads 29.94k
Release Time : 3/2/2025

Model Overview

A multilingual model optimized for various vision-language applications, including OCR, image captioning, visual reasoning, summarization, Q&A, coding, and more.

Model Features

Multilingual support
Supports visual and language task processing in 23 languages
Efficient visual processing
Uses 169 visual tokens to encode 364x364 pixel image patches, supporting up to 2197 image tokens
Long context support
Supports context lengths of up to 16K
Open weights
Provides an open-weight 8-billion-parameter version for research use

Model Capabilities

Image text recognition (OCR)
Image caption generation
Visual reasoning
Multilingual text generation
Image Q&A
Multimodal summarization

Use Cases

Multilingual applications
Multilingual image captioning
Generate descriptive text for images in different languages
Supports accurate descriptions in 23 languages
Cross-language visual Q&A
Ask questions about image content in different languages
Accurately understands and responds in the corresponding language
Document processing
Multilingual OCR
Recognize multilingual text in images
High-precision recognition of text in 23 languages
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase