S

Stockmark 2 VL 100B Beta

Developed by stockmark
Stockmark-2-VL-100B-beta is a Japanese-specific vision-language model with 100 billion parameters, equipped with chain-of-thought (CoT) reasoning ability and can be used for document reading and comprehension.
Downloads 184
Release Time : 5/27/2025

Model Overview

This model is optimized for Japanese scenarios, combining image and text information to achieve richer interactions, suitable for tasks such as Japanese document reading and comprehension.

Model Features

Japanese optimization
Designed specifically for Japanese scenarios and optimized for tasks such as Japanese document reading and comprehension
Chain-of-thought reasoning
Equipped with CoT reasoning ability to improve the logic of document understanding and answering
Multimodal processing
Combining image and text information to achieve richer interactions
High-performance visual encoder
Using google/siglip2-so400m-patch14-384 as the visual encoder, with better multilingual performance

Model Capabilities

Document reading and comprehension
Visual question answering
Combined analysis of image and text
Multimodal reasoning

Use Cases

Business analysis
Business slide analysis
Understand the content of complex Japanese business slide images and answer questions
Scored 4.2 in the BusinessSlideVQA benchmark test, outperforming GPT-4o
Data visualization
Chart understanding
Analyze Japanese chart images and answer related questions
Achieved an accuracy of 0.88 in the JChartQA benchmark test
Document processing
Japanese document understanding
Read and understand the content of Japanese documents and answer questions
Scored 3.5 in the JDocQA benchmark test
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase