M

Mengzi Oscar Base

Developed by Langboat
A Chinese multimodal pretraining model built on the Oscar framework, initialized with Mengzi-Bert base version, trained on 3.7 million image-text pairs.
Downloads 20
Release Time : 3/2/2022

Model Overview

The Mengzi-Oscar model is a Chinese-oriented multimodal pretraining model capable of handling joint understanding tasks of images and text, suitable for scenarios such as image-text matching and visual question answering.

Model Features

Multimodal Pretraining
Capable of processing both image and text information simultaneously to achieve cross-modal understanding
Chinese Optimization
Specifically optimized for Chinese scenarios, using Mengzi-Bert as the base model
Large-scale Training Data
Trained on 3.7 million Chinese image-text pairs, covering a wide range of scenarios

Model Capabilities

Image-text matching
Visual question answering
Cross-modal understanding
Chinese multimodal task processing

Use Cases

Intelligent Customer Service
Image-based Customer Service Q&A
Answer related questions based on images provided by users
Content Moderation
Image-Text Consistency Review
Check if the image content matches the description text
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase