3

360VL 70B

Developed by qihoo360
360VL is an open-source large multimodal model developed based on the LLama3 language model, featuring powerful image understanding and bilingual text support capabilities.
Downloads 103
Release Time : 5/16/2024

Model Overview

360VL is the industry's first open-source large multimodal model based on LLama3-70B, featuring a globally aware multi-branch projector architecture that supports multi-round image-text dialogues and fine-grained image parsing.

Model Features

Multi-Round Image-Text Dialogue
Supports text and images as input and generates text output, enabling multi-round visual Q&A with a single image.
Bilingual Text Support
Supports Chinese and English dialogues, including text recognition in images.
Powerful Image Understanding
Excels at analyzing visual content, efficiently completing tasks such as image information extraction, organization, and summarization.
Fine-Grained Image Parsing
Supports higher-resolution image understanding at 672×672.

Model Capabilities

Visual Question Answering
Image Content Analysis
Chinese-English Text Generation
Image Information Extraction
Multi-Round Dialogue

Use Cases

Visual Question Answering
Image Content Q&A
Users upload an image and ask questions, and the model answers questions about the image content.
Accurately identifies objects, scenes, and text information in images.
Image Analysis
Image Information Extraction
Extracts key information from images and summarizes it.
Efficiently completes the extraction and organization of image information.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase