Q

Qwen3 4B Base

Developed by Qwen
Qwen3-4B-Base is the latest generation of the Qwen series' 4-billion-parameter large language model, pre-trained on 36 trillion tokens of multilingual data, supporting a 32k context length.
Downloads 50.84k
Release Time : 4/28/2025

Model Overview

Qwen3-4B-Base is a causal language model focused on general language understanding and generation tasks, suitable for various scenarios such as text generation and code completion.

Model Features

Large-scale multilingual pre-training
Pre-trained on 36 trillion tokens of data covering 119 languages, with language coverage three times that of the previous generation.
Three-stage training optimization
Adopts a three-stage pre-training paradigm: general language modeling → specialized capability enhancement → long-context training.
Long-context support
Supports processing ultra-long contexts of up to 32k tokens.
Efficient attention mechanism
Utilizes Grouped Query Attention (GQA) architecture with 32 query heads and 8 key-value heads.

Model Capabilities

Text generation
Multilingual understanding
Code completion
Logical reasoning
Long-text processing

Use Cases

Natural Language Processing
Multilingual text generation
Generates coherent text content in multiple languages.
Supports fluent generation in 119 languages.
Technical document processing
Handles technical documents and code in STEM fields.
Optimized for code and STEM-related data.
Development Assistance
Code completion
Assists programmers in writing and completing code.
Increased proportion of code-related data in pre-training.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase