Q

Qwen3 0.6B Base

Developed by unsloth
Qwen3-0.6B-Base is the latest generation of large language models in the Tongyi Qianwen series, offering a range of dense models and Mixture of Experts (MoE) models.
Downloads 10.84k
Release Time : 4/28/2025

Model Overview

Qwen3-0.6B-Base is a pre-trained large language model focused on language modeling and general knowledge acquisition, supporting multiple languages and tasks.

Model Features

Expanded high-quality pre-training corpus
Pre-trained on 36 trillion tokens in 119 languages, with three times the language coverage of Qwen2.5 and containing more abundant high-quality data.
Improvements in training techniques and model architecture
Adopts global batch load balancing loss and qk layer normalization to improve stability and overall performance.
Three-stage pre-training
The first stage focuses on language modeling and general knowledge acquisition; the second stage improves inference ability; the third stage enhances long context understanding ability.
Hyperparameter adjustment based on scaling laws
Through comprehensive scaling law research, key hyperparameters are systematically adjusted to achieve better training dynamics and final performance.

Model Capabilities

Text generation
Language modeling
Multilingual support
Long context understanding
Logical reasoning
Coding support

Use Cases

Natural language processing
Text generation
Generate coherent and contextually relevant text.
High-quality natural language generation
Multilingual translation
Support translation tasks in multiple languages.
Extensive language coverage
Coding and STEM
Code generation
Generate code snippets or complete programming tasks.
Improve coding efficiency
Logical reasoning
Solve logical reasoning problems in the STEM field.
Enhanced reasoning ability
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase