Q

Qwen3 1.7B Base

Developed by unsloth
Qwen3-1.7B-Base is the latest generation of large language models in the Tongyi series, offering a range of dense models and mixture-of-experts (MoE) models. It has made significant improvements in training data, model architecture, and optimization techniques.
Downloads 7,444
Release Time : 4/28/2025

Model Overview

Qwen3-1.7B-Base is a large language model with 1.7 billion parameters, focusing on language modeling and general knowledge acquisition. It supports long context understanding and multilingual processing.

Model Features

Expanded high-quality pre-training corpus
Pre-trained on 36 trillion tokens in 119 languages, with three times the language coverage of the previous generation, including high-quality data in multiple domains such as coding, STEM, reasoning, and books.
Three-stage pre-training
The first stage focuses on language modeling, the second stage improves reasoning ability, and the third stage extends the context length to 32k tokens, enhancing the ability to understand long texts.
Optimized training techniques
Techniques such as global batch load balancing loss and qk layer normalization are used to improve model stability and performance.
Hyperparameter adjustment based on scaling laws
Through comprehensive research on scaling laws, key parameters such as the learning rate scheduler and batch size are systematically adjusted to optimize training dynamics and final performance.

Model Capabilities

Text generation
Multilingual processing
Long context understanding
Logical reasoning
STEM problem solving
Code generation

Use Cases

Natural language processing
Multilingual text generation
Generate coherent text in multiple languages
Supports fluent generation in 119 languages
Long document understanding
Process and understand long documents up to 32k tokens
Effectively capture long-distance dependencies
Education
STEM problem solving
Answer questions related to science, technology, engineering, and mathematics
Accurate answers based on high-quality STEM data
Programming
Code generation and completion
Generate code or complete code snippets based on natural language descriptions
High-quality code generation based on a large amount of coding data
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase