# MoE architecture optimization
Qwen 3 14b Drama
Apache-2.0
Qwen3-14B-Base is the latest generation of large language models in the Tongyi series, offering a comprehensive range of dense models and mixture-of-experts (MoE) models, and achieving significant progress in training data, model architecture, and optimization techniques.
Large Language Model
Transformers

Q
float-trip
167
1
Qwen3 14B Base
Apache-2.0
Qwen3-14B-Base is the latest generation of the Tongyi series of large language models, providing a comprehensive set of dense and mixture-of-experts (MoE) models with significant improvements in training data, model architecture, and optimization techniques.
Large Language Model
Transformers

Q
unsloth
4,693
1
Qwen3 8B Base Bnb 4bit
Apache-2.0
Qwen3-8B-Base is the latest generation of large language models in the Qwen series. Based on 36 trillion tokens of multilingual pre-training data, it optimizes the model architecture and training techniques to provide an efficient and accurate language interaction experience.
Large Language Model
Transformers

Q
unsloth
1,406
1
Qwen3 8B Base Unsloth Bnb 4bit
Apache-2.0
Qwen3-8B-Base is the latest generation of large language models in the Tongyi series, offering a comprehensive set of dense and mixture-of-experts (MoE) models based on significant improvements in training data, model architecture, and optimization techniques.
Large Language Model
Transformers

Q
unsloth
6,214
1
Qwen3 1.7B Base
Apache-2.0
Qwen3-1.7B-Base is the latest generation of large language models in the Tongyi series, offering a range of dense models and mixture-of-experts (MoE) models. It has made significant improvements in training data, model architecture, and optimization techniques.
Large Language Model
Transformers

Q
unsloth
7,444
2
Featured Recommended AI Models