Model Selection

MoE architecture optimization

# MoE architecture optimization

Qwen 3 14b Drama

Qwen3-14B-Base is the latest generation of large language models in the Tongyi series, offering a comprehensive range of dense models and mixture-of-experts (MoE) models, and achieving significant progress in training data, model architecture, and optimization techniques.

Large Language Model

Qwen3-14B-Base is the latest generation of the Tongyi series of large language models, providing a comprehensive set of dense and mixture-of-experts (MoE) models with significant improvements in training data, model architecture, and optimization techniques.

Large Language Model

Qwen3 8B Base Bnb 4bit

Qwen3-8B-Base is the latest generation of large language models in the Qwen series. Based on 36 trillion tokens of multilingual pre-training data, it optimizes the model architecture and training techniques to provide an efficient and accurate language interaction experience.

Large Language Model

Qwen3 8B Base Unsloth Bnb 4bit

Qwen3-8B-Base is the latest generation of large language models in the Tongyi series, offering a comprehensive set of dense and mixture-of-experts (MoE) models based on significant improvements in training data, model architecture, and optimization techniques.

Large Language Model

Qwen3 1.7B Base

Qwen3-1.7B-Base is the latest generation of large language models in the Tongyi series, offering a range of dense models and mixture-of-experts (MoE) models. It has made significant improvements in training data, model architecture, and optimization techniques.

Large Language Model

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase