Miniplm Qwen 200M
A 200M-parameter model based on the Qwen architecture, pretrained from scratch using the MiniPLM knowledge distillation framework
Downloads 203
Release Time : 10/17/2024
Model Overview
MiniPLM-Qwen-200M is a lightweight language model trained with knowledge distillation techniques, using Qwen1.5-1.8B as the teacher model, featuring efficient performance and good scalability.
Model Features
Knowledge Distillation Training
Utilizes the MiniPLM knowledge distillation framework to learn from the Qwen1.5-1.8B teacher model, achieving efficient knowledge transfer
Diverse Sampling Optimization
Employs a pretraining corpus optimized with diverse sampling to enhance training efficiency and model performance
High Computational Efficiency
Outperforms conventional pretraining methods under the same computational budget, with good scalability
Model Capabilities
Text Generation
Language Understanding
Use Cases
Natural Language Processing
Text Generation Applications
Can be used to generate coherent and meaningful text content
Language Model Research
Serves as a research benchmark for lightweight language models
Featured Recommended AI Models
Š 2025AIbase