M

Miniplm Qwen 200M

Developed by MiniLLM
A 200M-parameter model based on the Qwen architecture, pretrained from scratch using the MiniPLM knowledge distillation framework
Downloads 203
Release Time : 10/17/2024

Model Overview

MiniPLM-Qwen-200M is a lightweight language model trained with knowledge distillation techniques, using Qwen1.5-1.8B as the teacher model, featuring efficient performance and good scalability.

Model Features

Knowledge Distillation Training
Utilizes the MiniPLM knowledge distillation framework to learn from the Qwen1.5-1.8B teacher model, achieving efficient knowledge transfer
Diverse Sampling Optimization
Employs a pretraining corpus optimized with diverse sampling to enhance training efficiency and model performance
High Computational Efficiency
Outperforms conventional pretraining methods under the same computational budget, with good scalability

Model Capabilities

Text Generation
Language Understanding

Use Cases

Natural Language Processing
Text Generation Applications
Can be used to generate coherent and meaningful text content
Language Model Research
Serves as a research benchmark for lightweight language models
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase