M

Multilingual MiniLM L12 H384

Developed by microsoft
MiniLM is a compact and efficient pre-trained language model that compresses Transformer models through deep self-attention distillation technology, supporting multilingual understanding and generation tasks.
Downloads 28.51k
Release Time : 3/2/2022

Model Overview

MiniLM is a lightweight multilingual model based on the Transformer architecture. It retains the performance of the original large model through knowledge distillation technology while significantly reducing the parameter size, suitable for cross-lingual text classification, question answering, and other tasks.

Model Features

Efficient knowledge distillation
Compresses the original Transformer model through deep self-attention distillation technology while retaining core language understanding capabilities.
Multilingual support
Supports cross-lingual transfer learning for 16 languages, using the same tokenizer as XLM-R.
Lightweight architecture
Only 12 Transformer layers with 384 hidden units, significantly smaller in parameter size compared to similar multilingual models.

Model Capabilities

Cross-lingual text classification
Cross-lingual question answering
Natural language inference
Multilingual text understanding

Use Cases

Cross-lingual text classification
XNLI cross-lingual natural language inference
Transfer English-trained models to 15 other languages for textual entailment judgment
Achieves an average accuracy of 71.1% on the XNLI benchmark, outperforming mBERT models of similar size.
Question answering systems
MLQA cross-lingual question answering
Transfer English-trained QA models to other languages
Achieves an F1 score of 63.2% on the MLQA benchmark, approaching the performance of the larger XLM-R Base model.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase