X

Xtremedistil L6 H256 Uncased

Developed by microsoft
XtremeDistilTransformers is a task-agnostic distilled Transformer model that utilizes task transfer learning techniques to train small general-purpose models suitable for various tasks and languages.
Downloads 3,816
Release Time : 3/2/2022

Model Overview

This model combines multi-task distillation techniques, featuring a 6-layer network structure with 384-dimensional hidden layers and 22 million parameters, achieving 5.3x speedup compared to BERT-base.

Model Features

Task-Agnostic Distillation
Trained using task transfer learning techniques, applicable to any task and language.
Efficient Compression
Achieves 5.3x speedup compared to BERT-base with 80% fewer parameters.
Multi-Task Distillation
Incorporates advanced distillation methods from both XtremeDistil and MiniLM papers.
High Performance
Performs excellently on benchmarks like GLUE and SQuAD-v2, approaching the performance of the original large models.

Model Capabilities

Text Classification
Question Answering Systems
Natural Language Understanding
Semantic Similarity Calculation

Use Cases

Natural Language Processing
Text Classification
Can be used for sentiment analysis, topic classification, and other tasks.
Achieves 92.3% accuracy on SST-2 sentiment analysis task.
Question Answering
Suitable for open-domain question answering tasks.
Achieves 76.6 F1 score on SQuAD-v2 question answering task.
Semantic Similarity
Can be used to determine the semantic similarity between two texts.
Achieves 91.0% accuracy on QQP semantic similarity task.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase