🚀 MiniLM-L6-mnli
这是一个用于文本分类和零样本分类的模型,基于MiniLM-L6架构,在MultiNLI数据集上训练,速度快但精度略逊于其他模型。
🚀 快速开始
本模型可用于文本分类和零样本分类任务,以下是使用示例:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model_name = "MoritzLaurer/MiniLM-L6-mnli"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
premise = "I liked the movie"
hypothesis = "The movie was good."
input = tokenizer(premise, hypothesis, truncation=True, return_tensors="pt")
output = model(input["input_ids"].to(device))
prediction = torch.softmax(output["logits"][0], -1).tolist()
label_names = ["entailment", "neutral", "contradiction"]
prediction = {name: round(float(pred) * 100, 1) for pred, name in zip(prediction, label_names)}
print(prediction)
✨ 主要特性
- 适用任务:适用于文本分类和零样本分类任务。
- 基础模型:基于微软的MiniLM-L6,速度快,但精度略低于其他模型。
📦 安装指南
文档未提及具体安装步骤,可参考Hugging Face相关库的安装方式来安装所需依赖。
💻 使用示例
基础用法
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model_name = "MoritzLaurer/MiniLM-L6-mnli"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
premise = "I liked the movie"
hypothesis = "The movie was good."
input = tokenizer(premise, hypothesis, truncation=True, return_tensors="pt")
output = model(input["input_ids"].to(device))
prediction = torch.softmax(output["logits"][0], -1).tolist()
label_names = ["entailment", "neutral", "contradiction"]
prediction = {name: round(float(pred) * 100, 1) for pred, name in zip(prediction, label_names)}
print(prediction)
📚 详细文档
训练数据
模型使用MultiNLI数据集进行训练。
训练过程
MiniLM-L6-mnli-binary使用Hugging Face的训练器进行训练,超参数如下:
training_args = TrainingArguments(
num_train_epochs=5, # total number of training epochs
learning_rate=2e-05,
per_device_train_batch_size=32, # batch size per device during training
per_device_eval_batch_size=32, # batch size for evaluation
warmup_ratio=0.1, # number of warmup steps for learning rate scheduler
weight_decay=0.06, # strength of weight decay
fp16=True # mixed precision training
)
评估结果
模型使用MultiNLI的(匹配)测试集进行评估,准确率为0.814。
🔧 技术细节
- 模型基于微软的MiniLM-L6架构,在MultiNLI数据集上进行训练。
- 使用Hugging Face的训练器进行训练,并设置了一系列超参数。
📄 许可证
文档未提及许可证相关信息。
引用信息
如果您想引用此模型,请引用原始的MiniLM论文、相应的NLI数据集,并包含此模型在Hugging Face hub上的链接。
属性 |
详情 |
模型类型 |
用于文本分类和零样本分类的模型 |
训练数据 |
MultiNLI |
⚠️ 重要提示
请参考原始的MiniLM论文和不同NLI数据集的相关文献,以了解潜在的偏差。