🚀 German Zero-Shot Classification Model
This German zero-shot classification model is a fine - tuned version of deepset/gbert-large on the facebook/xnli German dataset. It provides an efficient way to classify text without the need for task - specific training data.
🚀 Quick Start
This German zero - shot classification model is a fine - tuned version of deepset/gbert-large on the Facebook/XNLI German dataset. It achieves the following results on the evaluation set:
- Loss: 0.4592
- Accuracy: 0.8486
✨ Features
- Zero - Shot Classification: Capable of classifying text into predefined labels without task - specific training.
- German Language Support: Specifically fine - tuned for German text, making it suitable for German - speaking applications.
📦 Installation
The model can be used with the transformers
library. You can install it using the following command:
pip install transformers
💻 Usage Examples
Basic Usage
import torch
from transformers import pipeline
pipe = pipeline(
"zero-shot-classification",
model="kaixkhazaki/german-zeroshot",
tokenizer="kaixkhazaki/german-zeroshot",
device=0 if torch.cuda.is_available() else -1
)
sequence = "Können Sie mir die Schritte zur Konfiguration eines VPN auf einem Linux-Server erklären?"
candidate_labels = [
"Technische Dokumentation",
"IT-Support",
"Netzwerkadministration",
"Linux-Konfiguration",
"VPN-Setup"
]
print(pipe(sequence,candidate_labels))
Advanced Usage
sequence = "Wie lautet die Garantiezeit für dieses Produkt?"
candidate_labels = [
"Garantiebedingungen",
"Kundendienst",
"Produktdetails",
"Reklamation",
"Kaufberatung"
]
print(pipe(sequence,candidate_labels))
📚 Documentation
Training and Evaluation Data
The model is fine - tuned on the Facebook/XNLI German dataset. However, more detailed information about the dataset is yet to be provided.
Training Procedure
Training Hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e - 05
- train_batch_size: 64
- eval_batch_size: 32
- seed: 42
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon = 1e - 08 and optimizer_args = No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 500
- num_epochs: 3
Training Results
Training Loss |
Epoch |
Step |
Validation Loss |
Accuracy |
F1 |
Precision |
Recall |
0.6429 |
0.1630 |
1000 |
0.5203 |
0.8004 |
0.8006 |
0.8009 |
0.8004 |
0.5715 |
0.3259 |
2000 |
0.5209 |
0.7964 |
0.7968 |
0.8005 |
0.7964 |
0.5897 |
0.4889 |
3000 |
0.5435 |
0.7924 |
0.7940 |
0.8039 |
0.7924 |
0.5701 |
0.6519 |
4000 |
0.5242 |
0.7880 |
0.7884 |
0.8078 |
0.7880 |
0.5238 |
0.8149 |
5000 |
0.4816 |
0.8233 |
0.8226 |
0.8263 |
0.8233 |
0.5285 |
0.9778 |
6000 |
0.4483 |
0.8265 |
0.8273 |
0.8303 |
0.8265 |
0.4302 |
1.1408 |
7000 |
0.4751 |
0.8209 |
0.8214 |
0.8277 |
0.8209 |
0.4163 |
1.3038 |
8000 |
0.4560 |
0.8285 |
0.8289 |
0.8344 |
0.8285 |
0.3942 |
1.4668 |
9000 |
0.4330 |
0.8414 |
0.8422 |
0.8454 |
0.8414 |
0.3875 |
1.6297 |
10000 |
0.4171 |
0.8430 |
0.8432 |
0.8455 |
0.8430 |
0.3639 |
1.7927 |
11000 |
0.4194 |
0.8442 |
0.8447 |
0.8487 |
0.8442 |
0.3768 |
1.9557 |
12000 |
0.4215 |
0.8474 |
0.8477 |
0.8492 |
0.8474 |
0.2443 |
2.1186 |
13000 |
0.4750 |
0.8390 |
0.8398 |
0.8452 |
0.8390 |
0.2404 |
2.2816 |
14000 |
0.4592 |
0.8486 |
0.8487 |
0.8505 |
0.8486 |
0.2154 |
2.4446 |
15000 |
0.4914 |
0.8418 |
0.8424 |
0.8466 |
0.8418 |
0.2157 |
2.6076 |
16000 |
0.4804 |
0.8454 |
0.8458 |
0.8488 |
0.8454 |
0.2249 |
2.7705 |
17000 |
0.4809 |
0.8466 |
0.8471 |
0.8507 |
0.8466 |
0.2204 |
2.9335 |
18000 |
0.4777 |
0.8466 |
0.8470 |
0.8502 |
0.8466 |
Framework versions
- Transformers 4.48.0.dev0
- Pytorch 2.4.1+cu121
- Datasets 3.1.0
- Tokenizers 0.21.0
📄 License
This model is released under the MIT license.