đ Fine-tuning Intent Classification Model
This project focuses on fine-tuning a pre-trained model for intent classification using the klue/roberta-small
model and the 3i4k
dataset. It provides a detailed process of fine-tuning, usage examples, evaluation metrics, and citation information.
đ Quick Start
Prerequisites
- Ensure you have Python and necessary libraries installed.
- You can install required libraries using
pip install transformers keras
.
Fine-tuning Steps
- Prepare the pre-trained model: Use the
klue/roberta-small
model from KLUE-benchmark.
- Prepare the dataset: Use the
3i4k
dataset from 3i4k.
- Set the training parameters: Adjust the parameters according to your needs.
⨠Features
- Fine-tuning: Fine-tune a pre-trained model for intent classification.
- Usage example: Provide a Python code example for using the fine-tuned model.
- Evaluation: Present evaluation metrics to measure the performance of the model.
đĻ Installation
No specific installation steps are provided in the original document. If you want to use the fine-tuned model, you need to install the transformers
library:
pip install transformers
đģ Usage Examples
Basic Usage
from transformers import RobertaTokenizerFast, RobertaForSequenceClassification, TextClassificationPipeline
HUGGINGFACE_MODEL_PATH = "bespin-global/klue-roberta-small-3i4k-intent-classification"
loaded_tokenizer = RobertaTokenizerFast.from_pretrained(HUGGINGFACE_MODEL_PATH )
loaded_model = RobertaForSequenceClassification.from_pretrained(HUGGINGFACE_MODEL_PATH )
text_classifier = TextClassificationPipeline(
tokenizer=loaded_tokenizer,
model=loaded_model,
return_all_scores=True
)
text = "your text"
preds_list = text_classifier(text)
best_pred = preds_list[0]
print(f"Label of Best Intentatioin: {best_pred['label']}")
print(f"Score of Best Intentatioin: {best_pred['score']}")
đ Documentation
Fine-tuning Details
- Pretrain Model: klue/roberta-small
- Dataset for fine-tuning: 3i4k
- Train: 46,863
- Validation: 8,271 (15% of Train)
- Test: 6,121
- Label info
- 0: "fragment",
- 1: "statement",
- 2: "question",
- 3: "command",
- 4: "rhetorical question",
- 5: "rhetorical command",
- 6: "intonation-dependent utterance"
- Parameters of Training
{
"epochs": 3 (setting 10 but early stopped),
"batch_size":32,
"optimizer_class": "<keras.optimizer_v2.adam.Adam'>",
"optimizer_params": {
"lr": 5e-05
},
"min_delta": 0.01
}
Evaluation Metrics
precision recall f1-score support
command 0.89 0.92 0.90 1296
fragment 0.98 0.96 0.97 600
intonation-depedent utterance 0.71 0.69 0.70 327
question 0.95 0.97 0.96 1786
rhetorical command 0.87 0.64 0.74 108
rhetorical question 0.61 0.63 0.62 174
statement 0.91 0.89 0.90 1830
accuracy 0.90 6121
macro avg 0.85 0.81 0.83 6121
weighted avg 0.90 0.90 0.90 6121
đ License
This project is licensed under the CC BY-NC 4.0 license.
Citing & Authors
Jaehyeong at Bespin Global