Klue-roberta-small-3i4k Open-source Korean Intent Classification Model

Klue Roberta Small 3i4k Intent Classification

Developed by bespin-global

A Korean intent classification model fine-tuned based on KLUE's RoBERTa-small model, designed to recognize 7 different intent types.

Text Classification

Transformers

Korean#Korean Intent Classification #High-precision Intent Recognition #Multi-category Dialogue Analysis

Downloads 362

Release Time : 3/2/2022

Model Overview

This pre-trained model is fine-tuned for Korean text intent classification tasks, capable of identifying 7 distinct intent types including statements, questions, commands, etc.

Model Features

Korean-specific

Intent classification model specifically optimized for Korean text

Multi-intent Recognition

Capable of recognizing 7 different intent types

Efficient Fine-tuning

Based on RoBERTa-small architecture, reducing computational resource requirements while maintaining performance

Model Capabilities

Korean Text Classification

Intent Recognition

Natural Language Understanding

Use Cases

Dialogue Systems

Chatbot Intent Recognition

Identify user input intent types such as questions, commands, etc.

90% accuracy

Customer Service

Automated Customer Support System

Classify customer request types to route to appropriate processing workflows

🚀 Fine-tuning Intent Classification Model

This project focuses on fine-tuning a pre-trained model for intent classification using the klue/roberta-small model and the 3i4k dataset. It provides a detailed process of fine-tuning, usage examples, evaluation metrics, and citation information.

🚀 Quick Start

Prerequisites

Ensure you have Python and necessary libraries installed.
You can install required libraries using pip install transformers keras.

Fine-tuning Steps

Prepare the pre-trained model: Use the klue/roberta-small model from KLUE-benchmark.
Prepare the dataset: Use the 3i4k dataset from 3i4k.
Set the training parameters: Adjust the parameters according to your needs.

✨ Features

Fine-tuning: Fine-tune a pre-trained model for intent classification.
Usage example: Provide a Python code example for using the fine-tuned model.
Evaluation: Present evaluation metrics to measure the performance of the model.

📦 Installation

No specific installation steps are provided in the original document. If you want to use the fine-tuned model, you need to install the transformers library:

pip install transformers

💻 Usage Examples

Basic Usage

from transformers import RobertaTokenizerFast, RobertaForSequenceClassification, TextClassificationPipeline

# Load fine-tuned model by HuggingFace Model Hub
HUGGINGFACE_MODEL_PATH = "bespin-global/klue-roberta-small-3i4k-intent-classification"
loaded_tokenizer = RobertaTokenizerFast.from_pretrained(HUGGINGFACE_MODEL_PATH )
loaded_model = RobertaForSequenceClassification.from_pretrained(HUGGINGFACE_MODEL_PATH )

# using Pipeline
text_classifier = TextClassificationPipeline(
    tokenizer=loaded_tokenizer, 
    model=loaded_model, 
    return_all_scores=True
)

# predict
text = "your text"

preds_list = text_classifier(text)
best_pred = preds_list[0]
print(f"Label of Best Intentatioin: {best_pred['label']}")
print(f"Score of Best Intentatioin: {best_pred['score']}")

📚 Documentation

Fine-tuning Details

Pretrain Model: klue/roberta-small
Dataset for fine-tuning: 3i4k
- Train: 46,863
- Validation: 8,271 (15% of Train)
- Test: 6,121
Label info
- 0: "fragment",
- 1: "statement",
- 2: "question",
- 3: "command",
- 4: "rhetorical question",
- 5: "rhetorical command",
- 6: "intonation-dependent utterance"
Parameters of Training

{
    "epochs": 3 (setting 10 but early stopped),
    "batch_size":32,
    "optimizer_class": "<keras.optimizer_v2.adam.Adam'>",
    "optimizer_params": {
        "lr": 5e-05
    },
    "min_delta": 0.01
}

Evaluation Metrics

                               precision    recall  f1-score   support

                      command       0.89      0.92      0.90      1296
                     fragment       0.98      0.96      0.97       600
intonation-depedent utterance       0.71      0.69      0.70       327
                     question       0.95      0.97      0.96      1786
           rhetorical command       0.87      0.64      0.74       108
          rhetorical question       0.61      0.63      0.62       174
                    statement       0.91      0.89      0.90      1830

                     accuracy                           0.90      6121
                    macro avg       0.85      0.81      0.83      6121
                 weighted avg       0.90      0.90      0.90      6121

📄 License

This project is licensed under the CC BY-NC 4.0 license.

Citing & Authors

Jaehyeong at Bespin Global

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご