🚀 DialoGPT-medium-spanish-chitchat
This is a transformer-decoder GPT-2 model adapted for single-turn dialogue tasks in Spanish, offering professional-style conversations.
🚀 Quick Start
This is a transformer-decoder GPT-2 model, adapted for the single-turn dialogue task in Spanish. We fine-tuned a DialoGPT-medium 345M parameter model from Microsoft, following the CLM (Causal Language Modelling) objective.
✨ Features
- Adapted for single - turn dialogue in Spanish.
- Fine - tuned from a 345M parameter DialoGPT - medium model.
📦 Installation
No specific installation steps are provided in the original document, so this section is skipped.
💻 Usage Examples
Basic Usage
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
CHAT_TURNS = 5
MAX_LENGTH = 1000
model = AutoModelForCausalLM.from_pretrained('ITG/DialoGPT-medium-spanish-chitchat')
tokenizer = AutoTokenizer.from_pretrained('ITG/DialoGPT-medium-spanish-chitchat')
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model.to(device)
for i in range(CHAT_TURNS):
user_input = input(f"Step - {i} >> user prompt ->")
with torch.no_grad():
user_inputs_ids = tokenizer.encode(user_input + tokenizer.eos_token, return_tensors="pt")
user_inputs_ids = user_inputs_ids.to(device)
chat_history = model.generate(user_inputs_ids, max_length=MAX_LENGTH, pad_token_id=tokenizer.eos_token_id)
step_model_answer = tokenizer.decode(chat_history[:, user_inputs_ids.shape[-1]:][0], skip_special_tokens=True)
print(f"Step - {i} >> DialoGPT-spanish model answer -> {step_model_answer}")
Advanced Usage
No advanced usage example is provided in the original document, so this part is skipped.
📚 Documentation
Examples
Single - turn generation examples from an interactive environment
Role |
Response |
User |
Hola, buenos dias |
Bot |
Buenos días |
User |
¿Qué tal estás? |
Bot |
Fenomenal, gracias. |
User |
¡Tengo hambre! |
Bot |
Quizá picar algo te ayude. |
User |
Estoy triste |
Bot |
Siento escuchar eso. |
User |
¿Tienes personalidad? |
Bot |
Soy un bot. |
User |
¡Cuenta un chiste! |
Bot |
En realidad no tengo tanta gracia. |
User |
Jajajajajajajaja |
Bot |
¡Me alegro de que te rías! |
Fine - tuning hyper - parameters
Property |
Details |
Validation partition (%) |
20% |
Training batch size |
8 |
Learning rate |
5e - 4 |
Max training epochs |
20 |
Warmup training steps (%) |
6% |
Weight decay |
0.01 |
Optimiser (beta1, beta2, epsilon) |
AdamW (0.9, 0.999, 1e - 08) |
Monitoring metric (delta, patience) |
Validation loss (0.1, 3) |
Fine - tuning in a different dataset or style
If you want to fine - tune your own dialogue model, we recommend you to start from the DialoGPT model. You can check the original GitHub repository.
🔧 Technical Details
- This model uses the original English - based tokenizer from GPT - 2 paper. Spanish tokenization is not considered but it has similarities in grammatical structure for encoding text. This overlap may help the model transfer its knowledge from English to Spanish. Moreover, the BPE (Byte Pair Encoding) implementation of the GPT - 2 tokenizer can assign a representation to every Unicode string.
From the GPT - 2 paper:
Since our approach can assign a probability to any Unicode string, this allows us to evaluate our LMs on any dataset regardless of pre - processing, tokenization, or vocab size.
- This model is intended to be used just for single - turn chitchat conversations in Spanish.
- This model's generation capabilities are limited to the extent of the aforementioned fine - tuning dataset.
- This model generates short answers, providing general context dialogue in a professional style for the Spanish language.
📄 License
This model is licensed under cc - by - nc - nd - 4.0.