Llama 3 8B Dutch

Developed by ReBatch

Dutch conversation model based on Llama 3 8B, optimized via ORPO method on Dutch feedback datasets

Large Language Model

Transformers

Other#Dutch conversation #ORPO optimization #QLORA fine-tuning

Downloads 47

Release Time : 4/21/2024

Model Overview

This is a Dutch conversation model developed based on Llama 3 8B, optimized via ORPO method on Dutch feedback datasets, suitable for Dutch conversation tasks.

Model Features

Dutch optimization

Specially optimized for Dutch, improving the quality and fluency of Dutch conversations

ORPO fine-tuning

Fine-tuned using ORPO method on Dutch feedback datasets to optimize model performance

QLORA training

Utilized QLORA technology for efficient fine-tuning, reducing training resource requirements

Model Capabilities

Dutch text generation

Dialogue systems

Natural language understanding

Use Cases

Dialogue systems

Dutch chatbot

Build a chatbot capable of fluent Dutch conversation

Delivers natural and fluent Dutch conversation experience

Language learning

Dutch learning assistant

Assists Dutch learners in practicing conversations and understanding the language

Provides accurate Dutch expressions and explanations

🚀 Llama 3 8B - Dutch

A conversational model for Dutch, based on Llama 3 8B, enabling natural and efficient communication in Dutch.

Llama 3 8B - Dutch

A conversational model for Dutch, based on Llama 3 8B

Try chatting with the model!

🚀 Quick Start

This model is a QLORA and ORPO fine - tuned version of meta-llama/Meta-Llama-3-8B on the synthetic feedback dataset BramVanroy/ultra_feedback_dutch

✨ Features

Dutch Chatting: Specifically designed for Dutch conversations, providing a smooth chatting experience in Dutch.
Fine - Tuned: Refined through ORPO on a feedback dataset.

📚 Documentation

Model description

This model is a Dutch chat model, originally developed from Llama 3 8B and further refined through a feedback dataset with ORPO and trained on BramVanroy/ultra_feedback_dutch

Intended uses & limitations

⚠️ Important Note

Although the model has been aligned with gpt - 4 - turbo output, which has strong content filters, the model could still generate wrong, misleading, and potentially even offensive content. Use at your own risk.

Training procedure

The model was trained in bfloat16 with QLORA with flash attention 2 on one GPU - H100 80GB SXM5 for around 24 hours on RunPod.

Evaluation Results

The model was evaluated using scandeval

The model showed mixed results across different benchmarks; it exhibited slight improvements on some while experiencing a decrease in scores on others. This occurred despite being trained on only 200,000 samples for a single epoch. We are curious to see whether its performance could be enhanced by training with more data or additional epochs.

Property	Details
Model Type	A QLORA and ORPO fine - tuned version of meta-llama/Meta-Llama-3-8B
Training Data	BramVanroy/ultra_feedback_dutch

Model	conll_nl	dutch_social	scala_nl	squad_nl	wiki_lingua_nl	mmlu_nl	hellaswag_nl
meta-llama/Meta-Llama-3-8B-Instruct	68.72	14.67	32.91	45.36	67.62	36.18	33.91
ReBatch/Llama-3-8B-dutch	58.85	11.14	15.58	59.96	64.51	36.27	28.34
meta-llama/Meta-Llama-3-8B	62.26	10.45	30.3	62.99	65.17	36.38	28.33

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 8e - 06
train_batch_size: 2
eval_batch_size: 2
num_devices: 1
gradient_accumulation_steps: 4
optimizer: paged_adamw_8bit
lr_scheduler_type: linear
warmup_steps: 10
num_epochs: 1.0
r: 16
lora_alpha: 32
lora_dropout: 0.05

📄 License

The model uses the llama3 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご