Maestrale-chat-v0.4-beta Open-source Italian Chat Model - Efficient Conversations through Large-scale Corpus Training

Home

Maestrale Chat V0.4 Beta

Developed by mii-llm

Italian chat model based on Mistral-7b, pre-trained and fine-tuned on a large Italian corpus

Large Language Model

Transformers

Other#Italian conversation #Multi-round SFT fine-tuning #DPO alignment

Downloads 6,555

Release Time : 6/6/2024

Model Overview

This is an Italian chat model based on the Mistral-7b architecture, specifically pre-trained and instruction fine-tuned for Italian, capable of dialogue, reasoning, and handling various professional tasks.

Model Features

Italian optimization

Extensively pre-trained and fine-tuned specifically for Italian

Multi-round dialogue capability

Two rounds of SFT fine-tuning on 1.7 million dialogue/instruction datasets

DPO alignment

Alignment training using multiple datasets via DPO

Multi-functional support

Supports various professional tasks like mind map generation, SQL conversion, and article writing

Model Capabilities

Italian conversation

Text generation

Instruction following

Mind map generation

SQL query conversion

Article writing

Mathematical reasoning

Latin translation

Poetry creation

Use Cases

Education

Language learning assistance

Helps students learning Italian with conversation practice

Latin translation

Provides translation services from Latin to Italian

Business

Database query

Converts natural language questions into SQL queries

Creative writing

Poetry creation

Generates Italian poetry based on themes

Article writing

Automatically generates complete articles based on table of contents

🚀 Maestrale chat beta ༄

Maestrale chat beta ༄ is a language model tailored for the Italian language, offering enhanced capabilities in various aspects such as truthfulness, math, and reasoning. It uses the ChatML prompt format and is suitable for a range of applications.

📦 Model Information

Property	Details
Model Name	maestrale-chat-v0.4-beta
Language	Italian
License	cc-by-nc-4.0
Tags	sft, it, mistral, chatml, axolotl
Prompt Template	<

Want to contribute? Please donate! This will let us work on better datasets and models!

✨ Features

Language Model: Mistral-7b for the Italian language, continued pre-training for Italian on a curated large-scale high-quality corpus, merged with occiglot.
Fine-Tuning: SFT performed on 1.7M convs/instructions for 2 epochs.
DPO: Aligned with DPO on multiple datasets.
v0.4 Enhancements: Agent, improved truthfullness, improved Math & Reasoning capabilities, Mermaid mindmaps, more latin translations, poems, etc.

📊 Scores

Tasks	Version	Filter	n-shot	Metric	Value		Stderr
hellaswag_it	1	none	0	acc	0.5270	±	0.0052
		none	0	acc_norm	0.7037	±	0.0048
arc_it	1	none	0	acc	0.1771	±	0.0112
		none	0	acc_norm	0.5218	±	0.0146
m_mmlu_it	0	none	5	acc	0.5623	±	0.0043

💻 Usage Examples

Basic Usage

from transformers import (
    AutoTokenizer, 
    AutoModelForCausalLM, 
    GenerationConfig,
    TextStreamer
)
import torch

tokenizer = AutoTokenizer.from_pretrained("mii-llm/maestrale-chat-v0.4-beta")
model = AutoModelForCausalLM.from_pretrained("mii-llm/maestrale-chat-v0.4-beta", load_in_8bit=True, device_map="auto")

gen = GenerationConfig(
    do_sample=True,
    temperature=0.7,
    repetition_penalty=1.2,
    top_k=50,
    top_p=0.95,
    max_new_tokens=500,
    pad_token_id=tokenizer.eos_token_id,
    eos_token_id=tokenizer.convert_tokens_to_ids("<|im_end|>")
)

streamer = TextStreamer(tokenizer, skip_prompt=True)

messages = [
    {"role": "system", "content": "Sei un assistente utile."},
    {"role": "user", "content": "{prompt}"}
]

with torch.no_grad():
    temp = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
    inputs = tokenizer(temp, return_tensors="pt").to("cuda")

    _ = model.generate(
        **inputs,
        streamer=streamer,
        generation_config=gen
    )

Advanced Usage

Mindmaps

messages = [
  {"role": "system", "content": "Fornisci una mindmap Mermaid sull'argomento in input."},
  {"role": "user", "content": "Argomento: [argomento]"}
]

SQL

schema = "[db schema]"
messages = [
  {"role": "system", "content": f"Sei un assistente SQL e il tuo compito è convertire la domanda dell'utente in codice SQL valido rispetto allo schema del database fornito.\n\nSchema:\n```sql\n{schema}\n```"},
  {"role": "user", "content": "Conta il numero di X prodotti dall'azienda Y"}
]

Article from index

messages = [
  {"role": "system", "content": "Sei un assistente utile."},
  {"role": "user", "content": (
    "Scrivi un articolo a partire dal titolo e dall'indice dei contenuti.\n\n"
    "Titolo: [titolo]\n\n"
    "Indice:\n\n"
    "1. Introduzione\n"
    "2. [heading]\n"
    "..."
  )}
]

⚠️ Intended uses & limitations

It's a beta version; it's quite safe, and it can refuse to answer to toxic questions.

📄 License

This model is licensed under cc-by-nc-4.0.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご