zefiro-7b-beta-ITA-v0.1 Open Source Model - High-Quality Datasets and Models Specifically Designed for Italian

Zefiro 7b Beta ITA V0.1

Developed by giux78

Zefiro is a Mistral-based Italian SFT fine-tuned model, designed to create open-source models and datasets suitable for the Italian language.

Large Language Model

Transformers

OtherOpen Source License:Apache-2.0 #Italian language generation #Dialogue fine-tuning #Mistral derivative

Downloads 2,629

Release Time : 1/9/2024

Model Overview

Zefiro is the result of porting the Zephyr model to Italian, using the alignment-handbook recipe and drawing insights from the Llamantino model. This model is suitable for Italian-specific dialogue tasks.

Model Features

Italian language optimization

Specifically fine-tuned for Italian, suitable for Italian dialogue tasks.

Open-source model

Based on the open-source models Mistral and Zephyr, following the Apache 2.0 license.

Synthetic data training

Trained using a filtered and preprocessed version of the UltraChat-ITA dataset.

Model Capabilities

Italian text generation

Dialogue task handling

Multi-turn dialogue support

Use Cases

Dialogue systems

Meal plan generation

Generate a weekly lunch and dinner meal list.

Political opinion analysis

Answer questions about Italian politics.

🚀 Zefiro-7B-Beta-ITA-v0.1

Zefiro-7B-Beta-ITA-v0.1 is a fine - tuned model for the Italian language. It combines techniques from multiple models and the open - source community, aiming to provide high - quality language processing capabilities for Italian tasks.

🚀 Quick Start

Here's how you can run the model using Transformers from 🤗 :

# Install transformers from source - only needed for versions <= v4.34
# pip install git+https://github.com/huggingface/transformers.git
# pip install accelerate
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "giux78/zefiro-7b-beta-ITA-v0.1"
model = AutoModelForCausalLM.from_pretrained(model_id)
model.to('cuda')
tokenizer = AutoTokenizer.from_pretrained(model_id, padding_side="left")


sys_prompt = "Sei un assistente disponibile, rispettoso e onesto. " \
         "Rispondi sempre nel modo piu' utile possibile, pur essendo sicuro. " \
         "Le risposte non devono includere contenuti dannosi, non etici, razzisti, sessisti, tossici, pericolosi o illegali. " \
         "Assicurati che le tue risposte siano socialmente imparziali e positive. " \
         "Se una domanda non ha senso o non e' coerente con i fatti, spiegane il motivo invece di rispondere in modo non corretto. " \
         "Se non conosci la risposta a una domanda, non condividere informazioni false."

messages = [{ 'content' : sys_prompt, 'role' : 'assistant'}, 
            {'content' : 'Crea una lista su cosa mangiare a pranzo ogni giorno della settimana a pranzo e cena', 'role' : 'user'}]


def generate_text(sys_prompt, user_prompt):
    messages = [{ 'content' : sys_prompt, 'role' : 'assistant'}, 
            {'content' : user_prompt, 'role' : 'user'}]
    prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
    model_inputs = tokenizer([prompt], return_tensors="pt").to("cuda")
    generated_ids = model.generate(**model_inputs, max_new_tokens=1024)
    return tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]


generate_text(sys_prompt, 'cosa ne pensi della politica italiana?')

✨ Features

Based on Multiple Models: Zefiro is a porting of the Zephyr model to the Italian language, inspired by Llamantino and combined with different approaches from the open - source community.
Italian Language Focus: Primarily designed for the Italian language, fine - tuned on a mix of publicly available, synthetic datasets.

📚 Documentation

Model Details

Zefiro is a porting of the Zephyr model to the Italian language using the recipes from [alignment - handbook](https://huggingface.co/alignment - handbook). It also takes inspiration from the [Llamantino](https://huggingface.co/swap - uniba/LLaMAntino - 2 - chat - 7b - hf - UltraChat - ITA) model developed by Università di Bari. For implementation, different approaches from these two models and the open - source community are combined.

Model description

Property	Details
Model Type	A 7B parameter GPT - like model fine - tuned on a mix of publicly available, synthetic datasets.
Language(s) (NLP)	Primarily Italian
License	Apache 2
Finetuned from model	[mistralai/Mistral - 7B - v0.1](https://huggingface.co/mistralai/Mistral - 7B - v0.1)
Developed by	giux78
Funded by	Business Operating System

Intended uses & limitations

The model was initially fine - tuned on a filtered and preprocessed version of [UltraChat - ITA](https://huggingface.co/datasets/giux78/100k - sft - ready - ultrafeedback - ita), which is a filtered version of the UltraChat dataset containing diverse synthetic dialogues generated by ChatGPT.

Bias, Risks, and Limitations

Zefiro - 7b - beta - ITA - v0.1 has not been aligned to human preferences for safety within the RLHF phase or deployed with in - the - loop filtering of responses like ChatGPT. So, the model can produce problematic outputs (especially when prompted to do so). It is also unknown what the size and composition of the corpus used to train the base model (mistralai/Mistral - 7B - v0.1) were, but it likely included a mix of Web data and technical sources like books and code. See the [Falcon 180B model card](https://huggingface.co/tiiuae/falcon - 180B#training - data) for an example.

Training Data

We used [UltraChat - ITA](https://huggingface.co/datasets/giux78/100k - sft - ready - ultrafeedback - ita) as training data, which is a filtered version of the UltraChat. For translating the dataset, different tools and APIs were combined, and we are still evaluating the best approach for translating more datasets. The translation phase is critical and can introduce incorrect syntax and semantics.

Summary

Zefiro - 7b - beta - ITA - v0.1 is a finetuned version of mistral - 7b using the zephyr approach for the Italian language.

📄 License

The model is licensed under Apache 2.

📚 Citation

@misc{tunstall2023zephyr,
      title={Zephyr: Direct Distillation of LM Alignment}, 
      author={Lewis Tunstall and Edward Beeching and Nathan Lambert and Nazneen Rajani and Kashif Rasul and Younes Belkada and Shengyi Huang and Leandro von Werra and Clémentine Fourrier and Nathan Habib and Nathan Sarrazin and Omar Sanseviero and Alexander M. Rush and Thomas Wolf},
      year={2023},
      eprint={2310.16944},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

@misc{basile2023llamantino,
      title={LLaMAntino: LLaMA 2 Models for Effective Text Generation in Italian Language}, 
      author={Pierpaolo Basile and Elio Musacchio and Marco Polignano and Lucia Siciliani and Giuseppe Fiameni and Giovanni Semeraro},
      year={2023},
      eprint={2312.09993},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Model Card Authors

giux78

Model Card Contact

ale.ercolani@gmail.com

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご