LLaMAntino-2-7b-hf-ITA Open Source Large Language Model - Free Deployment, Focusing on Italian Text Generation

Home

Llamantino 2 7b Hf ITA

Developed by swap-uniba

An Italian-adapted large language model based on LLaMA 2, focused on Italian text generation tasks

Large Language Model

Transformers

Other#Italian Language Generation #QLoRA Fine-tuning #Low-resource Optimization

Downloads 4,696

Release Time : 12/14/2023

Model Overview

LLaMAntino-2-7b is a text generation model optimized for Italian, trained using QLora technology, designed to provide foundational model support for Italian NLP researchers

Model Features

Italian Language Optimization

Specifically adapted and trained for Italian, optimizing Italian text generation capabilities

QLora Fine-tuning Technology

Utilizes efficient QLora technology for fine-tuning, reducing resource requirements while maintaining performance

High-quality Training Data

Trained on the clean_mc4_it medium dataset to ensure data quality

Model Capabilities

Italian Text Generation

Natural Language Understanding

Context-aware Generation

Use Cases

Academic Research

Italian NLP Research

Provides foundational models for Italian natural language processing research

Content Creation

Italian Content Generation

Automatically generates Italian articles, stories, and other content

🚀 LLaMAntino-2-7b-ITA

LLaMAntino-2-7b-ITA is an Italian-adapted Large Language Model (LLM) based on LLaMA 2, aiming to offer a base model for Italian NLP researchers in natural language generation tasks.

🚀 Quick Start

Below is an example of how to use the LLaMAntino-2-7b-ITA model:

Basic Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "swap-uniba/LLaMAntino-2-7b-hf-ITA"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

prompt = "Scrivi qui un possibile prompt"

input_ids = tokenizer(prompt, return_tensors="pt").input_ids
outputs = model.generate(input_ids=input_ids)

print(tokenizer.batch_decode(outputs.detach().cpu().numpy()[:, input_ids.shape[1]:], skip_special_tokens=True)[0])

Advanced Usage

If you encounter issues when loading the model, you can try to load it quantized:

model = AutoModelForCausalLM.from_pretrained(model_id, load_in_8bit=True)

⚠️ Important Note

The model loading strategy above requires the bitsandbytes and accelerate libraries

✨ Features

LLaMAntino-2-7b is an Italian-adapted LLaMA 2 Large Language Model (LLM). It is designed to provide Italian NLP researchers with a base model for natural language generation tasks.

📦 Installation

The installation process mainly involves installing the necessary Python libraries. You can use pip to install them:

pip install transformers bitsandbytes accelerate

📚 Documentation

Model Information

Property	Details
Model Type	LLaMA 2
Language(s) (NLP)	Italian
License	Llama 2 Community License
Finetuned from model	meta-llama/Llama-2-7b-hf
Developed by	Pierpaolo Basile, Elio Musacchio, Marco Polignano, Lucia Siciliani, Giuseppe Fiameni, Giovanni Semeraro
Funded by	PNRR project FAIR - Future AI Research
Compute infrastructure	Leonardo supercomputer

Training Details

The model was trained using QLora and the training data is clean_mc4_it medium. If you are interested in more details regarding the training procedure, you can find the code at the following link:

Repository: https://github.com/swapUniba/LLaMAntino

⚠️ Important Note

The code has not been released yet. We apologize for the delay, and it will be available asap!

📄 License

📚 Citation

If you use this model in your research, please cite the following:

@misc{basile2023llamantino,
      title={LLaMAntino: LLaMA 2 Models for Effective Text Generation in Italian Language}, 
      author={Pierpaolo Basile and Elio Musacchio and Marco Polignano and Lucia Siciliani and Giuseppe Fiameni and Giovanni Semeraro},
      year={2023},
      eprint={2312.09993},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご