đ LLaMAntino-2-7b-ITA
LLaMAntino-2-7b-ITA is an Italian-adapted Large Language Model (LLM) based on LLaMA 2, aiming to offer a base model for Italian NLP researchers in natural language generation tasks.
đ Quick Start
Below is an example of how to use the LLaMAntino-2-7b-ITA model:
Basic Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "swap-uniba/LLaMAntino-2-7b-hf-ITA"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
prompt = "Scrivi qui un possibile prompt"
input_ids = tokenizer(prompt, return_tensors="pt").input_ids
outputs = model.generate(input_ids=input_ids)
print(tokenizer.batch_decode(outputs.detach().cpu().numpy()[:, input_ids.shape[1]:], skip_special_tokens=True)[0])
Advanced Usage
If you encounter issues when loading the model, you can try to load it quantized:
model = AutoModelForCausalLM.from_pretrained(model_id, load_in_8bit=True)
â ī¸ Important Note
The model loading strategy above requires the bitsandbytes and accelerate libraries
⨠Features
LLaMAntino-2-7b is an Italian-adapted LLaMA 2 Large Language Model (LLM). It is designed to provide Italian NLP researchers with a base model for natural language generation tasks.
đĻ Installation
The installation process mainly involves installing the necessary Python libraries. You can use pip
to install them:
pip install transformers bitsandbytes accelerate
đ Documentation
Model Information
Property |
Details |
Model Type |
LLaMA 2 |
Language(s) (NLP) |
Italian |
License |
Llama 2 Community License |
Finetuned from model |
meta-llama/Llama-2-7b-hf |
Developed by |
Pierpaolo Basile, Elio Musacchio, Marco Polignano, Lucia Siciliani, Giuseppe Fiameni, Giovanni Semeraro |
Funded by |
PNRR project FAIR - Future AI Research |
Compute infrastructure |
Leonardo supercomputer |
Training Details
The model was trained using QLora and the training data is clean_mc4_it medium. If you are interested in more details regarding the training procedure, you can find the code at the following link:
- Repository: https://github.com/swapUniba/LLaMAntino
â ī¸ Important Note
The code has not been released yet. We apologize for the delay, and it will be available asap!
đ License
Llama 2 is licensed under the LLAMA 2 Community License, Copyright Š Meta Platforms, Inc. All Rights Reserved. License
đ Citation
If you use this model in your research, please cite the following:
@misc{basile2023llamantino,
title={LLaMAntino: LLaMA 2 Models for Effective Text Generation in Italian Language},
author={Pierpaolo Basile and Elio Musacchio and Marco Polignano and Lucia Siciliani and Giuseppe Fiameni and Giovanni Semeraro},
year={2023},
eprint={2312.09993},
archivePrefix={arXiv},
primaryClass={cs.CL}
}