LLaMAntino-3-ANITA-8B-Inst-DPO-ITA開源大模型 - 優化意大利語NLP任務多語言支持

首頁

Llamantino 3 ANITA 8B Inst DPO ITA

由swap-uniba開發

LLaMAntino-3-ANITA是基於Meta Llama 3構建的多語言（英語+意大利語）大型語言模型，專為意大利語NLP任務優化。

大型語言模型

Transformers

支持多種語言#意大利語AI助手 #多語言指令微調 #DPO對齊優化

下載量 6,401

發布時間 : 4/29/2024

模型概述

該模型是Meta-Llama-3-8b-instruct的指令調優版本，採用DPO方法對齊人類偏好，適用於意大利語特定任務。

模型特點

多語言支持

特別優化了意大利語處理能力，同時保持英語能力

指令調優

使用監督微調(SFT)和DPO方法對齊人類偏好

高效推理

支持4bit量化，降低硬件需求

長上下文處理

支持8K(8192)的上下文長度

模型能力

意大利語文本生成

英語文本生成

指令理解與執行

問答系統

對話系統

使用案例

教育

意大利語學習助手

幫助學生學習和練習意大利語

提供準確的意大利語解釋和示例

客戶服務

意大利語客服機器人

處理意大利語客戶的諮詢

提供自然流暢的意大利語響應

🚀 LLaMAntino-3-ANITA-8B-Inst-DPO-ITA

LLaMAntino-3-ANITA-8B-Inst-DPO-ITA 是基於 Meta-Llama-3-8B-Instruct 微調的多語言模型，支持英語和意大利語，旨在為意大利 NLP 研究提供更好的模型。

llamantino3_anita

“Built with Meta Llama 3”.

🚀 快速開始

LLaMAntino-3-ANITA-8B-Inst-DPO-ITA 是 LLaMAntino 大語言模型家族的一員，它是 Meta-Llama-3-8b-instruct 的指令微調版本（基於 LLaMA 3 模型 微調）。該模型版本旨在成為一個 多語言模型 （英語 + 意大利語），以便在意大利語特定任務上進行進一步微調。

ANITA 項目 （Advanced Natural-based interaction for the Italian language）旨在為意大利 NLP 研究人員提供一個適用於意大利語用例的改進模型。

即時演示：https://chat.llamantino.it/ 僅支持意大利網絡連接。

✨ 主要特性

多語言支持：支持英語和意大利語。
微調優化：基於 Meta-Llama-3-8B-Instruct 進行指令微調，使用 QLoRA 4bit 進行監督微調（SFT），並採用 DPO 方法在特定數據集上進行對齊。
多種使用方式：可以直接使用，也可以通過 transformers 庫進行調用。

📦 安裝指南

使用 `transformers` 庫

首先，你需要通過以下命令使用 pip 安裝 transformers 及其相關依賴：

pip install -U transformers trl peft accelerate bitsandbytes

💻 使用示例

基礎用法

import torch
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
)

base_model = "swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA"
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained(base_model)

sys = "Sei un an assistente AI per la lingua Italiana di nome LLaMAntino-3 ANITA " \
    "(Advanced Natural-based interaction for the ITAlian language)." \
    " Rispondi nella lingua usata per la domanda in modo chiaro, semplice ed esaustivo."

messages = [
    {"role": "system", "content": sys},
    {"role": "user", "content": "Chi è Carlo Magno?"}
]

# Method 1
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False)
for k,v in inputs.items():
    inputs[k] = v.cuda()
outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, top_p=0.9, temperature=0.6)
results = tokenizer.batch_decode(outputs)[0]
print(results)

# Method 2
import transformers
pipe = transformers.pipeline(
    model=model,
    tokenizer=tokenizer,
    return_full_text=False,  # langchain expects the full text
    task='text-generation',
    max_new_tokens=512,  # max number of tokens to generate in the output
    temperature=0.6,  # temperature for more or less creative answers
    do_sample=True,
    top_p=0.9,
)

sequences = pipe(messages)
for seq in sequences:
    print(f"{seq['generated_text']}")

高級用法

使用 4bit 量化模型以減少資源需求：

import torch
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    BitsAndBytesConfig,
)

base_model = "swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA"
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=False,
)
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    quantization_config=bnb_config,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained(base_model)

sys = "Sei un an assistente AI per la lingua Italiana di nome LLaMAntino-3 ANITA " \
    "(Advanced Natural-based interaction for the ITAlian language)." \
    " Rispondi nella lingua usata per la domanda in modo chiaro, semplice ed esaustivo."

messages = [
    {"role": "system", "content": sys},
    {"role": "user", "content": "Chi è Carlo Magno?"}
]

# Method 1
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False)
for k,v in inputs.items():
    inputs[k] = v.cuda()
outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, top_p=0.9, temperature=0.6)
results = tokenizer.batch_decode(outputs)[0]
print(results)

# Method 2
import transformers
pipe = transformers.pipeline(
    model=model,
    tokenizer=tokenizer,
    return_full_text=False,  # langchain expects the full text
    task='text-generation',
    max_new_tokens=512,  # max number of tokens to generate in the output
    temperature=0.6,  # temperature for more or less creative answers
    do_sample=True,
    top_p=0.9,
)

sequences = pipe(messages)
for seq in sequences:
    print(f"{seq['generated_text']}")

📚 詳細文檔

模型詳情

最後更新時間：2024 年 5 月 10 日

GitHub 倉庫

模型	HF 鏈接	GGUF 鏈接	EXL2 鏈接
swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA	Link	Link	Link

規格說明

屬性	詳情
模型開發者	Marco Polignano 博士 - 意大利巴里阿爾多·莫羅大學 SWAP 研究小組
模型變體	使用 QLoRA 4bit 進行監督微調（SFT），並在基於指令的數據集上進行訓練。採用 DPO 方法在 mlabonne/orpo-dpo-mix-40k 數據集上進行對齊，以符合人類對有用性和安全性的偏好。
輸入	僅接受文本輸入。
語言	多語言（英語 + 意大利語）
輸出	僅生成文本和代碼。
模型架構	Llama 3 架構
上下文長度	8K（8192）
使用的庫	Unsloth

提示模板

<|start_header_id|>system<|end_header_id|>

{ SYS Prompt }<|eot_id|><|start_header_id|>user<|end_header_id|>

{ USER Prompt }<|eot_id|><|start_header_id|>assistant<|end_header_id|>

{ ASSIST Prompt }<|eot_id|>

評估

Open LLM 排行榜：使用 lm-evaluation-benchmark-harness 對 Open Italian LLMs Leaderboard 進行評估：

   lm_eval --model hf --model_args pretrained=HUGGINGFACE_MODEL_ID  --tasks hellaswag_it,arc_it  --device cuda:0 --batch_size auto:2
   lm_eval --model hf --model_args pretrained=HUGGINGFACE_MODEL_ID  --tasks m_mmlu_it --num_fewshot 5  --device cuda:0 --batch_size auto:2

指標	值
平均值	0.6160
Arc_IT	0.5714
Hellaswag_IT	0.7093
MMLU_IT	0.5672

Unsloth

![Unsloth](https://raw.githubusercontent.com/unslothai/unsloth/main/images/made with unsloth.png)

Unsloth 是一個很棒的工具，它幫助我們以比預期更低的成本輕鬆開發產品。

引用說明

@misc{polignano2024advanced,
      title={Advanced Natural-based interaction for the ITAlian language: LLaMAntino-3-ANITA}, 
      author={Marco Polignano and Pierpaolo Basile and Giovanni Semeraro},
      year={2024},
      eprint={2405.07101},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

@misc{basile2023llamantino,
      title={LLaMAntino: LLaMA 2 Models for Effective Text Generation in Italian Language}, 
      author={Pierpaolo Basile and Elio Musacchio and Marco Polignano and Lucia Siciliani and Giuseppe Fiameni and Giovanni Semeraro},
      year={2023},
      eprint={2312.09993},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

@article{llama3modelcard,
  title={Llama 3 Model Card},
  author={AI@Meta},
  year={2024},
  url = {https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md}
}

致謝

我們感謝 PNRR 項目 FAIR - Future AI Research (PE00000013) 的支持，該項目是 NRRP MUR 計劃下的一部分，由 NextGenerationEU 資助。模型是在 Leonardo 超級計算機上構建的，得到了 CINECA - 意大利超級計算資源分配的支持，屬於 C 類項目 IscrC_Pro_MRS (HP10CQO70G)。

Open LLM 排行榜評估結果

詳細結果可查看此處

指標	值
平均值	75.12
AI2 Reasoning Challenge (25-Shot)	74.57
HellaSwag (10-Shot)	92.75
MMLU (5-Shot)	66.85
TruthfulQA (0-shot)	75.93
Winogrande (5-shot)	82.00
GSM8k (5-shot)	58.61