EEVE-Korean-Instruct-10.8B-v1.0開源大模型 - 韓語擴展支持DPO微調對話

首頁

EEVE Korean Instruct 10.8B V1.0

由yanolja開發

基於SOLAR-10.7B-v1.0的韓語詞彙擴展版，經過DPO微調的大語言模型

大型語言模型

Transformers

開源協議:Apache-2.0 #韓語指令優化 #多輪對話生成 #DPO微調

下載量 19.39k

發布時間 : 2/22/2024

模型概述

這是一個針對韓語優化的指令跟隨型大語言模型，基於yanolja/EEVE-Korean-10.8B-v1.0微調而來，能夠對用戶問題提供有用、詳細且禮貌的回答。

模型特點

韓語優化

專門針對韓語進行詞彙擴展和優化，提供更好的韓語處理能力

指令跟隨

經過DPO微調，能夠更好地理解和遵循用戶指令

高質量回答

提供有用、詳細且禮貌的回答，適合對話式交互

模型能力

韓語文本生成

問答系統

指令理解與執行

多輪對話

使用案例

智能助手

韓語問答系統

回答用戶關於韓國文化、歷史、地理等方面的問題

如示例所示，能準確回答韓國首都相關問題並提供額外信息

教育

韓語學習輔助

幫助韓語學習者理解語言和文化

🚀 EEVE-Korean-Instruct-10.8B-v1.0

EEVE-Korean-Instruct-10.8B-v1.0 是基於特定模型微調的大語言模型，可用於自然語言處理任務，為用戶提供詳細且禮貌的回答。

🚀 快速開始

本模型是基於 yanolja/EEVE-Korean-10.8B-v1.0 微調得到的，而 yanolja/EEVE-Korean-10.8B-v1.0 是 upstage/SOLAR-10.7B-v1.0 的韓語詞彙擴展版本。具體來說，我們使用了 Axolotl 進行直接偏好優化（DPO）。

更多詳細信息，請參考我們的技術報告：Efficient and Effective Vocabulary Expansion Towards Multilingual Large Language Models。

✨ 主要特性

社區交流

如果您對大語言模型領域充滿熱情，希望交流知識和見解，我們誠摯地邀請您加入我們的 Discord 服務器。需要注意的是，該服務器主要使用韓語交流。大語言模型領域發展迅速，如果不積極分享，我們的知識很快就會過時。讓我們攜手合作，產生更大的影響力！點擊此處加入：Discord 鏈接。

團隊信息

研究人員	工程師	產品管理	用戶體驗設計
鄭明浩	金健	許寶京	崔恩秀
金承德	裡夫基·阿爾菲
崔承澤	韓相勳
	康秀賢

📦 安裝指南

本部分未提供安裝步驟相關內容，暫不展示安裝指南。

💻 使用示例

基礎用法

from transformers import AutoTokenizer
from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("yanolja/EEVE-Korean-Instruct-10.8B-v1.0")
tokenizer = AutoTokenizer.from_pretrained("yanolja/EEVE-Korean-Instruct-10.8B-v1.0")

prompt_template = "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.\nHuman: {prompt}\nAssistant:\n"
text = '한국의 수도는 어디인가요? 아래 선택지 중 골라주세요.\n\n(A) 경성\n(B) 부산\n(C) 평양\n(D) 서울\n(E) 전주'
model_inputs = tokenizer(prompt_template.format(prompt=text), return_tensors='pt')

outputs = model.generate(**model_inputs, max_new_tokens=256)
output_text = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
print(output_text)

示例輸出

A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
Human: 한국의 수도는 어디인가요? 아래 선택지 중 골라주세요.

(A) 경성
(B) 부산
(C) 평양
(D) 서울
(E) 전주
Assistant:
(D) 서울이 한국의 수도입니다. 서울은 나라의 북동부에 위치해 있으며, 정치, 경제, 문화의 중심지입니다. 약 1,000만 명이 넘는 인구를 가진 세계에서 가장 큰 도시 중 하나입니다. 서울은 높은 빌딩, 현대적인 인프라, 활기 문화 장면으로 유명합니다. 또한, 많은 역사적 명소와 박물관이 있어 방문객들에게 풍부한 문화 체험을 제공합니다.

📚 詳細文檔

提示模板

A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
Human: {prompt}
Assistant:

訓練數據

Open-Orca/SlimOrca-Dedup 的韓語翻譯版本
argilla/ultrafeedback-binarized-preferences-cleaned 的韓語翻譯版本
未使用其他數據集

評估結果

Open LLM Leaderboard Evaluation Results 詳細結果可查看此處

指標	值
平均	66.48
AI2 推理挑戰（25 次射擊）	64.85
HellaSwag（10 次射擊）	83.04
MMLU（5 次射擊）	64.23
TruthfulQA（0 次射擊）	54.09
Winogrande（5 次射擊）	81.93
GSM8k（5 次射擊）	50.72

🔧 技術細節

本部分未提供技術實現細節相關內容，暫不展示技術細節。

📄 許可證

本模型使用 Apache-2.0 許可證。

📖 引用

@misc{kim2024efficient,
      title={Efficient and Effective Vocabulary Expansion Towards Multilingual Large Language Models}, 
      author={Seungduk Kim and Seungtaek Choi and Myeongho Jeong},
      year={2024},
      eprint={2402.14714},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

@misc{cui2023ultrafeedback,
      title={UltraFeedback: Boosting Language Models with High-quality Feedback}, 
      author={Ganqu Cui and Lifan Yuan and Ning Ding and Guanming Yao and Wei Zhu and Yuan Ni and Guotong Xie and Zhiyuan Liu and Maosong Sun},
      year={2023},
      eprint={2310.01377},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

@misc{SlimOrcaDedup,
  title = {SlimOrca Dedup: A Deduplicated Subset of SlimOrca},
  author = {Wing Lian and Guan Wang and Bleys Goodson and Eugene Pentland and Austin Cook and Chanvichet Vong and "Teknium" and Nathan Hoos},
  year = {2023},
  publisher = {HuggingFace},
  url = {https://huggingface.co/datasets/Open-Orca/SlimOrca-Dedup/}
}

@misc{mukherjee2023orca,
      title={Orca: Progressive Learning from Complex Explanation Traces of GPT-4}, 
      author={Subhabrata Mukherjee and Arindam Mitra and Ganesh Jawahar and Sahaj Agarwal and Hamid Palangi and Ahmed Awadallah},
      year={2023},
      eprint={2306.02707},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}