Orca-2-7b開源語言模型 - 免費提升小型模型推理能力

首頁

Orca 2 7b

由microsoft開發

Orca 2是微軟開發的研究型語言模型，專注於提升小型語言模型的推理能力，基於LLAMA-2微調而成。

大型語言模型

Transformers

開源協議:其他 #小模型推理增強 #合成數據訓練 #研究專用模型

下載量 120.21k

發布時間 : 11/14/2023

模型概述

Orca 2專為研究目的設計，擅長推理任務如閱讀理解、數學解題和文本摘要，通過合成數據訓練增強小模型能力。

模型特點

增強推理能力

通過合成數據訓練顯著提升小型語言模型的推理能力

研究導向

專為語言模型能力邊界研究設計，支持學術探索

安全過濾

訓練數據經過Azure內容安全審核，建議部署時結合內容過濾服務

模型能力

文本生成

推理任務處理

閱讀理解

數學問題求解

文本摘要

使用案例

學術研究

小模型能力評估

用於評估小型語言模型在推理任務上的表現

論文顯示在多項基準測試中超越同類小模型

教育輔助

數學問題解答

幫助學生理解數學問題的解決過程

🚀 Orca 2

Orca 2專為研究目的而構建，可在推理用戶給定數據、閱讀理解、解決數學問題和文本摘要等任務中提供單輪響應。該模型尤其擅長推理。

🚀 快速開始

使用Hugging Face庫進行推理

import torch
import transformers

if torch.cuda.is_available():
    torch.set_default_device("cuda")
else:
    torch.set_default_device("cpu")
    
model = transformers.AutoModelForCausalLM.from_pretrained("microsoft/Orca-2-7b", device_map='auto')

# https://github.com/huggingface/transformers/issues/27132
# please use the slow tokenizer since fast and slow tokenizer produces different tokens
tokenizer = transformers.AutoTokenizer.from_pretrained(
        "microsoft/Orca-2-7b",
        use_fast=False,
    )

system_message = "You are Orca, an AI language model created by Microsoft. You are a cautious assistant. You carefully follow instructions. You are helpful and harmless and you follow ethical guidelines and promote positive behavior."
user_message = "How can you determine if a restaurant is popular among locals or mainly attracts tourists, and why might this information be useful?"

prompt = f"<|im_start|>system\n{system_message}<|im_end|>\n<|im_start|>user\n{user_message}<|im_end|>\n<|im_start|>assistant"

inputs = tokenizer(prompt, return_tensors='pt')
output_ids = model.generate(inputs["input_ids"],)
answer = tokenizer.batch_decode(output_ids)[0]

print(answer)

# This example continues showing how to add a second turn message by the user to the conversation
second_turn_user_message = "Give me a list of the key points of your first answer."

# we set add_special_tokens=False because we dont want to automatically add a bos_token between messages
second_turn_message_in_markup = f"\n<|im_start|>user\n{second_turn_user_message}<|im_end|>\n<|im_start|>assistant"
second_turn_tokens = tokenizer(second_turn_message_in_markup, return_tensors='pt', add_special_tokens=False)
second_turn_input = torch.cat([output_ids, second_turn_tokens['input_ids']], dim=1)

output_ids_2 = model.generate(second_turn_input,)
second_turn_answer = tokenizer.batch_decode(output_ids_2)[0]

print(second_turn_answer)

使用Azure AI Content Safety進行安全推理

強烈建議在模型預測的基礎上使用 Azure AI Content Safety，這有助於防止一些內容危害。Azure AI Content Safety是一個內容審核平臺，它使用人工智能來審核內容。通過在Orca 2的輸出上使用Azure AI Content Safety，可以對模型輸出進行審核，掃描不同的危害類別，包括性內容、暴力、仇恨和自我傷害等，支持多種嚴重程度級別和多語言檢測。

import os
import math
import transformers
import torch

from azure.ai.contentsafety import ContentSafetyClient
from azure.core.credentials import AzureKeyCredential
from azure.core.exceptions import HttpResponseError
from azure.ai.contentsafety.models import AnalyzeTextOptions

CONTENT_SAFETY_KEY = os.environ["CONTENT_SAFETY_KEY"]
CONTENT_SAFETY_ENDPOINT = os.environ["CONTENT_SAFETY_ENDPOINT"]

# We use Azure AI Content Safety to filter out any content that reaches "Medium" threshold
# For more information: https://learn.microsoft.com/en-us/azure/ai-services/content-safety/
def should_filter_out(input_text, threshold=4):
    # Create an Content Safety client
    client = ContentSafetyClient(CONTENT_SAFETY_ENDPOINT, AzureKeyCredential(CONTENT_SAFETY_KEY))

    # Construct a request
    request = AnalyzeTextOptions(text=input_text)

    # Analyze text
    try:
        response = client.analyze_text(request)
    except HttpResponseError as e:
        print("Analyze text failed.")
        if e.error:
            print(f"Error code: {e.error.code}")
            print(f"Error message: {e.error.message}")
            raise
        print(e)
        raise

    categories = ["hate_result", "self_harm_result", "sexual_result", "violence_result"]
    max_score = -math.inf
    for category in categories:
        max_score = max(max_score, getattr(response, category).severity)

    return max_score >= threshold

model_path = 'microsoft/Orca-2-7b'
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model = transformers.AutoModelForCausalLM.from_pretrained(model_path)
model.to(device)

tokenizer = transformers.AutoTokenizer.from_pretrained(
    model_path,
    model_max_length=4096,
    padding_side="right",
    use_fast=False,
    add_special_tokens=False,
)

system_message = "You are Orca, an AI language model created by Microsoft. You are a cautious assistant. You carefully follow instructions. You are helpful and harmless and you follow ethical guidelines and promote positive behavior."
user_message = "\" \n :You can't just say, \"\"that's crap\"\" and remove it without gaining a consensus. You already know this, based on your block history. —/ \" \nIs the comment obscene? \nOptions : Yes, No."

prompt =  f"<|im_start|>system\n{system_message}<|im_end|>\n<|im_start|>user\n{user_message}<|im_end|>\n<|im_start|>assistant"

inputs = tokenizer(prompt, return_tensors='pt')
inputs = inputs.to(device)

output_ids = model.generate(inputs["input_ids"], max_length=4096, do_sample=False, temperature=0.0, use_cache=True)
sequence_length = inputs["input_ids"].shape[1]
new_output_ids = output_ids[:, sequence_length:]
answers = tokenizer.batch_decode(new_output_ids, skip_special_tokens=True)
final_output = answers[0] if not should_filter_out(answers[0]) else "[Content Filtered]"

print(final_output)

✨ 主要特性

Orca 2專為研究目的而構建，主要用於讓研究社區評估其能力，併為構建更好的前沿模型提供基礎。
該模型尤其擅長推理，可在推理用戶給定數據、閱讀理解、解決數學問題和文本摘要等任務中提供單輪響應。

📚 詳細文檔

Orca 2的預期用途

Orca 2僅用於研究目的。
主要目的是讓研究社區評估其能力，併為構建更好的前沿模型提供基礎。

Orca 2的評估方式

Orca 2已在大量任務上進行了評估，包括推理、基礎事實和安全性等方面。具體評估細節請參考 Orca 2論文的第6節和附錄。

模型詳情

Orca 2是LLAMA - 2的微調版本。Orca 2的訓練數據是一個合成數據集，旨在增強小模型的推理能力。所有合成訓練數據都使用Microsoft Azure內容過濾器進行了審核。有關該模型的更多詳細信息，請參考 Orca 2論文。模型架構的詳細信息請參考LLaMA - 2技術報告。

偏差、風險和侷限性

Orca 2基於LLaMA 2模型家族構建，保留了其許多侷限性，以及其他大型語言模型的常見侷限性或由其訓練過程導致的侷限性，包括：

數據偏差：大型語言模型在大量數據上進行訓練，可能會無意中攜帶源數據中存在的偏差。因此，這些模型可能會生成有潛在偏差或不公平的輸出。
缺乏上下文理解：儘管這些模型在語言理解和生成方面具有令人印象深刻的能力，但它們對現實世界的理解有限，可能導致輸出不準確或無意義。
缺乏透明度：由於模型的複雜性和規模，大型語言模型可能像“黑匣子”一樣，難以理解特定輸出或決策背後的原理。建議查看Azure的透明度說明以獲取更多信息。
內容危害：大型語言模型可能會造成各種類型的內容危害。在使用這些模型時，瞭解這些危害並採取措施預防非常重要。建議利用不同公司和機構提供的各種內容審核服務。重要的是，我們希望未來政府和科技領袖能針對人工智能技術的內容危害制定更好的法規和標準。我們重視並認可研究和開源社區在這方面可以發揮的重要作用。
幻覺現象：重要的是要意識到並謹慎對待，不要完全依賴給定的語言模型來做出關鍵決策或獲取可能有重大影響的信息，因為目前尚不清楚如何防止這些模型編造內容。此外，由於小模型規模較小、記憶能力有限，在無基礎事實的生成用例中，它們是否更容易出現幻覺現象也不清楚。這是一個活躍的研究課題，我們希望未來能有更嚴格的測量、理解和緩解措施。
潛在的濫用風險：如果沒有適當的保障措施，這些模型可能會被惡意用於生成虛假信息或有害內容。
數據分佈：Orca 2的性能可能與調優數據的分佈密切相關。這種相關性可能會限制其在訓練數據集中代表性不足的領域（如數學、編程和推理）的準確性。
系統消息：Orca 2的性能會因系統指令的不同而有所差異。此外，模型規模帶來的隨機性可能會導致對不同系統指令生成非確定性的響應。
零樣本設置：Orca 2主要在模擬零樣本設置的數據上進行訓練。雖然該模型在零樣本設置中表現非常出色，但與其他模型（特別是更大的模型）相比，它在少樣本學習方面的優勢並不明顯。
合成數據：由於Orca 2在合成數據上進行訓練，它可能繼承了用於數據生成的模型和方法的優點和缺點。我們認為Orca 2受益於訓練期間納入的安全措施和Azure OpenAI API中的安全護欄（如內容過濾器）。然而，需要進行詳細研究以更好地量化這些風險。

該模型僅用於研究環境，且僅在該環境中進行了測試。不應將其用於下游應用，因為需要進行額外的分析來評估其在擬議應用中可能造成的危害或偏差。

🔧 技術細節

請參考LLaMA - 2技術報告瞭解模型架構的詳細信息。有關該模型的更多詳細信息，請參考 Orca 2論文。

📄 許可證

📚 引用

@misc{mitra2023orca,
      title={Orca 2: Teaching Small Language Models How to Reason}, 
      author={Arindam Mitra and Luciano Del Corro and Shweti Mahajan and Andres Codas and Clarisse Simoes and Sahaj Agrawal and Xuxi Chen and Anastasia Razdaibiedina and Erik Jones and Kriti Aggarwal and Hamid Palangi and Guoqing Zheng and Corby Rosset and Hamed Khanpour and Ahmed Awadallah},
      year={2023},
      eprint={2311.11045},
      archivePrefix={arXiv},
      primaryClass={cs.AI}
}

📦 信息表格

屬性	詳情
模型類型	Orca 2是LLAMA - 2的微調版本
訓練數據	一個合成數據集，旨在增強小模型的推理能力，所有合成訓練數據都使用Microsoft Azure內容過濾器進行了審核