Deepseek Coder 6.7B Instruct開源編程AI助手 - 免費解答計算機科學問題

首頁

Deepseek Coder 6.7B Instruct AWQ

由TheBloke開發

Deepseek Coder 6.7B Instruct 是一個專注於編程任務的AI助手模型，由DeepSeek公司開發。它專門用於回答與計算機科學相關的問題，拒絕回答非技術性問題。

大型語言模型

Transformers

開源協議:其他 #編程助手 #代碼生成 #計算機科學專用

下載量 248

發布時間 : 11/5/2023

模型概述

這是一個67億參數的編程專用語言模型，基於DeepSeek的架構開發，專門優化用於代碼生成、解釋和調試等編程相關任務。

模型特點

編程專用

專門針對編程任務優化，能夠高效處理代碼相關的問題

安全限制

內置安全機制，拒絕回答政治敏感、隱私安全等非技術問題

高效推理

支持AWQ量化，可在消費級硬件上高效運行

長上下文支持

支持長達16384 tokens的上下文窗口

模型能力

代碼生成

代碼解釋

編程問題解答

代碼調試

算法實現

使用案例

軟件開發

代碼自動補全

根據上下文自動生成代碼片段

提高開發效率，減少重複編碼工作

代碼審查

分析代碼並提出改進建議

幫助開發者提高代碼質量

編程學習

編程概念解釋

解釋複雜的編程概念和算法

幫助初學者理解編程知識

🚀 Deepseek Coder 6.7B Instruct - AWQ

本項目提供了Deepseek Coder 6.7B Instruct模型的AWQ量化版本，該模型由DeepSeek團隊開發，可用於計算機科學相關問題的編程輔助。

🚀 快速開始

模型信息

屬性	詳情
模型創建者	DeepSeek
原始模型	Deepseek Coder 6.7B Instruct
模型類型	deepseek
量化者	TheBloke
許可證	查看

模型倉庫

提示模板

You are an AI programming assistant, utilizing the Deepseek Coder model, developed by Deepseek Company, and you only answer questions related to computer science. For politically sensitive questions, security and privacy issues, and other non-computer science questions, you will refuse to answer.
### Instruction:
{prompt}
### Response:

✨ 主要特性

AWQ量化方法

AWQ是一種高效、準確且極快的低比特權重量化方法，目前支持4位量化。與GPTQ相比，它在基於Transformer的推理中速度更快，並且在質量上與最常用的GPTQ設置相當或更優。

多平臺支持

Text Generation Webui - 使用Loader: AutoAWQ
vLLM - 僅支持Llama和Mistral模型
Hugging Face Text Generation Inference (TGI)
AutoAWQ - 可在Python代碼中使用

📦 安裝指南

在text-generation-webui中使用

請確保使用的是text-generation-webui的最新版本。強烈建議使用text-generation-webui的一鍵安裝程序，除非你確定知道如何手動安裝。

點擊Model tab。
在Download custom model or LoRA下，輸入TheBloke/deepseek-coder-6.7B-instruct-AWQ。
點擊Download。
模型將開始下載，完成後會顯示“Done”。
在左上角，點擊Model旁邊的刷新圖標。
在Model下拉菜單中，選擇剛剛下載的模型：deepseek-coder-6.7B-instruct-AWQ。
選擇Loader: AutoAWQ。
點擊Load，模型將加載並準備好使用。
如果你需要自定義設置，設置完成後點擊Save settings for this model，然後在右上角點擊Reload the Model。
準備好後，點擊Text Generation標籤並輸入提示以開始！

vLLM多用戶推理服務器

安裝和使用vLLM的文檔可在此處找到。

請確保使用的是vLLM版本0.2或更高版本。
使用vLLM作為服務器時，傳遞--quantization awq參數。

例如：

python3 python -m vllm.entrypoints.api_server --model TheBloke/deepseek-coder-6.7B-instruct-AWQ --quantization awq

Hugging Face Text Generation Inference (TGI)多用戶推理服務器

使用TGI版本1.1.0或更高版本。官方Docker容器為：ghcr.io/huggingface/text-generation-inference:1.1.0

示例Docker參數：

--model-id TheBloke/deepseek-coder-6.7B-instruct-AWQ --port 3000 --quantize awq --max-input-length 3696 --max-total-tokens 4096 --max-batch-prefill-tokens 4096

示例Python代碼與TGI交互（需要huggingface-hub 0.17.0或更高版本）：

pip3 install huggingface-hub

from huggingface_hub import InferenceClient

endpoint_url = "https://your-endpoint-url-here"

prompt = "Tell me about AI"
prompt_template=f'''You are an AI programming assistant, utilizing the Deepseek Coder model, developed by Deepseek Company, and you only answer questions related to computer science. For politically sensitive questions, security and privacy issues, and other non-computer science questions, you will refuse to answer.
### Instruction:
{prompt}
### Response:
'''

client = InferenceClient(endpoint_url)
response = client.text_generation(prompt,
                                  max_new_tokens=128,
                                  do_sample=True,
                                  temperature=0.7,
                                  top_p=0.95,
                                  top_k=40,
                                  repetition_penalty=1.1)

print(f"Model output: ", response)

使用AutoAWQ從Python代碼進行推理

安裝AutoAWQ包

需要AutoAWQ 0.1.1或更高版本。

pip3 install autoawq

如果你在使用預構建的輪子安裝AutoAWQ時遇到問題，可以從源代碼安裝：

pip3 uninstall -y autoawq
git clone https://github.com/casper-hansen/AutoAWQ
cd AutoAWQ
pip3 install .

💻 使用示例

基礎用法

from awq import AutoAWQForCausalLM
from transformers import AutoTokenizer

model_name_or_path = "TheBloke/deepseek-coder-6.7B-instruct-AWQ"

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, trust_remote_code=False)
# Load model
model = AutoAWQForCausalLM.from_quantized(model_name_or_path, fuse_layers=True,
                                          trust_remote_code=False, safetensors=True)

prompt = "Tell me about AI"
prompt_template=f'''You are an AI programming assistant, utilizing the Deepseek Coder model, developed by Deepseek Company, and you only answer questions related to computer science. For politically sensitive questions, security and privacy issues, and other non-computer science questions, you will refuse to answer.
### Instruction:
{prompt}
### Response:
'''

print("*** Running model.generate:")

token_input = tokenizer(
    prompt_template,
    return_tensors='pt'
).input_ids.cuda()

# Generate output
generation_output = model.generate(
    token_input,
    do_sample=True,
    temperature=0.7,
    top_p=0.95,
    top_k=40,
    max_new_tokens=512
)

# Get the tokens from the output, decode them, print them
token_output = generation_output[0]
text_output = tokenizer.decode(token_output)
print("LLM output: ", text_output)

"""
# Inference should be possible with transformers pipeline as well in future
# But currently this is not yet supported by AutoAWQ (correct as of September 25th 2023)
from transformers import pipeline

print("*** Pipeline:")
pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=512,
    do_sample=True,
    temperature=0.7,
    top_p=0.95,
    top_k=40,
    repetition_penalty=1.1
)

print(pipe(prompt_template)[0]['generated_text'])
"""

高級用法

在vLLM中使用時，可通過設置不同的參數來調整生成結果：

from vllm import LLM, SamplingParams

prompts = [
    "Tell me about AI",
    "Write a story about llamas",
    "What is 291 - 150?",
    "How much wood would a woodchuck chuck if a woodchuck could chuck wood?",
]
prompt_template=f'''You are an AI programming assistant, utilizing the Deepseek Coder model, developed by Deepseek Company, and you only answer questions related to computer science. For politically sensitive questions, security and privacy issues, and other non-computer science questions, you will refuse to answer.
### Instruction:
{prompt}
### Response:
'''

prompts = [prompt_template.format(prompt=prompt) for prompt in prompts]

sampling_params = SamplingParams(temperature=0.8, top_p=0.95)

llm = LLM(model="TheBloke/deepseek-coder-6.7B-instruct-AWQ", quantization="awq", dtype="auto")

outputs = llm.generate(prompts, sampling_params)

# Print the outputs.
for output in outputs:
    prompt = output.prompt
    generated_text = output.outputs[0].text
    print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")

📚 詳細文檔

模型詳情

deepseek-coder-6.7b-instruct是一個具有67億參數的模型，它從deepseek-coder-6.7b-base初始化，並在20億個指令數據令牌上進行了微調。

主頁：DeepSeek
倉庫：deepseek-ai/deepseek-coder
與DeepSeek Coder聊天：DeepSeek-Coder

模型使用示例

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/deepseek-coder-6.7b-instruct", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("deepseek-ai/deepseek-coder-6.7b-instruct", trust_remote_code=True).cuda()
messages=[
    { 'role': 'user', 'content': "write a quick sort algorithm in python."}
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
# 32021 is the id of <|EOT|> token
outputs = model.generate(inputs, max_new_tokens=512, do_sample=False, top_k=50, top_p=0.95, num_return_sequences=1, eos_token_id=32021)
print(tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True))

許可證

此代碼倉庫遵循MIT許可證。DeepSeek Coder模型的使用需遵循模型許可證。DeepSeek Coder支持商業使用。更多詳細信息請參閱LICENSE-MODEL。

聯繫我們

如果您有任何問題，請提出問題或通過agi_code@deepseek.com聯繫我們。

🔧 技術細節

提供的文件和AWQ參數

首次發佈AWQ模型時，僅發佈128g模型。如果有需求，並且在進行困惑度和評估比較後，會考慮添加32g模型，但目前32g模型仍未在AutoAWQ和vLLM中完全測試。

模型以分片的safetensors文件形式發佈。

分支	比特數	分組大小	AWQ數據集	序列長度	大小
main	4	128	Evol Instruct Code	16384	3.89 GB

兼容性

提供的文件經過測試，可與以下工具配合使用：

text-generation-webui 使用 Loader: AutoAWQ
vLLM 版本0.2.0及更高版本
Hugging Face Text Generation Inference (TGI) 版本1.1.0及更高版本
AutoAWQ 版本0.1.1及更高版本

📄 許可證

本項目遵循DeepSeek Coder的許可證，詳情請見LICENSE。

Discord

如需進一步支持，以及討論這些模型和人工智能相關話題，請加入我們的： TheBloke AI的Discord服務器

感謝與貢獻方式

感謝chirper.ai團隊！感謝來自gpus.llm-utils.org的Clay！

很多人詢問是否可以進行貢獻。我喜歡提供模型並幫助他人，也希望能夠花更多時間做這些事情，同時拓展到新的項目，如微調/訓練。

如果您有能力且願意貢獻，將不勝感激，這將幫助我繼續提供更多模型，並開展新的人工智能項目。

捐贈者將在所有AI/LLM/模型問題和請求上獲得優先支持，訪問私人Discord房間，以及其他福利。

Patreon: https://patreon.com/TheBlokeAI
Ko-Fi: https://ko-fi.com/TheBlokeAI

特別感謝：Aemon Algiz。

Patreon特別提及：Brandon Frisco, LangChain4j, Spiking Neurons AB, transmissions 11, Joseph William Delisle, Nitin Borwankar, Willem Michiel, Michael Dempsey, vamX, Jeffrey Morgan, zynix, jjj, Omer Bin Jawed, Sean Connelly, jinyuan sun, Jeromy Smith, Shadi, Pawan Osman, Chadd, Elijah Stavena, Illia Dulskyi, Sebastain Graf, Stephen Murray, terasurfer, Edmond Seymore, Celu Ramasamy, Mandus, Alex, biorpg, Ajan Kanaga, Clay Pascal, Raven Klaugh, 阿明, K, ya boyyy, usrbinkat, Alicia Loh, John Villwock, ReadyPlayerEmma, Chris Smitley, Cap'n Zoog, fincy, GodLy, S_X, sidney chen, Cory Kujawski, OG, Mano Prime, AzureBlack, Pieter, Kalila, Spencer Kim, Tom X Nguyen, Stanislav Ovsiannikov, Michael Levine, Andrey, Trailburnt, Vadim, Enrico Ros, Talal Aujan, Brandon Phillips, Jack West, Eugene Pentland, Michael Davis, Will Dee, webtim, Jonathan Leane, Alps Aficionado, Rooh Singh, Tiffany J. Kim, theTransient, Luke @flexchar, Elle, Caitlyn Gatomon, Ari Malik, subjectnull, Johann-Peter Hartmann, Trenton Dambrowitz, Imad Khwaja, Asp the Wyvern, Emad Mostaque, Rainer Wilmers, Alexandros Triantafyllidis, Nicholas, Pedro Madruga, SuperWojo, Harry Royden McLaughlin, James Bentley, Olakabola, David Ziegler, Ai Maven, Jeff Scroggin, Nikolai Manek, Deo Leter, Matthew Berman, Fen Risland, Ken Nordquist, Manuel Alberto Morcote, Luke Pendergrass, TL, Fred von Graf, Randy H, Dan Guido, NimbleBox.ai, Vitor Caleffi, Gabriel Tamborski, knownsqashed, Lone Striker, Erik Bjäreholt, John Detwiler, Leonard Tan, Iucharbius

感謝所有慷慨的贊助者和捐贈者！再次感謝a16z的慷慨資助。