NanoLM-1B-Instruct-v1.1開源語言模型 - 免費支持多領域英文文本生成

首頁

Nanolm 1B Instruct V1.1

由Mxode開發

NanoLM-1B-Instruct-v1.1是一個10億參數規模的小型指令微調語言模型，支持多領域英文文本生成任務。

大型語言模型

Safetensors

支持多種語言開源協議:Gpl-3.0 #小模型優化 #多領域指令 #英文文本生成

下載量 24

發布時間 : 9/7/2024

模型概述

該模型是NanoLM系列的一部分，專注於探索小模型的潛力。它經過指令微調，適用於化學、生物、金融、法律、音樂、藝術、編程、氣候和醫療等多個領域的英文文本生成任務。

模型特點

多領域適用性

支持化學、生物、金融、法律、音樂、藝術、編程、氣候和醫療等多個專業領域的文本生成

高效推理

作為10億參數的小型模型，適合資源有限的環境部署

指令微調

經過專門指令微調，能更好地理解和執行用戶指令

模型能力

文本生成

多領域知識問答

指令執行

使用案例

專業領域輔助

科研輔助寫作

幫助科研人員在化學、生物等領域生成研究報告或論文摘要

法律文件生成

輔助生成基礎法律文件或合同條款

教育應用

編程教學助手

為學生解釋編程概念或生成示例代碼

🚀 NanoLM-1B-Instruct-v1.1

NanoLM-1B-Instruct-v1.1是為探索小模型潛力而構建的一系列模型之一，目前僅支持英文。

🚀 快速開始

為了探索小模型的潛力，我嘗試構建了一系列小模型，這些模型可在 NanoLM Collections 中找到。

這是 NanoLM-1B-Instruct-v1.1，該模型目前僅支持英文。

✨ 主要特性

此模型聚焦於小模型領域，在有限的參數規模下探索模型能力，涉及化學、生物學、金融、法律、音樂、藝術、代碼、氣候、醫學等多個領域的文本生成。

📚 詳細文檔

模型詳情

Nano LMs	非嵌入參數	架構	層數	維度	頭數	序列長度
25M	15M	MistralForCausalLM	12	312	12	2K
70M	42M	LlamaForCausalLM	12	576	9	2K
0.3B	180M	Qwen2ForCausalLM	12	896	14	4K
1B	840M	Qwen2ForCausalLM	18	1536	12	4K

💻 使用示例

基礎用法

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = 'Mxode/NanoLM-1B-Instruct-v1.1'

model = AutoModelForCausalLM.from_pretrained(model_path).to('cuda:0', torch.bfloat16)
tokenizer = AutoTokenizer.from_pretrained(model_path)


def get_response(prompt: str, **kwargs):
    generation_args = dict(
        max_new_tokens = kwargs.pop("max_new_tokens", 512),
        do_sample = kwargs.pop("do_sample", True),
        temperature = kwargs.pop("temperature", 0.7),
        top_p = kwargs.pop("top_p", 0.8),
        top_k = kwargs.pop("top_k", 40),
        **kwargs
    )

    messages = [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": prompt}
    ]
    text = tokenizer.apply_chat_template(
        messages,
        tokenize=False,
        add_generation_prompt=True
    )
    model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

    generated_ids = model.generate(model_inputs.input_ids, **generation_args)
    generated_ids = [
        output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
    ]

    response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
    return response


prompt = "Calculate (4 - 1)^(9 - 5)"
print(get_response(prompt, do_sample=False))

"""
The expression (4 - 1)^(9 - 5) can be simplified as follows:

(4 - 1) = 3

So the expression becomes 3^(9 - 5)

3^(9 - 5) = 3^4

3^4 = 81

Therefore, (4 - 1)^(9 - 5) = 81.
"""