NanoLM-1B-Instruct-v2 Open-Source Small Model - Unleashing the Potential of Small Models through Fine-Tuning on 4 Million Data Entries

Nanolm 1B Instruct V2

Developed by Mxode

NanoLM-1B-Instruct-v2 is a 1B-parameter small model fine-tuned on over 4 million high-quality instruction datasets, dedicated to exploring the potential of small models.

Large Language Model

Safetensors

EnglishOpen Source License:Gpl-3.0 #Small Model Fine-tuning #High-quality Instructions #4K Long Context

Downloads 45

Release Time : 9/7/2024

Model Overview

This is a 1B-parameter instruction fine-tuned model, part of the NanoLM series, designed to explore the performance of small models in various tasks.

Model Features

Efficient Small Model

With only 1B parameters, it demonstrates performance comparable to larger models

High-quality Instruction Fine-tuning

Fine-tuned on over 4 million high-quality instruction datasets

Long Context Support

Supports a context window of up to 4K tokens

Model Capabilities

Text Generation

Mathematical Calculation

Logical Reasoning

Instruction Following

Use Cases

Educational Assistance

Math Problem Solving

Solving various mathematical calculation and reasoning problems

Achieves 44.1% accuracy on the GSM8K math test set

General Assistant

Daily Q&A

Answering various user questions and requests

🚀 NanoLM-1B-Instruct-v2

NanoLM-1B-Instruct-v2 is a fine - tuned model on over 4 million high - quality instruction data points, aiming to explore the potential of small models.

🚀 Quick Start

In order to explore the potential of small models, I have attempted to build a series of them, which are available in the NanoLM Collections.

This is NanoLM-1B-Instruct-v2, fine-tuned on over 4 million high-quality instruction data points.

The model currently supports English only.

✨ Features

Fine - tuned on over 4 million high - quality instruction data points.
Part of the NanoLM series exploring small - model potential.

📚 Documentation

Model Details

Property	Details
Model Type	The model belongs to the NanoLM series, specifically NanoLM-1B-Instruct-v2.
Training Data	Fine - tuned on over 4 million high - quality instruction data points from the dataset Mxode/Magpie - Pro - 10K - GPT4o - mini.

Nano LMs	Non - emb Params	Arch	Layers	Dim	Heads	Seq Len
25M	15M	MistralForCausalLM	12	312	12	2K
70M	42M	LlamaForCausalLM	12	576	9	2K
0.3B	180M	Qwen2ForCausalLM	12	896	14	4K
1B	840M	Qwen2ForCausalLM	18	1536	12	4K

Metrics

	NanoLM-1B-Instruct-v2	Tinyllama-1.1B	Gemma-2B	Qwen1.5-1.8B	Qwen2-1.5B	Qwen1.5-4B	Mistral-7B-v0.1	Mistral-7B-v0.3	Qwen1.5-7B
GSM8K	44.1	2.3	17.7	33.6	55.8	52.2	37.83	34.5	53.5
MATH	14.8	0.7	11.8	10.1	21.7	10.0	8.48	-	20.3
BBH	0.42	0.30	0.35	0.35	0.36	0.41	0.44	0.45	0.46

💻 Usage Examples

Basic Usage

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = 'Mxode/NanoLM-1B-Instruct-v2'

model = AutoModelForCausalLM.from_pretrained(model_path).to('cuda:0', torch.bfloat16)
tokenizer = AutoTokenizer.from_pretrained(model_path)


def get_response(prompt: str, **kwargs):
    generation_args = dict(
        max_new_tokens = kwargs.pop("max_new_tokens", 512),
        do_sample = kwargs.pop("do_sample", True),
        temperature = kwargs.pop("temperature", 0.7),
        top_p = kwargs.pop("top_p", 0.8),
        top_k = kwargs.pop("top_k", 40),
        **kwargs
    )

    messages = [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": prompt}
    ]
    text = tokenizer.apply_chat_template(
        messages,
        tokenize=False,
        add_generation_prompt=True
    )
    model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

    generated_ids = model.generate(model_inputs.input_ids, **generation_args)
    generated_ids = [
        output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
    ]

    response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
    return response


prompt = "Calculate (99 - 1) * (3 + 4)"
print(get_response(prompt, do_sample=False))

"""
To calculate \((99 - 1) * (3 + 4)\), follow the order of operations, also known as PEMDAS (Parentheses, Exponents, Multiplication and Division, and Addition and Subtraction).

First, solve the expressions inside the parentheses:

1. \(99 - 1 = 98\)
2. \(3 + 4 = 7\)

Now, multiply the results:

\(98 * 7 = 686\)

So, \((99 - 1) * (3 + 4) = 686\).
"""

📄 License

This project is licensed under the GPL - 3.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご