Cotype-Nano Open-Source Lightweight LLM - Achieve Efficient User Interaction with Low Resource Requirements

Cotype Nano

Developed by MTSAIR

Cotype-Nano is a lightweight LLM designed to perform tasks with minimal resources. It is optimized for fast and efficient interaction with users, delivering high performance even under resource-constrained conditions.

Large Language Model

Transformers

Open Source License:Other #Lightweight LLM #Low-resource optimization #Multilingual support

Downloads 4,075

Release Time : 11/22/2024

Model Overview

Cotype-Nano is a lightweight language model suitable for efficient text generation tasks in resource-constrained environments, supporting Russian and English.

Model Features

Lightweight and efficient

Designed for resource-constrained environments, it can operate efficiently with minimal resources

Multilingual support

Supports text generation in both Russian and English

Fast interaction

The optimized model enables fast and efficient interaction with users

Model Capabilities

Text generation

Instruction following

Code generation

Question answering

Use Cases

Development assistance

Code example generation

Generates programming code examples based on user requirements

For example, generating FastAPI server code

Intelligent assistant

Knowledge Q&A

Answers various questions posed by users

Provides detailed and comprehensive answers

🚀 Cotype-Nano

Cotype Nano is a lightweight LLM, designed to perform tasks with minimal resources. It is optimized for fast and efficient interaction with users, providing high performance even under resource-constrained conditions.

🚀 Quick Start

Inference with vLLM

python3 -m vllm.entrypoints.openai.api_server --model MTSAIR/Cotype-Nano --port 8000

Recommended generation parameters and system prompt

import openai
import pandas as pd
from tqdm import tqdm

openai.api_key = 'xxx'

endpoint = 'http://localhost:8000/v1'
model = 'MTSAIR/Cotype-Nano'
openai.api_base = endpoint

# Possible system prompt:
# {"role": "system", "content": "–¢—ã ‚Äî –ò–ò-–ø–æ–º–æ—â–Ω–∏–∫. –¢–µ–±–µ –¥–∞–Ω–æ –∑–∞–¥–∞–Ω–∏–µ: –Ω–µ–æ–±—Ö–æ–¥–∏–º–æ —Å–≥–µ–Ω–µ—Ä–∏—Ä–æ–≤–∞—Ç—å –ø–æ–¥—Ä–æ–±–Ω—ã–π –∏ —Ä–∞–∑–≤–µ—Ä–Ω—É—Ç—ã–π –æ—Ç–≤–µ—Ç."},


response = openai.ChatCompletion.create(
    model=model,
    temperature=0.4, # 0.0 is also allowed
    frequency_penalty=0.0,
    max_tokens=2048,
    top_p=0.8, # 0.1 is also allowed
    messages=[
          {"role": "user", "content": "–ö–∞–∫ –º–Ω–µ –æ–±—É—á–∏—Ç—å –º–æ–¥–µ–ª—å meta-llama/Llama-3.2-1B —Å –ø–æ–º–æ—â—å—é –±–∏–±–ª–∏–æ—Ç–µ–∫–∏ transformers?"}
        ]
    )

answer = response["choices"][0]["message"]["content"]
print(answer)

Inference with Huggingface

from transformers import pipeline

pipe = pipeline("text-generation", model="MTSAIR/Cotype-Nano", device="cuda")

messages = [
  {"role": "system", "content": "–¢—ã ‚Äî –ò–ò-–ø–æ–º–æ—â–Ω–∏–∫. –¢–µ–±–µ –¥–∞–Ω–æ –∑–∞–¥–∞–Ω–∏–µ: –Ω–µ–æ–±—Ö–æ–¥–∏–º–æ —Å–≥–µ–Ω–µ—Ä–∏—Ä–æ–≤–∞—Ç—å –ø–æ–¥—Ä–æ–±–Ω—ã–π –∏ —Ä–∞–∑–≤–µ—Ä–Ω—É—Ç—ã–π –æ—Ç–≤–µ—Ç."},
  {"role": "user", "content": "–†–∞—Å—Å–∫–∞–∂–∏ –º–Ω–µ –ø—Ä–æ –ò–ò"},
]

res = pipe(messages, max_length=1024)
print(res[0]['generated_text'][-1]['content'])

💻 Usage Examples

Basic Usage

# Inference with vLLM
python3 -m vllm.entrypoints.openai.api_server --model MTSAIR/Cotype-Nano --port 8000

Advanced Usage

# Recommended generation parameters and system prompt
import openai
import pandas as pd
from tqdm import tqdm

openai.api_key = 'xxx'

endpoint = 'http://localhost:8000/v1'
model = 'MTSAIR/Cotype-Nano'
openai.api_base = endpoint

# Possible system prompt:
# {"role": "system", "content": "–¢—ã ‚Äî –ò–ò-–ø–æ–º–æ—â–Ω–∏–∫. –¢–µ–±–µ –¥–∞–Ω–æ –∑–∞–¥–∞–Ω–∏–µ: –Ω–µ–æ–±—Ö–æ–¥–∏–º–æ —Å–≥–µ–Ω–µ—Ä–∏—Ä–æ–≤–∞—Ç—å –ø–æ–¥—Ä–æ–±–Ω—ã–π –∏ —Ä–∞–∑–≤–µ—Ä–Ω—É—Ç—ã–π –æ—Ç–≤–µ—Ç."},


response = openai.ChatCompletion.create(
    model=model,
    temperature=0.4, # 0.0 is also allowed
    frequency_penalty=0.0,
    max_tokens=2048,
    top_p=0.8, # 0.1 is also allowed
    messages=[
          {"role": "user", "content": "–ö–∞–∫ –º–Ω–µ –æ–±—É—á–∏—Ç—å –º–æ–¥–µ–ª—å meta-llama/Llama-3.2-1B —Å –ø–æ–º–æ—â—å—é –±–∏–±–ª–∏–æ—Ç–µ–∫–∏ transformers?"}
        ]
    )

answer = response["choices"][0]["message"]["content"]
print(answer)

Example request

Request: –ù–∞–ø–∏—à–∏ –ø—Ä–æ—Å—Ç–æ–π —Å–µ—Ä–≤–µ—Ä –Ω–∞ —Ñ–∞—Å—Ç–∞–ø–∏

Cotype-Nano response:

–§–∞—Å—Ç–∞–ø–∏ (FastAPI) is a framework that simplifies the creation of APIs in Python, allowing you to quickly build high - performance web applications. To use FastAPI, you need to follow these steps:

Install FastAPI and Uvicorn.

Create an App instance.

Define API endpoints.

Run the application.

Step 1: Install FastAPI and Uvicorn

To install FastAPI and Uvicorn, you can use the following command:
pip install fastapi uvicorn
Step 2: Create an App instance

Create an App instance using the following code:
from fastapi import FastAPI

app = FastAPI()
Step 3: Define API endpoints

Define an API endpoint, for example, a GET endpoint:
@app.get("/")
async def read_root():
    return {"message": "Hello, FastAPI!"}
Step 4: Run the application

Run the application using Uvicorn:
uvicorn main:app --reload
Here, main is the name of the Python file, and app is the App instance. The --reload option enables auto - reload during development.

After running the application, you can access it at http://127.0.0.1:8000. When you make a GET request to /, you will receive the following response:
{
    "message": "Hello, FastAPI!"
}
If you want to create more complex endpoints, you can define different routes and request methods according to your needs.

🔧 Technical Details

Training process

The model was trained in two stages. In the first stage, MLP layers were trained on mathematics and code. In the second stage, the entire model was trained on internal and open synthetic instructional datasets.

ru-llm-arena: 30.2 (local measurement)

Property	Details
Model Type	Cotype-Nano
Score	30.2
95% CI	+2.2 / -1.3
Avg. #Tokens	542

Model	Score	95% CI	Avg. #Tokens
Cotype-Nano	30.2	+2.2 / -1.3	542
vikhr-it-5.3-fp16-32k	27.8	+1.5 / -2.1	519.71
vikhr-it-5.3-fp16	22.73	+1.8 / -1.7	523.45
Cotype-Nano-4bit	22.5	+2.1 / -1.4	582
kolibri-vikhr-mistral-0427	22.41	+1.6 / -1.9	489.89
snorkel-mistral-pairrm-dpo	22.41	+1.7 / -1.6	773.8
storm-7b	20.62	+1.4 / -1.6	419.32
neural-chat-7b-v3-3	19.04	+1.8 / -1.5	927.21
Vikhrmodels-Vikhr-Llama-3.2-1B-instruct	19.04	+1.2 / -1.5	958.63
gigachat_lite	17.2	+1.5 / -1.5	276.81
Vikhrmodels-Vikhr-Qwen-2.5-0.5b-Instruct	16.5	+1.5 / -1.7	583.5
Qwen-Qwen2.5-1.5B-Instruct	16.46	+1.3 / -1.3	483.67
Vikhrmodels-vikhr-qwen-1.5b-it	13.19	+1.3 / -1.1	2495.38
meta-llama-Llama-3.2-1B-Instruct	4.04	+0.6 / -0.8	1240.53
Qwen-Qwen2.5-0.5B-Instruct	4.02	+0.7 / -0.8	829.87

📄 License

The model is under the Apache 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご