Minueza 32M Chat

Developed by Felladrin

Minueza-32M-Chat is a chat model with 32 million parameters, based on Felladrin/Minueza-32M-Base and trained with supervised fine-tuning (SFT) and direct preference optimization (DPO).

Large Language Model

Transformers

EnglishOpen Source License:Apache-2.0 #Multi-turn Dialogue Optimization #Lightweight Chat Model #Multi-domain Adaptability

Downloads 77

Release Time : 2/25/2024

Model Overview

This is a small yet efficient chat model suitable for various dialogue scenarios, capable of providing helpful responses and suggestions.

Model Features

Compact and Efficient

With only 32 million parameters, it achieves decent dialogue capabilities through meticulous training

Multi-dataset Training

Trained using multiple high-quality datasets including Dolly, WebGLM, Capybara, etc.

Direct Preference Optimization

Utilizes DPO training method to optimize response quality

Model Capabilities

Text Generation

Dialogue Interaction

Q&A System

Creative Writing

Career Counseling

Health Advice

Use Cases

Dialogue Systems

Career Counseling

Provides career development advice and guidance to users

Offers personalized career suggestions based on user skills and interests

Knowledge Q&A

Health Advice

Answers questions about healthy lifestyles

Provides common-sense health improvement suggestions

Creative Generation

Game Setting Creation

Generates fantasy game settings based on user requests

Creates imaginative game worlds and characters

language:

en license: apache-2.0 datasets:
databricks/databricks-dolly-15k
Felladrin/ChatML-databricks-dolly-15k
euclaise/reddit-instruct-curated
Felladrin/ChatML-reddit-instruct-curated
THUDM/webglm-qa
Felladrin/ChatML-WebGLM-QA
starfishmedical/webGPT_x_dolly
Felladrin/ChatML-webGPT_x_dolly
LDJnr/Capybara
Felladrin/ChatML-Capybara
Open-Orca/SlimOrca-Dedup
Felladrin/ChatML-SlimOrca-Dedup
HuggingFaceH4/ultrachat_200k
Felladrin/ChatML-ultrachat_200k
nvidia/HelpSteer
Felladrin/ChatML-HelpSteer
sablo/oasst2_curated
Felladrin/ChatML-oasst2_curated
CohereForAI/aya_dataset
Felladrin/ChatML-aya_dataset
argilla/distilabel-capybara-dpo-7k-binarized
Felladrin/ChatML-distilabel-capybara-dpo-7k-binarized
argilla/distilabel-intel-orca-dpo-pairs
Felladrin/ChatML-distilabel-intel-orca-dpo-pairs
argilla/ultrafeedback-binarized-preferences
Felladrin/ChatML-ultrafeedback-binarized-preferences
sablo/oasst2_dpo_pairs_en
Felladrin/ChatML-oasst2_dpo_pairs_en
NeuralNovel/Neural-DPO
Felladrin/ChatML-Neural-DPO base_model: Felladrin/Minueza-32M-Base pipeline_tag: text-generation widget:
messages:
- role: system content: You are a career counselor. The user will provide you with an individual looking for guidance in their professional life, and your task is to assist them in determining what careers they are most suited for based on their skills, interests, and experience. You should also conduct research into the various options available, explain the job market trends in different industries, and advice on which qualifications would be beneficial for pursuing particular fields.
- role: user content: Heya!
- role: assistant content: Hi! How may I help you?
- role: user content: I am interested in developing a career in software engineering. What would you recommend me to do?
messages:
- role: system content: You are a highly knowledgeable assistant. Help the user as much as you can.
- role: user content: How can I become a healthier person?
messages:
- role: system content: You are a helpful assistant who gives creative responses.
- role: user content: Write the specs of a game about mages in a fantasy world.
messages:
- role: system content: You are a helpful assistant who answers user's questions with details.
- role: user content: Tell me about the pros and cons of social media.
messages:
- role: system content: You are a helpful assistant who answers user's questions with details and curiosity.
- role: user content: What are some potential applications for quantum computing? inference: parameters: max_new_tokens: 250 do_sample: true temperature: 0.65 top_p: 0.55 top_k: 35 repetition_penalty: 1.176 model-index:
name: Minueza-32M-Chat results:
- task: type: text-generation name: Text Generation dataset: name: AI2 Reasoning Challenge (25-Shot) type: ai2_arc config: ARC-Challenge split: test args: num_few_shot: 25 metrics:
  - type: acc_norm value: 20.39 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Felladrin/Minueza-32M-Chat name: Open LLM Leaderboard
- task: type: text-generation name: Text Generation dataset: name: HellaSwag (10-Shot) type: hellaswag split: validation args: num_few_shot: 10 metrics:
  - type: acc_norm value: 26.54 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Felladrin/Minueza-32M-Chat name: Open LLM Leaderboard
- task: type: text-generation name: Text Generation dataset: name: MMLU (5-Shot) type: cais/mmlu config: all split: test args: num_few_shot: 5 metrics:
  - type: acc value: 25.75 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Felladrin/Minueza-32M-Chat name: Open LLM Leaderboard
- task: type: text-generation name: Text Generation dataset: name: TruthfulQA (0-shot) type: truthful_qa config: multiple_choice split: validation args: num_few_shot: 0 metrics:
  - type: mc2 value: 47.27 source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Felladrin/Minueza-32M-Chat name: Open LLM Leaderboard
- task: type: text-generation name: Text Generation dataset: name: Winogrande (5-shot) type: winogrande config: winogrande_xl split: validation args: num_few_shot: 5 metrics:
  - type: acc value: 50.99 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Felladrin/Minueza-32M-Chat name: Open LLM Leaderboard
- task: type: text-generation name: Text Generation dataset: name: GSM8k (5-shot) type: gsm8k config: main split: test args: num_few_shot: 5 metrics:
  - type: acc value: 0.0 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Felladrin/Minueza-32M-Chat name: Open LLM Leaderboard

Minueza-32M-Chat: A chat model with 32 million parameters

Base model: Felladrin/Minueza-32M-Base
Datasets used during SFT:
Datasets used during DPO:
License: Apache License 2.0
Availability in other ML formats:
- GGUF: Felladrin/gguf-Minueza-32M-Chat
- ONNX: Felladrin/onnx-Minueza-32M-Chat

Recommended Prompt Format

<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{user_message}<|im_end|>
<|im_start|>assistant

Recommended Inference Parameters

do_sample: true
temperature: 0.65
top_p: 0.55
top_k: 35
repetition_penalty: 1.176

Usage Example

from transformers import pipeline

generate = pipeline("text-generation", "Felladrin/Minueza-32M-Chat")

messages = [
    {
        "role": "system",
        "content": "You are a helpful assistant who answers the user's questions with details and curiosity.",
    },
    {
        "role": "user",
        "content": "What are some potential applications for quantum computing?",
    },
]

prompt = generate.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

output = generate(
    prompt,
    max_new_tokens=256,
    do_sample=True,
    temperature=0.65,
    top_k=35,
    top_p=0.55,
    repetition_penalty=1.176,
)

print(output[0]["generated_text"])

How it was trained

This model was trained with SFT Trainer and DPO Trainer, in several sessions, using the following settings:

For Supervised Fine-Tuning:

Hyperparameter	Value
learning_rate	2e-5
total_train_batch_size	24
max_seq_length	2048
weight_decay	0
warmup_ratio	0.02

For Direct Preference Optimization:

Hyperparameter	Value
learning_rate	7.5e-7
total_train_batch_size	6
max_length	2048
max_prompt_length	1536
max_steps	200
weight_decay	0
warmup_ratio	0.02
beta	0.1

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	28.49
AI2 Reasoning Challenge (25-Shot)	20.39
HellaSwag (10-Shot)	26.54
MMLU (5-Shot)	25.75
TruthfulQA (0-shot)	47.27
Winogrande (5-shot)	50.99
GSM8k (5-shot)	0.00

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご