CausalLM-14B-DPO-alpha-GGUF Open-source Model - Supports Chinese and English text generation with better results!

Causallm 14B DPO Alpha GGUF

Developed by tastypear

A 14B-parameter causal language model optimized with DPO, supporting English-Chinese text generation tasks

Large Language Model Supports Multiple Languages#Multi-turn Dialogue Optimization #Bilingual Generation (English-Chinese)#Human Preference Alignment

Downloads 2,238

Release Time : 11/25/2023

Model Overview

This is a 14B-parameter causal language model trained with Direct Preference Optimization (DPO), specifically designed for text generation tasks with bilingual English-Chinese support. The model demonstrates excellent performance on the MT-Bench benchmark, approaching GPT-3.5 levels.

Model Features

DPO Optimization

Trained using Direct Preference Optimization to align outputs with human preferences

Bilingual Support

Supports both English and Chinese text generation tasks

High Performance

Achieves 7.618868 on MT-Bench, approaching GPT-3.5 performance levels

GGUF Format Support

Provides GGUF format model files compatible with various inference clients and libraries

Model Capabilities

Text Generation

Dialogue Systems

Question Answering

Content Creation

Use Cases

Intelligent Dialogue

Chatbot

Building bilingual English-Chinese chatbots

Conversational experience approaching GPT-3.5 levels

Content Creation

Article Generation

Generating articles in English/Chinese based on prompts

🚀 CausalLM 14B-DPO-alpha - GGUF

This repository contains GGUF format model files for CausalLM's 14B-DPO-alpha, offering text generation capabilities with support for multiple languages.

🚀 Quick Start

This README provides all the necessary information about the CausalLM 14B-DPO-alpha model in GGUF format, including details about the GGUF format, prompt templates, licensing, and original model card information.

✨ Features

Multi - language Support: Supports both English and Chinese.
DPO Training: An optimized version trained with DPO, potentially achieving better performance.
GGUF Format: Compatible with a wide range of clients and libraries.

📦 Installation

No specific installation steps are provided in the original README.

💻 Usage Examples

Basic Usage

The prompt template for chat is as follows:

<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant

📚 Documentation

About GGUF

GGUF is a new format introduced by the llama.cpp team on August 21st 2023. It is a replacement for GGML, which is no longer supported by llama.cpp.

Here is an incomplete list of clients and libraries that are known to support GGUF:

llama.cpp. The source project for GGUF. Offers a CLI and a server option.
text-generation-webui, the most widely used web UI, with many features and powerful extensions. Supports GPU acceleration.
KoboldCpp, a fully featured web UI, with GPU accel across all platforms and GPU architectures. Especially good for story - telling.
LM Studio, an easy - to - use and powerful local GUI for Windows and macOS (Silicon), with GPU acceleration.
LoLLMS Web UI, a great web UI with many interesting and unique features, including a full model library for easy model selection.
Faraday.dev, an attractive and easy to use character - based chat GUI for Windows and macOS (both Silicon and Intel), with GPU acceleration.
ctransformers, a Python library with GPU accel, LangChain support, and OpenAI - compatible AI server.
llama - cpp - python, a Python library with GPU accel, LangChain support, and OpenAI - compatible API server.
candle, a Rust ML framework with a focus on performance, including GPU support, and ease of use.

Licensing

The license for the original model is listed as "wtfpl", but subject to the "Meta Llama 2 License Terms".

Original model card: CausalLM's CausalLM 14B-DPO-alpha

For details, please refer to the version without DPO training: CausalLM/14B.

Model	MT - Bench
GPT - 4	8.99
GPT - 3.5 - Turbo	7.94

Zephyr - 7b - β (Overfitting)	7.34
Zephyr - 7b - α	6.88

CausalLM/14B - DPO - α	7.618868
CausalLM/7B - DPO - α	7.038125

It should be noted that this is not a version that continues training on CausalLM/14B & 7B, but rather an optimized version that has undergone DPO training concurrently on a previous training branch, and some detailed parameters may have changed. You will still need to download the full model.

The beta branch will soon be released, employing some aggressive approaches that might be detrimental in certain tasks, in order to achieve better alignment with human preferences, aiming to meet or exceed the GPT - 3.5 benchmarks. Stay tuned.

Disclaimer: Please note that the model was trained on unfiltered internet data. Since we do not have the capacity to vet all of it, there may be a substantial amount of objectionable content, pornography, violence, and offensive language present that we are unable to remove. Therefore, you will still need to complete your own checks on the model's safety and filter keywords in the output. Due to computational resource constraints, we are presently unable to implement RLHF for the model's ethics and safety, nor training on SFT samples that refuse to answer certain questions for restrictive fine - tuning.

🔧 Technical Details

Datasets

The model was trained on the following datasets:

JosephusCheung/GuanacoDataset
Open - Orca/OpenOrca
stingning/ultrachat
meta - math/MetaMathQA
liuhaotian/LLaVA - Instruct - 150K
jondurbin/airoboros - 3.1
WizardLM/WizardLM_evol_instruct_V2_196k
RyokoAI/ShareGPT52K
RyokoAI/Fandom23K
milashkaarshif/MoeGirlPedia_wikitext_raw_archive
wikipedia
wiki_lingua
fnlp/moss - 003 - sft - data
garage - bAInd/Open - Platypus
LDJnr/Puffin
openbmb/llava_zh
BAAI/COIG
TigerResearch/tigerbot - zhihu - zh - 10k
liwu/MNBVC
teknium/openhermes
openbmb/UltraFeedback
lmsys/lmsys - chat - 1m

Model Information

Property	Details
Model Type	CausalLM 14B - DPO - alpha in GGUF format
Training Data	Multiple datasets as listed above
Pipeline Tag	text - generation
Tags	llama, llama2, qwen, causallm
Model Creator	CausalLM
Original Model	[CausalLM 14B - DPO - alpha](https://huggingface.co/CausalLM/14B - DPO - alpha)

📄 License

The license for the original model is listed as "wtfpl", but subject to the "Meta Llama 2 License Terms".

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご