CausalLM-7B-DPO-alpha-GGUF Open-Source Large Model - Supports Efficient Generation of Chinese and English Texts

Causallm 7B DPO Alpha GGUF

Developed by tastypear

A 7B-parameter large language model based on Llama 2 architecture, optimized through DPO training, supporting Chinese and English text generation

Large Language Model Supports Multiple Languages#Multi-turn Dialogue Optimization #Bilingual Support (Chinese-English)#Human Preference Alignment

Downloads 367

Release Time : 11/19/2023

Model Overview

This is a DPO-optimized 7B-parameter large language model based on the Llama 2 architecture, supporting Chinese and English text generation tasks. The model was trained on multiple datasets including Guanaco and OpenOrca, aiming to provide text generation capabilities more aligned with human preferences.

Model Features

DPO Optimization

The model underwent Direct Preference Optimization (DPO) training, enabling it to generate text more aligned with human preferences

Multi-dataset Training

Trained on over 20 high-quality datasets including Guanaco, OpenOrca, and UltraChat

Bilingual Support

Supports both English and Chinese text generation tasks

GGUF Quantization Format

Provides multiple quantized versions in GGUF format for easier deployment on different hardware

Model Capabilities

Text Generation

Dialogue Systems

QA Systems

Content Creation

Use Cases

Dialogue Systems

Intelligent Assistant

Can be used to build intelligent conversational assistants

Scored 7.038 on the MT-Bench benchmark

Content Creation

Text Generation

Can be used to generate various types of textual content

🚀 CausalLM 7B-DPO-alpha - GGUF

This repository provides a quantized version of the CausalLM 7B-DPO-alpha model in GGUF format, enabling efficient text generation.

🚀 Quick Start

I created a quantized version of this model by referring to TheBloke's publishing format and based on the recommendation of TheBloke/CausalLM-7B-GGUF.

✨ Features

Model Information

Model creator: CausalLM
Original model: CausalLM 7B-DPO-alpha
Model type: llama
Pipeline tag: text-generation
Prompt template:

<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant

Supported languages: en, zh
License: wtfpl

Datasets

The model is trained on the following datasets:

JosephusCheung/GuanacoDataset
Open-Orca/OpenOrca
stingning/ultrachat
meta-math/MetaMathQA
liuhaotian/LLaVA-Instruct-150K
jondurbin/airoboros-3.1
WizardLM/WizardLM_evol_instruct_V2_196k
RyokoAI/ShareGPT52K
RyokoAI/Fandom23K
milashkaarshif/MoeGirlPedia_wikitext_raw_archive
wikipedia
wiki_lingua
fnlp/moss-003-sft-data
garage-bAInd/Open-Platypus
LDJnr/Puffin
openbmb/llava_zh
BAAI/COIG
TigerResearch/tigerbot-zhihu-zh-10k
liwu/MNBVC
teknium/openhermes

Compatibility

These quantised GGUFv2 files are compatible with llama.cpp from August 27th onwards, as of commit d0cee0d. They are also compatible with many third - party UIs and libraries.

Provided Files

Name	Quant method	Bits	Size
causallm_7b.Q4_K_M.gguf	Q4_K_M	4	4.77 GB
causallm_7b.Q5_K_S.gguf	Q5_K_S	5	5.40 GB
causallm_7b.Q5_K_M.gguf	Q5_K_M	5	5.53 GB

📚 Documentation

About GGUF

GGUF is a new format introduced by the llama.cpp team on August 21st, 2023. It is a replacement for GGML, which is no longer supported by llama.cpp.

Here is an incomplete list of clients and libraries that are known to support GGUF:

llama.cpp. The source project for GGUF. Offers a CLI and a server option.
[text - generation - webui](https://github.com/oobabooga/text - generation - webui), the most widely used web UI, with many features and powerful extensions. Supports GPU acceleration.
KoboldCpp, a fully featured web UI, with GPU accel across all platforms and GPU architectures. Especially good for story telling.
LM Studio, an easy - to - use and powerful local GUI for Windows and macOS (Silicon), with GPU acceleration.
[LoLLMS Web UI](https://github.com/ParisNeo/lollms - webui), a great web UI with many interesting and unique features, including a full model library for easy model selection.
Faraday.dev, an attractive and easy to use character - based chat GUI for Windows and macOS (both Silicon and Intel), with GPU acceleration.
ctransformers, a Python library with GPU accel, LangChain support, and OpenAI - compatible AI server.
[llama - cpp - python](https://github.com/abetlen/llama - cpp - python), a Python library with GPU accel, LangChain support, and OpenAI - compatible API server.
candle, a Rust ML framework with a focus on performance, including GPU support, and ease of use.

Licensing

The creator of the source model has listed its license as wtfpl, and this quantization has therefore used that same license.

As this model is based on Llama 2, it is also subject to the Meta Llama 2 license terms, and the license files for that are additionally included. It should therefore be considered as being claimed to be licensed under both licenses. I contacted Hugging Face for clarification on dual licensing but they do not yet have an official position. Should this change, or should Meta provide any feedback on this situation, I will update this section accordingly.

In the meantime, any questions regarding licensing, and in particular how these two licenses might interact, should be directed to the original model repository: CausalLM's CausalLM 7B-DPO-alpha.

Explanation of Quantisation Methods

Click to see details

The new methods available are:

GGML_TYPE_Q4_K - "type - 1" 4 - bit quantization in super - blocks containing 8 blocks, each block having 32 weights. Scales and mins are quantized with 6 bits. This ends up using 4.5 bpw.
GGML_TYPE_Q5_K - "type - 1" 5 - bit quantization. Same super - block structure as GGML_TYPE_Q4_K resulting in 5.5 bpw

Refer to the Provided Files table above to see what files use which methods, and how.

Original Model Card: CausalLM's CausalLM 7B-DPO-alpha

For details, please refer to the version without DPO training: CausalLM/7B.

Model	MT - Bench
GPT - 4	8.99
GPT - 3.5 - Turbo	7.94

Zephyr - 7b - β (Overfitting)	7.34
Zephyr - 7b - α	6.88

CausalLM/14B - DPO - α	7.618868
CausalLM/7B - DPO - α	7.038125

It should be noted that this is not a version that continues training on CausalLM/14B & 7B, but rather an optimized version that has undergone DPO training concurrently on a previous training branch, and some detailed parameters may have changed. You will still need to download the full model.

The beta branch will soon be released, employing some aggressive approaches that might be detrimental in certain tasks, in order to achieve better alignment with human preferences, aiming to meet or exceed the GPT - 3.5 benchmarks. Stay tuned.

Disclaimer: Please note that the model was trained on unfiltered internet data. Since we do not have the capacity to vet all of it, there may be a substantial amount of objectionable content, pornography, violence, and offensive language present that we are unable to remove. Therefore, you will still need to complete your own checks on the model's safety and filter keywords in the output. Due to computational resource constraints, we are presently unable to implement RLHF for the model's ethics and safety, nor training on SFT samples that refuse to answer certain questions for restrictive fine - tuning.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご