Excalibur-7b-DPO Open-source Large Language Model - Improve Conversation Quality and Optimize Visual Scene Applications

Excalibur 7b DPO

Developed by InferenceIllusionist

Excalibur-7b-DPO is a large language model based on the Excalibur-7b foundation model, fine-tuned with Direct Preference Optimization (DPO), focusing on improving dialogue quality and performance in visual application scenarios.

Large Language Model

Transformers

Open Source License:Apache-2.0 #DPO fine-tuning optimization #Multimodal visual question answering #ChatML prompt format

Downloads 22

Release Time : 3/28/2024

Model Overview

This model was fine-tuned using the Intel/orca_dpo_pairs dataset with DPO to enhance the response quality of the original model, particularly in visual application scenarios. The fine-tuned model exhibits more conversational and comprehensive responses, showing improvements across multiple benchmarks.

Model Features

DPO fine-tuning optimization

Fine-tuned using Direct Preference Optimization (DPO), significantly improving dialogue quality and response comprehensiveness.

Enhanced visual applications

Specifically optimized for performance in visual application scenarios, supporting image understanding and description.

Multi-format support

Supports ChatML and Alpaca prompt formats, adaptable to various application scenarios.

Quantized versions available

Offers weighted and static quantized versions to meet different hardware requirements.

Model Capabilities

Text generation

Visual scene understanding

Multi-turn dialogue

Knowledge Q&A

Reasoning tasks

Use Cases

Visual applications

Image caption generation

Generates detailed descriptions based on input images

Requires additional mmproj file support

Dialogue systems

Intelligent assistant

Builds more natural and fluent conversational assistants

Significant improvement in dialogue quality after fine-tuning

Educational applications

Knowledge Q&A

Answers various knowledge-based questions

Performs well on benchmarks like the AI2 Reasoning Challenge

🚀 Excalibur-7b-DPO

An initial attempt at fine - tuning, aiming to enhance the quality of the original model's responses, especially for vision use cases.

Weighted (Importance Matrix) Quants available here

Static (Legacy) quants available here

🚀 Quick Start

This project is an exploration of fine - tuning. The model Excalibur-7b-DPO is based on Excalibur-7b and fine - tuned using the Direct Preference Optimization (DPO) method.

✨ Features

Enhanced Response Quality: Fine - tuning with DPO improves the model's conversational ability and makes it more well - rounded.
Multiple Quantization Options: Weighted (Importance Matrix) and Static (Legacy) quantization options are available.
Vision Functionality: Supports vision use cases with different mmproj file options.

📚 Documentation

Notes & Methodology

- Excalibur-7b was fine - tuned with Direct Preference Optimization (DPO) using Intel/orca_dpo_pairs.
This is a quick experiment to determine the impact of DPO finetuning on the Excelsior - 7b base model.
The fine - tuning ran for a little over an hour on a single A100.
Fine - tuning succeeded in making the model conversational and more well - rounded.
Benchmark scores increased in the following categories versus base Excelsior - 7b:
- ARC: 69.71 -> 70.9
- HellaSwag: 87.56 -> 87.93
- TruthfulQA: 67.24 -> 70.82
- Average: 73.6 -> 73.84
Precision: bfloat16

Sample Question - Vision

*Requires additional mmproj file. You have two options for vision functionality (available inside this repo):

Select the gguf file of your choice in Koboldcpp as usual, then make sure to choose the mmproj file above in the LLaVA mmproj field of the model submenu:

Prompt Format

For best results please use ChatML for the prompt format. Alpaca may also work.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	73.84
AI2 Reasoning Challenge (25 - Shot)	70.90
HellaSwag (10 - Shot)	87.93
MMLU (5 - Shot)	65.46
TruthfulQA (0 - shot)	70.82
Winogrande (5 - shot)	82.48
GSM8k (5 - shot)	65.43

📄 License

This project is licensed under the Apache - 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご