magnum-v2-123b Open-source Language Model - Free to Replicate the Prose Writing Quality of Claude 3 Series

Magnum V2 123b

Developed by anthracite-org

This is a model fine-tuned based on Mistral-Large-Instruct-2407, aiming to replicate the prose quality of the Claude 3 series models (especially Sonnet and Opus).

Large Language Model

Transformers

Supports Multiple LanguagesOpen Source License:Other #Multilingual text generation #Claude style replication #Low learning rate sensitivity

Downloads 284

Release Time : 8/17/2024

Model Overview

This model focuses on text generation tasks and supports multiple languages, including English, French, German, Spanish, Italian, Portuguese, Russian, Chinese, and Japanese.

Model Features

Multilingual support

Supports text generation tasks in multiple languages

Claude-style prose quality

Aims to replicate the prose quality of the Claude 3 series models

Optimized learning rate

Optimized and adjusted for the learning rate sensitivity of the Mistral large model

Model Capabilities

Text generation

Multilingual support

Use Cases

Text creation

Prose creation

Generate prose with a Claude style

Prose quality close to that of the Claude 3 series models

🚀 Magnum-123b-v1 Model Introduction

This is the sixth model in a series aiming to replicate the prose quality of Claude 3 models, offering multilingual text generation capabilities.

🚀 Quick Start

This model is fine - tuned on top of Mistral - Large - Instruct - 2407. It supports multiple languages including English, French, German, Spanish, Italian, Portuguese, Russian, Chinese, and Japanese.

image/png

✨ Features

Designed to replicate the prose quality of Claude 3 models (Sonnet and Opus).
Supports multiple languages for text generation.
Instruct - tuned with the Mistral formatting.

📦 Installation

No installation steps are provided in the original document, so this section is skipped.

💻 Usage Examples

Basic Usage

The model has been Instruct tuned with the Mistral formatting. A typical input would look like this:

<s>[INST] SYSTEM MESSAGE\nUSER MESSAGE[/INST] ASSISTANT MESSAGE</s>[INST] USER MESSAGE[/INST]

We also provide SillyTavern presets for Context and Instruct respectively. The Mistral preset included in SillyTavern seems to be misconfigured by default, so we recommend using these as a replacement.

📚 Documentation

Credits

This model has been a team effort, and the credits go to all members of Anthracite.

Training

The training was done for 1.5 epochs. We used 8x AMD Instinct™ MI300X Accelerators for the full - parameter fine - tuning of the model.

In addition to this, we noticed that Mistral Large models seemed much more sensitive to learning rate adjustments than other models:

image/png

We hypothesize this is primarily due to the particularly narrow and low variance weight distributions typical of Mistral derived models regardless of their scale.

In the end, due to the costs that would be involved in training another full 2 epochs run ($600) on an even lower rate, we settled on our third attempt: 2e - 6 with an effective batch size of 64. We chose to publish the 1.5 epoch run after manually testing and comparing it.

image/png

Also, we notice a correlation between the significance of the 2nd epoch loss drop and the strength of the learning rate, implying 4e - 6 leads to more catastrophic forgetting.

[](https://github.com/OpenAccess - AI - Collective/axolotl)

Safety

...

📄 License

License Name: mrl
License Link: https://mistral.ai/licenses/MRL - 0.1.md

📋 Information Table

Property	Details
Model Type	Text Generation
Base Model	Mistral - Large - Instruct - 2407
Training Datasets	[Doctor - Shotgun/C2 - Stheno](https://huggingface.co/datasets/Doctor - Shotgun/C2 - Stheno), [anthracite - org/kalo - opus - instruct - 22k - no - refusal](https://huggingface.co/datasets/anthracite - org/kalo - opus - instruct - 22k - no - refusal), [anthracite - org/nopm_claude_writing_fixed](https://huggingface.co/datasets/anthracite - org/nopm_claude_writing_fixed)
Library Name	transformers
Supported Languages	English, French, German, Spanish, Italian, Portuguese, Russian, Chinese, Japanese
License Name	mrl
License Link	https://mistral.ai/licenses/MRL - 0.1.md

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご