LeoLM Open-Source Foundation Language Model - Free for Commercial Use, Focusing on German Text Generation, Supporting 8k Context

Leo Mistral Hessianai 7b Chat

Developed by LeoLM

LeoLM is the first open-source commercial foundational language model for German based on the Mistral architecture, supporting 8k context length and specializing in German text generation tasks.

Large Language Model

Transformers

Supports Multiple LanguagesOpen Source License:Apache-2.0 #German Dialogue Model #8k Long Context #Multi-turn Dialogue Optimization

Downloads 266

Release Time : 10/6/2023

Model Overview

This model is the dialogue variant in the LeoLM series, fine-tuned on carefully selected German instruction datasets. It excels in writing, explanation, and discussion tasks but has average performance in mathematics and advanced reasoning.

Model Features

German Optimization

Specifically optimized for German through continued pre-training and fine-tuning, delivering excellent performance in German text generation tasks.

Long Context Support

Supports up to 8k tokens of context length, making it suitable for processing long documents and complex dialogues.

Commercial License

Licensed under Apache-2.0, allowing commercial use.

Dialogue Optimization

Utilizes ChatML format templates, making it particularly suitable for building dialogue systems.

Model Capabilities

German Text Generation

Multi-turn Dialogue Processing

Content Creation

Information Explanation

Role-playing

Use Cases

Content Creation

German Article Writing

Generates various German articles, reports, and creative writing.

Achieved a writing score of 6.8 in the MT-Bench-DE evaluation.

Poetry Composition

Generates German poetry and song lyrics.

Fine-tuned on German_Poems and German_Songs datasets.

Dialogue Systems

Customer Service Chatbot

Builds German customer service dialogue systems.

Supports multi-turn dialogue processing.

Educational Assistant

Serves as a German learning aid.

Achieved a high score of 8.25 in humanities and social sciences tasks.

🚀 LAION LeoLM: Linguistically Enhanced Open Language Model

LAION LeoLM is the first open and commercially available German Foundation Language Model built on Llama - 2 and Mistral. It extends Llama - 2's capabilities to German through continued pretraining on a large German - language corpus. With support from HessianAI's supercomputer, we're releasing three models with 8k context length, aiming to boost German open - source and commercial LLM research.

🚀 Quick Start

LAION LeoLM brings new opportunities to the German LLM research field. It's based on Llama - 2 and Mistral, and trained on a large German - language corpus. We've released three foundation models with 8k context length. Read our blog post or paper (preprint coming soon) for more details.

✨ Features

German - Focused: Specifically trained to handle German language tasks effectively.
Multiple Model Versions: Released three foundation models with different scales and under different licenses.
8k Context Length: Capable of handling longer texts and more complex conversations.

📦 Installation

Install Direct Dependencies

First, install the direct dependencies:

pip install transformers torch sentencepiece

Install Dependencies for Faster Inference

If you want faster inference using flash - attention2, install these dependencies:

pip install packaging ninja
pip install flash-attn

💻 Usage Examples

Basic Usage

from transformers import pipeline
import torch

system_prompt = """Dies ist eine Unterhaltung zwischen einem intelligenten, hilfsbereitem KI-Assistenten und einem Nutzer.
Der Assistent gibt ausführliche, hilfreiche und ehrliche Antworten."""

prompt_format = "<|im_start|>system\n{system_prompt}<|im_end|>\n<|im_start|>user\n{prompt}<|im_end|>\n<|im_start|>assistant\n"
prompt = "Erkläre mir wie die Fahrradwegesituation in Hamburg ist."

generator = pipeline(model="LeoLM/leo-mistral-hessianai-7b-chat", device="cuda", torch_dtype=torch.float16, use_flash_attention_2=True) # True for flash-attn2 else False
print(generator(prompt_format.format(system_prompt=system_prompt, prompt=prompt), do_sample=True, top_p=0.95, max_length=8192))

Advanced Usage

The advanced usage here mainly involves adjusting parameters for different application scenarios. For example, you can change the max_length parameter according to the length of the expected output, and adjust the top_p and do_sample parameters to control the randomness and diversity of the output.

# Advanced scenario: Generate a long - form text with high randomness
from transformers import pipeline
import torch

system_prompt = """Dies ist eine Unterhaltung zwischen einem intelligenten, hilfsbereitem KI-Assistenten und einem Nutzer.
Der Assistent gibt ausführliche, hilfreiche und ehrliche Antworten."""

prompt_format = "<|im_start|>system\n{system_prompt}<|im_end|>\n<|im_start|>user\n{prompt}<|im_end|>\n<|im_start|>assistant\n"
prompt = "Schreibe einen ausführlichen Artikel über die kulturellen Attraktionen in Berlin."

generator = pipeline(model="LeoLM/leo-mistral-hessianai-7b-chat", device="cuda", torch_dtype=torch.float16, use_flash_attention_2=True)
print(generator(prompt_format.format(system_prompt=system_prompt, prompt=prompt), do_sample=True, top_p=0.99, max_length=16384))

📚 Documentation

LeoLM Chat

LeoLM/leo-mistral-hessianai-7b-chat is a German chat model based on LeoLM/leo-mistral-hessianai-7b and finetuned on German instruction datasets. It performs well on writing, explanation, and discussion tasks but has some difficulties with math and advanced reasoning. Here are the MT - Bench - DE scores:

{
  "first_turn": 6.1,
  "second_turn": 4.7,
  "categories": {
      "writing": 6.8,
      "roleplay": 6.35,
      "reasoning": 3.3,
      "math": 2.75,
      "coding": 4.4,
      "extraction": 4.5,
      "stem": 6.85,
      "humanities": 8.25
  },
  "average": 5.4
}

Model Details

Property	Details
Finetuned from	LeoLM/leo-mistral-hessianai-7b
Model Type	Causal decoder - only transformer language model
Language	English and German
Demo	Web Demo coming soon !
License	Apache 2.0
Contact	LAION Discord or Björn Plüster

Prompting / Prompt Template

The prompt dialogue template follows the ChatML format:

"""
<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
"""

The model input can contain multiple conversation turns between the user and the assistant, for example:

<|im_start|>user
{prompt 1}<|im_end|>
<|im_start|>assistant
{reply 1}<|im_end|>
<|im_start|>user
{prompt 2}<|im_end|>
<|im_start|>assistant
(...)

🔧 Technical Details

Finetuning Details

Hyperparameter	Value
Num epochs	4
Examples per epoch	131214
Global batch size	256
Learning rate	1e - 5
Warmup steps	100
LR scheduler	Cosine
Adam betas	(0.9, 0.95)

Dataset Details

## Stats for 'Subset of OpenAssistant/OASST-DE' (3534 samples (100.0%))
-----------------
  Accepted: 3534/3534 (100.0%)
  Accepted tokens: 2259302
  Skipped: 0 (0.0%)
  Min tokens per sample: 29
  Max tokens per sample: 2484
  Avg tokens per sample: 639.3044708545557
-----------------

## Stats for 'Subset of FreedomIntelligence/evol-instruct-deutsch' (57841 samples (100.0%))
-----------------
  Accepted: 57841/57841 (100.0%)
  Accepted tokens: 42958192
  Skipped: 0 (0.0%)
  Min tokens per sample: 33
  Max tokens per sample: 5507
  Avg tokens per sample: 742.6944900675991
-----------------

## Stats for 'Subset of FreedomIntelligence/alpaca-gpt4-deutsch' (48969 samples (100.0%))
-----------------
  Accepted: 48969/48969 (100.0%)
  Accepted tokens: 13372005
  Skipped: 0 (0.0%)
  Min tokens per sample: 19
  Max tokens per sample: 1359
  Avg tokens per sample: 273.07082031489307
-----------------

## Stats for 'Subset of LeoLM/OpenSchnabeltier' (21314 samples (100.0%))
-----------------
  Accepted: 21314/21314 (100.0%)
  Accepted tokens: 8134690
  Skipped: 0 (0.0%)
  Min tokens per sample: 25
  Max tokens per sample: 1202
  Avg tokens per sample: 381.65947264708643
-----------------

## Stats for 'Subset of LeoLM/German_Poems' (490 samples (100.0%))
-----------------
  Accepted: 490/490 (100.0%)
  Accepted tokens: 618642
  Skipped: 0 (0.0%)
  Min tokens per sample: 747
  Max tokens per sample: 1678
  Avg tokens per sample: 1262.534693877551
-----------------

## Stats for 'Subset of LeoLM/German_Songs' (392 samples (100.0%))
-----------------
  Accepted: 392/392 (100.0%)
  Accepted tokens: 187897
  Skipped: 0 (0.0%)
  Min tokens per sample: 231
  Max tokens per sample: 826
  Avg tokens per sample: 479.3290816326531
-----------------

## Stats for 'total' (132540 samples (100.0%))
-----------------
  Accepted: 132540/132540 (100.0%)
  Accepted tokens: 67530728
  Skipped: 0 (0.0%)
  Min tokens per sample: 19
  Max tokens per sample: 5507
  Avg tokens per sample: 509.51205673758864
-----------------

📄 License

The model LeoLM/leo-mistral-hessianai-7b is under the Apache 2.0 license, while LeoLM/leo-hessianai-7b and LeoLM/leo-hessianai-13b are under the Llama - 2 community license.

⚠️ Important Note

As with all LLMs, the potential outputs of LeoLM/leo-mistral-hessianai-7b-chat cannot be predicted in advance. The model may produce inaccurate, biased or other objectionable responses. Developers should perform safety testing and tuning tailored to their specific applications before deployment.

Please see Meta's Responsible Use Guide.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご