Vikhr-YandexGPT-5-Lite-8B-it Open Source Model - Free Support for Russian and English Task Processing

Vikhr YandexGPT 5 Lite 8B It

Developed by Vikhrmodels

An instruction-tuned model based on YandexGPT-5-Lite-8B-pretrain, fine-tuned with Russian datasets GrandMaster-PRO-MAX and Grounded-RAG-RU-v2, excelling in Russian and English tasks.

Large Language Model

Transformers

Supports Multiple LanguagesOpen Source License:Other #Russian-English bilingual instruction following #RAG document retrieval augmentation #Chain-of-thought reasoning

Downloads 3,058

Release Time : 2/28/2025

Model Overview

This is a bilingual (Russian/English) instruction-following large language model optimized for Russian, supporting text generation, Q&A, and RAG (Retrieval-Augmented Generation) tasks.

Model Features

Russian language optimization

Specifically optimized for Russian language tasks, excelling in Russian text generation and comprehension.

Bilingual support

Supports both Russian and English, suitable for bilingual application scenarios.

RAG integration

Built-in Retrieval-Augmented Generation capability, enabling accurate responses based on provided documents.

Instruction following

Trained on large-scale instruction datasets, capable of accurately understanding and executing complex instructions.

Model Capabilities

Text generation

Q&A systems

Instruction following

Retrieval-Augmented Generation (RAG)

Bilingual processing

Use Cases

Content generation

Movie description generation

Generate a brief description about the movie 'Back to the Future'

Can generate comprehensive descriptions including basic movie information, plot summary, and impact.

Information retrieval & Q&A

Document-based Q&A

Answer questions about global warming using provided documents

Can accurately cite document content and synthesize information from multiple documents to generate complete answers.

🚀 Vikhr-YandexGPT-5-Lite-8B-it

An instructional model based on YandexGPT-5-Lite-8B-pretrain, trained on the Russian-language datasets GrandMaster-PRO-MAX and Grounded-RAG-RU-v2 using SFT.

✨ Features

📚 Foundation: YandexGPT-5-Lite-8B-pretrain
💾 Dataset: GrandMaster-PRO-MAX, Grounded-RAG-RU-v2
🇷🇺 Specialization: RU
🌍 Support: Bilingual RU/EN

📦 Quantized Variants

GGUF
MLX
- 4 bit
- 8 bit

🚀 Quick Start

Try now

Training

Vikhr-YandexGPT-5-Lite-8B-it was created using the SFT (Supervised Fine-Tuning) method.

Instructional SFT Part

For the SFT stage of model training, we prepared a large (150k instructions) instructional synthetic dataset Vikhrmodels/GrandMaster-PRO-MAX. Its feature is the built-in CoT (Chain-Of-Thought), for which we used a modified prompt for gpt-4-turbo. Details can be found in the dataset card.

In addition, to implement RAG Grounding, we prepared another synthetic dataset - Vikhrmodels/Grounded-RAG-RU-v2 (50k dialogues). Its collection pipeline is quite complex for a short description, and you can read more about it in its dataset card.

Training Config

💻 Usage Examples

Basic Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load the model and tokenizer
model_name = "Vikhrmodels/Vikhr-YandexGPT-5-Lite-8B-it"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Prepare the input text
input_text = "Напиши краткое описание фильма Назад в будущее."

messages = [
    {"role": "user", "content": input_text},
]

# Tokenize and generate text
input_ids = tokenizer.apply_chat_template(messages, truncation=True, add_generation_prompt=True, return_tensors="pt")
output = model.generate(
    input_ids,
    max_length=1512,
    temperature=0.7,
)

# Decode and print result
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)

Model Response

"Back to the Future" is an American science fiction film released in 1985. The film was directed by Robert Zemeckis, and the screenplay was written by Bob Gale. The main roles were played by Michael J. Fox, Christopher Lloyd, and Lea Thompson.

The film tells the story of Marty McFly, an ordinary teenager from 1985, who accidentally travels back to 1955 thanks to the invention of his friend, the scientist Dr. Emmett Brown. Marty finds himself in the past, where he must help Dr. Brown, who was young and naive at the time, invent the time machine.

During his adventures, Marty meets the young Dr. Brown and his family, and he also falls in love with a girl who will become his mother in the future. Marty must not only correct the mistakes of the past but also prevent a catastrophe that could change the future.

The film won numerous awards and became a cult classic, spawning two sequels and many memes and quotes that are still popular today.

Advanced Usage - Working with RAG

The role of documents is a list of dictionaries describing the content of documents, using json.dumps(array, ensure_ascii=False) (see the example below). The content of documents can be presented in 3 different formats: Markdown, HTML, Plain Text. The content of each document can be a text chunk up to 4k characters long.

[
  {
    "doc_id": (0..5),
    "title": "(null or str)",
    "content": "(html or markdown or plain text)"
  }
]

Example of Correct Usage with OpenAI-like API

Start the vLLM server: vllm serve --dtype half --max-model-len 32000 -tp 1 Vikhrmodels/Vikhr-YandexGPT-5-Lite-8B-it --api-key token-abc123

GROUNDED_SYSTEM_PROMPT = "Your task is to answer the user's questions using only the information from the provided documents. Give two answers to each question: one with a list of relevant document identifiers and the second with the answer to the question itself, using documents with these identifiers."
documents = [
  {
    "doc_id": 0,
    "title": "Global Warming: Glaciers",
    "content": "Over the past 50 years, the volume of glaciers worldwide has decreased by 30%"
  },
  {
    "doc_id": 1,
    "title": "Global Warming: Sea Level",
    "content": "The global sea level has risen by 20 cm since 1880 and continues to rise at a rate of 3.3 mm per year"
  }
]
sample_history = [
    {'role': 'system', 'content': GROUNDED_SYSTEM_PROMPT}, 
    {'role': 'documents', 'content': json.dumps(documents, ensure_ascii=False)},
    {'role': 'user', 'content': 'Global Warming'}
]
relevant_indexes = llm_client.chat.completions.create(
    model=llm_model,
    messages=sample_history,
    temperature=0.0,
    max_tokens=2048
).choices[0].message.content
print('Using documents: ' + relevant_indexes + '\n----')
final_answer = llm_client.chat.completions.create(
    model=llm_model,
    messages=sample_history + [{'role': 'assistant', 'content': relevant_indexes}],
    temperature=0.3,
    max_tokens=2048
).choices[0].message.content
print(final_answer)

The response after running the code will look something like this:

Using documents: {"relevant_doc_ids": [0, 1]}

Global warming is the long-term increase in the average temperature of the Earth's atmosphere and oceans. This phenomenon has numerous consequences for our planet, and two key aspects can be highlighted based on the available data:

Glacier Volume Reduction: Over the past 50 years, there has been a significant reduction in the volume of glaciers worldwide. According to the data, the volume of glaciers has decreased by 30%. This may be related to the melting of glaciers due to rising temperatures, which is one of the signs of global warming.

Sea Level Rise: The global sea level is also rising, which is related to the melting of glaciers and ice sheets, as well as the expansion of water due to rising temperatures. Since 1880, the sea level has risen by 20 centimeters, and this process continues, with an annual increase of 3.3 millimeters.

These changes have serious consequences for ecosystems, the climate, and human society. The melting of glaciers leads to a rise in sea levels, which can cause flooding of coastal areas and islands, as well as changes in water resources and climate patterns.

Using the first model response relevant_indexes (JSON), you can determine whether the model found information in the documents. It is trained to return an empty array if there is no information. In such a case, it will answer that it could not find information in the knowledge base (when generating the second response).

📚 Documentation

Nuances and Limitations

⚠️ Important Note

The model has a low level of response security and is aimed at correctly and fully executing instructions. Keep this in mind when using it and test it yourself. This can be partially corrected by system prompts and additional instructions about the importance of security in the user's prompt.

System prompts are not intended for character descriptions. We recommend using them to specify the response style (such as "answer only in json format"). In addition, it is advisable to write them in English because that's how it was in the dataset. The language of the response is not affected by using English in system prompts.

The RAG mode requires the presence of the system prompt GROUNDED_SYSTEM_PROMPT described in the How to Work with RAG section. Sometimes, the model may add general information from its knowledge to the answer in addition to what is in the documents.

It is better to use the model with a low temperature (0.1 - 0.5) and also use top_k (30 - 50). Random generation defects were observed at a temperature of 1.0.

Authors

Sergei Bratchikov, NLP Wanderer, Vikhr Team
Nikolay Kompanets, LakoMoor, Vikhr Team
Konstantin Korolev, Vikhr Team
Aleksandr Nikolich, Vikhr Team

@inproceedings{nikolich2024vikhr,
  title={Vikhr: Advancing Open-Source Bilingual Instruction-Following Large Language Models for Russian and English},
  author={Aleksandr Nikolich and Konstantin Korolev and Sergei Bratchikov and Nikolay Kompanets and Igor Kiselev and Artem Shelmanov},
  booktitle={Proceedings of the 4th Workshop on Multilingual Representation Learning (MRL) @ EMNLP-2024},
  year={2024},
  publisher={Association for Computational Linguistics},
  url={https://arxiv.org/pdf/2405.13929}
}

📄 License

The model is under the yandexgpt-5-lite-8b-pretrain license.

Property	Details
Library Name	transformers
Model Name	Vikhrmodels/Vikhr-YandexGPT-5-Lite-8B-it
Datasets	Vikhrmodels/GrandMaster-PRO-MAX, Vikhrmodels/Grounded-RAG-RU-v2
Base Model	yandex/YandexGPT-5-Lite-8B-pretrain
Language	ru, en
License	other
License Name	yandexgpt-5-lite-8b-pretrain
License Link	LICENSE

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご