RuadaptQwen2.5-7B-Lite-Beta-GGUF Open-source Model - Accelerate Russian text generation, free to deploy and use immediately

Ruadaptqwen2.5 7B Lite Beta GGUF

Developed by RefalMachine

An improved version adapted to Russian based on the T-lite-it-1.0 model. The Russian text generation speed is improved by replacing the tokenizer and continuing pre-training.

Large Language Model OtherOpen Source License:Apache-2.0 #Russian text generation #Tokenizer optimization #LEP technology

Downloads 311

Release Time : 1/27/2025

Model Overview

This model is a Russian-adapted version of the Qwen2.5-7B and T-lite-it-1.0 models. The efficiency of Russian text generation is significantly improved through technical improvements.

Model Features

Optimized Russian tokenizer

Using tiktoken cl100k extended by a single-character tokenizer with 48,000 tokens, which significantly improves the efficiency of Russian processing

Improvement of generation speed

The Russian text generation speed is 60% faster than the original model

Application of LEP technology

Adopting the Learned Embedding Propagation technology to optimize the model performance

Support for GGUF format

A GGUF version of the model is provided for easy deployment and use

Model Capabilities

Russian text generation

Russian language understanding

Fast text reasoning

Use Cases

Russian content generation

Russian dialogue system

Building a Russian chatbot

Performs well in the Ru-Arena evaluation

Russian text creation

Generating Russian articles, stories and other content

🚀 Russian Adaptation of Qwen2.5 and T-lite-it-1.0

This project is an adaptation of Qwen2.5 and T-lite-it-1.0 models for the Russian language, aiming to improve the generation speed and performance of Russian texts.

🚀 Quick Start

You can try out the model in the deployed Space (select the model in the parameters at the bottom): https://huggingface.co/spaces/RefalMachine/RuadaptQwen2.5

✨ Features

GGUF Version: Currently in progress! The current version is v1.
Russian Adaptation: Adapted the T-lite-it-1.0 model to the Russian language. Replaced the tokenizer, then performed continued pretraining on a Russian corpus, and applied the LEP (Learned Embedding Propagation) technique.
Improved Generation Speed: Thanks to the new tokenizer (an extended tiktoken cl100k with a unigram tokenizer of 48k tokens), the generation speed* of Russian texts has increased by up to 60% compared to the original T-lite-it-1.0 model.
- *Generation speed refers to the number of Russian characters/words per second on the same text sequences.

📦 Installation

No specific installation steps are provided in the original document.

💻 Usage Examples

No code examples are provided in the original document.

📚 Documentation

Model Description

This is the GGUF version! Work in progress!!! The current version is v1.

It is an adaptation of the T-lite-it-1.0 model to the Russian language. In the model, the tokenizer was replaced, then continued pretraining was carried out on a Russian corpus, and the LEP (Learned Embedding Propagation) technique was applied.

Thanks to the new tokenizer (an extended tiktoken cl100k with a unigram tokenizer of 48k tokens), the generation speed* of Russian texts has increased by up to 60% compared to the original T-lite-it-1.0 model.

*Generation speed refers to the number of Russian characters/words per second on the same text sequences.

Tokenization

image/png

Metrics and Quality Assessment

The model was evaluated on Ru-Arena-General, MERA, and llmtf_open.

Results on Ru-Arena-General

Measurements were made using the official leaderboard code (https://github.com/VikhrModels/ru_llm_arena), but with repetition_penalty = 1.1.

image/png

Results on Shlepa

image/png

Results on MERA

image/png

Results on llmtf_open

TODO

How to cite:

Tikhomirov M., Chernyshov D. Facilitating Large Language Model Russian Adaptation with Learned Embedding Propagation //Journal of Language and Education. – 2024. – Vol. 10. – No. 4. – Pp. 130-145.

Tikhomirov M., Chernyshev D. Impact of Tokenization on LLaMa Russian Adaptation //2023 Ivannikov Ispras Open Conference (ISPRAS). – IEEE, 2023. – Pp. 163-168.

Warning

The model's responses do not reflect the opinions of the authors but merely repeat the knowledge obtained from the data at all stages of training (pretraining, tokenizer change, instruction training, answer quality calibration). The model was derived from a third - party pretrained model, and the current authors are not responsible for the pretraining of this model. No additional actions were taken to change the "opinions" embedded in the LLM when creating this version of the model. Use with caution.

📄 License

This project is licensed under the Apache-2.0 license.

Property	Details
Datasets	IlyaGusev/saiga_scored, IlyaGusev/saiga_preferences, dichspace/darulm
Language	ru
Pipeline Tag	text-generation
License	apache-2.0
Base Model	Qwen/Qwen2.5 - 7B, t-tech/T-lite-it-1.0

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご