๐ QVikhr-3-1.7B-Instruction-noreasoning
This is an instructive model based on Qwen/Qwen3-1.7B, trained on the Russian-language dataset GrandMaster2. It is designed for high-efficiency text processing in Russian and English, delivering precise responses and fast task execution.
๐ Quick Start
The model is ready to use for high - efficiency text processing in Russian and English. You can refer to the sample code below to start using it.
โจ Features
๐ฆ Quantized variants
๐ป Usage Examples
Basic Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "Vikhrmodels/QVikhr-3-1.7B-Instruction-noreasoning"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
input_text = "โรนโโโรธโโโร โโ โโซโรโโโรโโซโรฆโยต โรฆโรธโโโร
โโโฮฉโโโยต โโซโฮฉโโโโฅโโ โรฌโโโรโรโโ โรผโรฆโรโรโยตโร."
messages = [
{"role": "user", "content": input_text},
]
input_ids = tokenizer.apply_chat_template(messages, truncation=True, add_generation_prompt=True, return_tensors="pt")
output = model.generate(
input_ids,
max_length=1512,
temperature=0.3,
num_return_sequences=1,
no_repeat_ngram_size=2,
top_k=50,
top_p=0.95,
)
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)
Model response
<think>
</think>
The response in the original text is in an unreadable format, and it seems to be in a non - standard encoding. Please check and provide the correct text if you need a more accurate presentation.
๐ Documentation
Description
QVikhr-3-1.7B-Instruction-noreasoning is a robust language model trained on the GrandMaster-2 dataset. It excels in instruction generation, contextual responses, and text analysis in Russian. The model is optimized for instructional tasks and textual data processing, suitable for professional use as well as integration into user - facing applications and services.
Training
QVikhr-3-1.7B-Instruction-noreasoning was developed using the SFT (Supervised Fine - Tuning) FFT (Full Fine - Tune) method. The synthetic dataset GrandMaster-2 was used for training.
๐ Ru Arena General
Model |
Score |
95% CI |
Avg. #Tokens |
Vikhrmodels-QVikhr-3-1.7B-Instruction-noreasoning |
59.2 |
(-2.1, 1.8) |
1053 |
noresoning-Qwen3-1.7B |
51.9 |
(-1.9, 1.5) |
999 |
Qwen3-1.7B |
49.7 |
(-1.8, 1.9) |
1918 |
๐ License
The model is released under the apache-2.0
license.
๐จโ๐ป Authors
๐ How to Cite
@inproceedings{nikolich2024vikhr,
title={Vikhr: Advancing Open-Source Bilingual Instruction-Following Large Language Models for Russian and English},
author={Aleksandr Nikolich and Konstantin Korolev and Sergei Bratchikov and Nikolay Kompanets and Igor Kiselev and Artem Shelmanov},
booktitle={Proceedings of the 4th Workshop on Multilingual Representation Learning (MRL) @ EMNLP-2024},
year={2024},
publisher={Association for Computational Linguistics},
url={[https://arxiv.org/pdf/2405.13929](https://arxiv.org/pdf/2405.13929)}
}
@misc{qwen3technicalreport,
title={Qwen3 Technical Report},
author={Qwen Team},
year={2025},
eprint={2505.09388},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2505.09388},
}
โ ๏ธ Important Note
The recommended generation temperature is 0.3.