🚀 keyT5. Base (small) version
keyT5 is a model designed for text - to - keywords conversion. It supports the Russian language and provides different pre - trained versions, helping users extract keywords from text efficiently.
🚀 Quick Start
Installation
pip install transformers sentencepiece
Usage
The following code demonstrates how to use the model to extract keywords from text. The code returns a list with keywords, and duplicates are possible.

from itertools import groupby
import torch
from transformers import T5ForConditionalGeneration, T5Tokenizer
model_name = "0x7194633/keyt5-large"
tokenizer = T5Tokenizer.from_pretrained(model_name)
model = T5ForConditionalGeneration.from_pretrained(model_name)
def generate(text, **kwargs):
inputs = tokenizer(text, return_tensors='pt')
with torch.no_grad():
hypotheses = model.generate(**inputs, num_beams=5, **kwargs)
s = tokenizer.decode(hypotheses[0], skip_special_tokens=True)
s = s.replace('; ', ';').replace(' ;', ';').lower().split(';')[:-1]
s = [el for el, _ in groupby(s)]
return s
article = """Reuters сообщил об отмене 3,6 тыс. авиарейсов из-за «омикрона» и погоды
Наибольшее число отмен авиарейсов 2 января пришлось на американские авиакомпании
SkyWest и Southwest, у каждой — более 400 отмененных рейсов. При этом среди
отмененных 2 января авиарейсов — более 2,1 тыс. рейсов в США. Также свыше 6400
рейсов были задержаны."""
print(generate(article, top_p=1.0, max_length=64))
📚 Documentation
Inference Parameters
Property |
Details |
top_p |
0.9 |
Inference Examples
- Topic: Coronavirus
- Text: "In Russia, a new strain of the coronavirus 'Omicron' may appear, which could lead to a rise in the incidence rate in January, said Sergey Voznesensky, an associate professor at the Department of Infectious Diseases of the Russian University of Peoples' Friendship. He noted that the 'Delta' variant caused more fatal cases than Omicron, and it was against the backdrop of 'Delta' that the maximum mortality rate was recorded."
- Topic: UK
- Text: "The British press reported that Admiral Tony Radakin, the head of the British Defense Staff, was made to simulate activity during a visit to a hangar with heavy weapons. The order stated that military personnel were ordered to run to the vehicles, open all hatches and doors, leaf through the operation manual, and inspect the machines as if a functional test was being conducted to ensure the proper operation of the equipment."
- Topic: Technologies
- Text: "To play music, simply press the keys on the keyboard. Each key corresponds to a specific sample - there are maracas and futuristic sounds reminiscent of blaster shots. From this variety, you can form your own patterns and watch the visualization with animated geometric figures. Interestingly, pressing the spacebar can completely change the appearance, colors on the screen, and the sound of the samples."
🛠️ Training
If you want to learn more about model training, you can go to the training notebook:

📄 License
This project is licensed under the MIT license.
🔗 Links
