U

USER2 Small

Developed by deepvk
USER2 is a next-generation Russian universal sentence encoder, specifically designed to support long-context sentence representations of up to 8,192 tokens.
Downloads 1,409
Release Time : 2/19/2025

Model Overview

Built on the RuModernBERT encoder and fine-tuned for retrieval and semantic tasks, it supports Matryoshka Representation Learning (MRL) technology, which allows reducing embedding dimensions with minimal quality loss.

Model Features

Long-context support
Supports long-context sentence representations of up to 8,192 tokens
Matryoshka Representation Learning (MRL)
Allows reducing embedding dimensions with minimal quality loss, supporting multiple dimensions [32, 64, 128, 256, 384]
Efficient small model
A compact model with only 34 million parameters, reducing computational resource requirements while maintaining performance
Task prefix optimization
Supports performance optimization for different scenarios by adding task prefixes (e.g., classification/clustering/search_query)

Model Capabilities

Text embedding generation
Sentence similarity calculation
Semantic retrieval
Text clustering
Classification tasks
Re-ranking tasks

Use Cases

Information retrieval
Document retrieval
Used in long-document retrieval systems, supporting long-context understanding of 8,192 tokens
Achieved nDCG@10 of 51.69 in the MLDR-rus test
Semantic analysis
Sentence similarity calculation
Calculates semantic similarity between two sentences or text segments
Scored 72.25 in the MTEB-rus semantic similarity task
Text classification
Multi-label classification
Suitable for scenarios requiring multi-label classification
Scored 33.56 in the MTEB-rus multi-label classification task
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase