R

Ruri V3 Pt 30m

Developed by cl-nagoya
Ruri is a Japanese universal text embedding model based on ModernBERT-Ja, offering versions with different parameter scales suitable for various text processing tasks.
Downloads 250
Release Time : 3/20/2025

Model Overview

Ruri is a Japanese universal text embedding model primarily used for sentence similarity calculation and feature extraction. It is based on the ModernBERT-Ja architecture and supports prefix differentiation for various text types.

Model Features

Multiple Parameter Scale Versions
Offers model versions ranging from 30M to 310M parameters to meet different computational resource needs.
1+3 Prefix Scheme
Uses special prefixes to differentiate text types: empty string for semantic encoding, 'トピック:' for classification/clustering, '検索クエリ:' for search queries, and '検索文書:' for documents to be retrieved.
High Performance
Achieves an average score of 74.51 to 77.24 on the JMTEB benchmark (varies by parameter scale version).

Model Capabilities

Sentence Similarity Calculation
Text Feature Extraction
Semantic Encoding
Classification/Clustering Encoding
Search Query Encoding
Document Retrieval Encoding

Use Cases

Information Retrieval
Document Retrieval
Use '検索クエリ:' and '検索文書:' prefixes to encode queries and documents for efficient retrieval.
Text Analysis
Topic Classification
Use the 'トピック:' prefix to encode text for topic classification.
Semantic Similarity Calculation
Compare embedding vectors of different texts to calculate semantic similarity.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase