Ruri Base
Ruri is a universal text embedding model for Japanese, focusing on sentence similarity and feature extraction tasks.
Downloads 523.56k
Release Time : 8/28/2024
Model Overview
Ruri is a Japanese text embedding model based on the BERT architecture, primarily used for calculating sentence similarity and extracting text features. The model supports adding specific prefixes to query and passage texts for better performance.
Model Features
Japanese Optimization
Specially optimized for Japanese text, excelling in Japanese language tasks
Long Text Support
Supports sequences up to 512 tokens, capable of handling longer texts
High Performance
Outperforms other Japanese models in the JMTEB benchmark
Prefix Enhancement
Improves similarity calculation by adding query/passage prefixes
Model Capabilities
Sentence Similarity Calculation
Text Feature Extraction
Semantic Search
Text Clustering
Information Retrieval
Use Cases
Information Retrieval
Q&A System
Implements question-answering functionality by calculating similarity between queries and candidate answers
Achieved a score of 69.82 on JMTEB retrieval tasks
Text Analysis
Text Clustering
Automatically groups similar texts together
Achieved a score of 54.16 on JMTEB clustering tasks
Featured Recommended AI Models