J

Jacolbert

Developed by bclavie
JaColBERT is the first Japanese-specific document retrieval model based on ColBERT, featuring strong out-of-domain generalization capabilities.
Downloads 859
Release Time : 12/25/2023

Model Overview

JaColBERT is the first Japanese-specific document retrieval model based on ColBERT. By representing documents as sets of embedding vectors, it achieves excellent performance and strong out-of-domain generalization capabilities at a low computational cost.

Model Features

Strong Out-of-Domain Generalization
Despite being evaluated on out-of-domain datasets, JaColBERT surpasses previously common Japanese document retrieval models and approaches the performance of multilingual models.
Efficient Training
Trained on only 10 million triplets from a single dataset, requiring far less data than dense embedding models.
High Computational Efficiency
By representing documents as sets of embedding vectors, it achieves superior performance at a much lower computational cost compared to cross-encoders.

Model Capabilities

Japanese Document Retrieval
Sentence Similarity Calculation
Semantic Search

Use Cases

Information Retrieval
Question Answering System
Used to build Japanese question-answering systems, quickly retrieving relevant documents to answer questions.
Achieved R@1 of 0.906 on the JSQuAD dataset
Document Search
Used for semantic search of Japanese documents, improving search relevance.
Performed excellently on MIRACL and MrTyDi datasets
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase