M

Multilingual ModernBert Base Preview

Developed by makiart
A multilingual BERT model developed by the Algomatic team, supporting mask-filling tasks with an 8192 context length and a vocabulary of 151,680.
Downloads 60
Release Time : 2/10/2025

Model Overview

This model is a multilingual BERT model primarily designed for mask-filling tasks. It supports multiple languages and features extended context processing capabilities, making it suitable for text understanding and generation tasks.

Model Features

Long Context Support
Supports an 8192 context length, ideal for long-text processing tasks.
Multilingual Capability
Supports multiple languages including Korean, English, Chinese, and Japanese.
Efficient Inference
Supports FlashAttention for more efficient inference on compatible GPUs.
Custom Tokenizer
Based on Qwen2.5 tokenizer with a 151,680 vocabulary size, optimized for code indentation recognition.

Model Capabilities

Mask Filling
Multilingual Text Understanding
Long Text Processing

Use Cases

Text Understanding and Generation
Korean Text Filling
Fills missing parts in Korean sentences.
Example result: {'score': 0.248046875, 'token': 128956, 'token_str': ' 하는', 'sequence': '우리의 대부분의 고뇌는 가능했을 또 다른 인생을 하는 데서 시작된다.'}
English Text Filling
Fills missing parts in English sentences.
Example result: {'score': 0.20703125, 'token': 5322, 'token_str': ' problems', 'sequence': 'Pinning our hopes on the unreliable notion of our potential is the root of all our problems.'}
Chinese Text Filling
Fills missing parts in Chinese sentences.
Example result: {'score': 0.177734375, 'token': 99392, 'token_str': '知道', 'sequence': '我们必须知道,我们只能成为此时此地的那个自己,而无法成为其他任何人。'}
Japanese Text Filling
Fills missing parts in Japanese sentences.
Example result: {'score': 0.11865234375, 'token': 142732, 'token_str': 'ケーキ', 'sequence': '大きなケーキを一人で切り分けて食べるというのは孤独の極地ですからね'}
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase