L

Language Detection

Developed by alexneakameni
BERT-based multilingual detection model supporting text classification for 200 languages
Downloads 1,210
Release Time : 2/13/2025

Model Overview

This is a BERT-based language detection model specifically designed for fast and accurate identification of text language types. Trained on a dataset of 121 million sentences covering 200 languages, it achieves high accuracy and recall rates.

Model Features

Multilingual Support
Supports detection for 200 languages, including major European, Asian, and African languages
High Accuracy
Achieves 0.9733 accuracy and 0.9733 F1 score on test sets
Data Augmentation
Employs multiple text augmentation strategies to enhance model robustness, including number removal and word order shuffling
Efficient Architecture
Lightweight BERT-based architecture with 4 Transformer layers, optimized for fast inference

Model Capabilities

Text Language Identification
Multilingual Text Classification
Short Text Language Detection
Long Text Language Detection

Use Cases

Content Management
Multilingual Content Classification
Automatically identifies the language of user-generated content
97.33% accuracy
Translation Systems
Pre-translation Language Detection
Automatically detects input text language before translation
Supports 200 language identifications
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase