M

Muril Large Cased

Developed by google
A multilingual pre-trained model for Indian languages based on the BERT large architecture, covering 17 Indian languages and their transcribed versions
Downloads 6,307
Release Time : 3/2/2022

Model Overview

MuRIL is a multilingual representation model optimized for Indian languages, enhancing performance on low-resource languages by integrating translation and transcribed data, suitable for NLP tasks in Indian languages

Model Features

Multilingual Transcription Optimization
Simultaneously trains original text and transcribed text pairs, specifically addressing common language transcription phenomena in India
Low-Resource Language Enhancement
Uses a 0.3 exponential upsampling strategy to significantly improve model performance on low-resource languages
Parallel Data Training
Integrates translation data (Google NMT) and transcription data (IndicTrans) for joint training

Model Capabilities

Multilingual Text Understanding
Cross-Language Transcription Processing
Named Entity Recognition
Text Classification
Question Answering System

Use Cases

Government Services
Multilingual Policy Document Analysis
Processes government documents in different Indian languages
Achieves an F1 score of 77.7% on the PANX task
Education
Cross-Language Educational Resource Processing
Automatically processes educational materials in different Indian languages
Improves F1 score by 3% on the TyDiQA task
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase