X

Xlmindic Base Uniscript Soham

Developed by ibraheemmoosa
This is a multilingual model based on the ALBERT architecture, specifically optimized for Indo-Aryan languages, supporting ISO-15919 transcribed text processing.
Downloads 117
Release Time : 3/2/2022

Model Overview

This fine-tuned model is primarily designed to handle Indian language texts transcribed in ISO-15919 format, supporting various natural language processing tasks.

Model Features

ISO-15919 Transcription Support
The model accepts ISO-15919 transcribed text, enabling unified processing of Indian languages across different writing systems.
Multilingual Capability
Supports processing of 14 Indo-Aryan languages with cross-lingual representation learning capabilities.
Efficient Architecture
Based on the ALBERT architecture, parameter sharing mechanism makes the model more lightweight and efficient.
Outstanding IndicGLUE Benchmark Performance
Surpasses baseline models like mBERT, XLM-R in multiple Indian language processing tasks.

Model Capabilities

Text classification
Named entity recognition
Masked language modeling
Cross-lingual text processing
Indian language understanding

Use Cases

News Classification
Bengali News Classification
Genre classification of Bengali news articles
Achieved 93.89% accuracy on the Soham dataset
Hindi News Classification
Classification of BBC Hindi news articles
Achieved 79.14% accuracy
Language Understanding
Cross-lingual Text Processing
Processing multiple Indian language texts transcribed in ISO-15919 format
Excellent performance on IndicGLUE benchmarks
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase