T

Tamillion

Developed by monsoon-nlp
A Tamil pre-trained model based on the ELECTRA framework, with the second version trained on TPUs and expanded corpus scale
Downloads 58
Release Time : 3/2/2022

Model Overview

A pre-trained language model specifically designed for Tamil, supporting natural language processing tasks such as text classification and sentiment analysis

Model Features

TPU Training Optimization
The second version uses TPU training, offering better performance compared to the GPU-trained V1 version
Expanded Corpus
Incorporates 11GB of IndicCorp corpus and 482MB of Wikipedia data, providing broader coverage
Outperforms mBERT
Achieves 75.1% accuracy in Tamil news classification tasks, significantly surpassing mBERT's 53%

Model Capabilities

Tamil text understanding
News classification
Sentiment analysis
Classic text topic classification
Question answering system adaptation

Use Cases

Text Classification
News Classification
Classify Tamil news content
75.1% accuracy, surpassing the mBERT model
Classic Text Classification
Topic classification for the classic text 'Thirukkural'
Achieves accuracy comparable to mBERT
Sentiment Analysis
Movie Review Analysis
Analyze sentiment tendencies in Tamil movie reviews
Root mean square error of 0.626, better than mBERT's 0.657
Question Answering Systems
Tamil Question Answering
Build a Tamil question answering system through fine-tuning
Reference implementation cases in Hindi and Bengali
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase