B

Bert Base Finnish Uncased V1

Developed by TurkuNLP
FinBERT is a Finnish pre-trained language model based on Google's BERT architecture, trained on over 3 billion Finnish word tokens and suitable for various Finnish NLP tasks.
Downloads 1,964
Release Time : 3/2/2022

Model Overview

FinBERT is a BERT model specifically optimized for Finnish, achieving state-of-the-art performance in tasks such as document classification, named entity recognition, and part-of-speech tagging through fine-tuning.

Model Features

Specialized Finnish Vocabulary
Custom 50,000-word piece vocabulary with far superior Finnish coverage compared to multilingual BERT
Large-Scale Finnish Training
Trained on 3 billion word tokens (24 billion characters) of Finnish text, far exceeding Wikipedia data volume
Multi-Domain Applicability
Training data includes news, online discussions, and web-crawled content, adaptable to various text types

Model Capabilities

Finnish Text Understanding
Document Classification
Named Entity Recognition
Part-of-Speech Tagging
Transfer Learning

Use Cases

News Classification
Yle News Classification
Classifying news articles from Finnish Broadcasting Company
Outperforms multilingual BERT across different training set sizes
Social Media Analysis
Ylilauta Forum Classification
Classifying content from Finnish online forums
Significantly outperforms baseline models
Information Extraction
Named Entity Recognition
Identifying entities such as person names and locations in Finnish text
Achieves 92.40% accuracy on the FiNER corpus
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase