B

Bert Base Finnish Cased V1

Developed by TurkuNLP
FinBERT is the Finnish version of Google's BERT model, specifically optimized for Finnish natural language processing tasks. Trained on large-scale Finnish corpora, it outperforms multilingual BERT in multiple tasks.
Downloads 10.30k
Release Time : 3/2/2022

Model Overview

A Finnish pre-trained language model based on the BERT architecture, supporting fine-tuning for various Finnish NLP tasks. Uses a customized vocabulary for more comprehensive Finnish word coverage, trained on diverse corpora including news and forums.

Model Features

Customized vocabulary
Includes 50,000 Finnish-optimized word pieces, significantly improving vocabulary coverage compared to multilingual BERT
Large-scale pre-training
Trained on 3 billion Finnish tokens (24 billion characters), 30 times the size of Finnish Wikipedia
Domain adaptability
Training data covers news, online discussions, and web-crawled content, supporting diverse application scenarios

Model Capabilities

Text classification
Named entity recognition
Part-of-speech tagging
Semantic understanding

Use Cases

News analysis
News topic classification
Automatic classification of Yle news articles
Outperforms multilingual BERT across different training data scales
Social media analysis
Forum content classification
Classification of Ylilauta online discussion content
Significantly higher accuracy than FastText baseline models
Information extraction
Named entity recognition
Identifying Finnish personal names, locations, and other entities from text
Achieved 92.4% accuracy on the FiNER corpus
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase