B

Bioformer 8L

Developed by bioformers
A lightweight BERT model specifically designed for biomedical text mining, running 3 times faster than BERT-base while achieving comparable or superior performance to BioBERT/PubMedBERT
Downloads 164
Release Time : 3/2/2022

Model Overview

Bioformer-8L is a lightweight BERT model pre-trained from scratch on biomedical corpus, featuring a biomedical-specific vocabulary and suitable for various biomedical text mining tasks

Model Features

Biomedical Specialization
Pre-trained exclusively on biomedical corpus (PubMed abstracts and PMC full texts) with biomedical-specific vocabulary
Efficient & Lightweight
42.8M parameters, runs 3 times faster than BERT-base while maintaining high performance on downstream tasks
Whole-word Masking Strategy
Adopts whole-word masking strategy during pre-training with 15% masking rate
Specialized Vocabulary Coverage
Vocabulary trained on biomedical literature containing 32,768 tokens, covering biomedical special symbols

Model Capabilities

Biomedical Text Understanding
Masked Language Modeling
Biomedical Entity Recognition
Biomedical Text Classification

Use Cases

Biomedical Research
Disease Concept Recognition
Identifying disease-related concepts in biomedical texts
Accurately recognizes medical concepts like 'diabetes' in masked filling examples
Literature Classification
Multi-label topic classification for biomedical literature
Achieved top performance in BioCreative VII COVID-19 classification challenge
Clinical Text Processing
Clinical Record Analysis
Analyzing key medical information in clinical records
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase