P

Prot Bert

Developed by Rostlab
A protein sequence pre-training model based on BERT architecture, capturing biophysical properties of protein sequences through self-supervised learning
Downloads 276.10k
Release Time : 3/2/2022

Model Overview

ProtBert is a model pre-trained on protein sequences using the masked language modeling (MLM) objective, capable of extracting protein features or fine-tuning for downstream tasks, learning biophysical properties in protein sequences

Model Features

Protein-specific pre-training
Optimized specifically for protein sequences, treating each sequence as an independent document
Biophysical property capture
Model embeddings can reflect important properties determining protein spatial conformation
Large-scale training data
Pre-trained on 217 million protein sequences from Uniref100

Model Capabilities

Protein sequence feature extraction
Protein sequence masking prediction
Fine-tuning for protein structure-related tasks

Use Cases

Protein structure prediction
Secondary structure prediction
Predicting 3-state or 8-state secondary structures of proteins
Achieved 75% accuracy (3-state) on CASP12
Protein function analysis
Subcellular localization prediction
Predicting the localization position of proteins within cells
Achieved 79% accuracy on the DeepLoc dataset
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase