P

Prot Bert Bfd

Developed by Rostlab
A BERT-based protein sequence pre-training model that extracts biophysical features through self-supervised learning from 2.1 billion protein sequences
Downloads 30.60k
Release Time : 3/2/2022

Model Overview

This model employs masked language modeling objectives for pre-training on massive protein sequences, capturing key biophysical properties that determine protein conformation, supporting protein feature extraction and downstream task fine-tuning

Model Features

Large-scale pre-training
Pre-trained on the BFD dataset containing 2.1 billion protein sequences to learn deep representations of protein sequences
Biophysical property capture
Model embeddings automatically capture key biophysical properties that determine protein conformation
Dual sequence processing
Supports two sequence length processing modes (512 and 2048) to accommodate different scales of protein analysis needs

Model Capabilities

Protein sequence feature extraction
Protein masked amino acid prediction
Protein downstream task fine-tuning

Use Cases

Protein structure prediction
Secondary structure prediction
Predict 3-state or 8-state secondary structures of proteins
Achieved 76% accuracy (3-state) on CASP12 dataset
Protein function analysis
Subcellular localization prediction
Predict protein localization within cells
Achieved 78% accuracy on DeepLoc dataset
Membrane protein recognition
Identify whether proteins belong to membrane proteins
Achieved 91% accuracy on DeepLoc dataset
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase