P

Prot T5 Xl Bfd

Developed by Rostlab
ProtT5-XL-BFD is a self-supervised pre-trained model based on protein sequences, using the T5 architecture, trained on 2.1 billion protein sequences for protein feature extraction and downstream task fine-tuning.
Downloads 605
Release Time : 3/2/2022

Model Overview

This model is pre-trained on a large protein sequence corpus with masked language modeling objectives, capable of capturing biophysical properties of proteins, suitable for protein structure prediction and functional analysis.

Model Features

Large-scale pre-training
Pre-trained on the BFD dataset containing 2.1 billion protein sequences, covering a wide range of protein diversity.
Self-supervised learning
No manual labeling required; learns from raw protein sequences through masked language modeling objectives.
Biophysical property capture
The extracted features can reflect important biophysical properties that determine protein shapes.

Model Capabilities

Protein sequence feature extraction
Protein structure prediction
Protein functional analysis

Use Cases

Bioinformatics
Protein secondary structure prediction
Used to predict the secondary structure of proteins (3-state or 8-state classification).
Achieved 77% accuracy (3-state) on the CASP12 dataset
Subcellular localization prediction
Predicts the localization of proteins within cells.
Achieved 77% accuracy on the DeepLoc dataset
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase