P

Protgpt2

Developed by nferruz
ProtGPT2 is a protein language model based on the GPT2 architecture, capable of generating novel protein sequences while retaining key features of natural proteins.
Downloads 17.99k
Release Time : 3/7/2022

Model Overview

ProtGPT2 is a language model that understands protein language and is used for novel protein design and engineering. The sequences it generates explore uncharted regions of protein space while preserving key characteristics of natural proteins (amino acid propensities, secondary structure content, and globular properties).

Model Features

Protein Sequence Generation
Capable of generating novel protein sequences to explore uncharted regions of protein space.
Preservation of Natural Features
Generated sequences retain key features of natural proteins, such as amino acid propensities, secondary structure content, and globular properties.
Self-supervised Training
Trained using self-supervised learning with a causal modeling objective to predict the next token in a sequence.

Model Capabilities

Protein sequence generation
Protein design
Protein engineering

Use Cases

Protein Design
Zero-shot Generation of Novel Proteins
Generate novel protein sequences starting from methionine (M).
Generated sequences retain key features of natural proteins.
Fine-tuning Based on User Sequences
Fine-tune based on user-provided sequences to generate specific types of protein sequences.
Generated sequences better align with user requirements.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase