G

GP MoLFormer Uniq

Developed by ibm-research
GP-MoLFormer is a chemical language model pretrained on 650 million to 1.1 billion molecular SMILES string representations from ZINC and PubChem, focusing on molecular generation tasks.
Downloads 122
Release Time : 4/30/2025

Model Overview

GP-MoLFormer is a large-scale autoregressive chemical language model for molecular generation tasks, employing a decoder-only Transformer architecture with linear attention and rotary position encoding.

Model Features

Large-scale Pretraining
Pretrained on 650 million to 1.1 billion molecular SMILES strings from ZINC and PubChem
Unique Molecular Training
This version is pretrained on all unique molecules from both datasets
Versatile Molecular Generation
Supports unconditional de novo molecular generation, scaffold completion/modification, and molecular optimization
Efficient Architecture
Transformer architecture with linear attention and rotary position encoding for improved computational efficiency

Model Capabilities

Unconditional Molecular Generation
Scaffold-constrained Molecular Modification
Molecular Property Optimization
SMILES String Completion

Use Cases

Drug Discovery
De Novo Molecular Design
Generate entirely new potential drug molecular structures
Can produce valid, unique, and moderately novel molecules
Molecular Optimization
Optimize molecular properties through fine-tuning or paired tuning
Can adjust molecular distributions to be more drug-like
Chemical Research
Scaffold Modification
Generate variant molecules based on given molecular scaffolds
Maintains core structures while exploring chemical space
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase