Gpt2 Zinc 87m
G
Gpt2 Zinc 87m
Developed by entropy
An autoregressive language model based on GPT2 architecture, specifically designed for generating drug-like molecules or embedding representations from SMILES strings
Downloads 404
Release Time : 5/11/2023
Model Overview
This model was trained on approximately 480 million SMILES strings from the ZINC database, suitable for molecular generation tasks in chemistry and drug discovery
Model Features
Large-scale molecular training data
Trained on 480 million SMILES strings from the ZINC database
High generation quality
Generates molecules with high uniqueness and validity across different temperature settings
Embedding representation capability
Can generate meaningful embedding representations from SMILES strings
Optimized training
Trained for 175,000 iterations with a batch size of 3072, achieving a validation loss of approximately 0.615
Model Capabilities
Molecular generation
SMILES string embedding representation
Drug-like compound design
Use Cases
Drug discovery
Virtual compound library generation
Generate large numbers of potential drug candidate molecules
At temperature 1.0, generates 99.9% unique and 99.9% valid molecules
Molecular representation learning
Convert SMILES strings into embedding vectors for downstream tasks
Chemical research
Chemical space exploration
Generate novel molecular structures to explore chemical space
Featured Recommended AI Models
Š 2025AIbase