M

Molformer XL Both 10pct

Developed by ibm-research
MoLFormer is a chemical language model pre-trained on 1.1 billion molecular SMILES strings from ZINC and PubChem. This version uses 10% samples from each dataset for training.
Downloads 171.96k
Release Time : 10/20/2023

Model Overview

A chemical language model based on linear attention Transformer architecture, primarily used for molecular feature extraction and property prediction tasks.

Model Features

Efficient Attention Mechanism
Utilizes linear attention Transformer architecture, significantly reducing computational complexity.
Dual Dataset Pretraining
Trained simultaneously on ZINC15 and PubChem datasets, covering a broader chemical space.
Molecular Representation Learning
Captures relationships between molecular structure and properties through self-supervised learning.

Model Capabilities

Molecular Feature Extraction
Molecular Property Prediction
Molecular Similarity Calculation

Use Cases

Drug Discovery
Solubility Prediction
Predicts water solubility of compounds.
RMSE of 0.3295 on ESOL dataset.
Toxicity Prediction
Evaluates compound toxicity.
AUROC of 84.5 on Tox21 dataset.
Materials Science
Quantum Chemical Property Prediction
Predicts quantum mechanical properties of molecules.
MAE of 1.7754 on QM9 dataset.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase