M

Materials.smi Ted

Developed by ibm-research
Chemical language foundation model proposed by IBM, supporting various tasks such as molecular representation conversion and quantum property prediction
Downloads 20.65k
Release Time : 7/25/2024

Model Overview

SMI-TED is a large chemical foundation encoder-decoder model based on SMILES, pre-trained on 91 million molecular samples, supporting complex tasks like molecular representation conversion and quantum property prediction

Model Features

Multimodal Molecular Representation
Supports various molecular representations including SMILES strings, SELFIES encoding, and 3D atomic coordinates
Large-scale Pretraining
Pre-trained on 91 million molecular samples (4 billion tokens) from PubChem
Dual Training Strategy
Combines masked language modeling and encoder-decoder strategies to optimize model performance

Model Capabilities

Molecular representation conversion
Quantum property prediction
SMILES encoding and decoding
Molecular feature extraction

Use Cases

Material Discovery
Novel Molecule Design
Generates potential new compounds through molecular representation learning
Drug Development
Molecular Property Prediction
Predicts quantum chemical properties of drug candidates
Demonstrated excellent performance on MoleculeNet benchmark tests
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase