P

Plantcaduceus L20

Developed by kuleshov-group
PlantCaduceus is a DNA language model pre-trained on 16 angiosperm genomes, utilizing Caduceus and Mamba architectures to learn evolutionary conservation and DNA sequence syntax through masked language modeling objectives.
Downloads 8,967
Release Time : 5/19/2024

Model Overview

PlantCaduceus is a DNA language model specifically designed for processing and analyzing plant genome sequences, capable of learning evolutionary conservation and DNA sequence syntax.

Model Features

Multi-Species Genome Pre-training
Pre-trained on 16 angiosperm genomes, covering 160 million years of evolutionary history.
Multiple Parameter Scales
Offers models ranging from 20 million to 225 million parameters to accommodate different computational needs.
Evolutionary Conservation Learning
Capable of learning evolutionary conservation and syntax rules in DNA sequences.

Model Capabilities

DNA sequence analysis
Genome masked language modeling
Evolutionary conservation prediction

Use Cases

Genome Research
DNA Sequence Scoring
Use the model for zero-shot scoring estimation of DNA sequences.
Evolutionary Conservation Analysis
Analyze conserved regions in DNA sequences across different species.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase