đ DUO: A Text Generation Model
DUO is a pre - trained model for masked language modeling, offering high - quality text generation capabilities.
đ Quick Start
To use the pre - trained model for masked language modeling, use the following snippet:
from transformers import AutoModelForMaskedLM, AutoTokenizer
tokenizer = transformers.AutoTokenizer.from_pretrained('gpt2')
model = AutoModelForMaskedLM.from_pretrained('s - sahoo/duo - distilled')
For a hands - on example, check out this [Colab notebook](https://colab.research.google.com/drive/1Sf7R - dqdR6gq - H8nyZ9E3ZkyvqMTqcwq?usp=sharing).
For more information and implementation details, visit our github repository: [DUO](https://github.com/s - sahoo/duo)
⨠Features
- Text Generation: Ideal for masked language modeling tasks.
- Context Length: The model has a context length of
1024
.
- Model Size: Similar in size to GPT2 - medium with approximately
130 million
non - embedding parameters.
đ Documentation
Model Details
The model, which has a context length of 1024
and is similar in size to GPT2 - medium with approximately 130 million
non - embedding parameters, was trained for 1M steps on the OpenWebText corpus.
For more details, please see our paper: The Diffusion Duality.
Project page: https://s - sahoo.com/duo
đ License
This project is licensed under the apache - 2.0 license.
đ Citation
Please cite our work using the bibtex below:
BibTeX:
@inproceedings{
sahoo2025the,
title={The Diffusion Duality},
author={Subham Sekhar Sahoo and Justin Deschenaux and Aaron Gokaslan and Guanghan Wang and Justin T Chiu and Volodymyr Kuleshov},
booktitle={ICLR 2025 Workshop on Deep Generative Model in Machine Learning: Theory, Principle and Efficacy},
year={2025},
url={https://openreview.net/forum?id=CB0Ub2yXjC}
}
đ Model Card Contact
Subham Sekhar Sahoo (ssahoo@cs.cornell.edu)