D

Doge 20M Chinese

Developed by wubingheng
The Doge model employs dynamic masked attention mechanisms for sequence transformation, with the option to use either multi-layer perceptrons or cross-domain mixture of experts for state transitions.
Downloads 65
Release Time : 4/11/2025

Model Overview

The Doge model is a Chinese text generation model that utilizes dynamic masked attention mechanisms, supporting the switching of different state transition mechanisms during training and inference.

Model Features

Dynamic Masked Attention Mechanism
Enables the Transformer to use self-attention mechanisms during training and switch to state-space mechanisms during inference.
Cross-domain Mixture of Experts
Allows direct inheritance of multi-layer perceptron weights for subsequent training.

Model Capabilities

Chinese Text Generation

Use Cases

Text Generation
Dialogue Generation
Used for generating natural language dialogues
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase