D

Doge 20M Instruct

Developed by SmallDoge
Doge 20M is a small language model based on dynamic masked attention mechanism, supporting instruction following and Q&A tasks.
Downloads 5,010
Release Time : 12/14/2024

Model Overview

Doge employs dynamic masked attention for sequence transformation and can use multi-layer perceptrons or cross-domain mixture of experts for state transitions. The model underwent supervised fine-tuning (SFT) on the SmolTalk dataset, followed by direct preference optimization (DPO) training on the UltraFeedback Binarized dataset.

Model Features

Dynamic Masked Attention Mechanism
Enables Transformer to use self-attention during training and state space during inference
Cross-domain Mixture of Experts
Can directly inherit weights from multi-layer perceptrons for further training
Efficient Inference
Achieves 142 tokens/sec inference speed on an i7-11th gen CPU

Model Capabilities

Instruction Following
Question Answering
Text Generation

Use Cases

Dialogue Systems
Daily Conversations
Used for building chatbots for daily conversations
Q&A Systems
Knowledge Q&A
Used to answer various user questions
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase