AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Self-Distillation Training

# Self-Distillation Training

Seerattention QwQ 32B AttnGates
Apache-2.0
Introducing an attention gating (AttnGates) weight adapter for the QwQ-32B model to accelerate long-context computation through dynamic block-level sparsity
Large Language Model Transformers
S
SeerAttention
35
3
Splade Cocondenser Selfdistil
SPLADE model for passage retrieval, improving retrieval effectiveness through sparse latent document expansion and knowledge distillation techniques
Text Embedding Transformers English
S
naver
16.11k
10
Trans Encoder Bi Simcse Roberta Large
An unsupervised sentence encoder based on RoBERTa-large, trained with self-distillation and mutual distillation techniques, suitable for sentence similarity calculation tasks.
Text Embedding Transformers
T
cambridgeltl
17
0
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase