Ultralong Thinking
U
Ultralong Thinking
Developed by mergekit-community
An 8B-parameter language model merged using the SLERP method, combining the strengths of DeepSeek-R1 and Nemotron-8B models
Downloads 69
Release Time : 4/17/2025
Model Overview
This is a pre-trained language model merged using the mergekit tool, employing Spherical Linear Interpolation (SLERP) to fuse the DeepSeek-R1 and Nemotron-8B models, aiming to combine their advantageous features
Model Features
Model fusion advantages
Combines DeepSeek-R1's distilled knowledge with Nemotron-8B's ultra-long context processing capability
V-shaped mixing strategy
Input/output layers adopt Hermes characteristics while middle layers use WizardMath features
Long-context support
Inherits Nemotron model's 4M tokens ultra-long context processing capability
Model Capabilities
Text generation
Instruction following
Long-context understanding
Multi-turn dialogue
Use Cases
Dialogue systems
Intelligent assistant
Building intelligent assistants capable of handling complex multi-turn dialogues
Can process context information up to 4M tokens long
Content generation
Long-form writing
Assisting in creating long articles or technical documents
Maintains long-distance contextual consistency
Featured Recommended AI Models
Š 2025AIbase