UltraLong-Thinking Open-Source Language Model: Combining the Advantages of Two Models to Initiate a New Experience in Intelligent Conversations

Ultralong Thinking

Developed by mergekit-community

An 8B-parameter language model merged using the SLERP method, combining the strengths of DeepSeek-R1 and Nemotron-8B models

Large Language Model

Transformers

#Long-text understanding #Instruction fine-tuning #Knowledge distillation

Downloads 69

Release Time : 4/17/2025

Model Overview

This is a pre-trained language model merged using the mergekit tool, employing Spherical Linear Interpolation (SLERP) to fuse the DeepSeek-R1 and Nemotron-8B models, aiming to combine their advantageous features

Model Features

Model fusion advantages

Combines DeepSeek-R1's distilled knowledge with Nemotron-8B's ultra-long context processing capability

V-shaped mixing strategy

Input/output layers adopt Hermes characteristics while middle layers use WizardMath features

Long-context support

Inherits Nemotron model's 4M tokens ultra-long context processing capability

Model Capabilities

Text generation

Instruction following

Long-context understanding

Multi-turn dialogue

Use Cases

Dialogue systems

Intelligent assistant

Building intelligent assistants capable of handling complex multi-turn dialogues

Can process context information up to 4M tokens long

Content generation

Long-form writing

Assisting in creating long articles or technical documents

Maintains long-distance contextual consistency

Property	Details
Base Model	mobiuslabsgmbh/DeepSeek - R1 - ReDistill - Llama3 - 8B - v1.1, nvidia/Llama - 3.1 - Nemotron - 8B - UltraLong - 4M - Instruct
Library Name	transformers
Tags	mergekit, merge

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Ultralong Thinking

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 Model Merge Project

🚀 Quick Start

✨ Features

📋 Model Information

📚 Documentation

📄 Merge Details

🔧 Merge Method

🧩 Models Merged

⚙️ Configuration