T

Thinkless 1.5B Warmup

Developed by Vinnnf
The Thinkless framework is a learnable framework that enables large models to adaptively choose between short reasoning or long-chain reasoning based on task complexity and their own capabilities.
Downloads 966
Release Time : 5/16/2025

Model Overview

This framework is trained using a reinforcement learning paradigm, employing two control tokens: <short> triggers concise responses, while <think> triggers detailed reasoning. The core method is the Decoupled Group Relative Policy Optimization (DeGRPO) algorithm, which decomposes the learning objective of hybrid reasoning into control token loss and response loss.

Model Features

Adaptive Reasoning
Automatically selects between short reasoning or long-chain reasoning modes based on task complexity
Decoupled Group Relative Policy Optimization
Uses the DeGRPO algorithm to decompose the learning objective into control token loss and response loss
Efficient Reasoning
Reduces the use of long-chain reasoning by 50%-90% in benchmark tests, significantly lowering computational costs

Model Capabilities

Adaptive Text Generation
Mathematical Reasoning
Question Answering

Use Cases

Education
Mathematical Problem Solving
Solves mathematical problems such as algebra and arithmetic
Performs well on benchmarks like Minerva Algebra, MATH-500, and GSM8K
Research
Reasoning Mode Research
Investigates the adaptive reasoning capabilities of large models
Validates that the model effectively learns when to use long-chain reasoning
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase