T

Thinkless 1.5B RL DeepScaleR

Developed by Vinnnf
Thinkless is a large language model trained via reinforcement learning, capable of adaptively selecting short or long-chain reasoning modes, significantly reducing inference computational costs.
Downloads 197
Release Time : 5/16/2025

Model Overview

The Thinkless framework trains large language models using the Decoupled Group Relative Policy Optimization (DeGRPO) algorithm, enabling them to adaptively choose between short (<short> tag) or long-chain reasoning (<think> tag) modes based on problem complexity.

Model Features

Adaptive Reasoning Mode
The model can automatically choose between short or long-chain reasoning modes based on problem complexity, reducing long-chain reasoning usage by 50%-90%.
Decoupled Group Relative Policy Optimization
Employs the DeGRPO algorithm to decompose mixed reasoning objectives into control tag loss and response loss, avoiding the collapse issues of the original GRPO.
Efficient Inference
Significantly reduces inference computational costs while maintaining accuracy, suitable for applications requiring rapid responses.

Model Capabilities

Mathematical Reasoning
Question Answering
Adaptive Reasoning Mode Selection

Use Cases

Education
Mathematical Problem Solving
Answers mathematical problems ranging from simple to complex, automatically selecting short or detailed reasoning modes based on difficulty.
Performs well on benchmarks such as Minerva Algebra, MATH-500, and GSM8K
Intelligent Assistant
Efficient Q&A System
Builds an intelligent Q&A system that adaptively adjusts response length based on problem complexity.
Reduces unnecessary long-chain reasoning, improving response speed
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase