T

Thinkedit Deepseek Llama3 8b

Developed by cesun
ThinkEdit is a lightweight weight editing method that identifies and modifies a small number of attention heads to alleviate the issue of overly brief reasoning chains generated by inference models, thereby improving reasoning accuracy.
Downloads 55
Release Time : 3/11/2025

Model Overview

This model addresses the problem of overly concise Chain-of-Thought (CoT) generated by large language models in reasoning tasks. Through an interpretable weight editing approach, it modifies only about 0.1% of the parameters, significantly improving performance in tasks such as mathematical reasoning.

Model Features

Lightweight Weight Editing
Identifies and edits only about 2% of 'brief reasoning' attention heads and 0.1% of total parameters for efficient optimization.
Interpretable Editing
Locates and removes specific directions causing brief reasoning by analyzing the activation patterns of attention heads.
Performance Improvement
Significantly improves accuracy in multiple mathematical reasoning benchmarks, particularly excelling in brief reasoning cases.
Reasoning Length Optimization
Effectively increases the length of reasoning steps generated by the model, providing more detailed problem-solving processes.

Model Capabilities

Mathematical Problem Solving
Complex Reasoning Task Handling
Generating Detailed Chain-of-Thought
Educational Applications

Use Cases

Education
Step-by-Step Math Problem Solving
Provides students with detailed steps to solve mathematical problems
Improves accuracy by 6.71% on the GSM8K math problem set
Exam Question Analysis
Generates detailed explanations for standardized exam questions
Improves accuracy by 0.07% on the MMLU elementary math test
Research
Model Interpretability Research
Investigates the relationship between attention heads and reasoning behavior
Identifies 2% of critical attention heads
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase