Drivelmmo1
DriveLMM-o1 is a fine-tuned large multimodal model optimized for autonomous driving, based on the InternVL2.5-8B architecture and adapted using LoRA technology, achieving step-by-step reasoning through stitched multi-view images.
Downloads 233
Release Time : 3/11/2025
Model Overview
DriveLMM-o1 is a large multimodal model designed for autonomous driving inference, integrating multi-view images for panoramic scene understanding and generating detailed intermediate reasoning steps to explain decision-making processes.
Model Features
Multimodal Fusion
Integrates multi-view images for panoramic scene understanding
Chain-of-Thought Reasoning
Generates detailed intermediate reasoning steps to explain decision-making processes
Efficient Adaptation
Employs dynamic image patching and LoRA fine-tuning technology to process high-resolution inputs with minimal additional parameters
Performance Breakthrough
Achieves significant improvements in final answer accuracy and overall reasoning scores compared to existing open-source models
Model Capabilities
Multi-view Image Processing
Autonomous Driving Decision Inference
Scene Perception and Object Understanding
Risk Assessment
Traffic Rule Compliance Analysis
Use Cases
Autonomous Driving
Risk Assessment
Analyzes potential risks in the driving environment through multi-view images
Risk assessment accuracy reaches 73.01%
Traffic Rule Compliance
Evaluates whether driving behavior complies with traffic rules
Traffic rule compliance rate reaches 81.56%
Scene Perception and Object Understanding
Identifies and understands various objects and scenes in the driving environment
Scene perception and object understanding accuracy reaches 75.39%
Featured Recommended AI Models
Š 2025AIbase