Flower Calvin Abc
F
Flower Calvin Abc
Developed by mbreuss
FlowerVLA is a pre-trained vision-language-action model for robotic manipulation tasks, trained on the CALVIN ABC dataset, utilizing an efficient flow matching architecture with approximately 1 billion parameters.
Downloads 20
Release Time : 3/16/2025
Model Overview
FlowerVLA is an efficient vision-language-action flow policy model specifically designed for robotic manipulation tasks, combining multimodal vision-language encoding and a novel Transformer architecture.
Model Features
Efficient Multimodal Encoding
Employs half of the Florence-2 model structure for multimodal vision-language encoding, achieving efficient vision-language fusion.
Flow Matching Architecture
Uses a novel Transformer-based flow matching architecture to optimize action generation processes.
Lightweight Design
With only about 1 billion parameters, it achieves efficient and versatile vision-language-action policies suitable for real-time robotic operations.
Model Capabilities
Vision-Language-Action Fusion
Execution of Robotic Manipulation Tasks
Multimodal Input Processing
Action Space Prediction
Use Cases
Robotics
CALVIN ABC Challenge
Performing complex robotic manipulation tasks in the CALVIN ABC challenge
Currently ranked first with an average task completion length of 4.54
Object Grasping
Grasping specific objects based on language instructions
High success rate
Featured Recommended AI Models