R

R1 Aqa

Developed by mispeech
R1-AQA is an audio question answering model based on Qwen2-Audio-7B-Instruct, optimized through Group Relative Policy Optimization (GRPO) algorithm, achieving state-of-the-art performance in the MMAU benchmark.
Downloads 791
Release Time : 3/13/2025

Model Overview

R1-AQA is a model specifically designed for Audio Question Answering (AQA) tasks, optimized via reinforcement learning to achieve high performance with minimal training data.

Model Features

Reinforcement Learning Optimization
Optimized using the Group Relative Policy Optimization (GRPO) algorithm, significantly improving performance.
Efficient Few-shot Training
Achieves superior results with only 38k training samples, surpassing supervised fine-tuning, demonstrating the advantage of reinforcement learning on small datasets.
High-Performance Audio QA
Achieves state-of-the-art performance in the MMAU benchmark, outperforming multiple large-scale models.

Model Capabilities

Audio Question Answering
Audio Content Understanding
Multiple-choice Question Answering

Use Cases

Smart Assistants
Audio Content Analysis
Analyzes audio content and answers related questions, such as identifying speaker gender.
Achieves an accuracy of 69.76% in MMAU tests
Education
Audio Learning Assistance
Helps students understand audio teaching materials and answer questions.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase