Q

Qwen Audio Nf4

Developed by Ostixe360
Qwen-Audio-nf4 is the quantized version of Qwen-Audio, supporting multiple audio inputs and text outputs
Downloads 134
Release Time : 4/25/2024

Model Overview

Qwen-Audio-nf4 is the quantized version of Alibaba Cloud's large-scale audio-language model Qwen-Audio, supporting various audio inputs (including human speech, natural sounds, music, singing) and text as input, with text as output.

Model Features

Multi-type Audio Support
Supports processing various audio types including human voice, natural sounds, music, and songs
Multitask Learning Framework
Adopts a multitask training framework supporting over 30 different audio tasks
No Fine-tuning Required
Achieves leading performance on multiple benchmark tasks without task-specific fine-tuning
Multi-turn Dialogue Support
Supports multi-turn audio and text dialogues, including scenarios like sound understanding and music appreciation

Model Capabilities

Audio-to-text conversion
Multilingual audio understanding
Music analysis
Sound reasoning
Multi-turn audio-text dialogue
Voice tool usage

Use Cases

Speech Recognition
Speech Transcription
Convert spoken language into text
Achieves SOTA on Aishell1 test set
Environmental Sound Analysis
Natural Sound Recognition
Identify types of natural sounds in the environment
Achieves SOTA on cochscene test set
Music Understanding
Music Description Generation
Generate descriptive text based on music
Achieves SOTA on ClothoAQA test set
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase