wav2vec2-large-960h-lv60-self_MIDIARIES_72H_FT Open Source Speech Recognition Model

Wav2vec2 Large 960h Lv60 Self MIDIARIES 72H FT

Developed by caurdy

A speech recognition model fine-tuned using 72 hours of MI diary data, based on Facebook's pre-trained wav2vec2 large 960H lv60 self-supervised model

Speech Recognition

Transformers

#Speech recognition optimization #Medical diary transcription #Fine-tuning error reduction

Downloads 20

Release Time : 4/21/2022

Model Overview

This model is specifically optimized for medical interview scenarios, significantly improving recognition accuracy in medical dialogue contexts through fine-tuning

Model Features

Medical scenario optimization

Fine-tuned with 72 hours of medical interview data, particularly suitable for medical dialogue scenarios

Performance improvement

On a 20-minute MI diary test set, the word error rate decreased from 13% to 9.7%

Based on mature architecture

Built upon Facebook's pre-trained wav2vec2 large 960H lv60 self-supervised model

Model Capabilities

English speech recognition

Medical dialogue transcription

Automatic speech-to-text conversion

Use Cases

Healthcare

Medical interview recording

Automatically transcribes conversations between doctors and patients

Word error rate reduced to 9.7%

Medical document generation

Automatically converts medical interview recordings into structured documents

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Wav2vec2 Large 960h Lv60 Self MIDIARIES 72H FT

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 FineTuned wav2vec2 large 960H lv60 self

🚀 Quick Start

💻 Usage Examples

Basic Usage

📄 License