V

Voc2vec Hubert Ls Pt

Developed by alkiskoudounas
voc2vec is a foundational model specifically designed for non-verbal human data, built on the HuBERT framework and pre-trained on 125 hours of non-verbal audio data.
Downloads 114
Release Time : 4/14/2025

Model Overview

This model focuses on the classification and analysis of non-verbal human sounds, particularly suitable for scenarios like infant crying.

Model Features

Specialized for non-verbal sounds
A pre-trained model optimized specifically for non-verbal human sounds (e.g., infant cries, laughter, etc.)
Multi-dataset pre-training
Pre-trained on 125 hours of non-verbal audio from 10 different datasets
HuBERT architecture
Built on the HuBERT framework, inheriting its excellent audio representation learning capabilities
Transfer learning friendly
Continued training from the LibriSpeech pre-trained model, suitable for fine-tuning downstream tasks

Model Capabilities

Non-verbal audio classification
Infant cry recognition
Audio feature extraction

Use Cases

Infant care
Infant cry recognition
Identify and analyze different types of infant cries (hunger, discomfort, etc.)
Performs excellently on infant cry datasets such as Donate a Cry
Medical assistance
Non-verbal symptom analysis
Analyze patients' non-verbal sounds to assist in medical diagnosis
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase