V

Voc2vec

Developed by alkiskoudounas
voc2vec is a foundational model specifically designed for non-linguistic human data, built on the wav2vec 2.0 framework, with a pretraining dataset covering approximately 125 hours of non-linguistic audio.
Downloads 223
Release Time : 2/6/2025

Model Overview

voc2vec is a foundational model for non-linguistic human audio data, primarily used for audio classification tasks, especially suitable for the classification and analysis of non-linguistic vocalizations such as infant cries.

Model Features

Non-linguistic vocalization classification
Specifically designed for non-linguistic human audio data, such as infant cries, laughter, etc.
Multi-dataset pretraining
Pretrained using a collection of 10 different datasets, covering approximately 125 hours of non-linguistic audio.
Multiple model variants
Provides model variants based on different pretraining datasets, including AudioSet, LibriSpeech, and HuBERT.

Model Capabilities

Non-linguistic vocalization classification
Audio feature extraction
Infant cry recognition

Use Cases

Healthcare
Infant cry analysis
Used to analyze infant cries, helping to identify the needs or health status of infants.
Performs well on the Donate a Cry dataset.
Speech research
Non-linguistic vocalization research
Used to study the characteristics and patterns of human non-linguistic vocalizations.
Evaluated on multiple non-linguistic vocalization datasets.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase