VoiceRestore Open-Source Voice Restoration System - Free Deployment, Significantly Improve the Quality of Damaged Recordings

Voicerestore

Developed by jadechoghari

A voice recording quality restoration system based on flow matching transformer, significantly improving the quality of damaged recordings

Audio Enhancement

Transformers

Open Source License:MIT #Voice Restoration #Flow Matching Transformer #Noise Reduction Enhancement

Downloads 24

Release Time : 9/27/2024

Model Overview

VoiceRestore is a model specifically designed for repairing damaged voice recordings, capable of handling background noise, reverberation, distortion, and signal loss

Model Features

Comprehensive Restoration

Capable of handling any degree and type of voice recording damage

Easy to Use

Provides a simple interface for processing damaged audio

Pre-trained Model

Includes a pre-trained transformer model with 301 million parameters

Model Capabilities

Noise Reduction

Reverberation Elimination

Distortion Repair

Signal Loss Recovery

Use Cases

Voice Restoration

Damaged Recording Repair

Repairs voice recordings with background noise, distortion, and other issues

Significantly improves speech clarity and intelligibility

Historical Recording Restoration

Processes low-quality audio recorded by old recording devices

Restores original voice characteristics

🚀 VoiceRestore: Flow-Matching Transformers for Speech Recording Quality Restoration

VoiceRestore is a state-of-the-art speech restoration model. It can greatly improve the quality of degraded voice recordings. By using flow-matching transformers, this model can effectively deal with various audio imperfections in speech, such as background noise, reverberation, distortion, and signal loss.

It is based on this repo & demo of audio restorations: VoiceRestore

🚀 Quick Start

Using Transformers 🤗

!git lfs install
!git clone https://huggingface.co/jadechoghari/VoiceRestore
%cd VoiceRestore
!pip install -r requirements.txt

from transformers import AutoModel
# path to the model folder (on colab it's as follows)
checkpoint_path = "/content/VoiceRestore"
model = AutoModel.from_pretrained(checkpoint_path, trust_remote_code=True)
model("test_input.wav", "test_output.wav")
#add short=False if audio is > 10 seconds
model("long.mp3", "long_output.mp3", short=False)

💻 Usage Examples

Basic Usage

from transformers import AutoModel
# path to the model folder (on colab it's as follows)
checkpoint_path = "/content/VoiceRestore"
model = AutoModel.from_pretrained(checkpoint_path, trust_remote_code=True)
model("test_input.wav", "test_output.wav")

Advanced Usage

# If the audio is longer than 10 seconds, add the short=False parameter
model("long.mp3", "long_output.mp3", short=False)

🔍 Example

Degraded Input:

Degraded Input Audio

Restored (steps=32, cfg=1.0):

Restored audio - 16 steps, strength 0.5:

✨ Features

Universal Restoration: The model can handle any level and type of voice recording degradation. It's truly amazing.
Easy to Use: It has a simple interface for processing degraded audio files.
Pretrained Model: It includes a 301 million parameter transformer model with pre-trained weights. (The model is still under training, and there will be further checkpoint updates)

📚 Documentation

Model Details

Property	Details
Model Type	Flow-matching transformer
Parameters	300M+ parameters
Input	Degraded speech audio (various formats supported)
Output	Restored speech

Limitations and Future Work

The current model is optimized for speech and may not perform optimally on music or other audio types.
Ongoing research is being conducted to improve performance on extreme degradations.
Future updates may include real-time processing capabilities.

Citation

If you use VoiceRestore in your research, please cite our paper:

@article{kirdey2024voicerestore,
  title={VoiceRestore: Flow-Matching Transformers for Speech Recording Quality Restoration},
  author={Kirdey, Stanislav},
  journal={arXiv},
  year={2024}
}

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Based on the E2-TTS implementation by Lucidrains
Special thanks to the open-source community for their invaluable contributions.
Credits: This repository is based on the E2-TTS implementation by Lucidrains

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご