🚀 Model Card for Sigurdur/vits_icelandic_rosa_female_monospeaker
This is a text-to-speech model for Icelandic. It is fine-tuned from facebook/mms-tts-isl
using the Talrómur dataset, offering a practical solution for Icelandic text-to-speech applications.
🚀 Quick Start
This model is designed for text-to-speech applications in Icelandic. Here's a basic example of how to use it:
from transformers import VitsModel, AutoTokenizer
import scipy.io.wavfile as wav
import torch
model = VitsModel.from_pretrained("Sigurdur/vits_icelandic_rosa_female_monospeaker")
tokenizer = AutoTokenizer.from_pretrained("Sigurdur/vits_icelandic_rosa_female_monospeaker")
text = "Góðan daginn! Ég heiti Rósa, ég er talgervill"
inputs = tokenizer(text, return_tensors="pt")
with torch.no_grad():
output = model(**inputs).waveform
sampling_rate = getattr(sampling_rate, "sampling_rate", 16000)
if not (0 <= sampling_rate <= 65535):
raise ValueError(f"Invalid sampling rate: {sampling_rate}")
waveform = output.squeeze().cpu().numpy()
Save Output to File
wav.write("output.wav", rate=sampling_rate, data=waveform)
View in Jupyter Notebook
from IPython.display import Audio
Audio(output, rate=sampling_rate)
✨ Features
- Icelandic Support: Specifically fine-tuned for Icelandic text-to-speech, providing high-quality voice output for the Icelandic language.
- Based on VITS: Built on the VITS architecture, ensuring efficient and accurate speech synthesis.
📦 Installation
No specific installation steps are provided in the original document.
💻 Usage Examples
Basic Usage
from transformers import VitsModel, AutoTokenizer
import scipy.io.wavfile as wav
import torch
model = VitsModel.from_pretrained("Sigurdur/vits_icelandic_rosa_female_monospeaker")
tokenizer = AutoTokenizer.from_pretrained("Sigurdur/vits_icelandic_rosa_female_monospeaker")
text = "Góðan daginn! Ég heiti Rósa, ég er talgervill"
inputs = tokenizer(text, return_tensors="pt")
with torch.no_grad():
output = model(**inputs).waveform
sampling_rate = getattr(sampling_rate, "sampling_rate", 16000)
if not (0 <= sampling_rate <= 65535):
raise ValueError(f"Invalid sampling rate: {sampling_rate}")
waveform = output.squeeze().cpu().numpy()
Advanced Usage
The basic usage example covers most common scenarios. There is no additional advanced usage information provided in the original document.
📚 Documentation
Model Details
- Developed by: Sigurdur Haukur Birgisson
- Model type: VITS
- Language(s) (NLP): Icelandic, isl
- License: [More Information Needed]
- Finetuned from model: facebook/mms-tts-isl
Uses
This model should be used for text-to-speech applications for Icelandic.
Bias, Risks, and Limitations
[More Information Needed]
Recommendations
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information is needed for further recommendations.
Training Data
[More Information Needed]
Training Hyperparameters
Evaluation
[More Information Needed]
Model Examination
[More Information Needed]
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
Technical Specifications
[More Information Needed]
Citation
[More Information Needed]
Glossary
[More Information Needed]
More Information
[More Information Needed]
📄 License
[More Information Needed]
Model Card Authors
Sigurdur Haukur Birgisson
Model Card Contact
Feel free to contact me through Linkedin: Sigurdur Haukur Birgisson