Dia1.6-pt_BR-v1 Open-source Text-to-Speech Model - Free Deployment and Optimization for Brazilian Portuguese Speech Synthesis

Dia1.6 Pt BR V1

Developed by Alissonerdx

A fine-tuned version of the Dia 1.6B text-to-audio model, specifically optimized for Brazilian Portuguese

Speech Synthesis OtherOpen Source License:Apache-2.0 #Brazilian Portuguese speech synthesis #Single speaker optimization #CETUC fine-tuning

Downloads 77

Release Time : 5/5/2025

Model Overview

This is a text-to-speech model focused on Brazilian Portuguese speech synthesis, fine-tuned with the CETUC dataset to generate pure Portuguese speech

Model Features

Brazilian Portuguese Optimization

Specially fine-tuned for Brazilian Portuguese to provide more authentic speech output

Pure Speech Synthesis

Focuses on standard speech synthesis, removing the emotional expression capabilities of the original model

Efficient Training

Trained in just 20 hours on a single RTX 4090 GPU

Multi-Version Support

Offers a pure Portuguese version and a hybrid version fused with the original model

Model Capabilities

Text-to-speech

Brazilian Portuguese speech synthesis

Single-speaker voice generation

Use Cases

Voice Applications

Voice Assistant

Provides voice interaction functionality for Brazilian Portuguese users

Generates natural Brazilian Portuguese speech

Audiobooks

Converts Portuguese text into speech

Smooth speech output

🚀 Dia1.6-Portuguese

This is a fine-tuned text-to-audio model, adapted from the Dia 1.6B model for Brazilian Portuguese. It utilizes the CETUC speech dataset, offering high - quality Brazilian Portuguese speech synthesis.

🚀 Quick Start

This fine - tuned model is ready to use for Brazilian Portuguese text - to - audio conversion. You can start leveraging its capabilities right away.

✨ Features

Fine - Tuned for Portuguese: Specifically adapted for Brazilian Portuguese, providing accurate and natural speech synthesis.
Single - Speaker Focus: Focuses on clean Brazilian Portuguese speech synthesis with a single speaker token [S1].
Multiple Versions: Offers different versions like fully fine - tuned, in .safetensors format, and merged with original weights.

📚 Documentation

🗣️ About the Model

🧠 Base Model: Dia 1.6B
📦 Dataset: CETUC — 144 hours of Brazilian Portuguese speech (100 speakers)
📝 Transcription: Performed using Whisper V3 Turbo + Pyannote diarization
🔁 Training: 140,000 steps (~1.4 epochs) on a single speaker subset
⏱️ Hardware: Trained on a single NVIDIA RTX 4090 (≈ 20 hours total)
🎙️ Speaker Token: [S1] (only one speaker present)
⚠️ Note: This model has lost the original English and expressive capabilities (e.g., laughter, emotions) and focuses exclusively on clean Brazilian Portuguese speech synthesis.

🧪 Versions

Version	Description	File
`v1`	Fully fine - tuned on Portuguese	`Dia1.6 - Portuguese - v1.pth`
`v1 - safetensors`	Same model as above in `.safetensors` format	`Dia1.6 - Portuguese - v1.safetensors`
`v1 - merged - alpha0.6`	Merged with original Dia weights using `alpha = 0.6`	`Dia1.6 - Portuguese - v1 - merged.pth`
`v1 - merged - alpha0.6 - safetensors`	Merged version in `.safetensors` format	`Dia1.6 - Portuguese - v1 - merged.safetensors`

📁 Files

config.json: Dia model configuration
Dia1.6 - Portuguese - v1.pth: Full fine - tuned model
Dia1.6 - Portuguese - v1.safetensors: Same as above, but in safetensors format
Dia1.6 - Portuguese - v1 - merged.pth: Merged version (alpha = 0.6)
Dia1.6 - Portuguese - v1 - merged.safetensors: Merged version in safetensors format

▶️ Audio Samples

Prompt	Audio Samples
Ex. 1 `[S1] Às vezes, tudo o que você precisa é respirar fundo e lembrar que nem tudo precisa ser resolvido hoje. A calma também é uma forma de seguir em frente.`	🎧 Original (Failed to generate) 🇧🇷 PT Only 🔀 Merged 0.6
Ex. 2 `[S1] Eu lembro exatamente da primeira vez que ouvi minha própria voz gerada por IA. Foi estranho, quase surreal. Mas ao mesmo tempo, foi incrível perceber até onde a tecnologia já chegou.`	🎧 Original 🇧🇷 PT Only 🔀 Merged 0.6
Ex. 3 `[S1] Era uma vez um menino chamado Leo que adorava olhar para o céu. Todas as noites ele subia no telhado de casa com seu velho binóculo e ficava horas tentando contar as estrelas. Um dia, ele viu algo diferente. Não era um avião, nem um satélite. Era uma luz que piscava lentamente, mudando de cor. No dia seguinte, ninguém acreditou nele. Mas Leo sabia o que tinha visto. E naquela noite, a luz voltou. Só que dessa vez, ela piscou duas vezes... como se estivesse respondendo.`	🎧 Original 🇧🇷 PT Only 🔀 Merged 0.6
Ex. 4 `[S1] Cara, sério... esse setup ficou simplesmente insane. Nunca vi uma configuração tão limpa!`	🎧 Original 🇧🇷 PT Only 🔀 Merged 0.6
Ex. 5 `[S1] Aproveite agora a promoção especial da semana. São até cinquenta por cento de desconto em produtos selecionados, por tempo limitado. Corra e garanta o seu antes que acabe.`	🎧 Original 🇧🇷 PT Only 🔀 Merged 0.6
Ex. 6 `[S1] Se você ainda não testou esse modelo, tá perdendo tempo. (laughs) Ele é rápido, leve e roda até em máquina fraca. Sério, eu não esperava tanto desempenho em algo open source.`	🎧 Original 🇧🇷 PT Only 🔀 Merged 0.6
Ex. 7 `[S1] Acredite: ninguém no mundo tem exatamente o que você tem. Sua visão, sua voz, sua forma de enxergar as coisas. Isso já é suficiente pra começar.`	🎧 Original 🇧🇷 PT Only 🔀 Merged 0.6
Ex. 8 `[S1] Você diz que quer mudar, mas continua fazendo tudo igual. Quer resultado novo com atitude velha? Não funciona. O mundo não vai parar pra te esperar, e a oportunidade não fica batendo na porta pra sempre. Ou você levanta agora e faz o que precisa, ou aceita viver sempre no quase.`	🎧 Original 🇧🇷 PT Only 🔀 Merged 0.6
Ex. 9 `[S1] Você vai desistir agora? Depois de tudo que já passou?` `[S2] (sighs) Eu tô cansado. Nada parece dar certo.` `[S1] Cansado todo mundo fica. Mas você não chegou até aqui por sorte.` `[S2] (pause) Eu só... não sei se ainda consigo.` `[S1] Consegue sim. Você já levantou antes. (inhales) Levanta de novo.` `[S2] (exhales) Tá certo. Não acabou enquanto eu não disser que acabou.` `[S1] Isso. Agora vai lá e faz o que tem que ser feito.`	🎧 Original 🇧🇷 PT Only 🔀 Merged 0.6
Ex. 10 `[S1] Welcome back to the channel! Today, I’m going to show you how to turn basic text into realistic speech using open - source tools. It’s easier than you think, and by the end of this video, you’ll be able to generate your own voiceovers like a pro.`	🎧 Original 🇧🇷 PT Only 🔀 Merged 0.6

🏷️ Tags

tts, portuguese, finetuned, text - to - audio, CETUC, Dia, speech - synthesis, huggingface, audio - generation

📄 License

Apache 2.0 — same as the original [Dia](https://huggingface.co/nari - labs/Dia - 1.6B) model.

🙏 Acknowledgements

Original model by [nari - labs](https://huggingface.co/nari - labs)
Brazilian Portuguese dataset from CETUC
Transcription with Whisper V3 Turbo and Pyannote
Fine - tuning scripts by [stlohrey/dia - finetuning](https://github.com/stlohrey/dia - finetuning)
Custom training scripts, dataset preparation, and model adaptation by [alisson - anjos](https://github.com/alisson - anjos/dia - finetuning)

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご