X

Xm Transformer S2ut Hk En

Developed by facebook
A fairseq-based single-decoder speech-to-speech translation (S2UT) model supporting Hokkien-English bidirectional translation
Downloads 17
Release Time : 10/7/2022

Model Overview

This model is a speech-to-speech translation system capable of directly translating Hokkien to English speech or vice versa. It employs a Transformer architecture combined with a HiFi-GAN vocoder for speech synthesis.

Model Features

End-to-End Speech Translation
Directly converts source language speech to target language speech without intermediate text representation
Multi-Domain Training Data
Trained using supervised and weakly supervised data from domains like TED talks, TV series, and TAT corpus
High-Quality Speech Synthesis
Utilizes unit_hifigan_mhubert vocoder model to generate natural and fluent target speech

Model Capabilities

Hokkien-to-English Speech Translation
English-to-Hokkien Speech Translation
Direct Speech-to-Speech Conversion

Use Cases

Cross-Language Communication
Hokkien-English Real-Time Translation
Facilitates real-time verbal communication between Hokkien and English speakers
Media Content Localization
TV Series Dubbing
Automatically translates and dubs Hokkien TV series into English versions
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase