模型概述

該模型專門用於將摩洛哥阿拉伯方言文本轉換為自然語音，適用於語音助手、有聲讀物等應用場景。

模型特點

方言支持

專門針對摩洛哥阿拉伯方言優化，提供更自然的語音輸出

高效微調

使用Unsloth的SFTTrainer和LoRA方法進行高效微調

高質量數據集

基於經過清洗的DarijaTTS-clean數據集訓練

模型能力

摩洛哥方言文本轉語音

語音合成

支持長文本轉換

使用案例

語音助手

方言語音助手

為摩洛哥用戶提供本地化語音交互體驗

生成自然流暢的摩洛哥方言語音

教育應用

方言學習工具

幫助學習者練習摩洛哥方言發音

提供準確的方言發音示範

🚀 摩洛哥達裡賈語文本轉語音模型

這是一個用於摩洛哥達裡賈語的文本轉語音（TTS）模型，它基於 OuteAI/OuteTTS - 0.2 - 500M 模型，在 KandirResearch/DarijaTTS - clean 數據集上進行了微調。該模型能夠將摩洛哥達裡賈語的文本轉換為自然流暢的語音，為達裡賈語的語音應用提供了有力支持。

🚀 快速開始

安裝依賴

你可以按照以下步驟運行該模型，首先需要安裝 outetts 和 llama - cpp - python：

pip install outetts llama-cpp-python huggingface_hub

運行模型

import outetts
from outetts.models.config import GenerationConfig
from huggingface_hub import hf_hub_download

model_path = hf_hub_download(
    repo_id="KandirResearch/DarijaTTS-v0.1-500M",
    filename="unsloth.Q8_0.gguf",
)
model_config = outetts.GGUFModelConfig_v2(
    model_path=model_path,
    tokenizer_path="KandirResearch/DarijaTTS-v0.1-500M",
)
interface = outetts.InterfaceGGUF(model_version="0.3", cfg=model_config)

def tts(text, temperature=0.3, repetition_penalty=1.1):
    gen_cfg = GenerationConfig(
        text=text,
        temperature=temperature,
        repetition_penalty=repetition_penalty,
        max_length=4096,
    )
    output = interface.generate(config=gen_cfg)
    output_path = "output.wav"
    output.save(output_path)
    return output_path

# 示例用法
audio_path = tts("السلام كيداير لاباس عليك؟")
print(f"生成的音頻保存路徑: {audio_path}")

✨ 主要特性

針對性強：專門為摩洛哥達裡賈語設計，能更好地處理該語言的語音轉換。
微調優化：基於 Unsloth 的 SFTTrainer 進行微調，提升了模型性能。
高效訓練：採用 LoRA 基於的微調方法，提高了訓練效率。

📦 安裝指南

安裝 outetts 和 llama - cpp - python 以及 huggingface_hub：

pip install outetts llama-cpp-python huggingface_hub

💻 使用示例

基礎用法

import outetts
from outetts.models.config import GenerationConfig
from huggingface_hub import hf_hub_download

model_path = hf_hub_download(
    repo_id="KandirResearch/DarijaTTS-v0.1-500M",
    filename="unsloth.Q8_0.gguf",
)
model_config = outetts.GGUFModelConfig_v2(
    model_path=model_path,
    tokenizer_path="KandirResearch/DarijaTTS-v0.1-500M",
)
interface = outetts.InterfaceGGUF(model_version="0.3", cfg=model_config)

def tts(text, temperature=0.3, repetition_penalty=1.1):
    gen_cfg = GenerationConfig(
        text=text,
        temperature=temperature,
        repetition_penalty=repetition_penalty,
        max_length=4096,
    )
    output = interface.generate(config=gen_cfg)
    output_path = "output.wav"
    output.save(output_path)
    return output_path

# 示例用法
audio_path = tts("السلام كيداير لاباس عليك؟")
print(f"生成的音頻保存路徑: {audio_path}")

📚 詳細文檔

模型詳情

屬性	詳情
基礎模型	OuteAI/OuteTTS - 0.2 - 500M
數據集	KandirResearch/DarijaTTS - clean
訓練方法	使用 Unsloth 的 `SFTTrainer` 進行微調
數據集準備	按照 OuteTTS 訓練指南進行預處理
演示	點擊此處試用