# FP16 Efficient Inference
Spark TTS 0.5B Bf16
Spark-TTS-0.5B-fp16 is a text-to-speech model based on the MLX format, supporting both English and Chinese.
Speech Synthesis Supports Multiple Languages
S
mlx-community
121
0
Wan2.1 T2V 14B Gguf
Apache-2.0
A text-to-video generation model converted to GGUF format, supporting usage via ComfyUI-GGUF custom nodes
Text-to-Video
W
city96
42.38k
130
Controlnet Illustrious Softedge Hed Sdxl Fp16
A ControlNet model based on Stable Diffusion XL, specializing in image generation control through soft edge HED (Holistically-Nested Edge Detection).
Image Generation
C
r3gm
60
0
Controlnet Kohaku Canny Sdxl Fp16
A ControlNet model based on Stable Diffusion XL, specializing in precise image generation control through Canny edge detection
Image Generation
C
r3gm
19
0
Faster Whisper Small
MIT
CTranslate2 converted version of OpenAI Whisper small model for efficient speech recognition
Speech Recognition Supports Multiple Languages
F
Systran
376.48k
13
Faster Whisper Large Zh Cv11
This is the CTranslate2-converted version of the jonatasgrosman/whisper-large-zh-cv11 model, designed for efficient speech recognition tasks, with special optimizations for Chinese speech recognition.
Speech Recognition Chinese
F
arc-r
22
9
Featured Recommended AI Models