Model Selection

FP16 Efficient Inference

# FP16 Efficient Inference

Spark TTS 0.5B Bf16

Spark-TTS-0.5B-fp16 is a text-to-speech model based on the MLX format, supporting both English and Chinese.

Speech Synthesis Supports Multiple Languages

Wan2.1 T2V 14B Gguf

A text-to-video generation model converted to GGUF format, supporting usage via ComfyUI-GGUF custom nodes

Controlnet Illustrious Softedge Hed Sdxl Fp16

A ControlNet model based on Stable Diffusion XL, specializing in image generation control through soft edge HED (Holistically-Nested Edge Detection).

Image Generation

Controlnet Kohaku Canny Sdxl Fp16

A ControlNet model based on Stable Diffusion XL, specializing in precise image generation control through Canny edge detection

Image Generation

Faster Whisper Small

CTranslate2 converted version of OpenAI Whisper small model for efficient speech recognition

Speech Recognition Supports Multiple Languages

Faster Whisper Large Zh Cv11

This is the CTranslate2-converted version of the jonatasgrosman/whisper-large-zh-cv11 model, designed for efficient speech recognition tasks, with special optimizations for Chinese speech recognition.

Speech Recognition Chinese

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase