AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Low Latency Inference

# Low Latency Inference

Qwen3 30B A3B FP8 Dynamic
FP8 dynamic quantization version based on Qwen/Qwen3-30B-A3B model, optimized for inference efficiency on Ampere architecture GPUs
Large Language Model Transformers
Q
khajaphysist
403
2
Yolov8s Cs2
An object detection model based on YOLOv8 and YOLOv9, specifically designed for player detection in Counter-Strike 2.
Object Detection
Y
Vombit
17
1
Alexturner
This is a voice conversion model based on RVC (Retrieval-based Voice Conversion) technology, capable of transforming input audio into speech output with a specific style.
Speech Synthesis Transformers
A
sail-rvc
371
0
Ms Marco TinyBERT L4
Apache-2.0
An information retrieval model optimized based on the TinyBERT architecture, specifically trained for the MS Marco passage ranking task
Text Embedding English
M
cross-encoder
380
1
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase