AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
FP8 Quantization Acceleration

# FP8 Quantization Acceleration

Internvl3 38B FP8 Dynamic
MIT
This is the FP8 static quantization version of OpenGVLab/InternVL3-38B, optimized for high-performance inference using vLLM. It achieves approximately 2x acceleration on vision-language tasks with minimal accuracy loss.
Text-to-Image Safetensors Supports Multiple Languages
I
ConfidentialMind
5,173
1
Llama 4 Scout 17B 16E Instruct FP8 Dynamic
Other
A 17B-parameter multilingual instruction model based on Llama-4, optimized with FP8 quantization to significantly reduce resource requirements
Image-to-Text Supports Multiple Languages
L
RedHatAI
5,812
8
Qwen2.5 VL 72B Instruct FP8 Dynamic
Apache-2.0
The FP8 quantized version of Qwen2.5-VL-72B-Instruct, supporting vision-text input and text output, suitable for multimodal tasks.
Text-to-Image Transformers English
Q
RedHatAI
1,837
3
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase