Model Selection

FP8 Quantization Acceleration

# FP8 Quantization Acceleration

Internvl3 38B FP8 Dynamic

This is the FP8 static quantization version of OpenGVLab/InternVL3-38B, optimized for high-performance inference using vLLM. It achieves approximately 2x acceleration on vision-language tasks with minimal accuracy loss.

Safetensors Supports Multiple Languages

ConfidentialMind

Llama 4 Scout 17B 16E Instruct FP8 Dynamic

A 17B-parameter multilingual instruction model based on Llama-4, optimized with FP8 quantization to significantly reduce resource requirements

Image-to-Text Supports Multiple Languages

Qwen2.5 VL 72B Instruct FP8 Dynamic

The FP8 quantized version of Qwen2.5-VL-72B-Instruct, supporting vision-text input and text output, suitable for multimodal tasks.

Transformers English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase