AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
FP8 Quantization Optimization

# FP8 Quantization Optimization

Qwen3 14B FP8 Dynamic
Apache-2.0
Qwen3-14B-FP8-dynamic is an optimized large language model. By quantizing activation values and weights to the FP8 data type, it effectively reduces GPU memory requirements and improves computational throughput.
Large Language Model Transformers
Q
RedHatAI
167
1
Llama 3.3 70B Instruct FP8 Dynamic
Llama-3.3-70B-Instruct-FP8-dynamic is an optimized large language model. By quantizing activations and weights to the FP8 data type, it reduces GPU memory requirements and improves computational throughput, supporting commercial and research use in multiple languages.
Large Language Model Transformers Supports Multiple Languages
L
RedHatAI
6,060
6
Llama 3.1 405B Instruct FP8
The NVIDIA Llama 3.1 405B Instruct FP8 model is a quantized version of Meta's Llama 3.1 405B Instruct model. It uses an optimized Transformer architecture and is an autoregressive language model. This model can be used for commercial or non-commercial purposes.
Large Language Model Transformers
L
nvidia
10.91k
11
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase