M

Meta Llama 3.1 405B Instruct FP8 Dynamic

Developed by RedHatAI
FP8 quantized version of Meta-Llama-3.1-405B-Instruct, suitable for multilingual commercial and research purposes, specially optimized for assistant robot scenarios.
Downloads 97
Release Time : 7/23/2024

Model Overview

This model is a quantized version of Meta-Llama-3.1-405B-Instruct, reducing approximately 50% of disk size and GPU memory requirements by quantizing weights and activations to FP8 data type. Suitable for chat scenarios similar to assistants.

Model Features

FP8 quantization
Weights and activations are quantized to FP8 data type, reducing approximately 50% of disk size and GPU memory requirements.
Multilingual support
Supports multiple languages including English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
High recovery rate
Achieves performance close to the original model in multiple benchmarks, such as a 99.0% recovery rate in Arena-Hard evaluation.

Model Capabilities

Text generation
Multilingual conversation
Mathematical reasoning
Multiple-choice tasks

Use Cases

Chatbot
Multilingual assistant
Acts as a multilingual assistant robot, supporting conversations and task completion in multiple languages.
Achieved a score of 66.7 in Arena-Hard evaluation.
Research tool
Language model research
Used to study the impact of quantization on the performance of large language models.
Achieved performance close to the original model in OpenLLM v1 and v2 evaluations.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase