M

Mistral Small 24B Instruct 2501 Quantized.w8a8

Developed by RedHatAI
A Mistral instruction fine-tuned model with 24B parameters after INT8 quantization, significantly reducing GPU memory requirements and improving computational throughput.
Downloads 158
Release Time : 3/3/2025

Model Overview

A quantized version based on Mistral-Small-24B-Instruct-2501, supporting multilingual text generation and dialogue tasks, suitable for low-latency inference scenarios.

Model Features

Efficient quantization
Adopts the W8A8 quantization scheme, reducing 50% of memory usage and disk space, and doubling the computational throughput.
Multilingual support
Supports text generation and understanding in 24 languages.
Low-latency inference
The optimized model is particularly suitable for dialogue scenarios requiring quick responses.
Enterprise-level deployment support
Provides a full-stack deployment solution for the Red Hat ecosystem.

Model Capabilities

Multilingual text generation
Instruction following
Long document understanding
Programming assistance
Mathematical reasoning

Use Cases

Dialogue system
Customer service robot
Build a low-latency multilingual customer service dialogue system.
Development assistance
Code generation
Help developers generate and optimize code snippets.
Education
Mathematical problem solving
Explain and solve mathematical problems.
GSM8K evaluation score: 90.00
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase