M

Mistral 7B Instruct V0.3 GPTQ 4bit

Developed by RedHatAI
The 4-bit quantized version of Mistral-7B-Instruct-v0.3, which optimizes inference performance through the GPTQ method while maintaining high accuracy
Downloads 9,897
Release Time : 5/23/2024

Model Overview

This model is a 4-bit weight quantized version of Mistral-7B-Instruct-v0.3, designed for efficient natural language processing tasks. It improves inference speed while maintaining 99.75% of the accuracy of the original model

Model Features

Efficient 4-bit quantization
Compresses the model weights to 4 bits through the GPTQ method, significantly reducing memory usage and computational requirements
High accuracy retention
Maintains 99.75% of the accuracy compared to the original model, with minimal performance loss
Optimized inference performance
Supports the Marlin mixed-precision kernel of vLLM to achieve efficient inference

Model Capabilities

Text generation
Question-answering system
Code generation
Text summarization
Dialogue system

Use Cases

Education
Mathematical problem solving
Solve mathematical problems in the GSM8K dataset
5-shot accuracy of 45.41%
Knowledge Q&A
Common-sense reasoning
Reasoning problems in the AI2 Reasoning Challenge
25-shot accuracy of 63.40%
Language understanding
Language understanding evaluation
Language understanding tests on the HellaSwag dataset
10-shot accuracy of 84.04%
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase