Mistral 7B Instruct V0.3 GPTQ 4bit
The 4-bit quantized version of Mistral-7B-Instruct-v0.3, which optimizes inference performance through the GPTQ method while maintaining high accuracy
Downloads 9,897
Release Time : 5/23/2024
Model Overview
This model is a 4-bit weight quantized version of Mistral-7B-Instruct-v0.3, designed for efficient natural language processing tasks. It improves inference speed while maintaining 99.75% of the accuracy of the original model
Model Features
Efficient 4-bit quantization
Compresses the model weights to 4 bits through the GPTQ method, significantly reducing memory usage and computational requirements
High accuracy retention
Maintains 99.75% of the accuracy compared to the original model, with minimal performance loss
Optimized inference performance
Supports the Marlin mixed-precision kernel of vLLM to achieve efficient inference
Model Capabilities
Text generation
Question-answering system
Code generation
Text summarization
Dialogue system
Use Cases
Education
Mathematical problem solving
Solve mathematical problems in the GSM8K dataset
5-shot accuracy of 45.41%
Knowledge Q&A
Common-sense reasoning
Reasoning problems in the AI2 Reasoning Challenge
25-shot accuracy of 63.40%
Language understanding
Language understanding evaluation
Language understanding tests on the HellaSwag dataset
10-shot accuracy of 84.04%
Featured Recommended AI Models
Š 2025AIbase