D

Deepseek R1 Distill Qwen 14B Quantized.w8a8

Developed by neuralmagic
The quantized version of DeepSeek-R1-Distill-Qwen-14B, optimized with INT8 quantization for weights and activations, reducing GPU memory requirements and improving computational efficiency.
Downloads 765
Release Time : 2/4/2025

Model Overview

This model is a quantized version of DeepSeek-R1-Distill-Qwen-14B, optimized with INT8 quantization for weights and activations, significantly reducing GPU memory requirements and improving computational throughput. Suitable for text generation tasks.

Model Features

INT8 Quantization
Optimizes weights and activations with INT8 quantization, significantly reducing GPU memory requirements and disk space usage.
Efficient Inference
Deployed using the vLLM backend, supporting efficient text generation tasks.
High Performance Recovery
Maintains over 99% performance recovery rate compared to the original model across multiple evaluation tasks.

Model Capabilities

Text Generation
Dialogue Systems
Code Generation

Use Cases

Dialogue Systems
Intelligent Customer Service
Used to build efficient intelligent customer service systems, providing natural and smooth conversational experiences.
Maintains over 99% performance of the original model in dialogue tasks.
Code Generation
Code Completion
Used for code completion and generation tasks, improving development efficiency.
Achieves a 99.4% performance recovery rate in HumanEval evaluations.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase