D

Deepseek R1 FP4

Developed by nvidia
FP4 quantized version of DeepSeek R1 model, using optimized Transformer architecture for efficient text generation
Downloads 61.51k
Release Time : 2/21/2025

Model Overview

FP4 quantized model based on DeepSeek R1, optimized for TensorRT-LLM inference with 128K long context generation support

Model Features

FP4 Quantization Technology
Achieves FP4 quantization for weights and activations through TensorRT Model Optimizer, reducing storage requirements by 1.6x
Long Context Support
Supports ultra-long context processing of 128K tokens
Blackwell Architecture Optimization
Inference performance specifically optimized for NVIDIA Blackwell GPU architecture

Model Capabilities

Text generation
Long text comprehension
Knowledge Q&A

Use Cases

Content generation
Article continuation
Automatically generates coherent subsequent content based on given opening
Knowledge Q&A
Factual Q&A
Answers various questions about world knowledge
Achieved 90.7% accuracy on MMLU benchmark
Featured Recommended AI Models
ยฉ 2025AIbase