Deepseek V3 0324 Bf16
Model Overview
Model Features
Model Capabilities
Use Cases
🚀 huihui-ai/DeepSeek-V3-0324-bf16
This model is converted from DeepSeek-V3-0324 to BF16. We only provide the conversion command for Windows and information related to ollama.
🚀 Quick Start
This model is converted from DeepSeek-V3-0324 to BF16. Therefore, we have only provided the command to convert in the Windows environment and information related to ollama.
The Windows environment is much faster than the WSL2 (Ubuntu-22.04) environment, provided you have sufficient memory or virtual memory. The Linux environment has not been tested.
If you are in a Linux or WSL environment, please refer to huihui-ai/DeepSeek-R1-bf16.
If needed, we can upload the bf16 version.
✨ Features
- Converted from DeepSeek-V3-0324 to BF16.
- Provide conversion commands and usage information.
📦 Installation
FP8 to BF16
- Download deepseek-ai/DeepSeek-V3-0324 model, which requires approximately 641GB of space.
cd /d C:\Users\admin\models
huggingface-cli download deepseek-ai/DeepSeek-V3-0324 --local-dir ./deepseek-ai/DeepSeek-V3-0324
- Create the environment.
conda create -yn DeepSeek-V3-0324 python=3.10
conda activate DeepSeek-V3
pip install torch --index-url https://download.pytorch.org/whl/cu124
pip install -U triton-windows
pip install transformers==4.46.3
pip install safetensors==0.4.5
pip install sentencepiece
- Convert to BF16, which requires an additional approximately 1.3 TB of space.
Here, you need to download the transformation code from the "inference" folder of deepseek-ai/DeepSeek-V3
cd deepseek-ai/DeepSeek-V3/inference
python fp8_cast_bf16.py --input-fp8-hf-path C:/Users/admin/deepseek-ai/models/DeepSeek-V3-0324/ --output-bf16-hf-path C:/Users/admin/models/deepseek-ai/DeepSeek-V3-0324-bf16
BF16 to f16.gguf
- Use the llama.cpp conversion program to convert DeepSeek-V3-0324-bf16 to gguf format, which requires an additional approximately 1.3 TB of space.
python convert_hf_to_gguf.py C:/Users/admin/deepseek-ai/models/deepseek-ai/DeepSeek-V3-0324-bf16 --outfile C:/Users/admin/deepseek-ai/models/deepseek-ai/DeepSeek-V3-0324-bf16/ggml-model-f16.gguf --outtype f16
- Use the llama.cpp quantitative program to quantize the model (llama-quantize needs to be compiled.), other quant option. Convert first to Q2_K, which requires an additional approximately 227 GB of space.
llama-quantize C:/Users/admin/deepseek-ai/models/deepseek-ai/DeepSeek-V3-0324-bf16/ggml-model-f16.gguf C:/Users/admin/deepseek-ai/models/deepseek-ai/DeepSeek-V3-0324-bf16/ggml-model-Q2_K.gguf Q2_K
- Use llama-cli to test.
llama-cli -m C:/Users/admin/deepseek-ai/models/deepseek-ai/DeepSeek-V3-0324-bf16/ggml-model-Q2_K.gguf -n 2048
💻 Usage Examples
Use with ollama
Note: this model requires Ollama 0.5.5
Modefile
FROM deepseek-ai/DeepSeek-V3-0324-bf16/ggml-model-Q2_K.gguf
TEMPLATE """{{- range $i, $_ := .Messages }}
{{- if eq .Role "user" }}<|User|>
{{- else if eq .Role "assistant" }}<|Assistant|>
{{- end }}{{ .Content }}
{{- if eq (len (slice $.Messages $i)) 1 }}
{{- if eq .Role "user" }}<|Assistant|>
{{- end }}
{{- else if eq .Role "assistant" }}<|end▁of▁sentence|><|begin▁of▁sentence|>
{{- end }}
{{- end }}"""
PARAMETER stop <|begin▁of▁sentence|>
PARAMETER stop <|end▁of▁sentence|>
PARAMETER stop <|User|>
PARAMETER stop <|Assistant|>
PARAMETER num_gpu 1
📄 License
This project is licensed under the MIT license.
🤝 Donation
If you like it, please click 'like' and follow us for more updates.
You can follow x.com/support_huihui to get the latest model information from huihui.ai.
Your donation helps us continue our further development and improvement, a cup of coffee can do it.
- bitcoin:
bc1qqnkhuchxw0zqjh2ku3lu4hq45hc6gy84uk70ge

