Model Overview
Model Features
Model Capabilities
Use Cases
đ DeepSeek-qwen-bllossom-32B
The DeepSeek-Bllossom Series is a model that has undergone additional training to address the issues of language mixing and poor multilingual performance in the original DeepSeek-R1-Distill Series models. DeepSeek-qwen-Bllossom-32B is built on the DeepSeek-R1-Distill-Qwen-32B model and is developed with the goal of improving inference performance in a Korean language environment.
This model is the first joint creation of the UNIVA and Bllossom teams.
Model | Base Model | Download |
---|---|---|
DeepSeek-qwen-Bllossom-1.5B | DeepSeek-R1-Distill-Qwen-1.5B | To be released |
DeepSeek-qwen-Bllossom-7B | DeepSeek-R1-Distill-Qwen-7B | To be released |
DeepSeek-llama3.1-Bllossom-8B | DeepSeek-R1-Distill-Llama-8B | đ¤ HuggingFace |
DeepSeek-qwen-Bllossom-14B | DeepSeek-R1-Distill-Qwen-14B | To be released |
DeepSeek-qwen-Bllossom-32B | DeepSeek-R1-Distill-Qwen-32B | đ¤ HuggingFace |
DeepSeek-llama3.3-Bllossom-70B | DeepSeek-R1-Distill-Llama-70B | đ¤ HuggingFace |
đ Quick Start
DeepSeek-qwen-Bllossom-32B is constructed based on the DeepSeek-R1-Distill-Qwen-32B model. The original base model was trained mainly on English and Chinese data, which had limitations. Specifically, the original DeepSeek-R1-Distill-Qwen-32B model had a significant performance drop when making inferences in Korean. To address this issue, DeepSeek-Bllossom was further trained to perform internal thinking processes in English and output responses to the final users according to the input language. As a result, the inference performance in the Korean language environment has been greatly improved.
During the training process, Korean and English reasoning data were used. In addition to the STEM field data commonly used in training the original DeepSeek-R1 model, data from various fields were also included. In the process of dataset design and model training, the main goal of DeepSeek-qwen-Bllossom-32B is to provide more accurate and reliable inference results in the Korean language environment.
⨠Features
Post - training
DeepSeek-qwen-Bllossom-32B carried out a post - training process using various self - created reasoning data. In this process, a method was applied to effectively distill the excellent reasoning ability and Korean language processing ability of large - scale models into the DeepSeek-R1-Distill-Qwen-32B model. Through this, the performance of the original model was supplemented, and it was optimized to generate more accurate and reliable responses to complex reasoning problems.
đģ Usage Examples
Basic Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model = AutoModelForCausalLM.from_pretrained(
"UNIVA-Bllossom/DeepSeek-llama3.1-Bllossom-8B",
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("UNIVA-Bllossom/DeepSeek-qwen-Bllossom-32B")
system='''
You are a highly capable assistant. For every user question, follow these instructions exactly:
1. First, think through the problem step-by-step in English. Enclose all of your internal reasoning between <think> and </think> tags. This chain-of-thought should detail your reasoning process.
2. After the closing </think> tag, provide your final answer.
3. Do not include any additional text or commentary outside of this format.
4. Your output should strictly follow this structure:
<think>
[Your detailed step-by-step reasoning in English]
</think>
[Your final answer]
'''
text="ė˛ ė, ėíŦ, ë¯ŧėę° 3íė ę˛ėėė ė ėëĨŧ ë°ėėĩëë¤. ėíŦė ė ėë ë¯ŧėė ė ėė ë ë°°ė´ëа, ë¯ŧėė ė ėë ė˛ ėė 4ë°°ė
ëë¤. ė˛ ėę° 10ė ė ë°ėë¤ëŠ´ ė´ 3ëĒ
ė íęˇ ė ėëĨŧ ęŗė°íė¸ė."
chat = [
{"role": "system", "content": system},
{"role": "user", "content": text}
]
prompt=tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
model_inputs = tokenizer(
prompt,
return_tensors="pt",
add_special_tokens=True
)
if "token_type_ids" in model_inputs:
del model_inputs["token_type_ids"]
model_inputs = {k: v.to(model.device) for k, v in model_inputs.items()}
generated_ids = model.generate(
**model_inputs,
max_new_tokens=8192,
)
đ Documentation
Model Response Comparison
Input (Prompt) | DeepSeek-R1-distill-Llama-70B | DeepSeek-llama3.3-Bllossom-70B |
---|---|---|
Alice, Bob, Charlie got scores in 3 games. Alice's score is twice that of Bob, and Bob's score is 4 times that of Charlie. If Charlie got 10 points, calculate the average score of these three people. | Charlie's score is 10 points. Bob's score is 4 times that of Charlie, so Bob's score is $$ 4 \times 10 = 40 $$ points. Alice's score is twice that of Bob, so Alice's score is $$ 2 \times 40 = 80 $$ points. The total score of the three people is $$ 10 + 40 + 80 = 130 $$ points. The average score is $$ \frac{130}{3} \approx 43.33$$ points. $$ \boxed{43.33} $$ Charlie's score is 10 points. |
Charlie's score is given as 10 points. Bob's score is 4 times Charlie's score, which is calculated as follows: $$ 4 \times 10 = 40 $$ Alice's score is twice Bob's score, which is as follows: $$ 2 \times 40 = 80 $$ The total score of the three people is as follows: $$ 10 + 40 + 80 = 130 $$ The average score of these three people is calculated by dividing the total score by the number of people: $$ \frac{130}{3} $$ Therefore, the average score of these three people is $$ \boxed{\dfrac{130}{3}} $$ . |
Prove the fact that prime numbers are infinite in three ways. | The following are the detailed steps to prove the infinity of prime numbers in three ways: ### Method 1: Euclid's proof by contradiction 1. Assumption: Assume that prime numbers are finite, and all prime numbers are $ p_1, p_2, p_3, \ldots, p_n $. 2. Construct a number: Construct a new number $ N = (p_1 \times p_2 \times p_3 \times \ldots \times p_n) + 1 $. 3. Analyze the properties of N: - If N is a prime number, then N is not in the original assumed list of prime numbers, which contradicts the assumption. - If N is a composite number, then there exists a prime number $ p $ that divides N. But since N has a remainder of 1 when divided by any $ p_i $, $ p $ is not in the original list, which also contradicts the assumption. 4. Conclusion: Therefore, prime numbers must be infinite. ### Method 2: The property of Fermat numbers 1. Definition: Fermat numbers are in the form of $ F_k = 2^{2^k} + 1 $, where $ k $ is a non - negative integer. 2. Assumption: Assume that prime numbers are finite, and all prime numbers are $ p_1, p_2, \ldots, p_n $. 3. Construct Fermat numbers: For each $ k $, construct $ F_k $. If $ F_k $ is a prime number, it is not in the original list; if it is a composite number, its prime factors are not in the original list. 4. Conclusion: Therefore, prime numbers must be infinite. ### Method 3: Euler's totient function and the infinity of prime numbers 1. Euler's totient function: Euler's totient function $ \phi(n) $ represents the number of positive integers less than $ n $ and relatively prime to $ n $. 2. Assumption: Assume that prime numbers are finite, denoted as $ p_1, p_2, \ldots, p_n $. 3. Construct a number: Consider a number in the form of $ N = p_1^{k_1} p_2^{k_2} \ldots p_n^{k_n} $, and its Euler's totient function is $ \phi(N) = N \prod_{i=1}^{n} \left(1 - \frac{1}{p_i}\right) $. 4. Analysis: If prime numbers are finite, $ \prod_{p \text{ prime}} \left(1 - \frac{1}{p}\right) $ converges, but in fact, this product diverges, leading to a contradiction. 5. Conclusion: Therefore, prime numbers must be infinite. These three methods prove the infinity of prime numbers from different perspectives, demonstrating the profundity of the infinity of prime numbers in number theory. |
To prove that prime numbers are infinite in three ways, we can use the following approaches: 1. Euclid's proof by contradiction: - Assumption: Assume that prime numbers are a finite number $ p_1, p_2, \dots, p_n $. - Construction: Define $ P = p_1 \times p_2 \times \dots \times p_n + 1 $. - Conclusion: If $ P $ is a prime number, it is a new prime number not in the list. If $ P $ is a composite number, it must contain a prime number not in the list. In both cases, it contradicts the assumption. Therefore, prime numbers are infinite. 2. Factorial approach: - Assumption: Assume that prime numbers are a finite number $ p_1, p_2, \dots, p_n $. - Construction: Consider $ n! + 1 $, where $ n $ is the largest prime number. - Conclusion: Since $ n! $ is divisible by all $ p_i $, $ n! + 1 $ does not contain any of the $ p_i $. If $ n! + 1 $ is a prime number, it is a new prime number greater than $ n $. If $ n! + 1 $ is a composite number, it must contain a prime number greater than $ n $. In both cases, it contradicts the assumption. Therefore, prime numbers are infinite. 3. Euler's proof using the zeta function: - Concept: Euler showed that the zeta function $ \zeta(s) = \sum_{k=1}^\infty \frac{1}{k^s} $ can be expressed as a product over prime numbers: $ \zeta(s) = \prod_{p \text{ prime}} \frac{1}{1 - \frac{1}{p^s}} $. - Conclusion: If prime numbers are finite, the product should be finite. However, when $ s $ approaches a value greater than 1, $ \zeta(s) $ diverges to infinity. This requires the product to diverge, and therefore, prime numbers must be infinite. Each method confirms that prime numbers are infinite and is based on different mathematical principles. Euclid's proof uses contradiction, the factorial approach uses simple number theory, and Euler's proof uses tools from analytic number theory. All three methods enhance the understanding of the infinity of prime numbers. |
Benchmark
- The inference was carried out with the dtype set to float16.
- max_tokens: 32786
- temperature: 0.7
- Evaluation method: Each benchmark was run 3 times, and then the average score was calculated.
- _en benchmark: The original benchmark questions were used as they were.
- _ko benchmark: The original benchmark questions were translated into high - quality Korean and used.
Model | AIME24_ko | AIME24_en | MATH500_ko | MATH500_en |
---|---|---|---|---|
DeepSeek-R1-Distill-Llama-8B | 25.56 | 46.67 | 63.40 | 88.87 |
DeepSeek-llama3.1-Bllossom-8B | 36.67 | 40.00 | 78.07 | 87.80 |
DeepSeek-R1-Distill-Qwen-32B | 48.89 | 75.56 | 86.87 | 93.47 |
DeepSeek-qwen-Bllossom-32B | 66.67 | 67.78 | 87.67 | 93.73 |
DeepSeek-R1-Distill-Llama-70B | 58.89 | 70.00 | 88.53 | 93.73 |
DeepSeek-llama3.3-Bllossom-70B | 62.22 | 65.56 | 88.40 | 93.33 |
đ License
This code repository and the model weights are licensed under the MIT License. The DeepSeek-Bllossom series supports commercial use, allows for any modifications and derivative works, including, but not limited to, distillation for training other LLMs. Please note that:
- DeepSeek-R1-Distill-Qwen-32B is derived from Qwen2.5-32B and is originally licensed under the Apache 2.0 License.
- DeepSeek-qwen-Bllossom-32B is derived from DeepSeek-R1-Distill-Qwen-32B and is originally licensed under the Apache 2.0 License.
Contributor
- UNIVA AI Team (UNIVA, Main contributor)
- Changsu Choi (Graduate student, MLP Lab, Seoul National University of Science and Technology)
- Kyeongtae Lim (Professor, MLP Lab, KAIST)
Contact
If you have any questions, please raise an issue or contact us at frodobaggins@univa.co.kr or ktlim@seoultech.ac.kr.

