đ Hicoder-R1-Distill-Gemma-27B
This CoT-enabled large language model is fine-tuned from Google's Gemma-3 27B base model. It's optimized for Chain-of-Thought (CoT) reasoning and code generation tasks, and notably, it was trained using only a single RTX 4090D through optimizations in GPU VRAM and system RAM management, as well as specific training techniques.
đ Quick Start
You can use this model with the Hugging Face transformers
library.
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "tonyli8623/Hicoder-R1-Distill-Gemma-27B"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto"
)
prompt_simple = "Write a Python function to calculate the factorial of a number."
messages_simple = [
{"role": "user", "content": prompt_simple}
]
input_ids_simple = tokenizer.apply_chat_template(messages_simple, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)
outputs_simple = model.generate(
input_ids_simple,
max_new_tokens=150,
do_sample=True,
temperature=0.7,
top_k=50,
top_p=0.95
)
response_simple = tokenizer.decode(outputs_simple[0][input_ids_simple.shape[1]:], skip_special_tokens=True)
print("--- Simple Code Generation ---")
print(response_simple)
prompt_cot = """Think step-by-step to write a Python function that finds all prime numbers up to a given integer 'n' using the Sieve of Eratosthenes algorithm. Then, provide the function.
Let's break this down:
1. Understand the Sieve of Eratosthenes.
2. Outline the steps needed in the function.
3. Write the Python code based on the outline."""
messages_cot = [
{"role": "user", "content": prompt_cot}
]
input_ids_cot = tokenizer.apply_chat_template(messages_cot, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)
outputs_cot = model.generate(
input_ids_cot,
max_new_tokens=500,
do_sample=True,
temperature=0.6,
top_k=50,
top_p=0.95
)
response_cot = tokenizer.decode(outputs_cot[0][input_ids_cot.shape[1]:], skip_special_tokens=True)
print("\n--- Code Generation with CoT ---")
print(response_cot)
⨠Features
- Enhanced CoT Reasoning: Explicitly trained to break down complex problems into intermediate steps before providing a final answer, particularly useful for complex coding or algorithmic tasks.
- Strong Coding Capabilities: Generates, explains, debugs, and translates code across various programming languages (e.g., Python, JavaScript, Java, C++, SQL, etc.).
- Gemma-3 Foundation: Built upon the powerful and efficient architecture of Google's Gemma-3 27B model.
- Distillation Enhanced (Implied): Potentially benefits from knowledge distillation for improved performance relative to standard fine-tuning on the target tasks.
đģ Usage Examples
Basic Usage
The above code examples in the "Quick Start" section demonstrate the basic usage of this model for simple code generation and code generation with CoT.
Advanced Usage
Prompting: For best results, especially when seeking CoT reasoning, explicitly ask the model to "think step-by-step" or "provide your reasoning process before the code". In system prompts, add "You are a code engineer proficient in various programming languages. Before answering, please carefully consider the question and create a logically coherent thought process, starting with and ending with . After thinking, provide the answer."
đ Documentation
Model Overview
Property |
Details |
Base Model |
google/gemma-3-27b |
Fine-tuned by |
tonyli8623 |
Focus Areas |
Chain-of-Thought (CoT), Code Generation, Code Explanation, Debugging |
Language |
Primarily English for prompts and reasoning, generates code in multiple languages. |
Limitations and Bias
- This model is based on Gemma-3, and inherits its capabilities and limitations.
- While fine-tuned for coding, it may still generate incorrect, inefficient, or insecure code. Always review and test generated code thoroughly.
- The model's knowledge is limited to its training data cutoff.
- Like all LLMs, it may exhibit biases present in the underlying training data.
- Chain-of-Thought reasoning may not always be perfect or logical.
License
The license for this model depends on the base Gemma-2 model's license and any additional terms you impose. The Gemma-3 models are typically governed by the "Gemma Terms of Use". Please consult the specific license file included with the model or the Gemma Terms of Use.
- Gemma Terms of Use: [Link to Google's Gemma Terms, e.g., https://ai.google.dev/gemma/terms]
- Fine-tuning Specific License (if any): [Specify if you add Apache 2.0, MIT, etc., or state it follows the base model license]
Citation
If you use this model in your research or work, please consider citing:
@misc{hicoder_r1_distill_gemma_27b_[year],
title={Hicoder-R1-Distill-Gemma-27B: A Chain-of-Thought and Code Generation Focused Model},
author={[Your Name/Organization]},
year={[Year of Release]},
howpublished={\url{[Link to Model Hub or Repository]}}
}
@misc{gemma2_2024,
title={Gemma 3 Technical Report},
author={Gemma Team, Google},
year={2024},
howpublished={\url{https://ai.google.dev/gemma}} % Replace with actual Gemma 2 paper/report link if available
}
đ License
The license for this model depends on the base Gemma-2 model's license and any additional terms you impose. The Gemma-3 models are typically governed by the "Gemma Terms of Use". Please consult the specific license file included with the model or the Gemma Terms of Use.
đ Contact
For questions, feedback, or issues, please contact tonyli288@gmail.com.