đ Superthoughts Lite v2 MOE Llama3.2
A powerful and lite reasoning model for various tasks including chemistry, code, math, and conversations.
đ Quick Start
This is the GGUF version of Superthoughts Lite v2 MOE Llama3.2, with 3.91B parameters, 2 experts active, and 4 in total.
You can access different versions of the model:

⨠Features
- Better Performance: This non - experimental version offers better accuracy at all tasks, better performance, and less looping while generating responses.
- Powerful Reasoning: It is a powerful, lite reasoning model trained with multiple experts, including chat, math, code, and science reasoning experts.
- Replacement for v1: It is a direct replacement of Pinkstack/Superthoughts - lite - v1, with much better code generation and text performance.
đ Documentation
Model Training
We trained the model by first creating a base model for all the experts, which was fine - tuned using GRPO techniques with Unsloth on top of meta - llama/Llama - 3.2 - 1B - Instruct. After that, we trained each potential expert using SFT and then did GRPO again. There are 4 experts in total:
- Chat reasoning expert
- Math reasoning expert
- Code reasoning expert
- Science reasoning expert
System Prompt
You should use the following system prompt:
Thinking: enabled.
Follow this format strictly:
<think>
Write your step - by - step reasoning here.
Break down the problem into smaller parts.
Solve each part systematically.
Check your work and verify the answer makes sense.
</think>
[Your final answer after thinking].
Model Information
The model can generate up to 16,380 tokens and has a context size of 131072. It has been fine - tuned to generate thinking data in - between <think>
xml tags. Note that it may still have some slight looping, but they are rare.
Limitations
- Safety Alignment: While some safety alignment was done, it was very minimal. Thus, the model can be uncensored at times.
- Hallucination: All large language models (LLMs), including this one, can hallucinate and output false information. Always double - check responses.
- Knowledge Dependency: The chat model may make things up unless you provide it with proper information.
License
By using this model, you agree to the [LLAMA 3.2 COMMUNITY LICENSE](https://huggingface.co/meta - llama/Llama - 3.2 - 1B/blob/main/LICENSE.txt).
GGUF Template
{{ if .Messages }}
{{- if or .System .Tools }}<|start_header_id|>system<|end_header_id|>
{{- if .System }}
{{ .System }}
{{- end }}
{{- if .Tools }}
You are a helpful assistant with tool calling capabilities. When you receive a tool call response, use the output to format an answer to the original use question.
{{- end }}
{{- end }}<|eot_id|>
{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1 }}
{{- if eq .Role "user" }}<|start_header_id|>user<|end_header_id|>
{{- if and $.Tools $last }}
Given the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt.
Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}. Do not use variables.
{{ $.Tools }}
{{- end }}
{{ .Content }}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|>
{{ end }}
{{- else if eq .Role "assistant" }}<|start_header_id|>assistant<|end_header_id|>
{{- if .ToolCalls }}
{{- range .ToolCalls }}{"name": "{{ .Function.Name }}", "parameters": {{ .Function.Arguments }}}{{ end }}
{{- else }}
{{ .Content }}{{ if not $last }}<|eot_id|>{{ end }}
{{- end }}
{{- else if eq .Role "tool" }}<|start_header_id|>ipython<|end_header_id|>
{{ .Content }}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|>
{{ end }}
{{- end }}
{{- end }}
{{- else }}
{{- if .System }}<|start_header_id|>system<|end_header_id|>
{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>
{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>
{{ end }}{{ .Response }}{{ if .Response }}<|eot_id|>{{ end }}
đ License
The model is under the [LLAMA 3.2 COMMUNITY LICENSE](https://huggingface.co/meta - llama/Llama - 3.2 - 1B/blob/main/LICENSE.txt).
đĻ Information Table
Property |
Details |
Model Type |
Superthoughts Lite v2 MOE Llama3.2 |
Training Data |
First, a base model was fine - tuned using GRPO techniques with Unsloth on top of meta - llama/Llama - 3.2 - 1B - Instruct. Then, each expert was trained using SFT and GRPO again. |
Token Limit |
16,380 |
Context Size |
131072 |
â ī¸ Important Note
- The model has minimal safety alignment, so it can be uncensored at times.
- All large language models can hallucinate and output false information. Always double - check responses.
- The chat model may make things up if not provided with proper information.
đĄ Usage Tip
Make sure to use the provided system prompt for better results.