š ReasoningCoreā3B-RE01
ReasoningCoreā3B is a multilingual, reasoningāenhanced large language model developed by EpitemeAI. Pretrained on a vast amount of publicly available data and instructionātuned, it excels at nuanced reasoning, dialogue management, retrieval, and summarization tasks. It often outperforms many current open - source and proprietary conversational models on a range of industry benchmarks. It is fine - tuned with a reasoning dataset.
ā ļø Important Note
This is an experimental model.
⨠Features
- Multilingual support for various common languages.
- Enhanced reasoning capabilities through fine - tuning with a reasoning dataset.
- Built - in safety guardrails and can be further enhanced with additional safeguards.
š¦ Installation
Ensure you have transformers
version 4.43.0 or later installed:
pip install --upgrade transformers
š» Usage Examples
Basic Usage
Use system prompt
SYSTEM_PROMPT = """
Respond in the following format:
<reasoning>
...
</reasoning>
<answer>
...
</answer>
"""
Use with Transformers
import torch
from transformers import pipeline
model_id = "EpistemeAI/ReasoningCore-3B-R01"
pipe = pipeline(
"text-generation",
model=model_id,
torch_dtype=torch.bfloat16,
device_map="auto"
)
print(pipe("The secret to effective reasoning is"))
Advanced Usage
For Mathematical problems
Please use "Please reason step by step, and put your final answer within \boxed{}" in the system prompt.
š Documentation
Model Information
Property |
Details |
Model Developer |
EpitemeAI |
Model Architecture |
ReasoningCoreā3B is an auto - regressive language model built on an optimized transformer architecture. It incorporates specialized reasoning pathways and has been fine - tuned using Group Robust Preference Optimization(GRPO), and both supervised learning and reinforcement learning with human feedback (RLHF) to align with human expectations for clarity, accuracy, and safety in complex tasks. |
Training Data |
A new mix of publicly available online data. |
Params |
3B |
Input Modalities |
Multilingual Text |
Output Modalities |
Multilingual Text and code |
Context Length |
128k |
GQA |
Yes |
Shared Embeddings |
Yes |
Token Count |
Up to 9T tokens |
Knowledge Cutoff |
December 2023 |
Supported Languages |
Officially supports English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. While the pretraining included a broader range of languages, additional languages can be fine - tuned in compliance with the community license and acceptable use policies. |
Model Release Date |
Sept 25, 2024 |
Status |
Static model trained on an offline dataset. Future iterations may further enhance its reasoning capabilities and safety features. |
License |
Use is governed by the [Llama 3.2 Community License](https://github.com/meta - llama/llama - models/blob/main/models/llama3_2/LICENSE) (a custom, commercial license agreement). |
Feedback |
For questions or comments, please refer to the [GitHub repository README](https://github.com/meta - llama/llama - models/tree/main/models/llama3_2) or follow the linked instructions. |
Intended Use
Use Cases
- Conversational AI: Assistant - like interactions.
- Knowledge Retrieval & Summarization: Dynamic extraction and condensation of information.
- Mobile AI - Powered Writing Assistants: Query reformulation and natural language generation.
- General Natural Language Generation: Any application that benefits from advanced reasoning abilities.
Out of Scope
- Deployments that violate applicable laws or trade compliance regulations.
- Use cases that conflict with the Acceptable Use Policy or licensing terms.
- Deployments in languages not explicitly supported (unless additional safety and performance validations are performed).
Responsibility & Safety
Responsible Deployment
- Approach: ReasoningCoreā3B is a foundational technology that includes built - in safety guardrails. Developers are encouraged to integrate additional safeguards tailored to their specific applications.
- System - Level Safety: The model is designed to be deployed as part of a broader system that implements safety measures (e.g., Prompt Guard, Code Shield) to ensure outputs remain safe even under adversarial conditions.
Safety Fine - Tuning & Data Strategy
- Objectives:
- Provide a reliable tool for building secure and helpful reasoning systems.
- Mitigate adversarial misuse through advanced data selection and response optimization techniques.
- Methodology:
- Incorporate adversarial prompts during training to refine model refusals and response tone.
- Combine human - curated data with synthetic data.
- Utilize iterative fine - tuning using supervised learning, rejection sampling, and preference optimization.
Evaluations and Red Teaming
- Scaled Evaluations: Dedicated adversarial datasets were used to rigorously test the modelās robustness. Developers should perform context - specific evaluations.
- Red Teaming: Experts in cybersecurity, adversarial machine learning, and responsible AI conducted recurring red team exercises to identify vulnerabilities and improve both performance and safety.
Critical Risk Mitigations
- CBRNE: The model has been evaluated to ensure it does not enhance capabilities for harmful activities involving chemical, biological, radiological, nuclear, or explosive materials.
- Child Safety: Expert assessments were conducted to evaluate and mitigate potential child safety risks.
- Cyber Attacks: Measures were taken to ensure the model cannot autonomously facilitate cyber - offensive operations.
Ethical Considerations and Limitations
- Core Values: ReasoningCoreā3B is built on the values of openness, inclusivity, and helpfulness. It is designed to respect user autonomy and foster free thought and expression while mitigating potential harm.
- Testing and Limitations: Despite extensive testing across diverse scenarios, the model may occasionally produce inaccurate, biased, or objectionable outputs. Developers must perform additional safety testing and integrate further safeguards as needed.
- Resources for Safe Deployment, with Meta Safety Deployment:
- [Responsible Use Guide](https://llama.meta.com/responsible - use - guide)
- [Trust and Safety Resources](https://llama.meta.com/trust - and - safety)
- [Getting Started Guide](https://llama.meta.com/docs/get - started)
š License
The uploaded model is developed by EpistemeAI and is under the apache - 2.0
license. It is finetuned from the model EpistemeAI/ReasoningCore-3B-0
. This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.
