🚀 AryaBhatta Model Series
This project presents the AryaBhatta model series, which consists of two models: AryaBhatta - 1 and AryaBhatta - 2. These models are fine - tuned from either HuggingFaceH4/zephyr - 7b - gemma - v0.1 or Google/gemma and are optimized for 9 Indian languages (Hindi, Tamil, Punjabi, Bengali, Gujarati, Oriya, Telugu, Kannada, Malayalam) along with English.
✨ Features
- Multi - language Support: Fine - tuned on 9 Indian languages and English, enabling broader language coverage.
- Enhanced Reasoning and Math Skills: By fine - tuning on Microsoft's Orca datasets, the model significantly improves in mathematical reasoning.
- Benchmark Performance: Achieves competitive scores on various benchmarks compared to other models.
📦 Installation
No specific installation steps are provided in the original document. If you plan to use the model, you can follow the usage example below.
💻 Usage Examples
Basic Usage
from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer
model = AutoPeftModelForCausalLM.from_pretrained(
"GenVRadmin/AryaBhatta-GemmaOrca",
load_in_4bit = False,
token = hf_token
)
tokenizer = AutoTokenizer.from_pretrained("GenVRadmin/AryaBhatta-GemmaOrca")
input_prompt = """
### Instruction:
{}
### Input:
{}
### Response:
{}"""
input_text = input_prompt.format(
"Answer this question about India.",
"Who is the Prime Minister of India",
"",
)
inputs = tokenizer([input_text], return_tensors = "pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens = 300, use_cache = True)
response = tokenizer.batch_decode(outputs)[0]
🔧 Technical Details
- Fine - tuning Bases: The models are fine - tuned from HuggingFaceH4/zephyr - 7b - gemma - v0.1 or Google/gemma.
- Initial Tuning: To enhance reasoning and math skills, the models are first SFT tuned on Microsoft's Orca datasets, including the Orca maths Hindi dataset (GenVRadmin/Aryabhatta - Orca - Maths - Hindi) and the original Orca maths dataset (microsoft/orca - math - word - problems - 200k). This boosts the MATHS score from 24.3 in Gemma - 7B to 25.5 in Zephyr - Gemma and 31.6 in GemmaOrca.
- Subsequent Tuning: The models are then fine - tuned on GenVR's Samvaad datasets (GenVRadmin/Samvaad - Indic - Positive, GenVRadmin/Samvaad - Tamil - Mixtral, and a subset of GenVRadmin/Samvaad - Mixed - Language - 3), followed by various open - sourced datasets such as Telugu - LLM - Labs/yahma_alpaca_cleaned_telugu_filtered_and_romanized, abhinand/tamil - alpaca, etc.
📚 Documentation
Model Variants
There are two models in the AryaBhatta series. One is fine - tuned on Google's Gemma, and the other is fine - tuned on Zephyr's Gemma base. The repo for the Zephyr - based model is GenVRadmin/AryaBhatta - GemmaOrca - 2 - Merged.
Benchmark Scores
Model |
AGIEval |
GPT4All |
TruthfulQA |
BigBench |
Average ⬇️ |
AryaBhatta - GemmaOrca |
35.9 |
72.26 |
53.85 |
40.35 |
50.59 |
zephyr - 7b - beta |
37.52 |
71.77 |
55.26 |
39.77 |
51.08 |
zephyr - 7b - gemma - v0.1 |
34.22 |
66.37 |
52.19 |
37.10 |
47.47 |
mlabonne/Gemmalpaca - 7B |
21.6 |
40.87 |
44.85 |
30.49 |
34.45 |
google/gemma - 7b - it |
21.33 |
40.84 |
41.70 |
30.25 |
33.53 |
📄 License
This project is licensed under the MIT license.