🚀 VeriUS LLM 8b v0.2
VeriUS LLM is an instruction-following large language model based on llama3-8B, supporting the Turkish language. It aims to provide reliable language processing capabilities for Turkish-related tasks.
✨ Features
- Based on llama3-8B architecture, offering strong language understanding and generation abilities.
- Supports Turkish language, catering to Turkish-speaking users.
- Fine-tuned with a carefully curated Turkish instruction dataset for better performance.
📦 Installation
This model is trained using Unsloth and uses it for fast inference. For Unsloth installation, please refer to: https://github.com/unslothai/unsloth
💻 Usage Examples
Basic Usage
from unsloth import FastLanguageModel
max_seq_len = 1024
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="VeriUs/VeriUS-LLM-8b-v0.2",
max_seq_length=max_seq_len,
dtype=None
)
FastLanguageModel.for_inference(model)
prompt_tempate = """Aşağıda, görevini açıklayan bir talimat ve daha fazla bağlam sağlayan bir girdi verilmiştir. İsteği uygun bir şekilde tamamlayan bir yanıt yaz.
### Talimat:
{}
### Girdi:
{}
### Yanıt:
"""
def generate_output(instruction, user_input):
input_ids = tokenizer(
[
prompt_tempate.format(instruction, user_input)
], return_tensors="pt").to("cuda")
outputs = model.generate(**input_ids, max_length=max_seq_len, do_sample=True)
outputs = [output[len(input_ids[i].ids):] for i, output in enumerate(outputs)]
return tokenizer.decode(outputs[0], skip_special_tokens=True)
response = generate_output("Türkiye'nin en kalabalık şehri hangisidir?", "")
print(response)
📚 Documentation
Model Details
Property |
Details |
Model Type |
VeriUS LLM 8b v0.2 |
Base Model |
unsloth/llama-3-8b-bnb-4bit |
Training Dataset |
A carefully curated general domain Turkish instruction dataset. |
Training Method |
Fine-tuned using QLoRA and ORPO |
TrainingArguments
PER_DEVICE_BATCH_SIZE: 2
GRADIENT_ACCUMULATION_STEPS: 4
WARMUP_RATIO: 0.03
NUM_EPOCHS: 2
LR: 0.000008
OPTIM: "adamw_8bit"
WEIGHT_DECAY: 0.01
LR_SCHEDULER_TYPE: "linear"
BETA: 0.1
PEFT Arguments
RANK: 128
TARGET_MODULES:
- "q_proj"
- "k_proj"
- "v_proj"
- "o_proj"
- "gate_proj"
- "up_proj"
- "down_proj"
LORA_ALPHA: 256
LORA_DROPOUT: 0
BIAS: "none"
GRADIENT_CHECKPOINT: 'unsloth'
USE_RSLORA: false
Bias, Risks, and Limitations
⚠️ Important Note
VeriUS LLM, an autoregressive language model, is primarily designed to predict the next token in a text string. While often used for various applications, it is important to note that it has not undergone extensive real-world application testing. Its effectiveness and reliability across diverse scenarios remain largely unverified.
The base model is primarily trained in standard English. Even though it is fine-tuned with a Turkish dataset, its performance in understanding and generating slang, informal language, or other languages may be limited, leading to potential errors or misinterpretations. Users should be aware that VeriUS LLM may produce inaccurate or misleading information. Outputs should be considered as starting points or suggestions rather than definitive answers.
📄 License
The model is under the llama3 license.