Llama-Krikri-8B-Instruct Open-Source Large Language Model - Free Deployment for Handling Multi-Tasks in Greek and English

Llama Krikri 8B Instruct

Developed by ilsp

Llama-Krikri-8B-Instruct is a Greek instruction-tuned large language model developed by the Athena Research Center, based on Llama-3.1-8B, with enhanced multitasking capabilities in Greek and English.

Large Language Model

Transformers

Supports Multiple Languages#Greek language enhancement #Multilingual translation #Long-context generation

Downloads 1,630

Release Time : 4/25/2025

Model Overview

Through continuous pre-training and instruction tuning, this model possesses strong capabilities in Greek and English text generation, translation, content editing, and more, supporting a 128k context length and suitable for various professional domains.

Model Features

Enhanced Greek language capabilities

Significantly improved Greek language comprehension and generation through continuous pre-training with 56.7 billion Greek monolingual tokens.

Multilingual support

In addition to Greek and English, it also supports bidirectional document translation with French, German, Italian, Portuguese, and Spanish.

Long-context support

Supports 128k context length, suitable for processing long documents and complex tasks.

Professional domain capabilities

Excellent performance in professional domains such as law, finance, healthcare, and science.

Retrieval-augmented generation

Supports retrieval-augmented generation (RAG), suitable for tasks requiring external knowledge support.

Model Capabilities

Text generation

Multilingual translation

Content summarization

Text editing

Entity recognition

Sentiment analysis

Code formatting

Tool usage

Structured data conversion

Chain-of-thought reasoning

Use Cases

Translation

Greek-English bidirectional translation

Translate Greek documents into English or vice versa.

High-quality translation results with support for professional terminology.

Content generation

Greek content creation

Generate Greek articles, reports, or other textual content.

Natural and fluent Greek text.

Professional domains

Legal document analysis

Analyze and summarize legal documents.

Accurate extraction of key information.

🚀 Llama-Krikri-8B-Instruct: An Instruction-tuned Large Language Model for the Greek language

Llama-Krikri-8B-Instruct is an instruction-tuned large language model based on Llama-3.1-8B, extending its capabilities for Greek. It offers enhanced chat and instruction-following in Greek and English, with excellent performance in various tasks.

🚀 Quick Start

Using Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

device = "cuda"

model = AutoModelForCausalLM.from_pretrained("ilsp/Llama-Krikri-8B-Instruct")
tokenizer = AutoTokenizer.from_pretrained("ilsp/Llama-Krikri-8B-Instruct")

model.to(device)

system_prompt = "Είσαι το Κρικρί, ένα εξαιρετικά ανεπτυγμένο μοντέλο Τεχνητής Νοημοσύνης για τα ελληνικα και εκπαιδεύτηκες από το ΙΕΛ του Ε.Κ. \"Αθηνά\"."
user_prompt = "Σε τι διαφέρει ένα κρικρί από ένα λάμα;"

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": user_prompt},
]
prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
input_prompt = tokenizer(prompt, return_tensors='pt').to(device)
outputs = model.generate(input_prompt['input_ids'], max_new_tokens=256, do_sample=True)

print(tokenizer.batch_decode(outputs)[0])

Using OpenAI compatible server via vLLM

vllm serve ilsp/Llama-Krikri-8B-Instruct \
  --enforce-eager \
  --dtype 'bfloat16' \
  --api-key token-abc123

Then, use the model through Python:

from openai import OpenAI

api_key = "token-abc123"
base_url = "http://localhost:8000/v1"

client = OpenAI(
    api_key=api_key,
    base_url=base_url,
)

system_prompt = "Είσαι ένα ανεπτυγμένο μεταφραστικό σύστημα που απαντάει με λίστες Python. Δεν γράφεις τίποτα άλλο στις απαντήσεις σου πέρα από τις μεταφρασμένες λίστες."
user_prompt = "Δώσε μου την παρακάτω λίστα με μεταφρασμένο κάθε string της στα ελληνικά: ['Ethics of duty', 'Postmodern ethics', 'Consequentialist ethics', 'Utilitarian ethics', 'Deontological ethics', 'Virtue ethics', 'Relativist ethics']"

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": user_prompt},
]

response = client.chat.completions.create(model="ilsp/Llama-Krikri-8B-Instruct",
                                          messages=messages,
                                          temperature=0.0,
                                          top_p=0.95,
                                          max_tokens=8192,
                                          stream=False)

print(response.choices[0].message.content)
# ['Ηθική καθήκοντος', 'Μεταμοντέρνα ηθική', 'Συνεπειοκρατική ηθική', 'Ωφελιμιστική ηθική', 'Δεοντολογική ηθική', 'Ηθική αρετών', 'Σχετικιστική ηθική']

✨ Features

Base Model Features

Vocabulary extension of the Llama-3.1 tokenizer with Greek tokens.
128k context length (approximately 80,000 Greek words).
Extended pretraining of Llama-3.1-8B with proficiency in Greek using a large training corpus:
- 56.7 billion monolingual Greek tokens from publicly available resources.
- Additional sub-corpora with 21 billion monolingual English tokens and 5.5 billion Greek-English parallel data to ensure bilingual capabilities.
- 7.8 billion math and code tokens.
- The training corpus is processed, filtered, and deduplicated. Chosen subsets of the 91 billion corpus were upsampled to 110 billion tokens.

Sub-corpus	# Tokens	Percentage
Greek	56.7 B	62.3 %
English	21.0 B	23.1 %
Parallel	5.5 B	6.0 %
Math/Code	7.8 B	8.6 %
Total	91 B	100%

Instruct Model Features

Enhanced chat capabilities and instruction-following in both Greek and English.
Document translation between Greek and English, French, German, Italian, Portuguese, Spanish.
Great performance on generation, comprehension, and editing tasks like summarization, creative content creation, text modification, entity recognition, sentiment analysis, etc.
Domain-specific expertise for legal, financial, medical, and scientific applications.
Retrieval-Augmented Generation (RAG) with 128k context length using multiple documents.
Improved coding and agentic capabilities with correct formatting and tool use.
Conversion or structured extraction (e.g., XML, JSON) in data-to-text & text-to-data settings.
Analytical thinking and Chain-of-Thought (CoT) reasoning for problem-solving.

Post-training Methodology

2-stage Supervised Fine-Tuning with Greek & English instruction-response pairs (& multi-turn conversations):
- Stage 1: 856,946 instruction-response pairs (371,379 Greek + 485,567 English).
- Stage 2: 638,408 instruction-response pairs (279,948 Greek + 358,460 English).
Alignment with Greek & English preference triplets (Instruction - Chosen Response - Rejected Response):
- Length Normalized DPO: 92,394 preference triplets (47,132 Greek + 45,262 English).

Post-training Data Construction

Collect existing high-quality datasets like Tulu 3, SmolTalk, etc.
Translate various data into Greek using an in-house translation tool.
Regenerate translated data and contrast with regenerated responses for preference triplets.
Distill models like Gemma 2 27B IT using the MAGPIE methodology.
Score data with Skywork Reward Gemma 2 27B v0.2 and filter with rule-based filters.
Create data for sentence and document translation using parallel corpora from ELRC-SHARE.
Synthetically extract question-answer pairs and multi-turn dialogues from diverse sources.

📚 Documentation

Evaluation Results

In the chat evaluation suite:

IFEval and MT-Bench: | | IFEval EL (strict avg) | IFEval EN (strict avg) | MT-Bench EL | MT-Bench EN | |---------------- |---------------- |----------------- |------------|------------| | Qwen 2.5 7B Instruct | 46.2% | 74.8% | 5.83 | 7.87 | | EuroLLM 9B Instruct | 51.3% | 64.5% | 5.98 | 6.27 | | Aya Expanse 8B | 50.4% | 62.2% | 7.68 | 6.92 | | Meltemi 7B v1.5 Instruct | 32.7% | 41.2% | 6.25 | 5.46 | | Llama-3.1-8B Instruct | 45.8% | 75.1% | 6.46 | 7.25 | | Llama-Krikri-8B Instruct | 67.5% | 82.4% | 7.96 | 7.21 |

Llama-Krikri-8B Instruct outperforms Llama-3.1-8B-Instruct by +21.7% and +7.3% on Greek and English IFEval respectively. It also shows strong chat capabilities in Greek MT-Bench (+0.28 compared to Aya Expanse 8B) and is competitive in the English variant.

Arena-Hard-Auto:
- Greek version: Using gpt-4o-2024-08-06 as the judge model and gpt-4o-mini-2024-07-18 as the baseline model, Llama-Krikri-8B Instruct scores higher than models over 8 times its size and is competitive with closed-source and high-performant open-source models.
- English version: Using gpt-4-1106-preview as the judge model and gpt-4-0314 as the baseline model, it is competitive with similarly sized LLMs and improves upon Llama-3.1-8B Instruct by +24.5% / +16% (No style control / With style control).

⚠️ Important Note

Judge models are biased towards student models trained on distilled data from them. You can read more here.

📄 License

The license for this model is llama3.1

Acknowledgements

The ILSP team utilized Amazon's cloud computing services, which were made available via GRNET under the OCRE Cloud framework, providing Amazon Web Services for the Greek Academic and Research Community.

⚠️ Important Note

PLEASE USE THE OFFICIAL QUANTIZED VERSIONS: GGUF OR REQUEST A SPECIFIC ONE. There is no guarantee that you are using the latest improved versions from 3rd party quantizations as we have updated the model's weights.

Following the release of Meltemi-7B on the 26th March 2024, we are happy to welcome Krikri to the family of ILSP open Greek LLMs. Krikri is built on top of Llama-3.1-8B, extending its capabilities for Greek through continual pretraining on a large corpus of high-quality and locally relevant Greek texts. We present Llama-Krikri-8B-Instruct, along with the base model, Llama-Krikri-8B-Base

🚨 More information on post-training, methodology, and evaluation coming soon. 🚨

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご