Llama 3.1 8B AthenaSky MegaMix

Developed by ZeroXClem

An 8B-parameter large language model fused via MergeKit from multiple high-quality models, optimized for reasoning, dialogue, and creative generation

Large Language Model

Transformers

EnglishOpen Source License:Apache-2.0 #Multi-task reasoning #Deep conversation optimization #Role-playing enhancement

Downloads 105

Release Time : 3/11/2025

Model Overview

This model integrates multiple Llama-3.1 variants, excelling in text generation, logical reasoning, and role-playing

Model Features

Advanced reasoning capability

Incorporates Skywork-o1 model to enhance logical thinking and problem-solving abilities

Deep conversational engagement

Integrates Claude-style fine-tuned models to improve dialogue quality and response structure

Versatile role-playing

Combines multiple role-playing optimized models to support immersive interactive experiences

Strong instruction following

Trained on diverse instruction datasets to accurately understand and execute complex instructions

Model Capabilities

Text generation

Logical reasoning

Code generation

Creative writing

Educational assistance

Problem solving

Use Cases

Dialogue & Interaction

Intelligent chat assistant

For building natural and fluent dialogue systems

Achieved 63.01 strict accuracy on IFEval benchmark

Role-playing applications

Supports immersive role-playing and story creation

Education & Research

Academic Q&A

Explains complex academic concepts and theories

Achieved 27.82 accuracy on MMLU-PRO test

Programming assistance

Code generation & completion

Provides programming suggestions and code examples

language:

en license: apache-2.0 library_name: transformers tags:
merge
mergekit
lazymergekit
model_stock
ZeroXClem-Llama-3.1-8B-AthenaSky-MegaMix base_model:
Pedro13543/mega_blend_model
Skywork/Skywork-o1-Open-Llama-3.1-8B
Undi95/Meta-Llama-3.1-8B-Claude
mergekit-community/good_mix_model_Stock
mergekit-community/L3.1-Athena-d-8B pipeline_tag: text-generation model-index:
name: Llama-3.1-8B-AthenaSky-MegaMix results:
- task: type: text-generation name: Text Generation dataset: name: IFEval (0-Shot) type: HuggingFaceH4/ifeval args: num_few_shot: 0 metrics:
  - type: inst_level_strict_acc and prompt_level_strict_acc value: 63.01 name: strict accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix name: Open LLM Leaderboard
- task: type: text-generation name: Text Generation dataset: name: BBH (3-Shot) type: BBH args: num_few_shot: 3 metrics:
  - type: acc_norm value: 31.39 name: normalized accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix name: Open LLM Leaderboard
- task: type: text-generation name: Text Generation dataset: name: MATH Lvl 5 (4-Shot) type: hendrycks/competition_math args: num_few_shot: 4 metrics:
  - type: exact_match value: 27.95 name: exact match source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix name: Open LLM Leaderboard
- task: type: text-generation name: Text Generation dataset: name: GPQA (0-shot) type: Idavidrein/gpqa args: num_few_shot: 0 metrics:
  - type: acc_norm value: 3.69 name: acc_norm source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix name: Open LLM Leaderboard
- task: type: text-generation name: Text Generation dataset: name: MuSR (0-shot) type: TAUR-Lab/MuSR args: num_few_shot: 0 metrics:
  - type: acc_norm value: 6.9 name: acc_norm source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix name: Open LLM Leaderboard
- task: type: text-generation name: Text Generation dataset: name: MMLU-PRO (5-shot) type: TIGER-Lab/MMLU-Pro config: main split: test args: num_few_shot: 5 metrics:
  - type: acc value: 27.82 name: accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix name: Open LLM Leaderboard

ZeroXClem-Llama-3.1-8B-AthenaSky-MegaMix

Overview

ZeroXClem-Llama-3.1-8B-AthenaSky-MegaMix is a powerful AI model built through model stock merging using MergeKit. It brings together some of the best models available on Hugging Face, ensuring strong performance in a wide range of NLP tasks, including reasoning, coding, roleplay, and instruction-following.

Model Fusion

This model was created by merging high-quality foundational and fine-tuned models to create an optimized blended architecture that retains the strengths of each contributing model.

Merge Details

Merge Method: model_stock
Base Model: mergekit-community/L3.1-Athena-d-8B
Dtype: bfloat16
Tokenizer Source: mergekit-community/L3.1-Athena-d-8B

Models Merged

The following models contributed to this fusion:

Pedro13543/mega_blend_model - A well-balanced blend of roleplay and instruction-tuned Llama-3.1 variants.
Skywork/Skywork-o1-Open-Llama-3.1-8B - Optimized for reasoning and slow-thinking capabilities.
Undi95/Meta-Llama-3.1-8B-Claude - Fine-tuned on Claude Opus/Sonnet data, improving response depth and conversational engagement.
mergekit-community/good_mix_model_Stock - A diverse mixture including RP-focused and knowledge-heavy datasets.

Configuration

name: ZeroXClem-Llama-3.1-8B-AthenaSky-MegaMix
base_model: mergekit-community/L3.1-Athena-d-8B
dtype: bfloat16
merge_method: model_stock
models:
  - model: Pedro13543/mega_blend_model
  - model: Skywork/Skywork-o1-Open-Llama-3.1-8B
  - model: Undi95/Meta-Llama-3.1-8B-Claude
  - model: mergekit-community/good_mix_model_Stock
tokenizer_source: mergekit-community/L3.1-Athena-d-8B

Features & Improvements

🔹 Advanced Reasoning & Thoughtfulness - Thanks to Skywork-o1 integration, this model excels in logical thinking and problem-solving.

🔹 Enhanced Conversational Depth - The inclusion of Meta-Llama-3.1-8B-Claude adds better response structuring, making it more engaging in dialogue.

🔹 Versatile Roleplay & Creativity - Leveraging mega_blend_model and good_mix_model_Stock, the model supports immersive roleplaying and storytelling.

🔹 Strong Instruction Following - Trained on various instruction datasets to provide clear, informative, and helpful responses.

Use Cases

Chat & Roleplay - Supports natural, engaging, and dynamic conversational flow.
Programming & Code Generation - Provides reliable code completions and debugging suggestions.
Creative Writing - Generates compelling stories, character dialogues, and immersive text.
Educational Assistance - Helps explain complex topics and answer academic questions.
Logic & Problem-Solving - Can handle reasoning-based and structured thought processes.

🛠 How to Use

🔥 Ollama (Quick Inference)

You can run the model using Ollama for direct testing:

ollama run hf.co/ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix

🤗 Hugging Face Transformers (Python)

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import torch

model_name = "ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix"

# Load tokenizer & model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name, 
    torch_dtype=torch.bfloat16, 
    device_map="auto"
)

# Initialize text generation pipeline
text_generator = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Example prompt
prompt = "Describe the significance of AI ethics in modern technology."

# Generate output
outputs = text_generator(
    prompt,
    max_new_tokens=200,
    do_sample=True,
    temperature=0.7,
    top_k=50,
    top_p=0.95
)

print(outputs[0]["generated_text"])

Model Alignment & Ethics

⚠️ Uncensored Use: This model does not apply strict moderation. Users should implement appropriate safety filters before deployment.

⚠️ Responsibility Notice: You are responsible for the outputs generated by this model. It is recommended to apply ethical safeguards and content moderation when integrating this model into applications.

📜 License: Governed by the Meta Llama 3.1 Community License Agreement.

Feedback & Contributions

We welcome feedback, bug reports, and performance evaluations! If you find improvements or wish to contribute, feel free to reach out or submit suggestions.

**ZeroXClem Team | 2025 ** ZXC

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	26.79
IFEval (0-Shot)	63.01
BBH (3-Shot)	31.39
MATH Lvl 5 (4-Shot)	27.95
GPQA (0-shot)	3.69
MuSR (0-shot)	6.90
MMLU-PRO (5-shot)	27.82

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご