CosmicBun-8B Open-source Model - Freely Supports Text Generation in Scientific Fields such as Mathematics, Physics, Chemistry, and Biology

Cosmicbun 8B

Developed by aloobun

CosmicBun-8B is a merged model based on the Llama3-8B architecture, specializing in text generation tasks for scientific fields such as mathematics, physics, chemistry, and biology.

Large Language Model

Transformers

Open Source License:MIT #Multidisciplinary knowledge integration #Scientific reasoning optimization #Few-shot learning

Downloads 19

Release Time : 5/1/2024

Model Overview

This model was created by merging multiple Llama3-8B variants (including dolphin-2.9, Einstein-v6.1, and neural-chat-v1) to enhance performance in science-related tasks.

Model Features

Science domain optimization

Specializes in text generation capabilities for scientific fields such as mathematics, physics, chemistry, and biology

Multi-model merging

Uses DARE/TIES methods to merge multiple Llama3-8B variants, combining the strengths of each model

Hierarchical parameter configuration

Applies different density and weight configurations to various model layers to optimize performance

Model Capabilities

Text generation

Scientific question answering

Mathematical reasoning

Physics concept explanation

Chemistry knowledge Q&A

Biology knowledge Q&A

Use Cases

Education

Scientific question answering

Answers students' questions related to mathematics, physics, chemistry, and biology

Achieves 68.23% accuracy on the GSM8k mathematical reasoning task

Research assistance

Scientific concept explanation

Helps researchers quickly understand complex scientific concepts

Achieves 65.53% accuracy on the MMLU comprehensive knowledge test

🚀 CosmicBun-8B

CosmicBun-8B is a merged pre - trained language model, leveraging the power of multiple base models to achieve excellent performance in text - generation tasks.

📚 Documentation

Model Overview

This is a merge of pre - trained language models created using mergekit.

Merge Method

This model was merged using the DARE TIES merge method using [Locutusque/llama - 3 - neural - chat - v1 - 8b](https://huggingface.co/Locutusque/llama - 3 - neural - chat - v1 - 8b) as a base.

Models Merged

The following models were included in the merge:

[cognitivecomputations/dolphin - 2.9 - llama3 - 8b](https://huggingface.co/cognitivecomputations/dolphin - 2.9 - llama3 - 8b)
[Weyaxi/Einstein - v6.1 - Llama3 - 8B](https://huggingface.co/Weyaxi/Einstein - v6.1 - Llama3 - 8B)

Configuration

The following YAML configuration was used to produce this model:

base_model: Locutusque/llama-3-neural-chat-v1-8b
dtype: bfloat16
merge_method: dare_ties
parameters:
  int8_mask: 1.0
  normalize: 0.0
slices:
- sources:
  - layer_range: [0, 4]
    model: cognitivecomputations/dolphin-2.9-llama3-8b
    parameters:
      density: 1.0
      weight: 0.6
  - layer_range: [0, 4]
    model: Weyaxi/Einstein-v6.1-Llama3-8B
    parameters:
      density: 0.6
      weight: 0.5
  - layer_range: [0, 4]
    model: Locutusque/llama-3-neural-chat-v1-8b
    parameters:
      density: 1.0
      weight: 0.5
- sources:
  - layer_range: [4, 8]
    model: cognitivecomputations/dolphin-2.9-llama3-8b
    parameters:
      density: 0.8
      weight: 0.1
  - layer_range: [4, 8]
    model: Weyaxi/Einstein-v6.1-Llama3-8B
    parameters:
      density: 1.0
      weight: 0.2
  - layer_range: [4, 8]
    model: Locutusque/llama-3-neural-chat-v1-8b
    parameters:
      density: 1.0
      weight: 0.7
- sources:
  - layer_range: [8, 12]
    model: cognitivecomputations/dolphin-2.9-llama3-8b
    parameters:
      density: 0.7
      weight: 0.1
  - layer_range: [8, 12]
    model: Weyaxi/Einstein-v6.1-Llama3-8B
    parameters:
      density: 0.7
      weight: 0.2
  - layer_range: [8, 12]
    model: Locutusque/llama-3-neural-chat-v1-8b
    parameters:
      density: 0.7
      weight: 0.6
- sources:
  - layer_range: [12, 16]
    model: cognitivecomputations/dolphin-2.9-llama3-8b
    parameters:
      density: 0.9
      weight: 0.2
  - layer_range: [12, 16]
    model: Weyaxi/Einstein-v6.1-Llama3-8B
    parameters:
      density: 0.6
      weight: 0.6
  - layer_range: [12, 16]
    model: Locutusque/llama-3-neural-chat-v1-8b
    parameters:
      density: 0.7
      weight: 0.3
- sources:
  - layer_range: [16, 20]
    model: cognitivecomputations/dolphin-2.9-llama3-8b
    parameters:
      density: 1.0
      weight: 0.2
  - layer_range: [16, 20]
    model: Weyaxi/Einstein-v6.1-Llama3-8B
    parameters:
      density: 1.0
      weight: 0.2
  - layer_range: [16, 20]
    model: Locutusque/llama-3-neural-chat-v1-8b
    parameters:
      density: 0.9
      weight: 0.4
- sources:
  - layer_range: [20, 24]
    model: cognitivecomputations/dolphin-2.9-llama3-8b
    parameters:
      density: 0.7
      weight: 0.2
  - layer_range: [20, 24]
    model: Weyaxi/Einstein-v6.1-Llama3-8B
    parameters:
      density: 0.9
      weight: 0.3
  - layer_range: [20, 24]
    model: Locutusque/llama-3-neural-chat-v1-8b
    parameters:
      density: 1.0
      weight: 0.4
- sources:
  - layer_range: [24, 28]
    model: cognitivecomputations/dolphin-2.9-llama3-8b
    parameters:
      density: 1.0
      weight: 0.4
  - layer_range: [24, 28]
    model: Weyaxi/Einstein-v6.1-Llama3-8B
    parameters:
      density: 0.8
      weight: 0.2
  - layer_range: [24, 28]
    model: Locutusque/llama-3-neural-chat-v1-8b
    parameters:
      density: 0.9
      weight: 0.4
- sources:
  - layer_range: [28, 32]
    model: cognitivecomputations/dolphin-2.9-llama3-8b
    parameters:
      density: 1.0
      weight: 0.3
  - layer_range: [28, 32]
    model: Weyaxi/Einstein-v6.1-Llama3-8B
    parameters:
      density: 0.9
      weight: 0.2
  - layer_range: [28, 32]
    model: Locutusque/llama-3-neural-chat-v1-8b
    parameters:
      density: 1.0
      weight: 0.3

📊 Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	68.81
AI2 Reasoning Challenge (25 - Shot)	61.86
HellaSwag (10 - Shot)	84.29
MMLU (5 - Shot)	65.53
TruthfulQA (0 - shot)	54.08
Winogrande (5 - shot)	78.85
GSM8k (5 - shot)	68.23

📄 License

This project is licensed under the MIT license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご