J.O.S.I.E.3 Beta12 7B Slerp
J.O.S.I.E.3-Beta12-7B-slerp is a 7B-parameter large language model created by merging Weyaxi/Einstein-v6-7B and argilla/CapybaraHermes-2.5-Mistral-7B models, supporting multilingual interaction and adopting the ChatML prompt format.
Downloads 17
Release Time : 4/23/2024
Model Overview
This model is a private super-intelligent AI assistant focused on providing high-quality conversation and Q&A services, supporting multiple languages and complex task processing.
Model Features
Multilingual Support
Supports interaction in 6 languages including Chinese
Merged Model Advantage
Combines the strengths of both Einstein-v6 and CapybaraHermes models through slerp merging
ChatML Format
Adopts standardized ChatML prompt format for easy integration into dialogue systems
Quantization Support
Provides GGUF quantized versions for easy deployment on different hardware
Model Capabilities
Multilingual text generation
Intelligent dialogue
Knowledge Q&A
Task completion
Use Cases
Personal Assistant
Personal AI Assistant
Serves as a daily personal assistant to answer various questions and provide advice
Achieved 83.98% normalized accuracy on the HellaSwag test set
Education
Subject Knowledge Q&A
Answers high school and university-level questions across various subjects
Achieved 79.8% accuracy in high school geography tests
đ J.O.S.I.E.3-Beta12-7B-slerp
J.O.S.I.E.3-Beta12-7B-slerp is a merged model that combines the strengths of multiple models. It uses LazyMergekit to merge the following models:
This model has been further fine - tuned on a custom J.O.S.I.E.v3.11 Dataset, following the ChatML prompt format.
đ Quick Start
Run in ollama
ollama run goekdenizguelmez/j.o.s.i.e.v3-beta12.1
Note: Only q4 - k - m is available for now!
⨠Features
- Model Merging: Utilizes LazyMergekit to combine multiple pre - trained models.
- Fine - Tuning: Fine - tuned on a custom dataset in the ChatML prompt format.
đĻ Installation
To use this model in Python, you need to install the necessary libraries:
!pip install -qU transformers accelerate
đģ Usage Examples
Basic Usage
from transformers import AutoTokenizer
import transformers
import torch
model = "Isaak-Carter/J.O.S.I.E.3-Beta12-7B-slerp"
messages = [{"role": "user", "content": "What is a large language model?"}]
tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
"text-generation",
model=model,
torch_dtype=torch.float16,
device_map="auto",
)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
Prompt Format
<|im_start|>system
You are JOSIE, my private and superinteligent AI Assistant.<|im_end|>
<|im_start|>user
{{ .Prompt }}<|im_end|>
<|im_start|>assistant
{{ .Response }}<|im_end|>
đ Documentation
Configuration
slices:
- sources:
- model: Weyaxi/Einstein-v6-7B
layer_range: [0, 32]
- model: argilla/CapybaraHermes-2.5-Mistral-7B
layer_range: [0, 32]
merge_method: slerp
base_model: argilla/CapybaraHermes-2.5-Mistral-7B
parameters:
t:
- filter: self_attn
value: [0, 0.5, 0.3, 0.7, 1]
- filter: mlp
value: [1, 0.5, 0.7, 0.3, 0]
- value: 0.5
dtype: bfloat16
Evaluation Results
{
"all": {
"acc": 0.635008846776534,
"acc_stderr": 0.03244450973873997,
"acc_norm": 0.6365238167399629,
"acc_norm_stderr": 0.033101612504829854,
"mc1": 0.397796817625459,
"mc1_stderr": 0.017133934248559635,
"mc2": 0.5816259277988214,
"mc2_stderr": 0.01521267822060948
},
"harness|arc:challenge|25": {
"acc": 0.6220136518771331,
"acc_stderr": 0.0141696645203031,
"acc_norm": 0.6459044368600683,
"acc_norm_stderr": 0.013975454122756557
},
"harness|hellaswag|10": {
"acc": 0.6512646883091018,
"acc_stderr": 0.004755960559929163,
"acc_norm": 0.8397729535949015,
"acc_norm_stderr": 0.003660668242740655
},
"harness|hendrycksTest-abstract_algebra|5": {
"acc": 0.4,
"acc_stderr": 0.04923659639173309,
"acc_norm": 0.4,
"acc_norm_stderr": 0.04923659639173309
},
"harness|hendrycksTest-anatomy|5": {
"acc": 0.5703703703703704,
"acc_stderr": 0.042763494943765995,
"acc_norm": 0.5703703703703704,
"acc_norm_stderr": 0.042763494943765995
},
"harness|hendrycksTest-astronomy|5": {
"acc": 0.6842105263157895,
"acc_stderr": 0.0378272898086547,
"acc_norm": 0.6842105263157895,
"acc_norm_stderr": 0.0378272898086547
},
"harness|hendrycksTest-business_ethics|5": {
"acc": 0.58,
"acc_stderr": 0.049604496374885836,
"acc_norm": 0.58,
"acc_norm_stderr": 0.049604496374885836
},
"harness|hendrycksTest-clinical_knowledge|5": {
"acc": 0.6792452830188679,
"acc_stderr": 0.028727502957880267,
"acc_norm": 0.6792452830188679,
"acc_norm_stderr": 0.028727502957880267
},
"harness|hendrycksTest-college_biology|5": {
"acc": 0.7361111111111112,
"acc_stderr": 0.03685651095897532,
"acc_norm": 0.7361111111111112,
"acc_norm_stderr": 0.03685651095897532
},
"harness|hendrycksTest-college_chemistry|5": {
"acc": 0.54,
"acc_stderr": 0.05009082659620332,
"acc_norm": 0.54,
"acc_norm_stderr": 0.05009082659620332
},
"harness|hendrycksTest-college_computer_science|5": {
"acc": 0.51,
"acc_stderr": 0.05024183937956912,
"acc_norm": 0.51,
"acc_norm_stderr": 0.05024183937956912
},
"harness|hendrycksTest-college_mathematics|5": {
"acc": 0.29,
"acc_stderr": 0.04560480215720684,
"acc_norm": 0.29,
"acc_norm_stderr": 0.04560480215720684
},
"harness|hendrycksTest-college_medicine|5": {
"acc": 0.6416184971098265,
"acc_stderr": 0.036563436533531585,
"acc_norm": 0.6416184971098265,
"acc_norm_stderr": 0.036563436533531585
},
"harness|hendrycksTest-college_physics|5": {
"acc": 0.3235294117647059,
"acc_stderr": 0.04655010411319619,
"acc_norm": 0.3235294117647059,
"acc_norm_stderr": 0.04655010411319619
},
"harness|hendrycksTest-computer_security|5": {
"acc": 0.76,
"acc_stderr": 0.04292346959909283,
"acc_norm": 0.76,
"acc_norm_stderr": 0.04292346959909283
},
"harness|hendrycksTest-conceptual_physics|5": {
"acc": 0.5829787234042553,
"acc_stderr": 0.03223276266711712,
"acc_norm": 0.5829787234042553,
"acc_norm_stderr": 0.03223276266711712
},
"harness|hendrycksTest-econometrics|5": {
"acc": 0.4649122807017544,
"acc_stderr": 0.046920083813689104,
"acc_norm": 0.4649122807017544,
"acc_norm_stderr": 0.046920083813689104
},
"harness|hendrycksTest-electrical_engineering|5": {
"acc": 0.5517241379310345,
"acc_stderr": 0.04144311810878152,
"acc_norm": 0.5517241379310345,
"acc_norm_stderr": 0.04144311810878152
},
"harness|hendrycksTest-elementary_mathematics|5": {
"acc": 0.42063492063492064,
"acc_stderr": 0.025424835086924006,
"acc_norm": 0.42063492063492064,
"acc_norm_stderr": 0.025424835086924006
},
"harness|hendrycksTest-formal_logic|5": {
"acc": 0.4444444444444444,
"acc_stderr": 0.044444444444444495,
"acc_norm": 0.4444444444444444,
"acc_norm_stderr": 0.044444444444444495
},
"harness|hendrycksTest-global_facts|5": {
"acc": 0.44,
"acc_stderr": 0.04988876515698589,
"acc_norm": 0.44,
"acc_norm_stderr": 0.04988876515698589
},
"harness|hendrycksTest-high_school_biology|5": {
"acc": 0.7548387096774194,
"acc_stderr": 0.024472243840895525,
"acc_norm": 0.7548387096774194,
"acc_norm_stderr": 0.024472243840895525
},
"harness|hendrycksTest-high_school_chemistry|5": {
"acc": 0.5024630541871922,
"acc_stderr": 0.035179450386910616,
"acc_norm": 0.5024630541871922,
"acc_norm_stderr": 0.035179450386910616
},
"harness|hendrycksTest-high_school_computer_science|5": {
"acc": 0.66,
"acc_stderr": 0.04760952285695237,
"acc_norm": 0.66,
"acc_norm_stderr": 0.04760952285695237
},
"harness|hendrycksTest-high_school_european_history|5": {
"acc": 0.7818181818181819,
"acc_stderr": 0.03225078108306289,
"acc_norm": 0.7818181818181819,
"acc_norm_stderr": 0.03225078108306289
},
"harness|hendrycksTest-high_school_geography|5": {
"acc": 0.797979797979798,
"acc_stderr": 0.02860620428922988,
"acc_norm": 0.797979797979798,
"acc_norm_stderr": 0.02860620428922988
},
"harness|hendrycksTest-high_school_government_and_politics|5": {
"acc": 0.8756476683937824,
"acc_stderr": 0.023814477086593552,
"acc_norm": 0.8756476683937824,
"acc_norm_stderr": 0.023814477086593552
},
"harness|hendrycksTest-high_school_macroeconomics|5": {
"acc": 0.658974358974359,
"acc_stderr": 0.02403548967633509,
"acc_norm": 0.658974358974359,
"acc_norm_stderr": 0.02403548967633509
},
"harness|hendrycksTest-high_school_mathematics|5": {
"acc": 0.32592592592592595,
"acc_stderr": 0.02857834836547308,
"acc_norm": 0.32592592592592595,
"acc_norm_stderr": 0.02857834836547308
},
"harness|hendrycksTest-high_school_microeconomics|5": {
"acc": 0.6638655462184874,
"acc_stderr": 0.030684737115135363,
"acc_norm": 0.6638655462184874,
"acc_norm_stderr": 0.030684737115135363
},
"harness|hendrycksTest-high_school_physics|5": {
"acc": 0.304635761589404,
"acc_stderr": 0.03757949922943344,
"acc_norm": 0.304635761589404,
"acc_norm_stderr": 0.03757949922943344
},
"harness|hendrycksTest-high_school_psychology|5": {
"acc": 0.8238532110091743,
"acc_stderr": 0.016332882393431353,
"acc_norm": 0.8238532110091743,
"acc_norm_stderr": 0.016332882393431353
},
"harness|hendrycksTest-high_school_statistics|5": {
"acc": 0.5092592592592593,
"acc_stderr": 0.03409386946992699,
"acc_norm": 0.5092592592592593,
"acc_norm_stderr": 0.03409386946992699
},
"harness|hendrycksTest-high_school_us_history|5": {
"acc": 0.7990196078431373,
"acc_stderr": 0.02812597226565437,
"acc_norm": 0.7990196078431373,
"acc_norm_stderr": 0.02812597226565437
},
"harness|hendrycksTest-high_school_world_history|5": {
"acc": 0.759493670886076,
"acc_stderr": 0.027820781981149685,
"acc_norm": 0.759493670886076,
"acc_norm_stderr": 0.027820781981149685
},
"harness|hendrycksTest-human_aging|5": {
"acc": 0.6681614349775785,
"acc_stderr": 0.03160295143776679,
"acc_norm": 0.6681614349775785,
"acc_norm_stderr": 0.03160295143776679
},
"harness|hendrycksTest-human_sexuality|5": {
"acc": 0.7404580152671756,
"acc_stderr": 0.03844876139785271,
"acc_norm": 0.7404580152671756,
"acc_norm_stderr": 0.03844876139785271
},
"harness|hendrycksTest-international_law|5": {
"acc": 0.8016528925619835,
"acc_stderr": 0.036401182719909456,
"acc_norm": 0.8016528925619835,
"acc_norm_stderr": 0.036401182719909456
},
"harness|hendrycksTest-jurisprudence|5": {
"acc": 0.7777777777777778,
"acc_stderr": 0.040191074725573483,
"acc_norm": 0.7777777777777778,
"acc_norm_stderr": 0.040191074725573483
},
"harness|hendrycksTest-logical_fallacies|5": {
"acc": 0.754601226993865,
"acc_stderr": 0.03380939813943354,
"acc_norm": 0.754601226993865,
"acc_norm_stderr": 0.03380939813943354
},
"harness|hendrycksTest-machine_learning|5": {
"acc": 0.4732142857142857,
"acc_stderr": 0.047389751192741546,
"acc_norm": 0.4732142857142857,
"acc_norm_stderr": 0.047389751192741546
},
"harness|hendrycksTest-management|5": {
"acc": 0.7766990291262136,
"acc_stderr": 0.04123553189891431,
"acc_norm": 0.7766990291262136,
"acc_norm_stderr": 0.04123553189891431
},
"harness|hendrycksTest-marketing|5": {
"acc": 0.8632478632478633,
"acc_stderr": 0.022509033937077802,
"acc_norm": 0.8632478632478633,
"acc_norm_stderr": 0.022509033937077802
},
"harness|hendrycksTest-medical_genetics|5": {
"acc": 0.69,
"acc_stderr": 0.04648231987117316,
"acc_norm": 0.69,
"acc_norm_stderr": 0.04648231987117316
},
"harness|hendrycksTest-miscellaneous|5": {
"acc": 0.8173690932311622,
"acc_stderr": 0.013816335389973141,
"acc_norm": 0.8173690932311622,
"acc_norm_stderr": 0.013816335389973141
},
"harness|hendrycksTest-moral_disputes|5": {
"acc": 0.7254335260115607,
"acc_stderr": 0.02402774515526502,
"acc_norm": 0.7254335260115607,
"acc_norm_stderr": 0.02402774515526502
},
"harness|hendrycksTest-moral_scenarios|5": {
"acc": 0.27039106145251396,
"acc_stderr": 0.014854993938010071,
"acc_norm": 0.27039106145251396,
"acc_norm_stderr": 0.014854993938010071
},
"harness|hendrycksTest-nutrition|5": {
"acc": 0.7516339869281046,
"acc_stderr": 0.02473998135511359,
"acc_norm": 0.7516339869281046,
"acc_norm_stderr": 0.02473998135511359
},
"harness|hendrycksTest-philosophy|5": {
"acc": 0.7331189710610932,
"acc_stderr": 0.025122637608816653,
"acc_norm": 0.7331189710610932,
"acc_norm_stderr": 0.025122637608816653
},
"harness|hendrycksTest-prehistory|5": {
"acc": 0.6666666666666666,
"acc_stderr": 0.03333333333333335,
"acc_norm": 0.6666666666666666,
"acc_norm_stderr": 0.03333333333333335
},
"harness|hendrycksTest-professional_accounting|5": {
"acc": 0.71,
"acc_stderr": 0.04612494389271347,
"acc_norm": 0.71,
"acc_norm_stderr": 0.04612494389271347
},
"harness|hendrycksTest-professional_law|5": {
"acc": 0.76,
"acc_stderr": 0.04292346959909283,
"acc_norm": 0.76,
"acc_norm_stderr": 0.04292346959909283
},
"harness|hendrycksTest-professional_medicine|5": {
"acc": 0.75,
"acc_stderr": 0.04330127018922194,
"acc_norm": 0.75,
"acc_norm_stderr": 0.04330127018922194
},
"harness|hendrycksTest-professional_psychology|5": {
"acc": 0.7333333333333333,
"acc_stderr": 0.03333333333333335,
"acc_norm": 0.7333333333333333,
"acc_norm_stderr": 0.03333333333333335
},
"harness|hendrycksTest-public_relations|5": {
"acc": 0.83,
"acc_stderr": 0.02607670576394032,
"acc_norm": 0.83,
"acc_norm_stderr": 0.02607670576394032
},
"harness|hendrycksTest-sociology|5": {
"acc": 0.78,
"acc_stderr": 0.03240370349203933,
"acc_norm": 0.78,
"acc_norm_stderr": 0.03240370349203933
},
"harness|hendrycksTest-us_foreign_policy|5": {
"acc": 0.79,
"acc_stderr": 0.0308053014374349,
"acc_norm": 0.79,
"acc_norm_stderr": 0.0308053014374349
},
"harness|hendrycksTest-veterinary_medicine|5": {
"acc": 0.77,
"acc_stderr": 0.03478501539300747,
"acc_norm": 0.77,
"acc_norm_stderr": 0.03478501539300747
},
"harness|hendrycksTest-world_religions|5": {
"acc": 0.73,
"acc_stderr": 0.03333333333333335,
"acc_norm": 0.73,
"acc_norm_stderr": 0.03333333333333335
}
}
đ License
This project is licensed under the apache - 2.0
license.
Phi 2 GGUF
Other
Phi-2 is a small yet powerful language model developed by Microsoft, featuring 2.7 billion parameters, focusing on efficient inference and high-quality text generation.
Large Language Model Supports Multiple Languages
P
TheBloke
41.5M
205
Roberta Large
MIT
A large English language model pre-trained with masked language modeling objectives, using improved BERT training methods
Large Language Model English
R
FacebookAI
19.4M
212
Distilbert Base Uncased
Apache-2.0
DistilBERT is a distilled version of the BERT base model, maintaining similar performance while being more lightweight and efficient, suitable for natural language processing tasks such as sequence classification and token classification.
Large Language Model English
D
distilbert
11.1M
669
Llama 3.1 8B Instruct GGUF
Meta Llama 3.1 8B Instruct is a multilingual large language model optimized for multilingual dialogue use cases, excelling in common industry benchmarks.
Large Language Model English
L
modularai
9.7M
4
Xlm Roberta Base
MIT
XLM-RoBERTa is a multilingual model pretrained on 2.5TB of filtered CommonCrawl data across 100 languages, using masked language modeling as the training objective.
Large Language Model Supports Multiple Languages
X
FacebookAI
9.6M
664
Roberta Base
MIT
An English pre-trained model based on Transformer architecture, trained on massive text through masked language modeling objectives, supporting text feature extraction and downstream task fine-tuning
Large Language Model English
R
FacebookAI
9.3M
488
Opt 125m
Other
OPT is an open pre-trained Transformer language model suite released by Meta AI, with parameter sizes ranging from 125 million to 175 billion, designed to match the performance of the GPT-3 series while promoting open research in large-scale language models.
Large Language Model English
O
facebook
6.3M
198
1
A pretrained model based on the transformers library, suitable for various NLP tasks
Large Language Model
Transformers

1
unslothai
6.2M
1
Llama 3.1 8B Instruct
Llama 3.1 is Meta's multilingual large language model series, featuring 8B, 70B, and 405B parameter scales, supporting 8 languages and code generation, with optimized multilingual dialogue scenarios.
Large Language Model
Transformers Supports Multiple Languages

L
meta-llama
5.7M
3,898
T5 Base
Apache-2.0
The T5 Base Version is a text-to-text Transformer model developed by Google with 220 million parameters, supporting multilingual NLP tasks.
Large Language Model Supports Multiple Languages
T
google-t5
5.4M
702
Featured Recommended AI Models
Š 2025AIbase