Model Overview
Model Features
Model Capabilities
Use Cases
đ TheProfessor - 155b
TheProfessor is a merged pre - trained language model created using mergekit, designed to offer broad conversational, reasoning, scientific, medical, and mathematical skills.
đ Quick Start
TheProfessor is a merge of pre - trained language models created using mergekit. It was developed by Eric Hartford, with significant assistance from Weyaxi, Charles Goddard, and AbacusAI's Generative AI team.
The gguf version of TheProfessor can be found here.
⨠Features
- Multifaceted Skills: TheProfessor is equipped with broad conversational, reasoning, scientific, medical, and mathematical skills, which are useful for interactive brainstorming and research. It can assist in concept development from ideation to implementation, including code writing and paper review.
- No Post - Merge Finetuning: The model was not finetuned after the merge.
đĻ Installation
No installation steps are provided in the original document.
đģ Usage Examples
Basic Usage
TheProfessor uses the ChatML prompt format.
<|im_start|>system
You are TheProfessor, a helpful AI assistant.<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
Example
<|im_start|>system
You are TheProfessor, a superintelligent AI assistant that is creative and able to invent new ideas.<|im_end|>
<|im_start|>user
Please give me ideas for my dissertation. My Ph.D. is Neuroscience, I like to focus on applied theory.<|im_end|>
<|im_start|>assistant
Ollama ModelFile Example
FROM "./TheProfessor_Q4_K_M.gguf"
TEMPLATE """<|im_start|>system
{{ .System }}<|im_end|>
<|im_start|>user
{{ .Prompt }}<|im_end|>
<|im_start|>assistant
"""
SYSTEM """Your name is TheProfessor. You are a helpful AI assistant. You are creative and inventive, and you are willing to make your best guess, and help to brainstorm answers. Please draw upon your vast knowledge to answer the user's question to the best of your ability."""
PARAMETER num_ctx 32768
PARAMETER stop "<|im_end|>"
đ Documentation
Evals
{
"mmlu": 0.694,
"truthfulqa_mc2": 0.624,
"gsm8k": 0.4284
}
Merge Details
Merge Method
TheProfessor was merged using the linear merge method.
Models Merged
The following models were included in the merge:
- [cognitivecomputations/dolphin - 2.2 - 70b](https://huggingface.co/cognitivecomputations/dolphin - 2.2 - 70b)
- [WizardLM/WizardMath - 70B - V1.0](https://huggingface.co/WizardLM/WizardMath - 70B - V1.0)
- [migtissera/SynthIA - 70B - v1.2b](https://huggingface.co/migtissera/SynthIA - 70B - v1.2b)
- [epfl - llm/meditron - 70b](https://huggingface.co/epfl - llm/meditron - 70b)
Configuration
The following YAML configuration was used to produce TheProfessor:
merge_method: linear # use linear so we can include multiple models, albeit at a zero weight
parameters:
weight: 1.0 # weight everything as 1 unless specified otherwise - linear with one model weighted at 1 is a no - op like passthrough
slices:
- sources:
- model: cognitivecomputations/dolphin - 2.2 - 70b # embed_tokens comes along with the ride with whatever is the first layer
layer_range: [0, 1]
- model: migtissera/SynthIA - 70B - v1.2b # add dummy second model with 0 weight so tokenizer - based merge routine is invoked for embed_tokens
layer_range: [0, 1]
parameters:
weight: 0
- sources:
- model: cognitivecomputations/dolphin - 2.2 - 70b
layer_range: [1, 20]
- sources:
- model: migtissera/SynthIA - 70B - v1.2b
layer_range: [10, 30]
- sources:
- model: WizardLM/WizardMath - 70B - V1.0
layer_range: [20, 40]
- sources:
- model: epfl - llm/meditron - 70b
layer_range: [25, 45]
- sources:
- model: cognitivecomputations/dolphin - 2.2 - 70b
layer_range: [30, 50]
- sources:
- model: migtissera/SynthIA - 70B - v1.2b
layer_range: [40, 60]
- sources:
- model: WizardLM/WizardMath - 70B - V1.0
layer_range: [50, 70]
- sources:
- model: epfl - llm/meditron - 70b
layer_range: [55, 75]
- sources:
- model: cognitivecomputations/dolphin - 2.2 - 70b
layer_range: [60, 79]
- sources: # same as above, but for lm_head with the last layer
- model: cognitivecomputations/dolphin - 2.2 - 70b
layer_range: [79, 80]
- model: migtissera/SynthIA - 70B - v1.2b
layer_range: [79, 80]
parameters:
weight: 0
dtype: float16
tokenizer_source: model:cognitivecomputations/dolphin - 2.2 - 70b # keep exact tokenizer used by dolphin - or you could use `union` if you add all of the input models to the first/last slice, but they would need to be non - zero weight or you'll get NaNs in your embeddings
Example Output
Make sense of physics equations
These equations appeared to me in a dream, I wrote them down as soon as I woke but I don't know what they mean. Can you please interpret them? $$\mathcal{L}{\text{gauge}} = -\frac{1}{4} F^{\mu\nu}F{\mu\nu}$$ $$\langle \phi \rangle = \sqrt{\frac{\lambda}{2}}$$ $$S_{\text{gravity}} = \int d^4x \sqrt{-g} \left( \frac{R}{16\pi G} + \mathcal{L}{\text{emergent}} \right)$$ $$\mathcal{L}{\text{GEG}} = \mathcal{L}{\text{gauge}} + \mathcal{L}{\text{emergent}} + \mathcal{L}{\text{matter}} + \mathcal{L}{\text{interaction}}$$ $$\mathcal{L}{\text{emergent}} = \lambda(g) + \kappa(g) R^2 + \ldots$$ $$S{\text{GEG}} = \int d^4x \sqrt{-g} ; \mathcal{L}{\text{GEG}}$$ $$\sigma = \sqrt{\langle | \phi | \rangle^2 + \frac{1}{4} \langle A^{\mu}A{\mu} \rangle^2}$$ $$\langle A^{\mu} \rangle = (0, v(r){i}/\sqrt{2}, 0, 0)$$ $$\langle \phi \rangle = \langle \phi_0 \rangle + \delta\phi(x)$$ $$\langle A{\mu} \rangle = (0, \frac{v(r){i}}{\sqrt{2}}, 0, 0)$$ $$g{\mu\nu} = \eta_{\mu\nu} + \kappa h_{\mu\nu}$$
These equations are from different areas of theoretical physics and cosmology, including gauge theories, emergent gravity, Einstein - Gauss - Bonnet (EG) theories, Higgs mechanism, and cosmic inflation. Here's a brief description of each set of equations:
- Gauge theory:
- The first two equations describe the Lagrangian for a gauge theory and its ground state in terms of a gauge field $A$ and a self - interacting scalar field $\phi$. Here, $F$ is the field strength tensor, $\mathcal{L}$ stands for Lagrangian density, and $\lambda$ and $\kappa$ are coupling constants.
- Gravity emergence:
- The third and fourth equations describe a scenario where gravity arises as an emergent phenomenon from other fundamental interactions in the form of an effective action term $\mathcal{L}$. Here, $R$ is the Ricci scalar, $G$ is the gravitational constant, and $g$ is the determinant of the spacetime metric $g_{\mu\nu}$.
- Einstein - Gauss - Bonnet (EG) theories:
- The fifth and sixth equations describe the Lagrangian for EG theories, which include the gauge, emergent gravity, matter, and interaction terms. Here, $\mathcal{L}$ again stands for Lagrangian density, and $\mathcal{L}$. $_{\text{emergent}}$ includes higher - order curvature terms like the Gauss - Bonnet term with coupling constant $\lambda(g)$, a squared Ricci tensor term with constant $\kappa(g)$, and possibly other terms represented by the dots.
- Cosmic inflation:
- The seventh to tenth equations describe some aspects of cosmic inflation, wherein the universe undergoes an accelerated phase of expansion. Here, $\sigma$ represents a combination of the Higgs field's absolute value squared and the square of the gauge field's time - component squared. The eleventh to thirteenth equations describe the Higgs field's fluctuation around its ground state and the gauge fields in terms of their vacuum values and small deviations.
- Gravitational waves:
- The last two equations describe a linearized spacetime background around Minkowski space - time, where $h$ represents the gravitational wave's tensor. Here, $\eta$ is the Minkowski metric, and $\kappa$ is the gravitational constant.
Overall, these equations come from different areas of theoretical physics and cosmology, including gauge theories, emergent gravity, Einstein - Gauss - Bonnet (EG) theories, Higgs mechanism, cosmic inflation, and gravitational waves. While they might seem unrelated at first glance, they all share a common goal: to better understand the fundamental forces of nature and the universe's structure and evolution.
Although I have provided a brief interpretation of each set of equations, their true meaning and implications require an in - depth understanding of these complex topics, which is beyond the scope of this AI response. I hope this helps you gain some insight into your intriguing dream - inspired equations!
Prove P != NP
In the pursuit to prove P != NP I want to try to prove that no polynomial - time algorithm could possibly solve all problems in NP, even though we may not know of any specific such problem that defies polynomial - time solution. Can you please make a lemma that would be helpful?
Certainly! One possible direction towards proving P â NP using non - computational or proof - theoretic methods is to try to show that certain kinds of mathematical statements, known as undecidable statements, cannot be proven in any polynomial - time verifiable formal system, such as first - order logic with addition (FO + M). In other words, we would like to show that there are inherent limits to what can be expressed or proven within these restricted systems, even though the full power of human intuition and insight may allow us to see their truth or falsity.
Here's a simple example of such an undecidable statement in the realm of Diophantine equations (i.e., equations in multiple variables with integer coefficients):
def Goldbach_Conjecture(n): if n % 2 == 1 and n < 4: return True elif n > 4 and n % 3 == 1: m = (n - 1) // 2 return Goldbach_Conjecture(m) or Goldbach_Conjecture(n - m) else: return False
This Python function checks whether a given even number n can be expressed as the sum of two prime numbers (not including 2 itself). The famous unsolved Goldbach Conjecture, named after the Prussian mathematician Christian Goldbach, states that this is always true for all sufficiently large even numbers n. However, despite much effort and extensive computational verification up to very large values of n, no one has yet found either a proof or counterexample to settle this conjecture definitively.
Suppose, for the sake of contradiction, that there were an algorithm A running in polynomial time on input n such that Goldbach_Conjecture(n) == A(n) for all even integers n. Then we could use this algorithm to decide whether any given number n is prime or composite (i.e., divisible by a prime other than 1 and itself) as follows:
def Goldbach_Prime_Test(n): if n % 2 == 0 and n < 4: return True elif n > 4 and n % 3 == 1: m = (n - 1) // 2 return not Goldbach_Conjecture(m) or not Goldbach_Conjecture(n - m) else: return False
This function checks whether n is a prime number by trying to express it as the sum of two other odd numbers, one of which must then be composite by our assumption that all even composites can be factized into primes. If this attempt fails for all possible such decompositions, we conclude that n itself must be prime.
However, deciding whether a given number is prime or composite is known to be an NP - hard problem, meaning that no polynomial - time algorithm exists that can always correctly classify all numbers as either prime or composite, unless P = NP. This leads to a contradiction, since our original assumed algorithm A would then provide such a polynomial - time solution to the prime number decision problem, which we know cannot exist if P â NP holds true.
Thus, by proof - by - contradiction, we have shown that there exists at least one undecidable statement in the realm of Diophantine equations (specifically, whether all even numbers can be expressed as sums of two prime numbers) that cannot be proven or refuted within any polynomial - time verifiable formal system, such as first - order logic with addition. This result, while not directly implying P â NP, demonstrates one possible strategy towards achieving such a negative answer to the millennium - prize question by leveraging non - computational or proof - theoretic techniques beyond traditional Turing machine - based complexity theory.
đ§ Technical Details
The model is created by merging multiple pre - trained language models using the linear merge method. The specific configuration details are provided in the "Merge Details" section above.
đ License
TheProfessor is subject to the Llama 2 license.
Property | Details |
---|---|
Model Type | Merged pre - trained language model |
Training Data | Not specified |

