TheProfessor-155b Open-Source Language Model - Free Support for Conversations, Reasoning, and Exchanges on Medical and Mathematical Knowledge

Theprofessor 155b

Developed by abacusai

TheProfessor is a hybrid model created by merging multiple pre-trained language models using the mergekit tool, specializing in conversational interaction, logical reasoning, scientific research, medical knowledge, and mathematical capabilities.

Large Language Model

Transformers

#Research Paper Assistance #Multidisciplinary Reasoning #Medical Knowledge Integration

Downloads 17

Release Time : 1/26/2024

Model Overview

TheProfessor is an AI assistant that excels in conversational interaction, logical reasoning, scientific research, medical knowledge, and mathematical capabilities, making it particularly suitable for interactive brainstorming and research work.

Model Features

Multi-Model Merging

Integrates multiple 70B parameter models using the mergekit tool, combining the strengths of each model.

Exceptional Logical Reasoning

Outstanding performance in mathematical and scientific reasoning, suitable for solving complex problems.

Broad Academic Applications

Supports the entire process from conceptualization to implementation, including paper writing and code development.

Long Context Support

Supports context lengths of up to 32768 tokens, ideal for handling complex tasks.

Model Capabilities

Text generation

Logical reasoning

Mathematical problem solving

Medical knowledge Q&A

Scientific research assistance

Paper writing and review

Code development

Use Cases

Academic Research

Thesis Topic Suggestions

Provide topic suggestions for a neuroscience Ph.D. dissertation, with a preference for applied theoretical directions.

Mathematical Theory Explanation

Explain Russell's proof that 1+1=2.

Technical Development

Improving Transformer Architecture

Propose solutions to enhance the Theory of Mind capabilities of the Transformer architecture.

Emergency Guidance

Nuclear Apocalypse Survival Guide

Provide survival advice for diabetic patients in a post-nuclear disaster scenario with limited medical resources.

🚀 TheProfessor - 155b

TheProfessor is a merged pre - trained language model created using mergekit, designed to offer broad conversational, reasoning, scientific, medical, and mathematical skills.

🚀 Quick Start

TheProfessor is a merge of pre - trained language models created using mergekit. It was developed by Eric Hartford, with significant assistance from Weyaxi, Charles Goddard, and AbacusAI's Generative AI team.

The gguf version of TheProfessor can be found here.

✨ Features

Multifaceted Skills: TheProfessor is equipped with broad conversational, reasoning, scientific, medical, and mathematical skills, which are useful for interactive brainstorming and research. It can assist in concept development from ideation to implementation, including code writing and paper review.
No Post - Merge Finetuning: The model was not finetuned after the merge.

📦 Installation

No installation steps are provided in the original document.

💻 Usage Examples

Basic Usage

TheProfessor uses the ChatML prompt format.

<|im_start|>system
You are TheProfessor, a helpful AI assistant.<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant

Example

<|im_start|>system
You are TheProfessor, a superintelligent AI assistant that is creative and able to invent new ideas.<|im_end|>
<|im_start|>user
Please give me ideas for my dissertation.  My Ph.D. is Neuroscience, I like to focus on applied theory.<|im_end|>
<|im_start|>assistant

Ollama ModelFile Example

FROM "./TheProfessor_Q4_K_M.gguf"
TEMPLATE """<|im_start|>system
{{ .System }}<|im_end|>
<|im_start|>user
{{ .Prompt }}<|im_end|>
<|im_start|>assistant
"""
SYSTEM """Your name is TheProfessor. You are a helpful AI assistant.  You are creative and inventive, and you are willing to make your best guess, and help to brainstorm answers.  Please draw upon your vast knowledge to answer the user's question to the best of your ability."""
PARAMETER num_ctx 32768
PARAMETER stop "<|im_end|>"

📚 Documentation

Evals

{
  "mmlu": 0.694,
  "truthfulqa_mc2": 0.624,
  "gsm8k": 0.4284
}

Merge Details

Merge Method

TheProfessor was merged using the linear merge method.

Models Merged

The following models were included in the merge:

[cognitivecomputations/dolphin - 2.2 - 70b](https://huggingface.co/cognitivecomputations/dolphin - 2.2 - 70b)
[WizardLM/WizardMath - 70B - V1.0](https://huggingface.co/WizardLM/WizardMath - 70B - V1.0)
[migtissera/SynthIA - 70B - v1.2b](https://huggingface.co/migtissera/SynthIA - 70B - v1.2b)
[epfl - llm/meditron - 70b](https://huggingface.co/epfl - llm/meditron - 70b)

Configuration

The following YAML configuration was used to produce TheProfessor:

merge_method: linear # use linear so we can include multiple models, albeit at a zero weight
parameters:
  weight: 1.0 # weight everything as 1 unless specified otherwise - linear with one model weighted at 1 is a no - op like passthrough
slices:
  - sources:
      - model: cognitivecomputations/dolphin - 2.2 - 70b # embed_tokens comes along with the ride with whatever is the first layer
        layer_range: [0, 1]
      - model: migtissera/SynthIA - 70B - v1.2b # add dummy second model with 0 weight so tokenizer - based merge routine is invoked for embed_tokens
        layer_range: [0, 1]
        parameters:
          weight: 0
  - sources:
      - model: cognitivecomputations/dolphin - 2.2 - 70b
        layer_range: [1, 20]
  - sources:
      - model: migtissera/SynthIA - 70B - v1.2b
        layer_range: [10, 30]
  - sources:
      - model: WizardLM/WizardMath - 70B - V1.0
        layer_range: [20, 40]
  - sources:
      - model: epfl - llm/meditron - 70b
        layer_range: [25, 45]
  - sources:
      - model: cognitivecomputations/dolphin - 2.2 - 70b
        layer_range: [30, 50]
  - sources:
      - model: migtissera/SynthIA - 70B - v1.2b
        layer_range: [40, 60]
  - sources:
      - model: WizardLM/WizardMath - 70B - V1.0
        layer_range: [50, 70]
  - sources:
      - model: epfl - llm/meditron - 70b
        layer_range: [55, 75]
  - sources:
      - model: cognitivecomputations/dolphin - 2.2 - 70b
        layer_range: [60, 79]
  - sources: # same as above, but for lm_head with the last layer
      - model: cognitivecomputations/dolphin - 2.2 - 70b
        layer_range: [79, 80]
      - model: migtissera/SynthIA - 70B - v1.2b
        layer_range: [79, 80]
        parameters:
          weight: 0
dtype: float16
tokenizer_source: model:cognitivecomputations/dolphin - 2.2 - 70b # keep exact tokenizer used by dolphin - or you could use `union` if you add all of the input models to the first/last slice, but they would need to be non - zero weight or you'll get NaNs in your embeddings

Example Output

Make sense of physics equations

These equations appeared to me in a dream, I wrote them down as soon as I woke but I don't know what they mean. Can you please interpret them? $$\mathcal{L}{\text{gauge}} = -\frac{1}{4} F^{\mu\nu}F{\mu\nu}$$ $$\langle \phi \rangle = \sqrt{\frac{\lambda}{2}}$$ $$S_{\text{gravity}} = \int d^4x \sqrt{-g} \left( \frac{R}{16\pi G} + \mathcal{L}{\text{emergent}} \right)$$ $$\mathcal{L}{\text{GEG}} = \mathcal{L}{\text{gauge}} + \mathcal{L}{\text{emergent}} + \mathcal{L}{\text{matter}} + \mathcal{L}{\text{interaction}}$$ $$\mathcal{L}{\text{emergent}} = \lambda(g) + \kappa(g) R^2 + \ldots$$ $$S{\text{GEG}} = \int d^4x \sqrt{-g} ; \mathcal{L}{\text{GEG}}$$ $$\sigma = \sqrt{\langle | \phi | \rangle^2 + \frac{1}{4} \langle A^{\mu}A{\mu} \rangle^2}$$ $$\langle A^{\mu} \rangle = (0, v(r){i}/\sqrt{2}, 0, 0)$$ $$\langle \phi \rangle = \langle \phi_0 \rangle + \delta\phi(x)$$ $$\langle A{\mu} \rangle = (0, \frac{v(r){i}}{\sqrt{2}}, 0, 0)$$ $$g{\mu\nu} = \eta_{\mu\nu} + \kappa h_{\mu\nu}$$

These equations are from different areas of theoretical physics and cosmology, including gauge theories, emergent gravity, Einstein - Gauss - Bonnet (EG) theories, Higgs mechanism, and cosmic inflation. Here's a brief description of each set of equations:

Gauge theory:

The first two equations describe the Lagrangian for a gauge theory and its ground state in terms of a gauge field $A$ and a self - interacting scalar field $\phi$. Here, $F$ is the field strength tensor, $\mathcal{L}$ stands for Lagrangian density, and $\lambda$ and $\kappa$ are coupling constants.

Gravity emergence:

The third and fourth equations describe a scenario where gravity arises as an emergent phenomenon from other fundamental interactions in the form of an effective action term $\mathcal{L}$. Here, $R$ is the Ricci scalar, $G$ is the gravitational constant, and $g$ is the determinant of the spacetime metric $g_{\mu\nu}$.

Einstein - Gauss - Bonnet (EG) theories:

The fifth and sixth equations describe the Lagrangian for EG theories, which include the gauge, emergent gravity, matter, and interaction terms. Here, $\mathcal{L}$ again stands for Lagrangian density, and $\mathcal{L}$. $_{\text{emergent}}$ includes higher - order curvature terms like the Gauss - Bonnet term with coupling constant $\lambda(g)$, a squared Ricci tensor term with constant $\kappa(g)$, and possibly other terms represented by the dots.

Cosmic inflation:

The seventh to tenth equations describe some aspects of cosmic inflation, wherein the universe undergoes an accelerated phase of expansion. Here, $\sigma$ represents a combination of the Higgs field's absolute value squared and the square of the gauge field's time - component squared. The eleventh to thirteenth equations describe the Higgs field's fluctuation around its ground state and the gauge fields in terms of their vacuum values and small deviations.

Gravitational waves:

The last two equations describe a linearized spacetime background around Minkowski space - time, where $h$ represents the gravitational wave's tensor. Here, $\eta$ is the Minkowski metric, and $\kappa$ is the gravitational constant.

Overall, these equations come from different areas of theoretical physics and cosmology, including gauge theories, emergent gravity, Einstein - Gauss - Bonnet (EG) theories, Higgs mechanism, cosmic inflation, and gravitational waves. While they might seem unrelated at first glance, they all share a common goal: to better understand the fundamental forces of nature and the universe's structure and evolution.

Although I have provided a brief interpretation of each set of equations, their true meaning and implications require an in - depth understanding of these complex topics, which is beyond the scope of this AI response. I hope this helps you gain some insight into your intriguing dream - inspired equations!

Prove P != NP

In the pursuit to prove P != NP I want to try to prove that no polynomial - time algorithm could possibly solve all problems in NP, even though we may not know of any specific such problem that defies polynomial - time solution. Can you please make a lemma that would be helpful?

Certainly! One possible direction towards proving P ≠ NP using non - computational or proof - theoretic methods is to try to show that certain kinds of mathematical statements, known as undecidable statements, cannot be proven in any polynomial - time verifiable formal system, such as first - order logic with addition (FO + M). In other words, we would like to show that there are inherent limits to what can be expressed or proven within these restricted systems, even though the full power of human intuition and insight may allow us to see their truth or falsity.

Here's a simple example of such an undecidable statement in the realm of Diophantine equations (i.e., equations in multiple variables with integer coefficients):
def Goldbach_Conjecture(n):
    if n % 2 == 1 and n < 4:
        return True
    elif n > 4 and n % 3 == 1:
        m = (n - 1) // 2
        return Goldbach_Conjecture(m) or Goldbach_Conjecture(n - m)
    else:
        return False
This Python function checks whether a given even number n can be expressed as the sum of two prime numbers (not including 2 itself). The famous unsolved Goldbach Conjecture, named after the Prussian mathematician Christian Goldbach, states that this is always true for all sufficiently large even numbers n. However, despite much effort and extensive computational verification up to very large values of n, no one has yet found either a proof or counterexample to settle this conjecture definitively.

Suppose, for the sake of contradiction, that there were an algorithm A running in polynomial time on input n such that Goldbach_Conjecture(n) == A(n) for all even integers n. Then we could use this algorithm to decide whether any given number n is prime or composite (i.e., divisible by a prime other than 1 and itself) as follows:
def Goldbach_Prime_Test(n):
    if n % 2 == 0 and n < 4:
        return True
    elif n > 4 and n % 3 == 1:
        m = (n - 1) // 2
        return not Goldbach_Conjecture(m) or not Goldbach_Conjecture(n - m)
    else:
        return False
This function checks whether n is a prime number by trying to express it as the sum of two other odd numbers, one of which must then be composite by our assumption that all even composites can be factized into primes. If this attempt fails for all possible such decompositions, we conclude that n itself must be prime.

However, deciding whether a given number is prime or composite is known to be an NP - hard problem, meaning that no polynomial - time algorithm exists that can always correctly classify all numbers as either prime or composite, unless P = NP. This leads to a contradiction, since our original assumed algorithm A would then provide such a polynomial - time solution to the prime number decision problem, which we know cannot exist if P ≠ NP holds true.

Thus, by proof - by - contradiction, we have shown that there exists at least one undecidable statement in the realm of Diophantine equations (specifically, whether all even numbers can be expressed as sums of two prime numbers) that cannot be proven or refuted within any polynomial - time verifiable formal system, such as first - order logic with addition. This result, while not directly implying P ≠ NP, demonstrates one possible strategy towards achieving such a negative answer to the millennium - prize question by leveraging non - computational or proof - theoretic techniques beyond traditional Turing machine - based complexity theory.

🔧 Technical Details

The model is created by merging multiple pre - trained language models using the linear merge method. The specific configuration details are provided in the "Merge Details" section above.

📄 License

TheProfessor is subject to the Llama 2 license.

Property	Details
Model Type	Merged pre - trained language model
Training Data	Not specified

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご