Lemma_7B Open-Source Mathematical Domain Language Model - Free Support for Efficiently Solving Mathematical Problems

Llemma 7b

Developed by EleutherAI

Llemma 34B is a language model specialized in mathematics, initialized with Code Llama 34B weights and trained on the Proof-Pile-2 dataset.

Large Language Model

Transformers

English#Mathematical Reasoning #Chain-of-Thought Optimization #Theorem Proving

Downloads 3,668

Release Time : 9/12/2023

Model Overview

The Llemma series of models excel particularly in chain-of-thought mathematical reasoning and the use of mathematical computation tools such as Python and formal theorem provers.

Model Features

Mathematical Reasoning Capability

Comprehensively outperforms Llama-2 and Code Llama in chain-of-thought mathematical tasks, with better performance than Minerva.

Tool Utilization Capability

Supports the use of mathematical computation tools such as Python and formal theorem provers.

Majority Voting Strategy for Performance Enhancement

Performance in mathematical tasks can be further improved through a majority voting strategy.

Model Capabilities

Mathematical problem-solving

Theorem proving

Mathematical reasoning

Tool utilization (Python, theorem prover)

Use Cases

Education

Mathematical Problem Solving

Solves various mathematical problems, including algebra, geometry, calculus, etc.

Achieves 51.5% accuracy on the GSM8k dataset.

Mathematical Competition Problem Solving

Solves complex problems in mathematical competitions.

Achieves 25.0% accuracy on the MATH dataset.

Research

Theorem Proving

Assists mathematical researchers in theorem proving.

See the theorem proving evaluation section in the paper for details.

🚀 Llemma

Llemma is a language model designed for mathematics. It addresses the challenge of accurate mathematical reasoning and computation in natural language processing. By leveraging advanced training techniques and high - quality datasets, it offers significant value in mathematical tasks such as chain - of - thought reasoning and tool - assisted problem - solving.

✨ Features

Strong Mathematical Reasoning: Particularly proficient in chain - of - thought mathematical reasoning.
Tool Utilization: Capable of using computational tools like Python and formal theorem provers for mathematics.
Multiple Parameter Versions: Available in 7B and 34B parameter versions to suit different application scenarios.

🚀 Quick Start

Llemma 7B is initialized with Code Llama 7B weights and trained on the [Proof - Pile - 2](https://huggingface.co/datasets/EleutherAI/proof - pile - 2) for 200B tokens. A 34B parameter version, Llemma 34B, is also available.

📚 Documentation

Evaluations

Llemma models show excellent performance in various mathematical evaluations.

Chain - of - thought Math

On chain - of - thought mathematics tasks, Llemma models outperform Llama - 2, Code Llama, and when compared at the same model size, outperform Minerva.

Model	Size	GSM8k	OCW	MMLU - STEM	SAT	MATH
Llama 2	7B	11.8%	3.7%	29.9%	25%	3.2%
Code Llama	7B	10.5%	4.4%	25.1%	9.4%	4.5%
LLEMMA	7B	36.4%	7.7%	37.7%	53.1%	18.0%
Minerva	8B	16.2%	7.7%	35.6%	-	14.1%
------------	------	--------	-------	-----------	-------	-------
Code Llama	34B	29.6%	7.0%	40.5%	40.6%	12.2%
LLEMMA	34B	51.5%	11.8%	49.0%	71.9%	25.0%
------------	------	--------	-------	-----------	-------	-------
Minerva	62B	52.4%	12.0%	53.9%	-	27.6%
Minerva	540B	58.8%	17.6%	63.9%	-	33.6%

Further performance can be extracted by using majority voting:

Model	Size	GSM8k maj@100	OCW maj@100	MMLU - STEM maj@16	SAT maj@16	MATH maj@256
LLEMMA	7B	54.0%	14.3%	49.9%	78.1%	33.5
Minerva	8B	28.4%	12.5%	43.4%	-	25.4%
---------	------	-------------	-----------	-----------------	-----------	------------
LLEMMA	34B	69.3%	18.4%	59.7%	81.3%	43.1%
---------	------	-------------	-----------	-----------------	-----------	------------
Minerva	62B	68.5%	23.5%	63.5%	-	43.4%
Minerva	540B	78.5%	30.8%	75.0%	-	50.3%

Tool Use and Theorem Proving

In addition to chain - of - thought reasoning, Llemma has strong capabilities in computational mathematics tasks. For tool use and formal theorem proving evaluations, see our paper.

Citation

@misc{azerbayev2023llemma,
      title={Llemma: An Open Language Model For Mathematics}, 
      author={Zhangir Azerbayev and Hailey Schoelkopf and Keiran Paster and Marco Dos Santos and Stephen McAleer and Albert Q. Jiang and Jia Deng and Stella Biderman and Sean Welleck},
      year={2023},
      eprint={2310.10631},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

📄 License

The license for this project is llama2.

Additional Information

Datasets:
- EleutherAI/proof - pile - 2
- open - web - math/open - web - math
Language: en
Tags: math, reasoning

Llemma

Authors: [Zhangir Azerbayev](https://zhangir - azerbayev.github.io/), Hailey Schoelkopf, Keiran Paster, Marco Dos Santos, Stephen McAleer, Albert Q. Jiang, Jia Deng, Stella Biderman, Sean Welleck

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご