Dolly-v2-12B Open-Source Large Language Model - Free Deployment, Supports Commercial Use, Enables Instruction Q&A

Dolly V2 12b

Developed by databricks

A 12-billion-parameter instruction fine-tuned large language model launched by Databricks, fine-tuned on 15K instruction data points based on pythia-12b, with commercial use permitted

Large Language Model

Transformers

EnglishOpen Source License:MIT #Instruction Fine-tuned Large Model #Commercial Use License #Multi-domain Instruction Response

Downloads 3,270

Release Time : 4/11/2023

Model Overview

A causal language model based on Pythia-12b, fine-tuned with instruction datasets generated by Databricks employees, capable of instruction following

Model Features

Business-friendly License

Adopts the MIT open-source license, explicitly permitting commercial use

Instruction Fine-tuning Optimization

Fine-tuned on 15K human-generated instruction data points, covering 7 types of task scenarios

Multiple Size Options

Offers three parameter scale versions: 3B/7B/12B

Model Capabilities

Instruction Following

Brainstorming

Text Classification

Closed-ended QA

Content Generation

Information Extraction

Open-ended QA

Text Summarization

Use Cases

Knowledge QA

Scientific Concept Explanation

Explain the difference between nuclear fission and nuclear fusion

Generates scientifically accurate comparative explanations

Information Processing

Person Information Extraction

Extract key personal information from given text

Accurately identifies and summarizes key points of a person's biography

🚀 dolly-v2-12b Model Card

dolly-v2-12b is an instruction-following large language model developed by Databricks. It's trained on the Databricks machine learning platform and is licensed for commercial use. This model is based on pythia-12b and fine - tuned on a dataset of about 15k instruction/response records.

✨ Features

Commercial Use License: dolly-v2-12b is licensed for commercial use, making it accessible for various business applications.
Instruction - Following Capability: It can follow instructions in multiple domains such as brainstorming, classification, and summarization.
Available in Multiple Sizes: Besides the 12 - billion parameter version, smaller models like dolly-v2-7b and dolly-v2-3b are also available.

📦 Installation

To use the model with the transformers library on a machine with GPUs, first ensure you have the transformers and accelerate libraries installed. In a Databricks notebook, you can run the following command:

%pip install "accelerate>=0.16.0,<1" "transformers[torch]>=4.28.1,<5" "torch>=1.13.1,<2"

💻 Usage Examples

Basic Usage

import torch
from transformers import pipeline

generate_text = pipeline(model="databricks/dolly-v2-12b", torch_dtype=torch.bfloat16, trust_remote_code=True, device_map="auto")
res = generate_text("Explain to me the difference between nuclear fission and fusion.")
print(res[0]["generated_text"])

Advanced Usage

If you prefer not to use trust_remote_code=True, you can download instruct_pipeline.py, store it alongside your notebook, and construct the pipeline yourself:

import torch
from instruct_pipeline import InstructionTextGenerationPipeline
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("databricks/dolly-v2-12b", padding_side="left")
model = AutoModelForCausalLM.from_pretrained("databricks/dolly-v2-12b", device_map="auto", torch_dtype=torch.bfloat16)

generate_text = InstructionTextGenerationPipeline(model=model, tokenizer=tokenizer)

LangChain Usage

To use the pipeline with LangChain, you must set return_full_text=True:

import torch
from transformers import pipeline

generate_text = pipeline(model="databricks/dolly-v2-12b", torch_dtype=torch.bfloat16,
                         trust_remote_code=True, device_map="auto", return_full_text=True)

Create prompts and use them for prediction:

from langchain import PromptTemplate, LLMChain
from langchain.llms import HuggingFacePipeline

# template for an instrution with no input
prompt = PromptTemplate(
    input_variables=["instruction"],
    template="{instruction}")

# template for an instruction with input
prompt_with_context = PromptTemplate(
    input_variables=["instruction", "context"],
    template="{instruction}\n\nInput:\n{context}")

hf_pipeline = HuggingFacePipeline(pipeline=generate_text)

llm_chain = LLMChain(llm=hf_pipeline, prompt=prompt)
llm_context_chain = LLMChain(llm=hf_pipeline, prompt=prompt_with_context)

# Example predicting using a simple instruction
print(llm_chain.predict(instruction="Explain to me the difference between nuclear fission and fusion.").lstrip())

# Example predicting using an instruction with context
context = """George Washington (February 22, 1732[b] - December 14, 1799) was an American military officer, statesman,
and Founding Father who served as the first president of the United States from 1789 to 1797."""

print(llm_context_chain.predict(instruction="When was George Washington president?", context=context).lstrip())

📚 Documentation

Model Overview

dolly-v2-12b is a 12 - billion parameter causal language model created by Databricks. It is derived from EleutherAI's Pythia - 12b and fine - tuned on a ~15K record instruction corpus generated by Databricks employees and released under a permissive license (CC - BY - SA).

Known Limitations

Performance Limitations

dolly-v2-12b is not a state - of - the - art generative language model. It struggles with syntactically complex prompts, programming problems, mathematical operations, etc.

Dataset Limitations

The Pile: The pre - training corpus contains content from the public internet, which may include objectionable content. The model may reflect these shortcomings.
databricks-dolly-15k: The training data was generated by Databricks employees from March to April 2023. It may contain typos, factual errors, and biases from Wikipedia.

Benchmark Metrics

model	openbookqa	arc_easy	winogrande	hellaswag	arc_challenge	piqa	boolq	gmean
EleutherAI/pythia-2.8b	0.348	0.585859	0.589582	0.591217	0.323379	0.73395	0.638226	0.523431
EleutherAI/pythia-6.9b	0.368	0.604798	0.608524	0.631548	0.343857	0.761153	0.6263	0.543567
databricks/dolly-v2-3b	0.384	0.611532	0.589582	0.650767	0.370307	0.742655	0.575535	0.544886
EleutherAI/pythia-12b	0.364	0.627104	0.636148	0.668094	0.346416	0.760065	0.673394	0.559676
EleutherAI/gpt-j-6B	0.382	0.621633	0.651144	0.662617	0.363481	0.761153	0.655963	0.565936
databricks/dolly-v2-12b	0.408	0.63931	0.616417	0.707927	0.388225	0.757889	0.568196	0.56781
databricks/dolly-v2-7b	0.392	0.633838	0.607735	0.686517	0.406997	0.750816	0.644037	0.573487
databricks/dolly-v1-6b	0.41	0.62963	0.643252	0.676758	0.384812	0.773667	0.687768	0.583431
EleutherAI/gpt-neox-20b	0.402	0.683923	0.656669	0.7142	0.408703	0.784004	0.695413	0.602236

📄 License

The model is licensed under the MIT license.

📖 Citation

@online{DatabricksBlog2023DollyV2,
    author    = {Mike Conover and Matt Hayes and Ankit Mathur and Jianwei Xie and Jun Wan and Sam Shah and Ali Ghodsi and Patrick Wendell and Matei Zaharia and Reynold Xin},
    title     = {Free Dolly: Introducing the World's First Truly Open Instruction-Tuned LLM},
    year      = {2023},
    url       = {https://www.databricks.com/blog/2023/04/12/dolly-first-open-commercially-viable-instruction-tuned-llm},
    urldate   = {2023-06-30}
}

Happy Hacking!

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご