Open-source Cappy-large Scorer - Enhance the performance of large language models and accurately evaluate the correctness of responses

Cappy Large

Developed by btan2

Large Language Model Open Source License:Apache-2.0 #Multi-task Scorer #LLM-assisted Optimization #Small-parameter Efficiency

Downloads 72

Release Time : 11/9/2023

Model Overview

Cappy is a pre-trained small-scale scorer designed to enhance the performance and efficiency of multi-task large language models (LLMs). The model takes instructions and candidate responses as input and outputs a score between 0 and 1, indicating the estimated correctness of the response relative to the instruction. With only 360 million parameters, Cappy can independently handle classification tasks or serve as an auxiliary component to improve LLM performance.

Model Features

Efficient Scoring

With only 360 million parameters, it efficiently evaluates the match between instructions and responses, outputting a score between 0 and 1.

Multi-task Support

Can independently handle classification tasks or serve as an auxiliary component to improve LLM performance.

No Fine-tuning Required

Efficiently integrates downstream supervision signals without the need to fine-tune or access LLM parameters.

Flexible Adaptation

Can flexibly integrate with other LLM adaptation techniques (e.g., fine-tuning, in-context learning, and prompt tuning) for additional performance gains.

Model Capabilities

Instruction-response Scoring

Multi-task Language Understanding

Classification Task Handling

LLM Performance Enhancement

Use Cases

News Classification

News Tag Selection

Select the most appropriate tags for news content

Performs excellently on 11 language understanding tasks from PromptSource

Complex Task Handling

BIG-Bench Tasks

Handles 45 complex tasks from BIG-Bench

Consistently and significantly enhances the performance of the advanced multi-task model FLAN-T5

🚀 Cappy-Large

Cappy is a pre - trained small scorer that aims to improve the performance and efficiency of multi - task LLMs. It takes an instruction and a candidate response as inputs and outputs a score between 0 and 1, estimating the correctness of the response according to the instruction. With only 360 million parameters, Cappy can work independently on classification tasks or assist LLMs to enhance their performance. It also allows for efficient integration of downstream supervision without LLM fine - tuning or access to their parameters. Moreover, it can cooperate with other LLM adaptations, providing additional performance improvements.

🚀 Quick Start

Cappy is a pretrained small scorer designed to enhance the performance and efficiency of multi - task LLMs. Cappy takes in an instruction and a candidate response as input, and produces a score between 0 and 1, indicating an estimated correctness of the response with respect to the instruction. With merely 360 million parameters, Cappy functions either independently on classification tasks or serve as an auxiliary component for LLMs, boosting their performance. Also, Cappy enables efficiently integrating downstream supervision without requiring LLM finetuning nor the access to their parameters. Furthermore, Cappy is flexible to cooperate with other LLM adaptations, including finetuning and in - context learning, and prompt tuning, offering additional performance enhancement.

Repository: https://github.com/tanyuqian/cappy
Paper: arxiv.org/abs/2311.06720

💻 Usage Examples

Basic Usage

Cappy can be loaded either as a Jax/Flax model or a PyTorch model.

Jax/Flax

from transformers import AutoTokenizer, FlaxAutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained('btan2/cappy-large')
cappy = FlaxAutoModelForSequenceClassification.from_pretrained('btan2/cappy-large')

instruction = """
What label best describes this news article?
Carlyle Looks Toward Commercial Aerospace (Reuters) Reuters - Private investment firm Carlyle Group,\which has a reputation for making well-timed and occasionally\controversial plays in the defense industry, has quietly placed\its bets on another part of the market.
"""
response = 'Business'

inputs = tokenizer([(instruction, response), ], return_tensors='pt')
score = cappy(**inputs).logits[0][0].item()

PyTorch

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained('btan2/cappy-large')
cappy = AutoModelForSequenceClassification.from_pretrained('btan2/cappy-large')

instruction = """
What label best describes this news article?
Carlyle Looks Toward Commercial Aerospace (Reuters) Reuters - Private investment firm Carlyle Group,\which has a reputation for making well-timed and occasionally\controversial plays in the defense industry, has quietly placed\its bets on another part of the market.
"""
response = 'Business'

inputs = tokenizer([(instruction, response), ], return_tensors='pt')
score = cappy(**inputs).logits[0][0].item()

📚 Documentation

We validate Cappy through an extensive suite of held - out tasks distinct from those incorporated in its pretraining. The overall performance is as shown in Fig. 1 and Fig. 2. Specifically, on 11 language understanding tasks drawn from PromptSource, Cappy, with 360 million parameters, outperforms OPT - IML - 30B and OPT - 175B significantly, and matches the best ones among previous multi - task LLMs. Besides, on 45 diverse complex tasks from BIG - Bench, Cappy consistently boosts the performance of the advanced multi - task LLM, FLAN - T5, by a large margin. Furthermore, Cappy offers additional performance enhancement when applied together with finetuning or in - context learning. Our subsequent ablation study proves the significance of our proposed pretraining and data augmentation strategies.

🔧 Technical Details

Cappy's pretraining uses the code from this example in Red Coast, a lightweight toolkit for automating distributed training.

📄 License

This project is licensed under the Apache - 2.0 license.

📖 Citation

@inproceedings{
tan2023cappy,
title={Cappy: Outperforming and Boosting Large Multi - Task {LM}s with a Small Scorer},
author={Bowen Tan and Yun Zhu and Lijuan Liu and Eric Xing and Zhiting Hu and Jindong Chen},
booktitle={Thirty - seventh Conference on Neural Information Processing Systems},
year={2023},
url={https://openreview.net/forum?id=Srt1hhQgqa}
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご