Open-source Model llmlingua-2-bert-base-multilingual-cased-meetingbank - Achieving Task-agnostic Prompt Compression

Llmlingua 2 Bert Base Multilingual Cased Meetingbank

Developed by microsoft

A prompt compression token classification model fine-tuned on the multilingual BERT base model, designed for task-agnostic prompt compression

Large Language Model

Transformers

Open Source License:Apache-2.0 #Meeting Minutes Compression #Multilingual Prompt Optimization #Task-Agnostic Compression

Downloads 28.67k

Release Time : 3/17/2024

Model Overview

This model performs task-agnostic prompt compression token classification, where the retention probability of each token serves as a compression metric. Particularly suitable for compressing texts like meeting minutes.

Model Features

Task-Agnostic Prompt Compression

Capable of effective prompt compression without relying on specific downstream tasks

Multilingual Support

Based on a multilingual BERT model, supporting text compression in multiple languages

Data Distillation Training

Trained using the data distillation method proposed by LLMLingua-2, improving compression quality

Model Capabilities

Text Compression

Token Classification

Meeting Minutes Processing

Multilingual Text Processing

Use Cases

Meeting Minutes Processing

Meeting Minutes Compression

Compress lengthy meeting minutes while retaining key information

Significantly reduces text length while maintaining critical information

Downstream Task Preprocessing

Preprocess input text for downstream tasks like Q&A and summarization

Enhances downstream task efficiency without significantly affecting accuracy

🚀 LLMLingua-2-Bert-base-Multilingual-Cased-MeetingBank

This model is designed for task-agnostic prompt compression, offering efficient and faithful data distillation.

🚀 Quick Start

This model was introduced in the paper LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression (Pan et al, 2024). It is a BERT multilingual base model (cased) finetuned to perform token classification for task agnostic prompt compression. The probability $p_{preserve}$ of each token $x_i$ is used as the metric for compression. This model is trained on the extractive text compression dataset constructed with the methodology proposed in the LLMLingua-2, using training examples from MeetingBank (Hu et al, 2023) as the seed data.

You can evaluate the model on downstream tasks such as question answering (QA) and summarization over compressed meeting transcripts using this dataset.

For more details, please check the project page of LLMLingua-2 and LLMLingua Series.

💻 Usage Examples

Basic Usage

from llmlingua import PromptCompressor

compressor = PromptCompressor(
    model_name="microsoft/llmlingua-2-bert-base-multilingual-cased-meetingbank",
    use_llmlingua2=True
)

original_prompt = """John: So, um, I've been thinking about the project, you know, and I believe we need to, uh, make some changes. I mean, we want the project to succeed, right? So, like, I think we should consider maybe revising the timeline.
Sarah: I totally agree, John. I mean, we have to be realistic, you know. The timeline is, like, too tight. You know what I mean? We should definitely extend it.
"""
results = compressor.compress_prompt_llmlingua2(
    original_prompt,
    rate=0.6,
    force_tokens=['\n', '.', '!', '?', ','],
    chunk_end_tokens=['.', '\n'],
    return_word_label=True,
    drop_consecutive=True
)

print(results.keys())
print(f"Compressed prompt: {results['compressed_prompt']}")
print(f"Original tokens: {results['origin_tokens']}")
print(f"Compressed tokens: {results['compressed_tokens']}")
print(f"Compression rate: {results['rate']}")

# get the annotated results over the original prompt
word_sep = "\t\t|\t\t"
label_sep = " "
lines = results["fn_labeled_original_prompt"].split(word_sep)
annotated_results = []
for line in lines:
    word, label = line.split(label_sep)
    annotated_results.append((word, '+') if label == '1' else (word, '-')) # list of tuples: (word, label)
print("Annotated results:")
for word, label in annotated_results[:10]:
    print(f"{word} {label}")

📄 License

This project is licensed under the Apache-2.0 license.

📚 Citation

@article{wu2024llmlingua2,
    title = "{LLML}ingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression",
    author = "Zhuoshi Pan and Qianhui Wu and Huiqiang Jiang and Menglin Xia and Xufang Luo and Jue Zhang and Qingwei Lin and Victor Ruhle and Yuqing Yang and Chin-Yew Lin and H. Vicky Zhao and Lili Qiu and Dongmei Zhang",
    url = "https://arxiv.org/abs/2403.12968",
    journal = "ArXiv preprint",
    volume = "abs/2403.12968",
    year = "2024",
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご