🚀 pszemraj/pegasus-x-large-book-summary
該模型基於Transformer架構,專為長文檔摘要任務設計,能有效處理長序列文本,在多個數據集上取得了良好的ROUGE指標成績,可用於地震、學術論文、講座等多種文本的摘要生成。
🚀 快速開始
本模型可用於多種文本的摘要生成任務,如地震相關文本、學術論文、講座轉錄文本等。以下是使用示例:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("pszemraj/pegasus-x-large-book-summary")
model = AutoModelForSeq2SeqLM.from_pretrained("pszemraj/pegasus-x-large-book-summary")
text = "large earthquakes along a given fault segment do not occur at random intervals because it takes time to accumulate the strain energy for the rupture. The rates at which tectonic plates move and accumulate strain at their boundaries are approximately uniform. Therefore, in first approximation, one may expect that large ruptures of the same fault segment will occur at approximately constant time intervals. If subsequent main shocks have different amounts of slip across the fault, then the recurrence time may vary, and the basic idea of periodic mainshocks must be modified. For great plate boundary ruptures the length and slip often vary by a factor of 2. Along the southern segment of the San Andreas fault the recurrence interval is 145 years with variations of several decades. The smaller the standard deviation of the average recurrence interval, the more specific could be the long term prediction of a future mainshock."
inputs = tokenizer(text, return_tensors="pt")
summary_ids = model.generate(inputs["input_ids"], num_beams=2, early_stopping=True, length_penalty=0.1)
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
print(summary)
✨ 主要特性
- 處理長序列文本:能夠處理長度較長的文本序列,如支持長度達到4096的序列,相比傳統基於Transformer的模型,在處理長序列時計算成本更低。
- 高效注意力機制:採用塊稀疏注意力(block sparse attention)代替普通注意力機制,有效降低了計算複雜度。
- 多任務適用性:適用於多種自然語言處理任務,特別是長文檔摘要任務。
📚 詳細文檔
模型參數
屬性 |
詳情 |
最大長度 |
48 |
最小長度 |
2 |
無重複n - gram大小 |
3 |
編碼器無重複n - gram大小 |
3 |
提前停止 |
開啟 |
長度懲罰 |
0.1 |
束搜索數量 |
2 |
基礎模型 |
google/pegasus-x-large |
數據集與指標
模型在多個數據集上進行了測試,以下是部分數據集的測試結果:
samsum數據集
指標類型 |
指標值 |
指標名稱 |
rouge |
33.1401 |
ROUGE - 1 |
rouge |
9.3095 |
ROUGE - 2 |
rouge |
24.8552 |
ROUGE - L |
rouge |
29.0391 |
ROUGE - LSUM |
loss |
2.288182497024536 |
loss |
gen_len |
45.2173 |
gen_len |
launch/gov_report數據集
指標類型 |
指標值 |
指標名稱 |
rouge |
39.7279 |
ROUGE - 1 |
rouge |
10.8944 |
ROUGE - 2 |
rouge |
19.7018 |
ROUGE - L |
rouge |
36.5634 |
ROUGE - LSUM |
loss |
2.473011016845703 |
loss |
gen_len |
212.8243 |
gen_len |
billsum數據集
指標類型 |
指標值 |
指標名稱 |
rouge |
42.1065 |
ROUGE - 1 |
rouge |
15.4079 |
ROUGE - 2 |
rouge |
24.8814 |
ROUGE - L |
rouge |
36.0375 |
ROUGE - LSUM |
loss |
1.9130958318710327 |
loss |
gen_len |
179.2184 |
gen_len |
📄 許可證
本模型使用的許可證包括:
- Apache - 2.0
- BSD - 3 - Clause