🚀 文本改寫釋義器
本倉庫包含一個基於T5-Base的微調文本改寫模型,該模型擁有2.23億個參數。此模型能夠有效改寫文本,為用戶提供高質量的釋義內容。
✨ 主要特性
- 基於T5-Base微調:藉助預訓練的文本到文本轉換模型的強大能力,實現高效的釋義功能。
- 大型數據集(43萬個示例):在綜合數據集上進行訓練,該數據集整合了三個開源數據源,並採用多種技術進行清理,以確保最佳性能。
- 高質量釋義:生成的釋義能夠顯著改變句子結構,同時保持準確性和事實正確性。
- 不易被AI檢測:旨在生成自然的釋義,使其與人類撰寫的文本難以區分。
模型性能:
📦 安裝指南
文檔未提及具體安裝步驟,暫不展示。
💻 使用示例
基礎用法
T5模型需要一個與任務相關的前綴,因為這是一個釋義任務,我們將添加前綴 "paraphraser: "。
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
tokenizer = AutoTokenizer.from_pretrained("Ateeqq/Text-Rewriter-Paraphraser")
model = AutoModelForSeq2SeqLM.from_pretrained("Ateeqq/Text-Rewriter-Paraphraser").to(device)
def generate_title(text):
input_ids = tokenizer(f'paraphraser: {text}', return_tensors="pt", padding="longest", truncation=True, max_length=64).input_ids.to(device)
outputs = model.generate(
input_ids,
num_beams=4,
num_beam_groups=4,
num_return_sequences=4,
repetition_penalty=10.0,
diversity_penalty=3.0,
no_repeat_ngram_size=2,
temperature=0.8,
max_length=64
)
return tokenizer.batch_decode(outputs, skip_special_tokens=True)
text = 'By leveraging prior model training through transfer learning, fine-tuning can reduce the amount of expensive computing power and labeled data needed to obtain large models tailored to niche use cases and business needs.'
generate_title(text)
輸出示例
['The fine-tuning can reduce the amount of expensive computing power and labeled data required to obtain large models adapted for niche use cases and business needs by using prior model training through transfer learning.',
'fine-tuning, by utilizing prior model training through transfer learning, can reduce the amount of expensive computing power and labeled data required to obtain large models tailored for niche use cases and business needs.',
'Fine-tunering by using prior model training through transfer learning can reduce the amount of expensive computing power and labeled data required to obtain large models adapted for niche use cases and business needs.',
'Using transfer learning to use prior model training, fine-tuning can reduce the amount of expensive computing power and labeled data required for large models that are suitable in niche usage cases or businesses.']
📚 詳細文檔
推理參數
屬性 |
詳情 |
束搜索數量 (num_beams ) |
3 |
束搜索組數量 (num_beam_groups ) |
3 |
返回序列數量 (num_return_sequences ) |
1 |
重複懲罰 (repetition_penalty ) |
3 |
多樣性懲罰 (diversity_penalty ) |
3.01 |
無重複n-gram大小 (no_repeat_ngram_size ) |
2 |
溫度 (temperature ) |
0.8 |
最大長度 (max_length ) |
64 |
示例文本
示例標題 |
文本內容 |
AWS課程 |
paraphraser: Learn to build generative AI applications with an expert AWS instructor with the 2-day Developing Generative AI Applications on AWS course. |
生成式AI |
paraphraser: In healthcare, Generative AI can help generate synthetic medical data to train machine learning models, develop new drug candidates, and design clinical trials. |
微調 |
paraphraser: By leveraging prior model training through transfer learning, fine-tuning can reduce the amount of expensive computing power and labeled data needed to obtain large models tailored to niche use cases and business needs. |
📄 許可證
本項目採用Apache-2.0許可證。
🔧 技術細節
文檔未提供具體的技術實現細節,暫不展示。
🔜 後續開發
(在討論區提及任何正在進行的開發或未來改進的方向。)