đ Text Rewriter Paraphraser
This repository houses a fine - tuned text - rewriting model based on T5 - Base with 223M parameters, designed for effective text paraphrasing.
⨠Features
- Fine - tuned on t5 - base: Harnesses the capabilities of a pre - trained text - to - text transfer model for efficient paraphrasing.
- Large Dataset (430k examples): Trained on a comprehensive dataset that combines three open - source sources and is cleaned using various techniques to ensure optimal performance.
- High - Quality Paraphrases: Generates paraphrases that substantially change the sentence structure while preserving accuracy and factual correctness.
- Non - AI Detectable: Aims to create paraphrases that seem natural and are indistinguishable from human - written text.
Model Performance
- Train Loss: 1.0645
- Validation Loss: 0.8761
đ Quick Start
The T5 model requires a task - related prefix. Since this is a paraphrasing task, we'll add the prefix "paraphraser: ".
đģ Usage Examples
Basic Usage
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
tokenizer = AutoTokenizer.from_pretrained("Ateeqq/Text-Rewriter-Paraphraser")
model = AutoModelForSeq2SeqLM.from_pretrained("Ateeqq/Text-Rewriter-Paraphraser").to(device)
def generate_title(text):
input_ids = tokenizer(f'paraphraser: {text}', return_tensors="pt", padding="longest", truncation=True, max_length=64).input_ids.to(device)
outputs = model.generate(
input_ids,
num_beams=4,
num_beam_groups=4,
num_return_sequences=4,
repetition_penalty=10.0,
diversity_penalty=3.0,
no_repeat_ngram_size=2,
temperature=0.8,
max_length=64
)
return tokenizer.batch_decode(outputs, skip_special_tokens=True)
text = 'By leveraging prior model training through transfer learning, fine-tuning can reduce the amount of expensive computing power and labeled data needed to obtain large models tailored to niche use cases and business needs.'
generate_title(text)
Output
['The fine-tuning can reduce the amount of expensive computing power and labeled data required to obtain large models adapted for niche use cases and business needs by using prior model training through transfer learning.',
'fine-tuning, by utilizing prior model training through transfer learning, can reduce the amount of expensive computing power and labeled data required to obtain large models tailored for niche use cases and business needs.',
'Fine-tunering by using prior model training through transfer learning can reduce the amount of expensive computing power and labeled data required to obtain large models adapted for niche use cases and business needs.',
'Using transfer learning to use prior model training, fine-tuning can reduce the amount of expensive computing power and labeled data required for large models that are suitable in niche usage cases or businesses.']
đ Documentation
Inference Parameters
Property |
Details |
num_beams |
3 |
num_beam_groups |
3 |
num_return_sequences |
1 |
repetition_penalty |
3 |
diversity_penalty |
3.01 |
no_repeat_ngram_size |
2 |
temperature |
0.8 |
max_length |
64 |
Widget Examples
Example Title |
Text |
AWS course |
paraphraser: Learn to build generative AI applications with an expert AWS instructor with the 2 - day Developing Generative AI Applications on AWS course. |
Generative AI |
paraphraser: In healthcare, Generative AI can help generate synthetic medical data to train machine learning models, develop new drug candidates, and design clinical trials. |
Fine Tuning |
paraphraser: By leveraging prior model training through transfer learning, fine - tuning can reduce the amount of expensive computing power and labeled data needed to obtain large models tailored to niche use cases and business needs. |
đ License
This project is licensed under the Apache - 2.0 license.
đ§ Further Development
(Mention any ongoing development or areas for future improvement in Discussions.)