Text-Rewriter-Paraphraser: An Open-Source Text Rewriting Model - Efficient Rewriting with Original Meaning Preserved and Sentence Structures Altered

Text Rewriter Paraphraser

Developed by Ateeqq

A text paraphrasing model fine-tuned on T5-Base, capable of efficiently rewriting text content while preserving the original meaning and altering sentence structures.

Text Generation

Transformers

Open Source License:Openrail #High-quality Paraphrasing #Sentence Restructuring #AI Detection Avoidance

Downloads 1,613

Release Time : 6/2/2024

Model Overview

This model is primarily used for text paraphrasing tasks, generating text that maintains the same semantics as the original but with different expressions, suitable for content rewriting and avoiding AI detection scenarios.

Model Features

Fine-tuned on T5-Base

Leverages the powerful capabilities of pre-trained text transformation models for efficient paraphrasing.

Massive Training Data

Integrates three open-source datasets with multi-dimensional cleaning and optimization, totaling 430,000 training samples.

High-quality Paraphrasing Output

Significantly alters sentence structures while maintaining accuracy and factual correctness.

Avoids AI Detection

Generates natural and fluent results that are difficult to distinguish from human-written text.

Model Capabilities

Text Rewriting

Sentence Transformation

Content Paraphrasing

Text Style Conversion

Use Cases

Education

Course Description Rewriting

Rewrites course description texts to generate multiple expressions.

Multiple versions of AWS course descriptions rewritten.

Healthcare

Medical AI Application Descriptions

Generates multiple descriptions for AI applications in the medical field.

Multi-angle descriptions of generative AI in healthcare.

Technology

Technical Concept Explanations

Provides multiple versions of explanations for technical concepts.

Various explanations of model fine-tuning techniques.

🚀 Text Rewriter Paraphraser

This repository houses a fine - tuned text - rewriting model based on T5 - Base with 223M parameters, designed for effective text paraphrasing.

✨ Features

Fine - tuned on t5 - base: Harnesses the capabilities of a pre - trained text - to - text transfer model for efficient paraphrasing.
Large Dataset (430k examples): Trained on a comprehensive dataset that combines three open - source sources and is cleaned using various techniques to ensure optimal performance.
High - Quality Paraphrases: Generates paraphrases that substantially change the sentence structure while preserving accuracy and factual correctness.
Non - AI Detectable: Aims to create paraphrases that seem natural and are indistinguishable from human - written text.

Model Performance

Train Loss: 1.0645
Validation Loss: 0.8761

🚀 Quick Start

The T5 model requires a task - related prefix. Since this is a paraphrasing task, we'll add the prefix "paraphraser: ".

💻 Usage Examples

Basic Usage

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import torch

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
tokenizer = AutoTokenizer.from_pretrained("Ateeqq/Text-Rewriter-Paraphraser")
model = AutoModelForSeq2SeqLM.from_pretrained("Ateeqq/Text-Rewriter-Paraphraser").to(device)

def generate_title(text):
    input_ids = tokenizer(f'paraphraser: {text}', return_tensors="pt", padding="longest", truncation=True, max_length=64).input_ids.to(device)
    outputs = model.generate(
        input_ids,
        num_beams=4,
        num_beam_groups=4,
        num_return_sequences=4,
        repetition_penalty=10.0,
        diversity_penalty=3.0,
        no_repeat_ngram_size=2,
        temperature=0.8,
        max_length=64
    )
    return tokenizer.batch_decode(outputs, skip_special_tokens=True)

text = 'By leveraging prior model training through transfer learning, fine-tuning can reduce the amount of expensive computing power and labeled data needed to obtain large models tailored to niche use cases and business needs.'
generate_title(text)

Output

 ['The fine-tuning can reduce the amount of expensive computing power and labeled data required to obtain large models adapted for niche use cases and business needs by using prior model training through transfer learning.',
 'fine-tuning, by utilizing prior model training through transfer learning, can reduce the amount of expensive computing power and labeled data required to obtain large models tailored for niche use cases and business needs.',
 'Fine-tunering by using prior model training through transfer learning can reduce the amount of expensive computing power and labeled data required to obtain large models adapted for niche use cases and business needs.',
 'Using transfer learning to use prior model training, fine-tuning can reduce the amount of expensive computing power and labeled data required for large models that are suitable in niche usage cases or businesses.']

📚 Documentation

Inference Parameters

Property	Details
num_beams	3
num_beam_groups	3
num_return_sequences	1
repetition_penalty	3
diversity_penalty	3.01
no_repeat_ngram_size	2
temperature	0.8
max_length	64

Widget Examples

Example Title	Text
AWS course	paraphraser: Learn to build generative AI applications with an expert AWS instructor with the 2 - day Developing Generative AI Applications on AWS course.
Generative AI	paraphraser: In healthcare, Generative AI can help generate synthetic medical data to train machine learning models, develop new drug candidates, and design clinical trials.
Fine Tuning	paraphraser: By leveraging prior model training through transfer learning, fine - tuning can reduce the amount of expensive computing power and labeled data needed to obtain large models tailored to niche use cases and business needs.

📄 License

This project is licensed under the Apache - 2.0 license.

🔧 Further Development

(Mention any ongoing development or areas for future improvement in Discussions.)

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご