CoEdIT-xxl Open-source Text Editing Model - Effortlessly Revise Texts by Instructions, Free to Use

Coedit Xxl

Developed by grammarly

CoEdIT-xxl is a large-scale text editing model fine-tuned based on the google/flan-t5-xxl model, specifically designed for text revision tasks following instructions.

Large Language Model

Transformers

English#Instruction-fine-tuned Text Editing #Multi-task Text Revision #Grammar Correction and Optimization

Downloads 180

Release Time : 5/11/2023

Model Overview

This model is obtained by fine-tuning the google/flan-t5-xxl model on the CoEdIT dataset, primarily used for editing and revising text based on specific instructions.

Model Features

Large-scale Parameters

With 11 billion parameters, it possesses powerful text comprehension and generation capabilities.

Task-specific Fine-tuning

Optimized specifically for text editing tasks, capable of accurately understanding and executing editing instructions.

Multi-dataset Training

Trained on multiple high-quality text editing datasets, including facebook/asset, wi_locness, etc.

Model Capabilities

Grammar Error Correction

Text Style Conversion

Text Simplification

Text Polishing

Instruction-based Text Editing

Use Cases

Writing Assistance

Grammar Correction

Automatically detect and correct grammatical errors in the text.

Improves the grammatical accuracy of the text.

Text Polishing

Optimize the style and expression of the text based on instructions.

Enhances the readability and expression quality of the text.

Education

Writing Guidance

Provide writing revision suggestions for students.

Helps students improve their writing skills.

🚀 Model Card for CoEdIT-xxl

This model card provides detailed information about the CoEdIT-xxl model, which is fine - tuned from the google/flan - t5 - xxl model on the CoEdIT dataset. It offers insights into the model's details, usage, and citation information.

🚀 Quick Start

The CoEdIT-xxl model is readily available for use. You can start using it by following the code example below:

from transformers import AutoTokenizer, T5ForConditionalGeneration

tokenizer = AutoTokenizer.from_pretrained("grammarly/coedit-xxl")
model = T5ForConditionalGeneration.from_pretrained("grammarly/coedit-xxl")
input_text = 'Fix grammatical errors in this sentence: When I grow up, I start to understand what he said is quite right.'
input_ids = tokenizer(input_text, return_tensors="pt").input_ids
outputs = model.generate(input_ids, max_length=256)
edited_text = tokenizer.decode(outputs[0], skip_special_tokens=True)

✨ Features

Text Revision: Given an edit instruction and an original text, the model can generate the edited version of the text.
Multiple Model Sizes: Available in different sizes (CoEdIT-large, CoEdIT-xl, CoEdIT-xxl) with varying numbers of parameters.

📚 Documentation

📦 Installation

The model can be installed using the transformers library. Ensure you have it installed in your Python environment.

💻 Usage Examples

Basic Usage

from transformers import AutoTokenizer, T5ForConditionalGeneration

tokenizer = AutoTokenizer.from_pretrained("grammarly/coedit-xxl")
model = T5ForConditionalGeneration.from_pretrained("grammarly/coedit-xxl")
input_text = 'Fix grammatical errors in this sentence: When I grow up, I start to understand what he said is quite right.'
input_ids = tokenizer(input_text, return_tensors="pt").input_ids
outputs = model.generate(input_ids, max_length=256)
edited_text = tokenizer.decode(outputs[0], skip_special_tokens=True)

🔧 Technical Details

Model Details

Property	Details
Model Type	CoEdIT-xxl, fine - tuned from google/flan - t5 - xxl
Language(s) (NLP)	English
Finetuned from model	google/flan - t5 - xxl
Repository	https://github.com/vipulraheja/coedit
Paper	https://arxiv.org/abs/2305.09857

Model Sizes

Model	Number of parameters
CoEdIT-large	770M
CoEdIT-xl	3B
CoEdIT-xxl	11B

📄 License

This model is licensed under the cc - by - nc - 4.0 license.

📦 Datasets

The model is trained on the following datasets:

facebook/asset
wi_locness
GEM/wiki_auto_asset_turk
discofuse
zaemyung/IteraTeR_plus
jfleg
grammarly/coedit

📊 Metrics

The performance of the model is evaluated using the following metrics:

sari
bleu
accuracy

📖 Citation

BibTeX:

@article{raheja2023coedit,
      title={CoEdIT: Text Editing by Task-Specific Instruction Tuning}, 
      author={Vipul Raheja and Dhruv Kumar and Ryan Koo and Dongyeop Kang},
      year={2023},
      eprint={2305.09857},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

APA: Raheja, V., Kumar, D., Koo, R., & Kang, D. (2023). CoEdIT: Text Editing by Task - Specific Instruction Tuning. ArXiv. /abs/2305.09857

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご