đ t5-small-finetuned-summarization-xsum
This model is a fine - tuned version of [t5 - small](https://huggingface.co/t5 - small) on the xsum dataset. It can summarize text rapidly and efficiently, saving resources.
đ Quick Start
This model is a fine - tuned version of [t5 - small](https://huggingface.co/t5 - small) on the xsum dataset. It is very fast and light, capable of summarizing a whole text in just <1s, which is highly efficient for low - resource usage.
Model Demo
[Click here to access the model demo](https://huggingface.co/spaces/Rahmat82/RHM - text - summarizer - light)
It achieves the following results on the evaluation set:
- Loss: 2.2425
- Rouge1: 31.3222
- Rouge2: 10.0614
- Rougel: 25.0513
- Rougelsum: 25.0446
- Gen Len: 18.8044
⨠Features
- High - speed Summarization: Whether on GPU or CPU, it can always summarize your text in <1s. Using optimum may make it even faster.
- Lightweight: The model is light, suitable for various resource - constrained environments.
đĻ Installation
No specific installation steps are provided in the original text. If you want to use the model, you need to install relevant libraries as shown in the usage examples.
đģ Usage Examples
Basic Usage
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer, pipeline
model_id = "Rahmat82/t5 - small - finetuned - summarization - xsum"
model = AutoModelForSeq2SeqLM.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id, use_fast=True)
summarizer = pipeline("summarization", model = model, tokenizer=tokenizer)
text_to_summarize = """
The koala is regarded as the epitome of cuddliness. However, animal lovers
will be saddened to hear that this lovable marsupial has been moved to the
endangered species list. The Australian Koala Foundation estimates there are
somewhere between 43,000 - 100,000 koalas left in the wild. Their numbers have
been dwindling rapidly due to disease, loss of habitat, bushfires, being hit
by cars, and other threats. Stuart Blanch from the World Wildlife Fund in
Australia said: "Koalas have gone from no listing to vulnerable to endangered
within a decade. That is a shockingly fast decline." He added that koalas risk
"sliding toward extinction"
"""
print(summarizer(text_to_summarize)[0]["summary_text"])
Advanced Usage
from transformers import AutoTokenizer
from optimum.onnxruntime import ORTModelForSeq2SeqLM
from optimum.pipelines import pipeline
import accelerate
model_name = "Rahmat82/t5 - small - finetuned - summarization - xsum"
model = ORTModelForSeq2SeqLM.from_pretrained(model_name, export=True)
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)
summarizer = pipeline("summarization", model=model, tokenizer=tokenizer,
device_map="auto", batch_size=12)
text_to_summarize = """
The koala is regarded as the epitome of cuddliness. However, animal lovers
will be saddened to hear that this lovable marsupial has been moved to the
endangered species list. The Australian Koala Foundation estimates there are
somewhere between 43,000 - 100,000 koalas left in the wild. Their numbers have
been dwindling rapidly due to disease, loss of habitat, bushfires, being hit
by cars, and other threats. Stuart Blanch from the World Wildlife Fund in
Australia said: "Koalas have gone from no listing to vulnerable to endangered
within a decade. That is a shockingly fast decline." He added that koalas risk
"sliding toward extinction"
"""
print(summarizer(text_to_summarize)[0]["summary_text"])
đ Documentation
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 28
- eval_batch_size: 28
- seed: 42
- optimizer: Adam with betas=(0.9, 0.999) and epsilon = 1e - 08
- lr_scheduler_type: linear
- num_epochs: 2
- mixed_precision_training: Native AMP
Training results
Training Loss |
Epoch |
Step |
Validation Loss |
Rouge1 |
Rouge2 |
Rougel |
Rougelsum |
Gen Len |
2.5078 |
1.0 |
7288 |
2.2860 |
30.9087 |
9.7673 |
24.6951 |
24.6927 |
18.7973 |
2.4245 |
2.0 |
14576 |
2.2425 |
31.3222 |
10.0614 |
25.0513 |
25.0446 |
18.8044 |
Framework versions
- Transformers 4.37.0
- Pytorch 2.1.2
- Datasets 2.1.0
- Tokenizers 0.15.1
đ License
This model is licensed under the Apache - 2.0 license.
đĻ Model Information
Property |
Details |
Model Type |
Fine - tuned version of t5 - small on xsum dataset |
Training Data |
xsum |
Metrics |
Rouge |
Pipeline Tag |
summarization |