flan-t5-portuguese-small-summarization Open Source Model - Easily Generate Portuguese Text Summaries

Flan T5 Portuguese Small Summarization

Developed by rhaymison

A Portuguese summarization generation model fine-tuned based on google/flan-t5-small, suitable for Portuguese text summarization tasks, with a small model size but good performance.

Text Generation

Transformers

OtherOpen Source License:Apache-2.0 #Portuguese summarization #Small-scale efficiency #Medical text optimization

Downloads 45

Release Time : 3/18/2024

Model Overview

This model is specifically optimized for Portuguese text summarization tasks, capable of generating concise and accurate summaries. Due to its small size, occasional accent errors may occur.

Model Features

Portuguese optimization

Specially trained and optimized for Portuguese text, suitable for Portuguese content processing

Lightweight model

Based on the flan-t5-small architecture, with a small model size, suitable for resource-limited environments

Summarization generation

Focused on text summarization tasks, capable of generating concise and accurate summaries

Model Capabilities

Text summarization generation

Portuguese text processing

Use Cases

News summarization

Political news summarization

Summarize political news articles to extract key information

As shown in the example, it can accurately extract key information such as election results

Medical/Psychology content summarization

Medical research summarization

Summarize medical research content to extract key findings

As shown in the example, it can accurately summarize key features of body dysmorphic disorder

🚀 Flan - T5 Portuguese Small Summarization

This model aims to meet the needs of Portuguese - language models for specific tasks, especially excelling in summarization tasks. However, due to its small - scale nature, occasional errors related to word accentuation may occur.

🚀 Quick Start

First, install the transformers library:

!pip install transformers

Then, use the following code to perform text summarization:

from transformers import pipeline
summarization = pipeline("summarization", model="rhaymison/flan-t5-portuguese-small-summarization", tokenizer="rhaymison/flan-t5-portuguese-small-summarization")

prompt =f"""
sumarize: No que consiste o transtorno dismórfico corporal? São pessoas que se acham feias e querem mudar sua aparência de forma obsessiva, mesmo que não tenham nenhum problema. Num dos estudos que fiz, detectamos que de 50% a 54% dos pacientes que procuram cirurgia de face, nariz ou abdômen apresentam essa condição. A cirurgia pode beneficiar aqueles com um quadro leve ou intermediário do transtorno. No entanto, os que apresentam um transtorno mais grave não devem ser operados, e sim encaminhados para tratamento psicológico. A maior dificuldade é que aceitem ajuda. Muitos preferem buscar um médico que dê sinal verde para a intervenção.
"""
output =  summarization(prompt)

#Transtorno dismórfico corporal: o que apresenta o transtorno no deve ser operados, e sim encaminhados para tratamento psicológico. 
#A cirurgia pode beneficiar aqueles com um quadro leve ou intermediário do transtornamento, nariz ou abdômen.

✨ Features

Portuguese Summarization: Specialized in Portuguese text summarization tasks.
Performance Metrics: Achieved certain Rouge scores during training, indicating good summarization performance.

📦 Installation

To use this model, you need to install the transformers library:

!pip install transformers

💻 Usage Examples

Basic Usage

from transformers import pipeline
summarization = pipeline("summarization", model="rhaymison/flan-t5-portuguese-small-summarization", tokenizer="rhaymison/flan-t5-portuguese-small-summarization")

prompt = "sumarize: Your Portuguese text here"
output = summarization(prompt)
print(output)

📚 Documentation

Model Information

Property	Details
Model Type	flan - t5 - base
Base Model	google/flan - t5 - small
Library Name	transformers
Pipeline Tag	summarization
Language	Portuguese
License	Apache - 2.0
Training Datasets	recogna - nlp/recognasumm
Evaluation Metrics	Rouge

Training and Evaluation

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e - 05
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
lr_scheduler_type: linear
num_epochs: 5

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
1.847	0.27	500	1.7443	15.4969	5.9408	13.5074	14.5518	19.0
1.8333	0.53	1000	1.7194	15.6496	5.8641	13.5584	14.669	19.0
1.8043	0.8	1500	1.7209	15.8523	6.0544	13.7563	14.8941	19.0
1.7903	1.07	2000	1.7156	15.8969	6.0071	13.7534	14.8513	19.0
1.7862	1.33	2500	1.7007	15.8441	5.958	13.66	14.7226	19.0
1.7687	1.6	3000	1.6949	15.9134	6.0486	13.9238	14.9171	19.0
1.7724	1.87	3500	1.6909	15.8827	5.8941	13.7195	14.8736	19.0
1.7653	2.13	4000	1.6811	16.0819	5.9791	13.8639	15.0031	19.0
1.7392	2.4	4500	1.6761	15.706	5.7384	13.5978	14.7374	19.0
1.7578	2.67	5000	1.6729	15.8926	5.9629	13.767	14.9088	19.0
1.7353	2.93	5500	1.6675	16.0266	5.9024	13.8471	14.9721	19.0
1.7425	3.2	6000	1.6626	16.0732	6.1141	13.9016	15.0673	19.0
1.73	3.47	6500	1.6631	16.1333	6.0951	13.9551	15.0686	19.0
1.7355	3.73	7000	1.6616	16.1704	6.1575	14.0481	15.079	19.0
1.7139	4.0	7500	1.6572	16.2592	6.25	14.0403	15.1851	19.0
1.7188	4.27	8000	1.6580	16.1572	6.0661	14.0029	15.0935	19.0
1.7045	4.53	8500	1.6560	16.1409	6.1478	13.9806	15.0795	19.0
1.7201	4.8	9000	1.6541	16.3352	6.2366	14.1335	15.2755	19.0

Framework versions

Transformers 4.38.2
Pytorch 2.2.1+cu121
Datasets 2.18.0
Tokenizers 0.15.2

Comments

Any idea, help or report will always be welcome. You can contact the author via email: rhaymisoncristian@gmail.com

📄 License

This project is licensed under the Apache - 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご