Open-source title-generator model: Automatically generate concise titles from long medical research abstracts for free

Title Generator

Developed by TusharJoshi89

This model is based on the t5-small architecture, designed to automatically generate concise titles from lengthy medical research abstracts.

Text Generation

Transformers

EnglishOpen Source License:Apache-2.0 #Medical Title Generation #Short Text Summarization #T5 Architecture Optimization

Downloads 67

Release Time : 8/21/2023

Model Overview

This is a text generation model specifically designed to summarize lengthy abstract texts into one-line titles, suitable for generating titles in medical research journals.

Model Features

Medical Research Abstract Optimization

Specially optimized for lengthy abstracts in the medical research field, generating titles suitable for journal use.

Efficient Text Processing

Capable of processing up to 1024 input tokens and generating concise titles with up to 24 output tokens.

Based on t5-small Architecture

Uses t5-small as the base model, providing good summarization performance while keeping the model lightweight.

Model Capabilities

Text Summarization

Title Generation

Medical Research Text Processing

Use Cases

Academic Research

Medical Journal Title Generation

Automatically generates concise titles from lengthy abstracts of medical research papers

Tested on 5000 research journals, achieving a Rouge1 score of 0.3137

Text Processing Tools

Research Abstract Simplification

Helps researchers quickly grasp the core content of lengthy abstracts

🚀 Model Card for Model ID

This model is designed to automatically generate titles from paragraphs, offering a practical solution for summarizing long abstract text in journals into concise one - liners that can serve as journal titles.

🚀 Quick Start

Use the code below to get started with the model.

from transformers import pipeline
text = """Text that needs to be summarized"""

summarizer = pipeline("summarization", model="path - to - model")
summary = summarizer(text)[0]["summary_text"]

print (summary)

✨ Features

Text Summarization: It can be used as a text summarizer to create titles for paragraphs.
Tunable for Downstream Tasks: It serves as a tunable language model for downstream tasks.

📦 Installation

No installation steps are provided in the original document, so this section is skipped.

💻 Usage Examples

Basic Usage

from transformers import pipeline
text = """Text that needs to be summarized"""

summarizer = pipeline("summarization", model="path - to - model")
summary = summarizer(text)[0]["summary_text"]

print (summary)

📚 Documentation

Model Details

Model Description

This is a text generative model that summarizes long abstract text journals into one - liners for use as journal titles.

Developed by: Tushar Joshi
Shared by [optional]: Tushar Joshi
Model type: t5 - small
Language(s) (NLP): English
License: Apache 2.0
Finetuned from model [optional]: t5 - small baseline

Model Sources [optional]

Repository: https://huggingface.co/t5 - small

Uses

Direct Use

As a text summarizer for paragraphs.

Out - of - Scope Use

Should not be used as a text summarizer for very long paragraphs.

Bias, Risks, and Limitations

Max input token size of 1024
Max output token size of 24

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

Training Details

Training Data

The training data is internally curated and cannot be exposed.

Training Procedure

None

Preprocessing [optional]

None

Training Hyperparameters

Training regime: [More Information Needed]
None

Speeds, Sizes, Times [optional]

The training was done using GPU T4x 2. The task took 4:09:47 to complete. The dataset size of 10,000 examples was used for training the generative model.

Evaluation

The quality of summarization was tested on 5000 research journals created over the last 20 years.

Testing Data, Factors & Metrics

Test Data Size: 5000 examples

Testing Data

The testing data is internally generated and curated.

Factors

[More Information Needed]

Metrics

The model was evaluated on Rouge Metrics. Below are the baseline results achieved.

Results

Epoch	Training Loss	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
18	2.442800	2.375408	0.313700	0.134600	0.285400	0.285400	16.414100
19	2.454800	2.372553	0.312900	0.134100	0.284900	0.285000	16.445100
20	2.438900	2.372551	0.312300	0.134000	0.284500	0.284600	16.435500

Model Examination [optional]

[More Information Needed]

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Hardware Type: GPU T4 x 2
Hours used: 4.5
Cloud Provider: GCP
Compute Region: Ireland
Carbon Emitted: Unknown

Technical Specifications [optional]

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

[More Information Needed]

Hardware

[More Information Needed]

Software

[More Information Needed]

Citation [optional]

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

Tushar Joshi

Model Card Contact

Tushar Joshi LinkedIn - https://www.linkedin.com/in/tushar - joshi - 816133100/

📄 License

The model is licensed under the Apache 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご