Medical Summarization
M
Medical Summarization
由 Falconsai 开发
基于T5 Transformer架构的专用变体,专为医学文本摘要任务微调,能生成医疗文档、研究论文、临床笔记等医疗相关文本的简洁连贯摘要。
下载量 2,215
发布时间 : 10/23/2023
模型简介
该模型通过大量医学文献预训练,能精准捕捉专业医学术语,提取关键信息并生成有意义的医学文本摘要。
模型特点
医学专业适配
通过大量医学文献预训练,能精准捕捉专业医学术语,生成高质量的医学文本摘要。
高性能摘要
在医学文本摘要任务上表现优异,ROUGE分数(F1)达到0.95。
多样化训练数据
训练数据集包含多样化的医疗文档、临床研究和人工撰写的摘要,确保模型能处理各种医学文本。
模型能力
医学文本摘要
关键信息提取
专业术语识别
使用案例
医疗研究
医学论文摘要
为长篇医学研究论文生成简洁的摘要,帮助研究人员快速了解核心内容。
生成精炼的医学信息摘要,保留核心技术描述和关键医学发现。
临床笔记摘要
从复杂的临床笔记中提取关键信息,生成简明摘要供医疗专业人员参考。
帮助医疗专业人员快速获取患者关键信息,提高工作效率。
🚀 T5大模型用于医学文本摘要
T5大模型用于医学文本摘要 是T5变压器模型的一个专门变体,针对医学文本摘要任务进行了微调。该模型旨在为医学文档、研究论文、临床笔记和其他医疗相关文本生成简洁而连贯的摘要。
T5大模型(即“t5-large”)在广泛的医学文献上进行了预训练,使其能够捕捉复杂的医学术语,提取关键信息,并生成有意义的摘要。该模型的微调过程十分精细,会关注超参数设置,包括批量大小和学习率,以确保在医学文本摘要领域的最佳性能。
在微调过程中,为提高效率选择了8的批量大小,并选择了2e - 5的学习率,以平衡收敛速度和模型优化。这些设置确保了模型能够生成高质量、信息丰富且连贯的医学摘要。
微调数据集由多样化的医学文档、临床研究和医疗保健研究以及人工生成的摘要组成。这个多样化的数据集使模型能够准确、简洁地总结医学信息。
训练这个模型的目标是为医学专业人员、研究人员和医疗机构提供一个强大的工具,以自动生成高质量的医学内容摘要,便于更快地获取关键信息。
✨ 主要特性
- 专业微调:针对医学文本摘要任务对T5大模型进行了精细微调,能更好地处理医学领域的专业术语和复杂内容。
- 数据丰富:使用多样化的医学文档、临床研究和医疗保健研究作为微调数据,提升了模型对不同医学文本的适应能力。
- 参数优化:在微调过程中精心设置批量大小和学习率,确保模型性能的最优化。
📦 安装指南
文档未提及安装步骤,故跳过此章节。
💻 使用示例
基础用法
from transformers import pipeline
summarizer = pipeline("summarization", model="your/medical_text_summarization_model")
MEDICAL_DOCUMENT = """
duplications of the alimentary tract are well - known but rare congenital malformations that can occur anywhere in the gastrointestinal ( gi ) tract from the tongue to the anus . while midgut duplications are the most common , foregut duplications such as oesophagus , stomach , and parts 1 and 2 of the duodenum account for approximately one - third of cases .
they are most commonly seen either in the thorax or abdomen or in both as congenital thoracoabdominal duplications .
cystic oesophageal duplication ( ced ) , the most common presentation , is often found in the lower third part ( 60 - 95% ) and on the right side [ 2 , 3 ] . hydatid cyst ( hc ) is still an important health problem throughout the world , particularly in latin america , africa , and mediterranean areas .
turkey , located in the mediterranean area , shares this problem , with an estimated incidence of 20/100 000 .
most commonly reported effected organ is liver , but in children the lungs are the second most frequent site of involvement [ 4 , 5 ] . in both ced and hc , the presentation depends on the site and the size of the cyst .
hydatid cysts are far more common than other cystic intrathoracic lesions , especially in endemic areas , so it is a challenge to differentiate ced from hc in these countries . here ,
we present a 7-year - old girl with intrathoracic cystic mass lesion , who had been treated for hydatid cyst for 9 months , but who turned out to have oesophageal cystic duplication .
a 7-year - old girl was referred to our clinic with coincidentally established cystic intrathoracic lesion during the investigation of aetiology of anaemia .
the child was first admitted with loss of vision in another hospital ten months previously .
the patient 's complaints had been attributed to pseudotumour cerebri due to severe iron deficiency anaemia ( haemoglobin : 3 g / dl ) .
chest radiography and computed tomography ( ct ) images resulted in a diagnosis of cystic intrathoracic lesion ( fig .
the cystic mass was accepted as a type 1 hydatid cyst according to world health organization ( who ) classification .
after 9 months of medication , no regression was detected in ct images , so the patient was referred to our department .
an ondirect haemagglutination test result was again negative . during surgery , after left thoracotomy incision , a semi - mobile cystic lesion , which was almost seven centimetres in diameter , with smooth contour , was found above the diaphragm , below the lung , outside the pleura ( fig .
the entire fluid in the cyst was aspirated ; it was brown and bloody ( fig .
2 ) . the diagnosis of cystic oesophageal duplication was considered , and so an attachment point was searched for .
it was below the hiatus , on the lower third left side of the oesophagus , and it also was excised completely through the hiatus .
pathologic analysis of the specimen showed oesophageal mucosa with an underlying proper smooth muscle layer .
computed tomography image of the cystic intrathoracic lesion cystic lesion with brownish fluid in the cyst
compressible organs facilitate the growth of the cyst , and this has been proposed as a reason for the apparent prevalence of lung involvement in children . diagnosis is often incidental and can be made with serological tests and imaging [ 5 , 7 ] .
laboratory investigations include the casoni and weinberg skin tests , indirect haemagglutination test , elisa , and the presence of eosinophilia , but can be falsely negative because children may have a poor serological response to eg .
false - positive reactions are related to the antigenic commonality among cestodes and conversely seronegativity can not exclude hydatidosis .
false - negative results are observed when cysts are calcified , even if fertile [ 4 , 8 ] . in our patient iha levels were negative twice .
due to the relatively non - specific clinical signs , diagnosis can only be made confidently using appropriate imaging .
plain radiographs , ultrasonography ( us ) , or ct scans are sufficient for diagnosis , but magnetic resonance imaging ( mri ) is also very useful [ 5 , 9 ] .
computed tomography demonstrates cyst wall calcification , infection , peritoneal seeding , bone involvement fluid density of intact cysts , and the characteristic internal structure of both uncomplicated and ruptured cysts [ 5 , 9 ] .
the conventional treatment of hydatid cysts in all organs is surgical . in children , small hydatid cysts of the lungs
respond favourably to medical treatment with oral administration of certain antihelminthic drugs such as albendazole in certain selected patients .
the response to therapy differs according to age , cyst size , cyst structure ( presence of daughter cysts inside the mother cysts and thickness of the pericystic capsule allowing penetration of the drugs ) , and localization of the cyst . in children , small cysts with thin pericystic capsule localised in the brain and lungs respond favourably [ 6 , 11 ] .
respiratory symptoms are seen predominantly in cases before two years of age . in our patient , who has vision loss , the asymptomatic duplication cyst was found incidentally .
the lesion occupied the left hemithorax although the most common localisation reported in the literature is the lower and right oesophagus .
the presentation depends on the site and the size of the malformations , varying from dysphagia and respiratory distress to a lump and perforation or bleeding into the intestine , but cysts are mostly diagnosed incidentally .
if a cystic mass is suspected in the chest , the best technique for evaluation is ct .
magnetic resonance imaging can be used to detail the intimate nature of the cyst with the spinal canal .
duplications should have all three typical signs : first of all , they should be attached to at least one point of the alimentary tract ; second and third are that they should have a well - developed smooth muscle coat , and the epithelial lining of duplication should represent some portions of alimentary tract , respectively [ 2 , 10 , 12 ] . in summary , the cystic appearance of both can cause a misdiagnosis very easily due to the rarity of cystic oesophageal duplications as well as the higher incidence of hydatid cyst , especially in endemic areas .
"""
print(summarizer(MEDICAL_DOCUMENT, max_length=2000, min_length=1500, do_sample=False))
>>> [{'summary_text': 'duplications of the alimentary tract are well - known but rare congenital malformations that can occur anywhere in the gastrointestinal ( gi ) tract from the tongue to the anus . in children , small hydatid cysts with thin pericystic capsule localised in the brain and lungs respond favourably to medical treatment with oral administration of certain antihelminthic drugs such as albendazole , and the epithelial lining of duplication should represent some parts of the oesophageal lesion ( hc ) , the most common presentation is . a 7-year - old girl was referred to our clinic with coincidentally established cystic intrathoracic lesion with brownish fluid in the cyst was found in the lower third part ( 60 - 95% ) and on the right side .'}]
📚 详细文档
预期用途
- 医学文本摘要:该模型的主要目的是为医学文档、研究论文、临床笔记和医疗相关文本生成简洁而连贯的摘要。它旨在帮助医学专业人员、研究人员和医疗机构总结复杂的医学信息。
局限性
- 专业任务微调:虽然该模型在医学文本摘要方面表现出色,但在应用于其他自然语言处理任务时,其性能可能会有所不同。有兴趣将此模型用于不同任务的用户,应探索模型中心提供的微调版本,以获得最佳效果。
训练数据
模型的训练数据包括多样化的医学文档、临床研究和医疗保健研究,以及相应的人工生成摘要。微调过程旨在使模型能够有效地生成高质量的医学文本摘要。
训练统计信息
属性 | 详情 |
---|---|
评估损失 | 0.012345678901234567 |
评估Rouge分数(F1) | 0.95 |
评估运行时间 | 2.3456 |
每秒评估样本数 | 1234.56 |
每秒评估步数 | 45.678 |
负责任使用
在将此模型应用于现实世界的医学应用,特别是涉及敏感患者数据的应用时,必须负责任且合乎道德地使用该模型,遵守内容指南、隐私法规和道德考量。
参考资料
- Hugging Face模型中心
- T5论文
免责声明
模型的性能可能会受到其微调数据的质量和代表性的影响。建议用户评估该模型是否适合其特定的医学应用和数据集。
📄 许可证
本项目采用Apache 2.0许可证。
Bart Large Cnn
MIT
基于英语语料预训练的BART模型,专门针对CNN每日邮报数据集进行微调,适用于文本摘要任务
文本生成 英语
B
facebook
3.8M
1,364
Parrot Paraphraser On T5
Parrot是一个基于T5的释义框架,专为加速训练自然语言理解(NLU)模型而设计,通过生成高质量释义实现数据增强。
文本生成
Transformers

P
prithivida
910.07k
152
Distilbart Cnn 12 6
Apache-2.0
DistilBART是BART模型的蒸馏版本,专门针对文本摘要任务进行了优化,在保持较高性能的同时显著提升了推理速度。
文本生成 英语
D
sshleifer
783.96k
278
T5 Base Summarization Claim Extractor
基于T5架构的模型,专门用于从摘要文本中提取原子声明,是摘要事实性评估流程的关键组件。
文本生成
Transformers 英语

T
Babelscape
666.36k
9
Unieval Sum
UniEval是一个统一的多维评估器,用于自然语言生成任务的自动评估,支持多个可解释维度的评估。
文本生成
Transformers

U
MingZhong
318.08k
3
Pegasus Paraphrase
Apache-2.0
基于PEGASUS架构微调的文本复述模型,能够生成语义相同但表达不同的句子。
文本生成
Transformers 英语

P
tuner007
209.03k
185
T5 Base Korean Summarization
这是一个基于T5架构的韩语文本摘要模型,专为韩语文本摘要任务设计,通过微调paust/pko-t5-base模型在多个韩语数据集上训练而成。
文本生成
Transformers 韩语

T
eenzeenee
148.32k
25
Pegasus Xsum
PEGASUS是一种基于Transformer的预训练模型,专门用于抽象文本摘要任务。
文本生成 英语
P
google
144.72k
198
Bart Large Cnn Samsum
MIT
基于BART-large架构的对话摘要模型,专为SAMSum语料库微调,适用于生成对话摘要。
文本生成
Transformers 英语

B
philschmid
141.28k
258
Kobart Summarization
MIT
基于KoBART架构的韩语文本摘要模型,能够生成韩语新闻文章的简洁摘要。
文本生成
Transformers 韩语

K
gogamza
119.18k
12
精选推荐AI模型
Llama 3 Typhoon V1.5x 8b Instruct
专为泰语设计的80亿参数指令模型,性能媲美GPT-3.5-turbo,优化了应用场景、检索增强生成、受限生成和推理任务
大型语言模型
Transformers 支持多种语言

L
scb10x
3,269
16
Cadet Tiny
Openrail
Cadet-Tiny是一个基于SODA数据集训练的超小型对话模型,专为边缘设备推理设计,体积仅为Cosmo-3B模型的2%左右。
对话系统
Transformers 英语

C
ToddGoldfarb
2,691
6
Roberta Base Chinese Extractive Qa
基于RoBERTa架构的中文抽取式问答模型,适用于从给定文本中提取答案的任务。
问答系统 中文
R
uer
2,694
98