🚀 BiomedNLP - PubMedBERT-base-uncased-abstract-fulltext_pub_section
本项目是一个用于文档段落文本分类的模型,基于microsoft/BiomedNLP - PubMedBERT-base-uncased-abstract-fulltext
进行微调。可对医学文档的不同段落进行分类,如背景、结论、方法、目标、结果等。
🚀 快速开始
安装依赖
根据需要安装transformers
库:
pip install -U transformers
运行示例
运行以下代码,将示例文本替换为你的实际用例:
from transformers import pipeline
model_tag = "ml4pubmed/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext_pub_section"
classifier = pipeline(
'text-classification',
model=model_tag,
)
prompt = """
Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train.
"""
classifier(
prompt,
)
✨ 主要特性
- 适用数据集:
pubmed
、ml4pubmed/pubmed-classification-20k
。
- 评估指标:
f1
。
- 支持任务:文本分类、文档段落分类、句子分类、文档分类等医学相关的文本分类任务。
📦 安装指南
若需要使用该模型,需安装transformers
库,安装命令如下:
pip install -U transformers
💻 使用示例
基础用法
from transformers import pipeline
model_tag = "ml4pubmed/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext_pub_section"
classifier = pipeline(
'text-classification',
model=model_tag,
)
prompt = """
Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train.
"""
classifier(
prompt,
)
🔧 技术细节
训练指标
指标 |
数值 |
val_accuracy |
0.8678670525550842 |
val_matthewscorrcoef |
0.8222037553787231 |
val_f1score |
0.866841197013855 |
val_cross_entropy |
0.3674609065055847 |
epoch |
8.0 |
train_accuracy_step |
0.83984375 |
train_matthewscorrcoef_step |
0.7790813446044922 |
train_f1score_step |
0.837363600730896 |
train_cross_entropy_step |
0.39843088388442993 |
train_accuracy_epoch |
0.8538406491279602 |
train_matthewscorrcoef_epoch |
0.8031334280967712 |
train_f1score_epoch |
0.8521654605865479 |
train_cross_entropy_epoch |
0.4116102457046509 |
test_accuracy |
0.8578397035598755 |
test_matthewscorrcoef |
0.8091378808021545 |
test_f1score |
0.8566917181015015 |
test_cross_entropy |
0.3963385224342346 |
date_run |
Apr - 22 - 2022_t - 19 |
huggingface_tag |
microsoft/BiomedNLP - PubMedBERT - base - uncased - abstract - fulltext |
📄 许可证
本项目采用apache - 2.0
许可证。