🚀 蛋白质多标签分类模型
本项目是一个基于微调的Bert-Base-Uncased模型,主要用于多标签分类任务。该模型能够根据蛋白质的氨基酸序列来预测其功能,为生物学研究和蛋白质分析提供了有力支持。
🚀 快速开始
本模型的使用方式很简单,在推理框中粘贴蛋白质序列,模型就会输出该序列与某些GO术语相关联的概率。
例如,输入以下蛋白质序列:
MMSTTHLLVFLLGVVTLTTPTFGTYESPNYGKPPTPVFKPPKVKPPPYEPKPPVYEPPKKEKPEPKPPVYAPPKKEKHGPKPTMYEPPKKEKPEPKPPVYTPPKKEVPKPKPPVYEPPKKEKPEPKPPIYTPPKKEKPEPKPPVYEPPKKEKPEPKPPVYTPPKKEKPEPKPPVYEPPKKPPMYEPKPPKPPVYTPPKKEKPEPKPPMYEPPKKPPMYEPKPPKPPVYTPPKKEKPEPKPPMYQPPNNPPIYEPKPPKPPVYAPPKEEKPKPKPPVYEPPAHEPPYGHYPGHPPLGKPQ
模型将输出如下分数:
[
[
{
"label": "GO:0000122",
"score": 0.29775485396385193
},
{
"label": "GO:0000070",
"score": 0.10477513074874878
},
{
"label": "GO:0000075",
"score": 0.08593793958425522
},
{
"label": "GO:0000118",
"score": 0.05860009789466858
},
{
"label": "GO:0000082",
"score": 0.05373986065387726
},
{
"label": "GO:0000077",
"score": 0.03928716108202934
},
{
"label": "GO:0000096",
"score": 0.03705739229917526
},
{
"label": "GO:0000079",
"score": 0.02797592058777809
},
{
"label": "GO:0000045",
"score": 0.026528609916567802
},
{
"label": "GO:0000097",
"score": 0.026119187474250793
},
{
"label": "GO:0000086",
"score": 0.019697198644280434
},
{
"label": "GO:0000049",
"score": 0.018551582470536232
},
{
"label": "GO:0000041",
"score": 0.016929756850004196
},
{
"label": "GO:0000054",
"score": 0.015105823054909706
},
{
"label": "GO:0000083",
"score": 0.01434631273150444
},
{
"label": "GO:0000105",
"score": 0.013960960321128368
},
{
"label": "GO:0000076",
"score": 0.013064960949122906
},
{
"label": "GO:0000109",
"score": 0.012523632496595383
},
{
"label": "GO:0000113",
"score": 0.012152223847806454
},
{
"label": "GO:0000062",
"score": 0.01127714291214943
},
{
"label": "GO:0000101",
"score": 0.011041304096579552
}
]
📚 详细文档
模型信息
这是一个经过微调的Bert-Base-Uncased模型,用于多标签分类任务。该任务主要是基于蛋白质的氨基酸序列来预测其功能。模型以序列数据和蛋白质类别名称作为输入,并输出概率分数,即该序列属于某个类别的可能性。
属性 |
详情 |
模型类型 |
微调的Bert-Base-Uncased模型 |
任务类型 |
多标签分类任务,基于蛋白质氨基酸序列预测功能 |
输入 |
序列数据和蛋白质类名 |
输出 |
概率分数(该序列属于某个类别的可能性) |