🚀 ESM-2序列分類器
這是一個基於GPT - 4生成的合成數據訓練的小型序列分類器,它可以將蛋白質序列分為三個類別:酶
(類別 0
)、受體蛋白
(類別 1
)和結構蛋白
(類別 2
)。該分類器使用了 facebook/esm2_t6_8M_UR50D 進行訓練,這是 ESM - 2模型 之一。
此模型尚未經過充分測試,僅用於實驗和教育目的,請謹慎使用。
🚀 快速開始
✨ 主要特性
📦 安裝指南
文檔未提及安裝步驟,故跳過此章節。
💻 使用示例
基礎用法
model = EsmForSequenceClassification.from_pretrained("AmelieSchreiber/esm2_t6_8M_UR50D_sequence_classifier_v1")
tokenizer = AutoTokenizer.from_pretrained("facebook/esm2_t6_8M_UR50D")
new_sequences_0 = [
"ACGYLKTPKLADPPVLRGDSSVTKAICKPDPVLEK",
"GVALDECKALDYLPGKPLPMDGKVCQCGSKTPLRP",
"VLPGYTCGELDCKPGKPLPKCGADKTQVATPFLRG",
"TCGALVQYPSCADPPVLRGSDSSVKACKKLDPQDK",
"GALCEECKLCPGADYKPMDGDRLPAAATSKTRPVG",
"PAVDCKKALVYLPKPLPMDGKVCRGSKTPKTRPYG",
"VLGYTCGALDCKPGKPLPKCGADKTQVATPFLRGA",
"CGALVQYPSCADPPVLRGSDSSVKACKKLDPQDKT",
"ALCEECKLCPGADYKPMDGDRLPAAATSKTRPVGK",
"AVDCKKALVYLPKPLPMDGKVCRGSKTPKTRPYGR",
]
new_sequences_1 = [
"VGQRFYGGRQKNRHCELSPLPSACRGSVQGALYTD",
"KDQVLTVPTYACRCCPKMDSKGRVPSTLRVKSARS",
"PLAGVACGRGLDYRCPRKMVPGDLQVTPATQRPYG",
"CGVRLGYPGCADVPLRGRSSFAPRACMKKDPRVTR",
"RKGVAYLYECRKLRCRADYKPRGMDGRRLPKASTT",
"RPTGAVNCKQAKVYRGLPLPMMGKVPRVCRSRRPY",
"RLDGGYTCGQALDCKPGRKPPKMGCADLKSTVATP",
"LGTCRKLVRYPQCADPPVMGRSSFRPKACCRQDPV",
"RVGYAMCSPKLCSCRADYKPPMGDGDRLPKAATSK",
"QPKAVNCRKAMVYRPKPLPMDKGVPVCRSKRPRPY",
]
new_sequences_2 = [
"VGKGFRYGSSQKRYLHCQKSALPPSCRRGKGQGSAT",
"KDPTVMTVGTYSCQCPKQDSRGSVQPTSRVKTSRSK",
"PLVGKACGRSSDYKCPGQMVSGGSKQTPASQRPSYD",
"CGKKLVGYPSSKADVPLQGRSSFSPKACKKDPQMTS",
"RKGVASLYCSSKLSCKAQYSKGMSDGRSPKASSTTS",
"RPKSAASCEQAKSYRSLSLPSMKGKVPSKCSRSKRP",
"RSDVSYTSCSQSKDCKPSKPPKMSGSKDSSTVATPS",
"LSTCSKKVAYPSSKADPPSSGRSSFSMKACKKQDPPV",
"RVGSASSEPKSSCSVQSYSKPSMSGDSSPKASSTSK",
"QPSASNCEKMSSYRPSLPSMSKGVPSSRSKSSPPYQ",
]
new_sequences = new_sequences_0 + new_sequences_1 + new_sequences_2
inputs = tokenizer(new_sequences, return_tensors="pt", padding=True, truncation=True)
with torch.no_grad():
logits = model(**inputs).logits
predicted_class_ids = torch.argmax(logits, dim=-1)
for sequence, predicted_class in zip(new_sequences, predicted_class_ids):
print(f"Sequence: {sequence}, Predicted class: {predicted_class.item()}")
📚 詳細文檔
文檔未提及詳細說明內容,故跳過此章節。
🔧 技術細節
文檔未提及技術實現細節,故跳過此章節。
📄 許可證
本項目採用MIT許可證。