FewShotIssueClassifier-NLBSE23開源模型 - 免費部署精準分類問題報告4大類別

首頁

Fewshotissueclassifier NLBSE23

由PeppoCola開發

基於Sentence Transformers的句子相似度模型，針對問題報告分類任務微調，支持缺陷/文檔/功能/疑問4類分類

文本分類

PyTorch

英語開源協議:Gpl-3.0 #問題報告分類 #少樣本學習 #句子相似度

下載量 27

發布時間 : 3/21/2023

模型概述

該模型將句子和段落映射到768維稠密向量空間，專門用於問題報告分類任務，適用於聚類或語義搜索等場景

模型特點

少樣本學習能力

針對小規模標註數據優化，適合標註資源有限場景

多類別分類

支持缺陷/文檔/功能/疑問4種問題報告類型識別

高維語義編碼

將文本映射到768維稠密向量空間，保留豐富語義信息

模型能力

文本分類

語義相似度計算

問題報告自動歸類

短文本向量化

使用案例

軟件開發支持

問題報告自動分類

自動將用戶提交的問題報告分類到預定義類別

減少人工分類工作量，提高問題跟蹤效率

缺陷報告分析

從大量問題報告中識別出真正的缺陷報告

幫助開發團隊優先處理關鍵問題

文本分析

語義聚類

對相似問題報告進行自動分組

發現重複問題或相關需求

🚀 FewShotIssueClassifier-NLBSE23

這是一個使用句子轉換器（Sentence Transformers）的SetFit模型，它可以將句子和段落映射到768維的密集向量空間。該模型可用於聚類或語義搜索等任務。

此特定模型針對問題報告分類進行了微調，分為4類：錯誤（bug）、文檔（documentation）、功能（feature）和問題（question）。

🚀 快速開始

本模型可用於將句子和段落映射到768維的密集向量空間，適用於聚類或語義搜索等任務。尤其針對問題報告分類進行了微調，能準確區分錯誤、文檔、功能和問題這四類。

✨ 主要特性

使用句子轉換器（Sentence Transformers）將文本映射到768維密集向量空間。
針對問題報告分類進行微調，分為4類：錯誤、文檔、功能和問題。

📦 安裝指南

文檔未提供具體安裝命令，故跳過此章節。

💻 使用示例

基礎用法

from sentence_transformers.losses import CosineSimilarityLoss
from setfit import SetFitModel
from setfit import SetFitTrainer
sentences = ["error in line 20", "add method list_features"]

label_mapping = {
  0 : "bug",
  1 : "documentation",
  2 : "feature",
  3 : "question"
}

model = SetFitModel.from_pretrained('PeppoCola/FewShotIssueClassifier-NLBSE23')
predictions = model.predict(sentences)
print([label_mapping[i] for i in predictions])

📚 詳細文檔

數據集

該模型在NLBSE23數據集的一個子集上進行訓練。該樣本經過人工標註，並在Zenodo上提供。

引用與作者

@software{Colavito_Few-Shot_Learning_for_2023,
	title        = {{Few-Shot Learning for Issue Report Classification}},
	author       = {Colavito, Giuseppe and Lanubile, Filippo and Novielli, Nicole},
	year         = 2023,
	month        = 2,
	url          = {https://github.com/collab-uniba/Issue-Report-Classification-NLBSE2023},
	version      = {1.0.0}
}

@dataset{colavito_giuseppe_2023_7628150,
  author       = {Colavito Giuseppe and
                  Lanubile Filippo and
                  Novielli Nicole},
  title        = {Few-Shot Learning for Issue Report Classification},
  month        = feb,
  year         = 2023,
  note         = {{To use this, merge the CSV with the original 
                   dataset (after removing duplicates on the 'id'
                   column)}},
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.7628150},
  url          = {https://doi.org/10.5281/zenodo.7628150}
}

@inproceedings{Colavito-2023,
	title        = {Few-Shot Learning for Issue Report Classification},
	author       = {Colavito, Giuseppe and Lanubile, Filippo and Novielli, Nicole},
	year         = 2023,
	booktitle    = {2nd International Workshop on Natural Language-Based Software Engineering (NLBSE)}
}