KM45L6V2OC開源軟件需求分類模型 - 精準區分功能與非功能需求

首頁

KM45L6V2OC

由kasrahabib開發

基於sentence-transformers/all-MiniLM-L6-v2微調的軟件需求分類模型，用於區分功能性(F)與非功能性(NF)需求

文本分類

Transformers

英語開源協議:Apache-2.0 #需求分類 #高精度F1 #軟件工程

下載量 27

發布時間 : 3/3/2023

模型概述

該模型專門用於軟件需求分類任務，能夠準確識別功能性需求和非功能性需求，訓練數據來自軟件需求數據集(SWARD)。

模型特點

高精度分類

在評估集上達到0.99的宏觀F1分數，能夠準確區分功能性需求和非功能性需求

基於成熟模型微調

基於sentence-transformers/all-MiniLM-L6-v2模型微調，繼承了原模型的優秀特性

簡單易用

提供完整的pipeline使用示例，支持快速集成到現有系統中

模型能力

軟件需求分類

功能性需求識別

非功能性需求識別

自然語言處理

使用案例

軟件開發

需求文檔分析

自動分析軟件需求文檔，分類功能性需求和非功能性需求

準確率高達99%

需求管理工具集成

集成到需求管理系統中，自動標記需求類型

🚀 kasrahabib/KM45L6V2OC

該模型是 sentence-transformers/all-MiniLM-L6-v2 的微調版本，用於在軟件需求數據集（SWARD）上將軟件需求分類為功能性（F）和非功能性（NF）類型。它在評估集上取得了以下結果：

訓練損失：0.0107
驗證損失：0.0404
輪數：14
最終宏 F1 分數：0.99

標籤： 0 或 F -> 功能性； 1 或 NF -> 非功能性；

🚀 快速開始

✨ 主要特性

此模型基於預訓練模型微調，能高效準確地將軟件需求分類為功能性和非功能性類型，在評估集上有優異的表現。

📦 安裝指南

若要在本地使用該模型，可按以下步驟操作：

克隆倉庫：

git lfs install
git clone url_of_repo

找到下載目錄的路徑。
將路徑鏈接寫入 model_ckpt 變量。

💻 使用示例

基礎用法

from transformers import pipeline

frame_work = 'tf'
task = 'text-classification'
model_ckpt = 'kasrahabib/KM45L6V2OC'

software_requirment_cls = pipeline(task = task, model = model_ckpt, framework = frame_work)

example_1_f = 'The START NEW PROJECT function shall allow the user to create a new project.'
example_2_nf = 'The email string consists of x@x.x and is less than 31 characters in length and is not empty.'
software_requirment_cls([example_1_f, example_2_nf])

運行結果：

[{'label': 'F', 'score': 0.9998922348022461},
 {'label': 'NF', 'score': 0.999846339225769}]

高級用法

import numpy as np
from transformers import AutoTokenizer, TFAutoModelForSequenceClassification

model_ckpt = 'kasrahabib/KM45L6V2OC'
tokenizer = AutoTokenizer.from_pretrained(model_ckpt)
model = TFAutoModelForSequenceClassification.from_pretrained(model_ckpt)

example_1_f = 'The START NEW PROJECT function shall allow the user to create a new project.'
example_2_nf = 'The email string consists of x@x.x and is less than 31 characters in length and is not empty.'
requirements = [example_1_f, example_2_nf]

encoded_requirements = tokenizer(requirements, return_tensors = 'np', padding = 'longest')

y_pred = model(encoded_requirements).logits
classifications = np.argmax(y_pred, axis = 1)

classifications = [model.config.id2label[output] for output in classifications]
print(classifications)

運行結果：

['F', 'NF']

🔧 技術細節

訓練超參數

訓練期間使用了以下超參數：

優化器：{'name': 'Adam', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 2e-05, 'decay_steps': 9030, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}}, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False}
訓練精度：float32