モデル概要
モデル特徴
モデル能力
使用事例
🚀 denis-gordeev/rured2-ner-microsoft-mdeberta-v3-base
このモデルは、ロシア語の固有表現抽出(NER)に特化したモデルです。microsoft/mdeberta-v3-base をベースに、特定のデータセットでファインチューニングされています。評価セットで良好な結果を達成しており、様々な固有表現の抽出に有効です。
🚀 クイックスタート
このモデルを使用するには、以下のコードを参考にしてください。
基本的な使用法
# このコードは、モデルを使用して固有表現抽出を行う基本的な例です。
import torch
from torch import nn
from transformers import (AutoTokenizer, AutoModelForTokenClassification,
TrainingArguments, Trainer)
model_name = "denis-gordeev/rured2-ner-microsoft-mdeberta-v3-base"
model = AutoModelForTokenClassification.from_pretrained(
model_name).to('cuda')
tokenizer = AutoTokenizer.from_pretrained(model_name)
def predict(text:str, glue_tokens=False, output_together=True, glue_words=True):
sigmoid = nn.Sigmoid()
tokenized = tokenizer(text)
input_ids = torch.tensor(
[tokenized["input_ids"]], dtype=torch.long
).to("cuda")
token_type_ids = torch.tensor(
[tokenized["token_type_ids"]], dtype=torch.long
).to("cuda")
attention_mask = torch.tensor(
[tokenized["attention_mask"]], dtype=torch.long
).to("cuda")
preds = model(**{"input_ids": input_ids, "token_type_ids": token_type_ids, "attention_mask": attention_mask})
logits = sigmoid(preds.logits)
output_tokens = []
output_preds = []
id_to_label = {int(k): v for k, v in model.config.id2label.items()}
for i, token in enumerate(input_ids[0]):
if token > 3:
class_ids = (logits[0][i] > 0.5).nonzero()
if class_ids.shape[0] >= 1:
class_names = [id_to_label[int(cl)] for cl in class_ids]
else:
class_names = [id_to_label[int(logits[0][i].argmax())]]
converted_token = tokenizer.convert_ids_to_tokens([token])[0]
new_word_bool = converted_token.startswith("▁")
converted_token = converted_token.replace("▁", "")
if glue_words and not(new_word_bool) and output_tokens:
output_tokens[-1] += converted_token
else:
output_tokens.append(converted_token)
output_preds.append(class_names)
else:
class_names = []
if output_together:
return [[output_tokens[t_i], output_preds[t_i]] for t_i in range(len(output_tokens))]
return output_tokens, output_preds
✨ 主な機能
- このモデルは、microsoft/mdeberta-v3-base をベースにファインチューニングされています。
- 評価セットで以下のような結果を達成しています。
- Loss: 0.0096
- F1 Micro: 0.5837
- O F1 Micro: 0.6370
- 他にも様々な固有表現に関する評価指標で良好な結果を示しています。
📚 ドキュメント
モデルの詳細
このモデルは、microsoft/mdeberta-v3-base のファインチューニングバージョンです。特定のデータセットで訓練され、評価セットで以下の結果を達成しています。
評価指標 | 数値 |
---|---|
Loss | 0.0096 |
F1 Micro | 0.5837 |
O F1 Micro | 0.6370 |
O Recall Micro | 0.9242 |
O Precision Micro | 0.4860 |
B-person F1 Micro | 0.9639 |
他多数... | ... |
F1 Macro | 0.3969 |
Recall Macro | 0.5603 |
Precision Macro | 0.3447 |
訓練手順
訓練ハイパーパラメータ
訓練時に使用されたハイパーパラメータは以下の通りです。
- learning_rate: 1e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10000
- mixed_precision_training: Native AMP
訓練結果
訓練中の損失や評価指標の推移は以下の通りです。 | Training Loss | Epoch | Step | Validation Loss | F1 Micro | O F1 Micro | O Recall Micro | O Precision Micro | B-person F1 Micro | B-person Recall Micro | B-person Precision Micro | B-norp F1 Micro | B-norp Recall Micro | B-norp Precision Micro | B-commodity F1 Micro | B-commodity Recall Micro | B-commodity Precision Micro | B-date F1 Micro | B-date Recall Micro | B-date Precision Micro | I-date F1 Micro | I-date Recall Micro | I-date Precision Micro | B-country F1 Micro | B-country Recall Micro | B-country Precision Micro | B-economic Sector F1 Micro | B-economic Sector Recall Micro | B-economic Sector Precision Micro | I-economic Sector F1 Micro | I-economic Sector Recall Micro | I-economic Sector Precision Micro | B-news Source F1 Micro | B-news Source Recall Micro | B-news Source Precision Micro | B-profession F1 Micro | B-profession Recall Micro | B-profession Precision Micro | I-news Source F1 Micro | I-news Source Recall Micro | I-news Source Precision Micro | I-person F1 Micro | I-person Recall Micro | I-person Precision Micro | B-organization F1 Micro | B-organization Recall Micro | B-organization Precision Micro | I-profession F1 Micro | I-profession Recall Micro | I-profession Precision Micro | B-event F1 Micro | B-event Recall Micro | B-event Precision Micro | B-city F1 Micro | B-city Recall Micro | B-city Precision Micro | B-gpe F1 Micro | B-gpe Recall Micro | B-gpe Precision Micro | I-event F1 Micro | I-event Recall Micro | I-event Precision Micro | B-group F1 Micro | B-group Recall Micro | B-group Precision Micro | B-ordinal F1 Micro | B-ordinal Recall Micro | B-ordinal Precision Micro | B-product F1 Micro | B-product Recall Micro | B-product Precision Micro | I-organization F1 Micro | I-organization Recall Micro | I-organization Precision Micro | B-money F1 Micro | B-money Recall Micro | B-money Precision Micro | I-money F1 Micro | I-money Recall Micro | I-money Precision Micro | B-currency F1 Micro | B-currency Recall Micro | B-currency Precision Micro | B-percent F1 Micro | B-percent Recall Micro | B-percent Precision Micro | I-percent F1 Micro | I-percent Recall Micro | I-percent Precision Micro | I-group F1 Micro | I-group Recall Micro | I-group Precision Micro | B-cardinal F1 Micro | B-cardinal Recall Micro | B-cardinal Precision Micro | B-law F1 Micro | B-law Recall Micro | B-law Precision Micro | I-law F1 Micro | I-law Recall Micro | I-law Precision Micro | B-fac F1 Micro | B-fac Recall Micro | B-fac Precision Micro | I-fac F1 Micro | I-fac Recall Micro | I-fac Precision Micro | B-age F1 Micro | B-age Recall Micro | B-age Precision Micro | I-city F1 Micro | I-city Recall Micro | I-city Precision Micro | B-work Of Art F1 Micro | B-work Of Art Recall Micro | B-work Of Art Precision Micro | I-work Of Art F1 Micro | I-work Of Art Recall Micro | I-work Of Art Precision Micro | B-region F1 Micro | B-region Recall Micro | B-region Precision Micro | I-region F1 Micro | I-region Recall Micro | I-region Precision Micro | I-cardinal F1 Micro | I-cardinal Recall Micro | I-cardinal Precision Micro | I-currency F1 Micro | I-currency Recall Micro | I-currency Precision Micro | B-quantity F1 Micro | B-quantity Recall Micro | B-quantity Precision Micro | I-quantity F1 Micro | I-quantity Recall Micro | I-quantity Precision Micro | B-crime F1 Micro | B-crime Recall Micro | B-crime Precision Micro | I-crime F1 Micro | I-crime Recall Micro | I-crime Precision Micro | B-trade Agreement F1 Micro | B-trade Agreement Recall Micro | B-trade Agreement Precision Micro | B-nationality F1 Micro | B-nationality Recall Micro | B-nationality Precision Micro | B-family F1 Micro | B-family Recall Micro | B-family Precision Micro | I-family F1 Micro | I-family Recall Micro | I-family Precision Micro | I-product F1 Micro | I-product Recall Micro | I-product Precision Micro | B-time F1 Micro | B-time Recall Micro | B-time Precision Micro | I-time F1 Micro | I-time Recall Micro | I-time Precision Micro | I-commodity F1 Micro | I-commodity Recall Micro | I-commodity Precision Micro | B-application F1 Micro | B-application Recall Micro | B-application Precision Micro | I-application F1 Micro | I-application Recall Micro | I-application Precision Micro | I-country F1 Micro | I-country Recall Micro | I-country Precision Micro | B-award F1 Micro | B-award Recall Micro | B-award Precision Micro | I-award F1 Micro | I-award Recall Micro | I-award Precision Micro | I-gpe F1 Micro | I-gpe Recall Micro | I-gpe Precision Micro | B-location F1 Micro | B-location Recall Micro | B-location Precision Micro | I-location F1 Micro | I-location Recall Micro | I-location Precision Micro | I-ordinal F1 Micro | I-ordinal Recall Micro | I-ordinal Precision Micro | I-trade Agreement F1 Micro | I-trade Agreement Recall Micro | I-trade Agreement Precision Micro | B-religion F1 Micro | B-religion Recall Micro | B-religion Precision Micro | I-age F1 Micro | I-age Recall Micro | I-age Precision Micro | B-investment Program F1 Micro | B-investment Program Recall Micro | B-investment Program Precision Micro | I-investment Program F1 Micro | I-investment Program Recall Micro | I-investment Program Precision Micro | B-borough F1 Micro | B-borough Recall Micro | B-borough Precision Micro | B-price F1 Micro | B-price Recall Micro | B-price Precision Micro | I-price F1 Micro | I-price Recall Micro | I-price Precision Micro | B-character F1 Micro | B-character Recall Micro | B-character Precision Micro | I-character F1 Micro | I-character Recall Micro | I-character Precision Micro | B-website F1 Micro | B-website Recall Micro | B-website Precision Micro | B-street F1 Micro | B-street Recall Micro | B-street Precision Micro | I-street F1 Micro | I-street Recall Micro | I-street Precision Micro | B-village F1 Micro | B-village Recall Micro | B-village Precision Micro | I-village F1 Micro | I-village Recall Micro | I-village Precision Micro | B-disease F1 Micro | B-disease Recall Micro | B-disease Precision Micro | I-disease F1 Micro | I-disease Recall Micro | I-disease Precision Micro | B-penalty F1 Micro | B-penalty Recall Micro | B-penalty Precision Micro | I-penalty F1 Micro | I-penalty Recall Micro | I-penalty Precision Micro | B-weapon F1 Micro | B-weapon Recall Micro | B-weapon Precision Micro | I-weapon F1 Micro | I-weapon Recall Micro | I-weapon Precision Micro | I-borough F1 Micro | I-borough Recall Micro | I-borough Precision Micro | B-vehicle F1 Micro | B-vehicle Recall Micro | B-vehicle Precision Micro | I-vehicle F1 Micro | I-vehicle Recall Micro | I-vehicle Precision Micro | B-language F1 Micro | B-language Recall Micro | B-language Precision Micro | I-language F1 Micro | I-language Recall Micro | I-language Precision Micro | B-house F1 Micro | B-house Recall Micro | B-house Precision Micro | I-norp F1 Micro | I-norp Recall Micro | I-norp Precision Micro | I-house F1 Micro | I-house Recall Micro | I-house Precision Micro | I-website F1 Micro | I-website Recall Micro | I-website Precision Micro | F1 Macro | 0.3969 | | Recall Macro | 0.5603 | | Precision Macro | 0.3447 |
📄 ライセンス
このモデルはMITライセンスの下で提供されています。








