Careerbert - gオープンソースモデル - 職業マッチングとレコメンデーションシステム向けに設計されたドイツ語ツール

ホーム

Careerbert G

lwolfrum2によって開発

ESCO分類体系に基づいて微調整されたドイツ語の文変換モデルで、職業マッチングと推薦システム専用に設計されています

テキスト埋め込み

Transformers

ドイツ語#職業マッチング #ESCO分類 #ドイツ語NLP

ダウンロード数 49

リリース時間 : 2/26/2025

モデル概要

CareerBERT-Gはdeepset/gbert-baseを微調整した文変換モデルで、履歴書とESCO職位分類のマッチングに特化しており、職業相談や職位推薦システムをサポートします。

モデル特徴

職業マッチング最適化

職業相談や職位推薦シナリオに特化して最適化されており、履歴書と職位記述を効果的にマッチングできます

ESCO分類統合

欧州スキル・能力・職業分類体系(ESCO)を統合し、標準化された職業特徴表現を提供します

二段階検証

EURES職位広告と人事専門家評価による二段階検証を通じて、モデルの実用性能を保証します

モデル能力

文埋め込み生成

テキスト類似度計算

職業特徴抽出

履歴書と職位マッチング

使用事例

職業相談

履歴書と職位マッチング

求職者の履歴書をESCO職位分類とマッチングさせ、精度の高い職位推薦を提供します

人事専門家評価において堅牢な効能を示しました

就業サービス

職位広告分析

EURESなどのプラットフォームの職位広告を分析し、標準化された職業特徴を抽出します

従来の最先端埋め込み手法を凌駕しました

🚀 CareerBERT-G

CareerBERT-Gは、ESCO Taxonomy でファインチューニングされた文埋め込みモデルです。ベースモデルには deepset/gbert-base を使用しています。

対応する論文: https://www.sciencedirect.com/science/article/pii/S0957417425006657

プロパティ	詳細
パイプラインタグ	sentence-similarity
タグ	sentence-transformers, feature-extraction, sentence-similarity, transformers
言語	de
ベースモデル	deepset/gbert-base

🚀 クイックスタート

このモデルを使用するには、以下の手順に従ってください。

📦 インストール

sentence-transformers をインストールすることで、このモデルを簡単に使用できます。

pip install -U sentence-transformers

💻 使用例

基本的な使用法 (Sentence-Transformers)

from sentence_transformers import SentenceTransformer
sentences = ["This is an example sentence", "Each sentence is converted"]

model = SentenceTransformer('{MODEL_NAME}')
embeddings = model.encode(sentences)
print(embeddings)

高度な使用法 (HuggingFace Transformers)

sentence-transformers を使用せずにこのモデルを使用するには、入力をトランスフォーマーモデルに通し、その後コンテキスト化された単語埋め込みに対して適切なプーリング操作を適用する必要があります。

from transformers import AutoTokenizer, AutoModel
import torch


#Mean Pooling - Take attention mask into account for correct averaging
def mean_pooling(model_output, attention_mask):
    token_embeddings = model_output[0] #First element of model_output contains all token embeddings
    input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
    return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)


# Sentences we want sentence embeddings for
sentences = ['This is an example sentence', 'Each sentence is converted']

# Load model from HuggingFace Hub
tokenizer = AutoTokenizer.from_pretrained('{MODEL_NAME}')
model = AutoModel.from_pretrained('{MODEL_NAME}')

# Tokenize sentences
encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')

# Compute token embeddings
with torch.no_grad():
    model_output = model(**encoded_input)

# Perform pooling. In this case, mean pooling.
sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])

print("Sentence embeddings:")
print(sentence_embeddings)

📚 ドキュメント

評価結果

このモデルの自動評価については、Sentence Embeddings Benchmark を参照してください: https://seb.sbert.net

トレーニング

このモデルは以下のパラメータでトレーニングされました。

DataLoader:

torch.utils.data.dataloader.DataLoader (長さ3695) で、以下のパラメータを使用:

{'batch_size': 32, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}

Loss:

sentence_transformers.losses.MultipleNegativesRankingLoss.MultipleNegativesRankingLoss で、以下のパラメータを使用:

{'scale': 20.0, 'similarity_fct': 'cos_sim'}

fit() メソッドのパラメータ:

{
    "epochs": 1,
    "evaluation_steps": 0,
    "evaluator": "sentence_transformers.evaluation.RerankingEvaluator.RerankingEvaluator",
    "max_grad_norm": 1,
    "optimizer_class": "<class 'torch.optim.adamw.AdamW'>",
    "optimizer_params": {
        "lr": 2e-05
    },
    "scheduler": "WarmupLinear",
    "steps_per_epoch": null,
    "warmup_steps": 11821.1,
    "weight_decay": 0.01
}

モデルアーキテクチャ

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
)

引用と著者

@article{ROSENBERGER2025127043,
title = {CareerBERT: Matching resumes to ESCO jobs in a shared embedding space for generic job recommendations},
journal = {Expert Systems with Applications},
volume = {275},
pages = {127043},
year = {2025},
issn = {0957-4174},
doi = {https://doi.org/10.1016/j.eswa.2025.127043},
url = {https://www.sciencedirect.com/science/article/pii/S0957417425006657},
author = {Julian Rosenberger and Lukas Wolfrum and Sven Weinzierl and Mathias Kraus and Patrick Zschech},
keywords = {Job consultation, Job markets, Job recommendation system, BERT, NLP},
abstract = {The rapidly evolving labor market, driven by technological advancements and economic shifts, presents significant challenges for traditional job matching and consultation services. In response, we introduce an advanced support tool for career counselors and job seekers based on CareerBERT, a novel approach that leverages the power of unstructured textual data sources, such as resumes, to provide more accurate and comprehensive job recommendations. In contrast to previous approaches that primarily focus on job recommendations based on a fixed set of concrete job advertisements, our approach involves the creation of a corpus that combines data from the European Skills, Competences, and Occupations (ESCO) taxonomy and EURopean Employment Services (EURES) job advertisements, ensuring an up-to-date and well-defined representation of general job titles in the labor market. Our two-step evaluation approach, consisting of an application-grounded evaluation using EURES job advertisements and a human-grounded evaluation using real-world resumes and Human Resources (HR) expert feedback, provides a comprehensive assessment of CareerBERT’s performance. Our experimental results demonstrate that CareerBERT outperforms both traditional and state-of-the-art embedding approaches while showing robust effectiveness in human expert evaluations. These results confirm the effectiveness of CareerBERT in supporting career consultants by generating relevant job recommendations based on resumes, ultimately enhancing the efficiency of job consultations and expanding the perspectives of job seekers. This research contributes to the field of NLP and job recommendation systems, offering valuable insights for both researchers and practitioners in the domain of career consulting and job matching.}
}