CareerBERT-JG開源求職模型 - 支持職業諮詢與精準求職推薦

首頁

Careerbert Jg

由lwolfrum2開發

CareerBERT-JG是基於ESCO分類法微調的句子轉換器模型，專為職業諮詢和求職推薦場景設計。

文本嵌入德語#職業推薦 #ESCO分類 #簡歷匹配

下載量 309

發布時間 : 2/26/2025

模型概述

該模型以agne/jobGBERT為基礎，能夠計算句子相似度，支持職業諮詢和求職推薦等應用。

模型特點

ESCO分類法微調

專門在歐洲技能、能力和職業分類體系上微調，適合歐洲就業市場分析

職業嵌入空間

將簡歷和職位描述映射到共享的嵌入空間，實現精準匹配

高效池化處理

採用均值池化方法處理詞嵌入，考慮注意力掩碼確保準確性

模型能力

句子嵌入生成

文本相似度計算

職業相關性分析

簡歷與職位匹配

使用案例

職業諮詢

簡歷職位匹配

根據求職者簡歷內容推薦最相關的ESCO職業分類

在專家評估中表現出優於傳統方法的匹配效果

職業發展建議

分析現有技能與目標職位要求的匹配度

幫助職業顧問提供數據驅動的建議

招聘系統

自動化簡歷篩選

快速匹配大量簡歷與職位要求

提高HR部門工作效率

🚀 CareerBERT-JG

CareerBERT-JG是一個在ESCO分類法上微調的句子轉換器模型。它以agne/jobGBERT為基礎模型，可用於計算句子相似度，為職業諮詢和求職推薦等場景提供支持。

🚀 快速開始

本模型支持使用sentence-transformers庫或HuggingFace Transformers庫調用，下面為你詳細介紹使用方法。

📦 安裝指南

若要使用sentence-transformers庫調用模型，你需要先安裝它：

pip install -U sentence-transformers

💻 使用示例

基礎用法（Sentence-Transformers）

當你安裝了sentence-transformers庫後，使用該模型會變得非常簡單：

from sentence_transformers import SentenceTransformer
sentences = ["This is an example sentence", "Each sentence is converted"]

model = SentenceTransformer('{MODEL_NAME}')
embeddings = model.encode(sentences)
print(embeddings)

高級用法（HuggingFace Transformers）

若未安裝sentence-transformers庫，你可以按以下方式使用模型：首先，將輸入數據傳入Transformer模型，然後對上下文詞嵌入應用合適的池化操作。

from transformers import AutoTokenizer, AutoModel
import torch


#Mean Pooling - Take attention mask into account for correct averaging
def mean_pooling(model_output, attention_mask):
    token_embeddings = model_output[0] #First element of model_output contains all token embeddings
    input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
    return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)


# Sentences we want sentence embeddings for
sentences = ['This is an example sentence', 'Each sentence is converted']

# Load model from HuggingFace Hub
tokenizer = AutoTokenizer.from_pretrained('{MODEL_NAME}')
model = AutoModel.from_pretrained('{MODEL_NAME}')

# Tokenize sentences
encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')

# Compute token embeddings
with torch.no_grad():
    model_output = model(**encoded_input)

# Perform pooling. In this case, mean pooling.
sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])

print("Sentence embeddings:")
print(sentence_embeddings)

📚 詳細文檔

評估結果

若要對該模型進行自動評估，請參考句子嵌入基準測試：https://seb.sbert.net

訓練信息

該模型使用以下參數進行訓練：

數據加載器： torch.utils.data.dataloader.DataLoader，長度為3695，參數如下：

{'batch_size': 32, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}

損失函數： sentence_transformers.losses.MultipleNegativesRankingLoss.MultipleNegativesRankingLoss，參數如下：

{'scale': 20.0, 'similarity_fct': 'cos_sim'}

fit()方法的參數：

{
    "epochs": 1,
    "evaluation_steps": 0,
    "evaluator": "sentence_transformers.evaluation.RerankingEvaluator.RerankingEvaluator",
    "max_grad_norm": 1,
    "optimizer_class": "<class 'torch.optim.adamw.AdamW'>",
    "optimizer_params": {
        "lr": 2e-05
    },
    "scheduler": "WarmupLinear",
    "steps_per_epoch": null,
    "warmup_steps": 11821.1,
    "weight_decay": 0.01
}

完整模型架構

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
)

引用與作者

如果你在研究中使用了該模型，請引用以下論文：

@article{ROSENBERGER2025127043,
title = {CareerBERT: Matching resumes to ESCO jobs in a shared embedding space for generic job recommendations},
journal = {Expert Systems with Applications},
volume = {275},
pages = {127043},
year = {2025},
issn = {0957-4174},
doi = {https://doi.org/10.1016/j.eswa.2025.127043},
url = {https://www.sciencedirect.com/science/article/pii/S0957417425006657},
author = {Julian Rosenberger and Lukas Wolfrum and Sven Weinzierl and Mathias Kraus and Patrick Zschech},
keywords = {Job consultation, Job markets, Job recommendation system, BERT, NLP},
abstract = {The rapidly evolving labor market, driven by technological advancements and economic shifts, presents significant challenges for traditional job matching and consultation services. In response, we introduce an advanced support tool for career counselors and job seekers based on CareerBERT, a novel approach that leverages the power of unstructured textual data sources, such as resumes, to provide more accurate and comprehensive job recommendations. In contrast to previous approaches that primarily focus on job recommendations based on a fixed set of concrete job advertisements, our approach involves the creation of a corpus that combines data from the European Skills, Competences, and Occupations (ESCO) taxonomy and EURopean Employment Services (EURES) job advertisements, ensuring an up-to-date and well-defined representation of general job titles in the labor market. Our two-step evaluation approach, consisting of an application-grounded evaluation using EURES job advertisements and a human-grounded evaluation using real-world resumes and Human Resources (HR) expert feedback, provides a comprehensive assessment of CareerBERTâ€™s performance. Our experimental results demonstrate that CareerBERT outperforms both traditional and state-of-the-art embedding approaches while showing robust effectiveness in human expert evaluations. These results confirm the effectiveness of CareerBERT in supporting career consultants by generating relevant job recommendations based on resumes, ultimately enhancing the efficiency of job consultations and expanding the perspectives of job seekers. This research contributes to the field of NLP and job recommendation systems, offering valuable insights for both researchers and practitioners in the domain of career consulting and job matching.}
}