Qwen2.5-14B-CIC-ACLARC開源模型 - 免費部署實現科學出版物引文意圖分類

首頁

Qwen2.5 14B CIC ACLARC

由sknow-lab開發

基於Qwen 2.5 14B Instruct微調的引文意圖分類模型，專門用於科學出版物中的引文意圖分類。

文本分類

Transformers

英語開源協議:Apache-2.0 #引文意圖分類 #科學計量分析 #零樣本推理

下載量 24

發布時間 : 2/21/2025

模型概述

該模型在ACL-ARC數據集上微調，用於對科學出版物中的引文意圖進行分類，支持六種引文意圖類別。

模型特點

零樣本分類

支持零樣本分類任務，無需額外訓練即可進行分類。

科學計量學應用

專門針對科學計量學和引文分析領域優化。

多類別分類

支持六種引文意圖類別的分類。

模型能力

引文意圖分類

零樣本分類

科學文本分析

使用案例

學術研究

引文意圖分析

分析科學論文中的引文意圖，幫助研究者瞭解文獻引用關係。

準確分類六種引文意圖類別。

文獻計量學

文獻影響力評估

通過引文意圖分類評估文獻的影響力和使用情況。

🚀 Qwen2.5-14B-CIC-ACLARC

這是一個針對引文意圖分類進行微調的模型，基於 Qwen 2.5 14B Instruct 構建，並在 ACL-ARC 數據集上進行訓練。

GGUF 版本：https://huggingface.co/sknow-lab/Qwen2.5-14B-CIC-ACLARC-GGUF

🚀 快速開始

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "sknow-lab/Qwen2.5-14B-CIC-ACLARC"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

system_prompt = """
# CONTEXT #
You are an expert researcher tasked with classifying the intent of a citation in a scientific publication.

########

# OBJECTIVE # 
You will be given a sentence containing a citation, you must output the appropriate class as an answer.

########

# CLASS DEFINITIONS #

The six (6) possible classes are the following: "BACKGROUND", "MOTIVATION", "USES", "EXTENDS", "COMPARES_CONTRASTS", "FUTURE".

The definitions of the classes are:
1 - BACKGROUND: The cited paper provides relevant Background information or is part of the body of literature.
2 - MOTIVATION: The citing paper is directly motivated by the cited paper.
3 - USES: The citing paper uses the methodology or tools created by the cited paper.
4 - EXTENDS: The citing paper extends the methods, tools or data, etc. of the cited paper.
5 - COMPARES_CONTRASTS: The citing paper expresses similarities or differences to, or disagrees with, the cited paper.
6 - FUTURE: The cited paper may be a potential avenue for future work.

########

# RESPONSE RULES #
- Analyze only the citation marked with the @@CITATION@@ tag.
- Assign exactly one class to each citation.
- Respond only with the exact name of one of the following classes: "BACKGROUND", "MOTIVATION", "USES", "EXTENDS", "COMPARES_CONTRASTS", "FUTURE".
- Do not provide any explanation or elaboration.
"""

test_citing_sentence = "However , the method we are currently using in the ATIS domain ( @@CITATION@@ ) represents our most promising approach to this problem."

user_prompt = f"""
{test_citing_sentence}
### Question: Which is the most likely intent for this citation?
a) BACKGROUND
b) MOTIVATION
c) USES
d) EXTENDS
e) COMPARES_CONTRASTS
f) FUTURE
### Answer:
"""

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": user_prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
# Response: USES

關於系統提示和查詢模板的詳細信息可以在論文中找到。

可能需要一個清理函數來從輸出中提取預測標籤。你可以在 GitHub 上找到我們的實現。

📚 詳細文檔

ACL-ARC 類別

類別	描述
背景信息	被引用的論文提供了相關的背景信息，或是文獻體系的一部分。
動機	引用論文直接受到被引用論文的啟發。
使用	引用論文使用了被引用論文所創建的方法或工具。
擴展	引用論文對被引用論文的方法、工具或數據等進行了擴展。
比較或對比	引用論文表達了與被引用論文的相似性、差異性或不同觀點。
未來方向	被引用的論文可能是未來工作的潛在方向。

📄 許可證

本項目採用 Apache-2.0 許可證。

📚 引用

@misc{koloveas2025llmspredictcitationintent,
      title={Can LLMs Predict Citation Intent? An Experimental Analysis of In-context Learning and Fine-tuning on Open LLMs}, 
      author={Paris Koloveas and Serafeim Chatzopoulos and Thanasis Vergoulis and Christos Tryfonopoulos},
      year={2025},
      eprint={2502.14561},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2502.14561}, 
}