Qwen2.5 - 14B - CIC - ACLARCオープンソースモデル - 無料でデプロイし、科学出版物の引用意図分類を実現する

ホーム

Qwen2.5 14B CIC ACLARC

sknow-labによって開発

Qwen 2.5 14B Instructをファインチューニングした引用意図分類モデルで、科学出版物における引用意図分類に特化しています。

テキスト分類

Transformers

英語オープンソースライセンス:Apache-2.0 #引用意図分類 #科学計量分析 #ゼロショット推論

ダウンロード数 24

リリース時間 : 2/21/2025

モデル概要

このモデルはACL-ARCデータセットでファインチューニングされ、科学出版物における引用意図を分類するために使用され、6種類の引用意図カテゴリをサポートします。

モデル特徴

ゼロショット分類

追加のトレーニングなしで分類タスクを実行可能なゼロショット分類をサポートします。

科学計量学への応用

科学計量学および引用分析分野に特化して最適化されています。

多クラス分類

6種類の引用意図カテゴリの分類をサポートします。

モデル能力

引用意図分類

ゼロショット分類

科学テキスト分析

使用事例

学術研究

引用意図分析

科学論文における引用意図を分析し、研究者が文献引用関係を理解するのを支援します。

6種類の引用意図カテゴリを正確に分類します。

文献計量学

文献影響力評価

引用意図分類を通じて文献の影響力と使用状況を評価します。

🚀 Qwen2.5-14B-CIC-ACLARC

引用意図分類のために微調整されたモデルで、Qwen 2.5 14B Instruct をベースに、ACL-ARC データセットで学習されました。

GGUFバージョン: https://huggingface.co/sknow-lab/Qwen2.5-14B-CIC-ACLARC-GGUF

🚀 クイックスタート

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "sknow-lab/Qwen2.5-14B-CIC-ACLARC"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

system_prompt = """
# CONTEXT #
You are an expert researcher tasked with classifying the intent of a citation in a scientific publication.

########

# OBJECTIVE # 
You will be given a sentence containing a citation, you must output the appropriate class as an answer.

########

# CLASS DEFINITIONS #

The six (6) possible classes are the following: "BACKGROUND", "MOTIVATION", "USES", "EXTENDS", "COMPARES_CONTRASTS", "FUTURE".

The definitions of the classes are:
1 - BACKGROUND: The cited paper provides relevant Background information or is part of the body of literature.
2 - MOTIVATION: The citing paper is directly motivated by the cited paper.
3 - USES: The citing paper uses the methodology or tools created by the cited paper.
4 - EXTENDS: The citing paper extends the methods, tools or data, etc. of the cited paper.
5 - COMPARES_CONTRASTS: The citing paper expresses similarities or differences to, or disagrees with, the cited paper.
6 - FUTURE: The cited paper may be a potential avenue for future work.

########

# RESPONSE RULES #
- Analyze only the citation marked with the @@CITATION@@ tag.
- Assign exactly one class to each citation.
- Respond only with the exact name of one of the following classes: "BACKGROUND", "MOTIVATION", "USES", "EXTENDS", "COMPARES_CONTRASTS", "FUTURE".
- Do not provide any explanation or elaboration.
"""

test_citing_sentence = "However , the method we are currently using in the ATIS domain ( @@CITATION@@ ) represents our most promising approach to this problem."

user_prompt = f"""
{test_citing_sentence}
### Question: Which is the most likely intent for this citation?
a) BACKGROUND
b) MOTIVATION
c) USES
d) EXTENDS
e) COMPARES_CONTRASTS
f) FUTURE
### Answer:
"""

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": user_prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
# Response: USES

システムプロンプトとクエリテンプレートの詳細は論文に記載されています。

出力から予測ラベルを抽出するためのクリーンアップ関数が必要になる場合があります。当社の関数は GitHub で見つけることができます。

✨ 主な機能

このモデルは、引用意図分類のために微調整されており、ACL-ARCデータセットを用いて学習されています。以下の6つのクラスに分類できます。

クラス	説明
Background	引用された論文は関連する背景情報を提供するか、文献の一部です。
Motivation	引用する論文は、引用された論文に直接触発されています。
Uses	引用する論文は、引用された論文で作成された方法論やツールを使用しています。
Extends	引用する論文は、引用された論文の方法、ツール、データなどを拡張しています。
Comparison or Contrast	引用する論文は、引用された論文との類似点や相違点を表現するか、引用された論文に反対しています。
Future	引用された論文は、将来の研究の潜在的な道筋となる可能性があります。

📄 ライセンス

このプロジェクトは、Apache-2.0ライセンスの下で公開されています。

📚 引用

@misc{koloveas2025llmspredictcitationintent,
      title={Can LLMs Predict Citation Intent? An Experimental Analysis of In-context Learning and Fine-tuning on Open LLMs}, 
      author={Paris Koloveas and Serafeim Chatzopoulos and Thanasis Vergoulis and Christos Tryfonopoulos},
      year={2025},
      eprint={2502.14561},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2502.14561}, 
}