SciScore開源科學評分模型 - 免費評估隱含提示與生成圖像科學對齊度

首頁

Sciscore

由Jialuo21開發

SciScore是基於CLIP-H模型微調的科學評分模型，用於評估隱含提示與生成圖像之間的科學對齊程度。

文本生成圖像

Transformers

開源協議:Apache-2.0 #科學圖像對齊評分 #CLIP-H微調模型 #隱含提示評估

下載量 1,627

發布時間 : 3/17/2025

模型概述

SciScore是一個視覺-語言模型，專門設計用於評估科學圖像與其描述提示之間的對齊程度。它可以幫助識別和量化圖像合成中的科學準確性。

模型特點

科學對齊評估

專門設計用於評估科學圖像與其描述提示之間的對齊程度

高質量訓練數據

使用Science-T2I數據集進行微調，專注於科學準確性

CLIP基礎模型

基於強大的CLIP-ViT-H-14模型，具有良好的視覺-語言理解能力

模型能力

圖像-文本對齊評分

科學準確性評估

多模態理解

使用案例

科學研究

科學圖像生成評估

評估AI生成的科學圖像是否準確反映了其描述的科學概念

可量化圖像與科學描述之間的匹配程度

科學教育材料驗證

驗證教育材料中的圖像是否準確傳達了科學概念

幫助確保教育材料的科學準確性

AI生成內容

文本到圖像模型評估

評估不同文本到圖像模型生成科學圖像的準確性

提供客觀評分標準比較不同模型的科學表現

🚀 SciScore

SciScore是一個基於特定模型微調的工具，它以隱式提示和生成圖像為輸入，輸出代表兩者科學一致性的分數，有助於評估圖像與科學描述的匹配程度。

🚀 快速開始

from transformers import AutoProcessor, AutoModel
from PIL import Image
import torch

device = "cuda"
processor_name_or_path = "Jialuo21/SciScore"
model_pretrained_name_or_path = "Jialuo21/SciScore"

processor = AutoProcessor.from_pretrained(processor_name_or_path)
model = AutoModel.from_pretrained(model_pretrained_name_or_path).eval().to(device)

def calc_probs(prompt, images):
    
    image_inputs = processor(
        images=images,
        padding=True,
        truncation=True,
        max_length=77,
        return_tensors="pt",
    ).to(device)
    
    text_inputs = processor(
        text=prompt,
        padding=True,
        truncation=True,
        max_length=77,
        return_tensors="pt",
    ).to(device)

    with torch.no_grad():
        image_embs = model.get_image_features(**image_inputs)
        image_embs = image_embs / torch.norm(image_embs, dim=-1, keepdim=True)
    
        text_embs = model.get_text_features(**text_inputs)
        text_embs = text_embs / torch.norm(text_embs, dim=-1, keepdim=True)
    
        scores = model.logit_scale.exp() * (text_embs @ image_embs.T)[0]
        probs = torch.softmax(scores, dim=-1)
    return probs.cpu().tolist()

pil_images = [Image.open("./examples/camera_1.png"), Image.open("./examples/camera_2.png")]
prompt = "A camera screen without electricity sits beside the window, realistic."
print(calc_probs(prompt, pil_images))

✨ 主要特性

特性展示

SciScore在基礎模型CLIP-H上進行微調，使用了Science-T2I數據集。它能夠根據輸入的隱式提示和生成圖像，輸出代表兩者科學一致性的分數。

📚 詳細文檔

資源鏈接

引用信息

@misc{li2025sciencet2iaddressingscientificillusions,
  title={Science-T2I: Addressing Scientific Illusions in Image Synthesis}, 
  author={Jialuo Li and Wenhao Chai and Xingyu Fu and Haiyang Xu and Saining Xie},
  year={2025},
  eprint={2504.13129},
  archivePrefix={arXiv},
  primaryClass={cs.CV},
  url={https://arxiv.org/abs/2504.13129}, 
}