🚀 BLaIR-roberta-base
BLaIR(即“Bridging Language and Items for Retrieval and Recommendation”的缩写)是一系列在2023年亚马逊评论数据集上预训练的语言模型。该模型基于*(商品元数据, 语言上下文)*对进行训练,使其能够:
- 为推荐和检索任务生成强大的商品文本表示;
- 根据简单或复杂的语言上下文预测最相关的商品。
[📑 论文] · [💻 代码] · [🌐 2023年亚马逊评论数据集] · [🤗 Huggingface数据集] · [🔬 McAuley实验室]
🚀 快速开始
模型详情
使用HuggingFace调用模型
import torch
from transformers import AutoModel, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("hyp1231/blair-roberta-base")
model = AutoModel.from_pretrained("hyp1231/blair-roberta-base")
language_context = 'I need a product that can scoop, measure, and rinse grains without the need for multiple utensils and dishes. It would be great if the product has measurements inside and the ability to rinse and drain all in one. I just have to be careful not to pour too much accidentally.'
item_metadata = [
'Talisman Designs 2-in-1 Measure Rinse & Strain | Holds up to 2 Cups | Food Strainer | Fruit Washing Basket | Strainer & Colander for Kitchen Sink | Dishwasher Safe - Dark Blue. The Measure Rinse & Strain by Talisman Designs is a 2-in-1 kitchen colander and strainer that will measure and rinse up to two cups. Great for any type of food from rice, grains, beans, fruit, vegetables, pasta and more. After measuring, fill with water and swirl to clean. Strain then pour into your pot, pan, or dish. The convenient size is easy to hold with one hand and is compact to fit into a kitchen cabinet or pantry. Dishwasher safe and food safe.',
'FREETOO Airsoft Gloves Men Tactical Gloves for Hiking Cycling Climbing Outdoor Camping Sports (Not Support Screen Touch).'
]
texts = [language_context] + item_metadata
inputs = tokenizer(texts, padding=True, truncation=True, max_length=512, return_tensors="pt")
with torch.no_grad():
embeddings = model(**inputs, return_dict=True).last_hidden_state[:, 0]
embeddings = embeddings / embeddings.norm(dim=1, keepdim=True)
print(embeddings[0] @ embeddings[1])
print(embeddings[0] @ embeddings[2])
📄 许可证
本项目采用MIT许可证。
📚 引用
如果您发现2023年亚马逊评论数据集、BLaIR模型检查点、Amazon - C4数据集或我们的脚本/代码有帮助,请引用以下论文。
@article{hou2024bridging,
title={Bridging Language and Items for Retrieval and Recommendation},
author={Hou, Yupeng and Li, Jiacheng and He, Zhankui and Yan, An and Chen, Xiusi and McAuley, Julian},
journal={arXiv preprint arXiv:2403.03952},
year={2024}
}
📞 联系我们
如果您遇到了问题或有任何建议/疑问,请提交一个issue,或者通过电子邮件联系Yupeng Hou:yphou@ucsd.edu。