🚀 BART(大型模型),在亚马逊评论(英语)上微调
BART模型最初在CNN - DailyMail数据集上进行预训练,之后在亚马逊网站的英文购买评论数据上进行了重新训练。这样做的目的是构建一个专门用于总结亚马逊网站用户评论的流程。
🚀 快速开始
本模型旨在用于总结网站上的用户评论。以下是使用pipeline API调用此模型的示例:
from transformers import pipeline
summarizer = pipeline("summarization", model="mabrouk/amazon-review-summarizer-bart")
review = """ I really like this book. It takes a step-by-step approach to introduce the reader to the IBM Q Experience, to the basics underlying quantum computing, and to the reality of the noise involved in the current machines. This introduction is technical and shows the user how to use the IBM system either directly through the GUI on their website or by running Python code on one's own machine. The text provides examples of small exercises to try and stimulates ideas of new things to try. The IBM Q Exp Qiskit software modules are identified and introduced - Terra, Aer, Ignis, and Aqua, as well as the backends that one can choose to do the computing. The book ends with two great chapters on quantum algorithms.
"""
print(summarizer(review, min_length = 60))
>>> [{'summary': 'This book is a great resource, and a great read, to learn about quantum and start writing your first programs, or to brush up on your programming skills. I loved that there is a quiz at the end of every chapter so you can check and see how...'}]
✨ 主要特性
根据Hugging Face的介绍,BART是一种Transformer编解码器(seq2seq)模型,具有双向(类似BERT)编码器和自回归(类似GPT)解码器。BART的预训练过程包括:(1)使用任意噪声函数对文本进行损坏;(2)学习一个模型来重构原始文本。
📦 安装指南
文档未提及安装步骤,故跳过此章节。
💻 使用示例
基础用法
from transformers import pipeline
summarizer = pipeline("summarization", model="mabrouk/amazon-review-summarizer-bart")
review = """ I really like this book. It takes a step-by-step approach to introduce the reader to the IBM Q Experience, to the basics underlying quantum computing, and to the reality of the noise involved in the current machines. This introduction is technical and shows the user how to use the IBM system either directly through the GUI on their website or by running Python code on one's own machine. The text provides examples of small exercises to try and stimulates ideas of new things to try. The IBM Q Exp Qiskit software modules are identified and introduced - Terra, Aer, Ignis, and Aqua, as well as the backends that one can choose to do the computing. The book ends with two great chapters on quantum algorithms.
"""
print(summarizer(review, min_length = 60))
>>> [{'summary': 'This book is a great resource, and a great read, to learn about quantum and start writing your first programs, or to brush up on your programming skills. I loved that there is a quiz at the end of every chapter so you can check and see how...'}]
📚 详细文档
数据集
链接:亚马逊评论语料库
参考资料
预训练模型:facebook/bart-large-cnn
重新训练数据集:亚马逊评论语料库
🔧 技术细节
文档未提供具体的技术实现细节,故跳过此章节。
📄 许可证
文档未提及许可证信息,故跳过此章节。