flan - t5 - small - keywordsオープンソースキーワード抽出モデル - 段落からキーワードを簡単かつ正確に抽出

ホーム

Flan T5 Small Keywords

agentlansによって開発

Flan-T5小型版をファインチューニングしたキーワード抽出モデルで、段落からキーワードを抽出するために特別に設計されています

大規模言語モデル

Transformers

英語オープンソースライセンス:MIT #英文キーワード抽出 #テキスト要約最適化 #SEO補助ツール

ダウンロード数 1,101

リリース時間 : 9/10/2024

モデル概要

このモデルはT5アーキテクチャの強力な能力を利用して、テキストの核心内容を要約するキーフレーズを識別し出力します。長文要約、記事タグ生成、文書の核心テーマ識別に適しています

モデル特徴

段落キーワード抽出

長い段落からテキストの核心内容を要約するキーワードやキーフレーズを正確に抽出できます

多目的アプリケーション

テキスト要約、タグ生成、SEOキーワード識別など、さまざまなシナリオに適用可能です

Flan-T5アーキテクチャベース

T5アーキテクチャの強力な能力を利用して、シーケンスからシーケンスへのタスク処理を行います

モデル能力

テキストキーワード抽出

長文要約

キーフレーズ生成

使用事例

コンテンツ管理

記事タグ生成

ブログや記事に自動的にタグを生成します

コンテンツ分類と検索効率の向上

メタデータ生成

コンテンツ管理システムのためにメタデータを生成します

コンテンツ管理プロセスの簡素化

SEO最適化

SEOキーワード識別

SEO最適化のために文書内の核心キーワードを識別します

ウェブページの検索ランキング向上

🚀 キーワード抽出モデル

このモデルは、Flan-T5 small モデルを微調整したもので、段落からキーワードを抽出するために特化しています。T5アーキテクチャの力を活用して、入力テキストの本質を捉えるキーフレーズを識別して出力します。

✨ 主な機能

このモデルは段落を入力として受け取り、テキストの主要なトピックやテーマを要約するキーワードまたはキーフレーズのリストを生成します。以下の用途に特に役立ちます：

長いテキストの要約
記事やブログ投稿のタグ生成
文書内の主要なテーマの特定

🚀 クイックスタート

想定される用途と制限

想定される用途:

長い段落の迅速な要約
コンテンツ管理システムのメタデータ生成
SEOキーワードの特定支援

制限事項:

モデルは時々関連性のないキーワードを生成することがあります。
パフォーマンスは入力テキストの長さと複雑さによって異なります。
- 最良の結果を得るには、長くクリーンなテキストを使用してください。
- Flan-T5アーキテクチャの制限により、長さの上限は512トークンです。
このモデルは英語のテキストで学習されており、他の言語ではうまく機能しない可能性があります。

学習と評価

このモデルは、英語のウィキペディアの段落とそれに対応するキーワードのデータセットで微調整されました。様々なトピックが含まれており、幅広い適用性が保証されています。

📦 インストール

このモデルを使用するには、transformers ライブラリが必要です。以下のコマンドでインストールできます：

pip install transformers

💻 使用例

基本的な使用法

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model_name = "agentlans/flan-t5-small-keywords"
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

input_text = "Your paragraph here..."
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=512)
decoded_output = tokenizer.decode(outputs[0], skip_special_tokens=True)

# Process the output to get a list of keywords (split and remove duplicates)
keywords = list(set(decoded_output.split('||')))
print(keywords)

入力段落の例

In the heart of the bustling city, a hidden gem awaits discovery: a quaint little bookstore that seems to have escaped the relentless march of time. As you step inside, the scent of aged paper and rich coffee envelops you, creating an inviting atmosphere that beckons you to explore its shelves. Each corner is adorned with carefully curated collections, from classic literature to contemporary bestsellers, inviting readers of all tastes to lose themselves in the pages of a good book. The soft glow of warm lighting casts a cozy ambiance, while the gentle hum of conversation among fellow book lovers adds to the charm. This bookstore is not just a place to buy books; it's a sanctuary for those seeking solace, inspiration, and a sense of community in the fast-paced world outside.

出力キーワードの例

['old paper coffee scent', 'cosy hum of conversation', 'quaint bookstore', 'community in the fast-paced world', 'solace inspiration', 'curated collections']