detr-doc-table-detectionオープンソースモデル - ドキュメントの有枠および無枠テーブルを検出するために無料で使用可能

ホーム

Detr Doc Table Detection

TahaDouajiによって開発

DETRアーキテクチャに基づく文書テーブル検出モデル、文書内の枠あり・枠なしテーブルを検出

物体検出

Transformers

オープンソースライセンス:Apache-2.0 #文書テーブル検出 #無枠テーブル認識 #エンドツーエンド物体検出

ダウンロード数 233.45k

リリース時間 : 3/11/2022

モデル概要

このモデルはfacebook/detr-resnet-50でトレーニングされた文書テーブル検出モデルで、文書内のテーブル領域を検出するために特別に設計されており、枠あり・枠なしテーブルの検出をサポートします。

モデル特徴

エンドツーエンド物体検出

Transformerアーキテクチャを採用したエンドツーエンドの物体検出を実現、複雑な後処理は不要

テーブル検出能力

文書内のテーブルに特化して最適化されており、枠あり・枠なしテーブルを検出可能

DETRアーキテクチャベース

DETRの先進的な物体検出能力を活用し、ResNet-50の特徴抽出と組み合わせ

モデル能力

文書テーブル検出

枠ありテーブル認識

無枠テーブル認識

物体検出

使用事例

文書処理

PDFテーブル抽出

PDF文書から自動的にテーブル領域を検出・抽出

文書内のテーブル位置を正確に識別可能

文書デジタル化

紙文書内のテーブルをデジタル形式に変換する支援

文書デジタル化の効率と精度を向上

🚀 detr-doc-table-detection モデルカード

detr-doc-table-detectionは、ドキュメント内の境界付きと境界なしの両方の表を検出するために訓練されたモデルです。このモデルは、facebook/detr-resnet-50をベースに構築されています。

🚀 クイックスタート

以下のコードを使用して、このモデルを始めることができます。

from transformers import DetrImageProcessor, DetrForObjectDetection
import torch
from PIL import Image
import requests

image = Image.open("IMAGE_PATH")

processor = DetrImageProcessor.from_pretrained("TahaDouaji/detr-doc-table-detection")
model = DetrForObjectDetection.from_pretrained("TahaDouaji/detr-doc-table-detection")

inputs = processor(images=image, return_tensors="pt")
outputs = model(**inputs)

# convert outputs (bounding boxes and class logits) to COCO API
# let's only keep detections with score > 0.9
target_sizes = torch.tensor([image.size[::-1]])
results = processor.post_process_object_detection(outputs, target_sizes=target_sizes, threshold=0.9)[0]

for score, label, box in zip(results["scores"], results["labels"], results["boxes"]):
    box = [round(i, 2) for i in box.tolist()]
    print(
            f"Detected {model.config.id2label[label.item()]} with confidence "
            f"{round(score.item(), 3)} at location {box}"
    )

✨ 主な機能

detr-doc-table-detectionは、ドキュメント内の境界付きと境界なしの両方の表を検出することができる物体検出モデルです。

📦 インストール

READMEにインストール手順は記載されていません。

💻 使用例

基本的な使用法

from transformers import DetrImageProcessor, DetrForObjectDetection
import torch
from PIL import Image
import requests

image = Image.open("IMAGE_PATH")

processor = DetrImageProcessor.from_pretrained("TahaDouaji/detr-doc-table-detection")
model = DetrForObjectDetection.from_pretrained("TahaDouaji/detr-doc-table-detection")

inputs = processor(images=image, return_tensors="pt")
outputs = model(**inputs)

# convert outputs (bounding boxes and class logits) to COCO API
# let's only keep detections with score > 0.9
target_sizes = torch.tensor([image.size[::-1]])
results = processor.post_process_object_detection(outputs, target_sizes=target_sizes, threshold=0.9)[0]

for score, label, box in zip(results["scores"], results["labels"], results["boxes"]):
    box = [round(i, 2) for i in box.tolist()]
    print(
            f"Detected {model.config.id2label[label.item()]} with confidence "
            f"{round(score.item(), 3)} at location {box}"
    )

📚 ドキュメント

モデルの詳細

detr-doc-table-detectionは、ドキュメント内の表を検出するために訓練されたモデルで、facebook/detr-resnet-50をベースにしています。

属性	详情
開発者	Taha Douaji
共有者	Taha Douaji
モデルタイプ	物体検出
言語	詳細情報が必要
ライセンス	詳細情報が必要
親モデル	facebook/detr-resnet-50
詳細情報リソース	モデルデモスペース、関連論文

用途

直接利用

このモデルは、物体検出のタスクに使用できます。

範囲外の利用

このモデルは、人々に敵意や疎外感を抱かせるような環境を意図的に作り出すために使用してはいけません。

バイアス、リスク、制限

多くの研究が、言語モデルのバイアスと公平性の問題を調査しています（例えば、Sheng et al. (2021) と Bender et al. (2021) を参照）。このモデルによって生成された予測には、保護されたクラス、アイデンティティの特性、および敏感な社会的・職業的グループにまたがる有害なステレオタイプが含まれる可能性があります。

⚠️ 重要提示

ユーザー（直接ユーザーと下流ユーザーの両方）は、このモデルのリスク、バイアス、および制限について認識すべきです。さらなる推奨事項については、詳細情報が必要です。

訓練の詳細

訓練データ

このモデルは、ICDAR2019 Table Datasetを使用して訓練されました。

環境への影響

炭素排出量は、Lacoste et al. (2019)で提示されたMachine Learning Impact calculatorを使用して推定できます。

引用

@article{DBLP:journals/corr/abs-2005-12872,
  author    = {Nicolas Carion and
               Francisco Massa and
               Gabriel Synnaeve and
               Nicolas Usunier and
               Alexander Kirillov and
               Sergey Zagoruyko},
  title     = {End-to-End Object Detection with Transformers},
  journal   = {CoRR},
  volume    = {abs/2005.12872},
  year      = {2020},
  url       = {https://arxiv.org/abs/2005.12872},
  archivePrefix = {arXiv},
  eprint    = {2005.12872},
  timestamp = {Thu, 28 May 2020 17:38:09 +0200},
  biburl    = {https://dblp.org/rec/journals/corr/abs-2005-12872.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}