P

Paligemma 3b Mix 448 Ft TableDetection

Developed by ucsahin
A multimodal table detection model fine-tuned from google/paligemma-3b-mix-448, specialized in identifying table regions in images
Downloads 19
Release Time : 5/25/2024

Model Overview

This model predicts bounding box coordinates for tables in images by combining visual and textual inputs, suitable for document processing and data extraction scenarios

Model Features

Multimodal input processing
Supports simultaneous processing of image and text inputs for vision-language joint understanding
High-precision table detection
Fine-tuned on the pubtables-detection dataset, specifically optimized for table region recognition
Standardized output format
Outputs normalized coordinate values for easy conversion to various bounding box formats

Model Capabilities

Table detection in images
Bounding box coordinate prediction
Multimodal understanding

Use Cases

Document processing
PDF table extraction
Automatically locate table regions in scanned documents
Outputs standardized coordinates for subsequent OCR processing
Data collection
Web screenshot analysis
Identify table structures in screenshots
Provides positioning references for web scrapers
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase