P

Pix2struct Chartqa Base

Developed by google
Pix2Struct is an image encoder-text decoder model trained on image-text pairs for multitasking, specifically fine-tuned for chart question answering tasks
Downloads 181
Release Time : 3/21/2023

Model Overview

This model is a fine-tuned version of the Pix2Struct architecture on the ChartQA dataset, designed for parsing chart images and answering related questions, with support for multilingual chart comprehension

Model Features

Multitask Pretraining
Pretrained on multiple tasks including image caption generation and visual question answering to enhance comprehension capabilities
Multilingual Support
Supports chart understanding in multiple languages including English, French, Romanian, and German
HTML Structure Parsing
Innovatively pretrained by parsing webpage screenshot masks into simplified HTML, enriching visual element understanding

Model Capabilities

Chart Image Understanding
Visual Question Answering
Multilingual Text Generation
Structured Data Extraction

Use Cases

Education
Textbook Chart Analysis
Helps students understand complex charts and data visualizations in textbooks
Accurately answers various questions about chart data
Business Intelligence
Business Report Analysis
Automatically parses charts and data visualizations in business reports
Quickly extracts key business metrics and trend information
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase