Diagram Detr R50 Finetuned
This model is an object detection model based on the DETR architecture, fine-tuned on the bpmn-shapes dataset, suitable for detecting elements in diagrams.
Downloads 26
Release Time : 1/19/2024
Model Overview
A vision-based object detection model using the DETR architecture, specifically optimized for detecting shape elements in BPMN diagrams.
Model Features
Transformer-based detection architecture
Utilizes the DETR architecture, combining the strengths of Transformer and CNN for end-to-end object detection.
Optimized for diagram elements
Fine-tuned specifically on the bpmn-shapes dataset, making it ideal for detecting various shape elements in diagrams.
Efficient training process
Uses mixed-precision training and a linear learning rate scheduler to optimize training efficiency.
Model Capabilities
Diagram element detection
Object localization
Shape recognition
Use Cases
Business process modeling
BPMN diagram analysis
Automatically detects and identifies various process elements in BPMN diagrams.
Validation loss 0.9817
Document processing
Technical document parsing
Extracts diagrams and graphical elements from technical documents.
Featured Recommended AI Models
Qwen2.5 VL 7B Abliterated Caption It I1 GGUF
Apache-2.0
Quantized version of Qwen2.5-VL-7B-Abliterated-Caption-it, supporting multilingual image description tasks.
Image-to-Text
Transformers Supports Multiple Languages

Q
mradermacher
167
1
Nunchaku Flux.1 Dev Colossus
Other
The Nunchaku quantized version of the Colossus Project Flux, designed to generate high-quality images based on text prompts. This model minimizes performance loss while optimizing inference efficiency.
Image Generation English
N
nunchaku-tech
235
3
Qwen2.5 VL 7B Abliterated Caption It GGUF
Apache-2.0
This is a static quantized version based on the Qwen2.5-VL-7B model, focusing on image captioning generation tasks and supporting multiple languages.
Image-to-Text
Transformers Supports Multiple Languages

Q
mradermacher
133
1
Olmocr 7B 0725 FP8
Apache-2.0
olmOCR-7B-0725-FP8 is a document OCR model based on the Qwen2.5-VL-7B-Instruct model. It is fine-tuned using the olmOCR-mix-0225 dataset and then quantized to the FP8 version.
Image-to-Text
Transformers English

O
allenai
881
3
Lucy 128k GGUF
Apache-2.0
Lucy-128k is a model developed based on Qwen3-1.7B, focusing on proxy-based web search and lightweight browsing, and can run efficiently on mobile devices.
Large Language Model
Transformers English

L
Mungert
263
2
Š 2025AIbase