yolov8s-signature-detector开源模型 - 精准定位文档图像中手写签名

首页

Yolov8s Signature Detector

由 tech4humans 开发

基于YOLOv8s微调的手写签名检测模型，专门用于文档图像中的签名定位

目标检测

TensorBoard

#手写签名检测 #文档处理 #高精度目标检测

下载量 28.14k

发布时间 : 1/3/2025

模型简介

该模型是基于Ultralytics YOLOv8s架构微调的目标检测模型，专门用于在各种文档图像中检测和定位手写签名区域。

模型特点

高精度签名检测

在测试集上达到94.5%的mAP@0.5精度

轻量级架构

基于YOLOv8s小型架构，平衡了精度和推理速度

多格式支持

支持PyTorch、ONNX和TensorRT等多种部署格式

专业数据集训练

使用2,819张专业标注的文档签名图像进行训练

模型能力

文档图像中的签名检测

签名区域边界框预测

多签名检测

使用案例

文档处理

合同签名验证

自动检测合同文件中的签名位置

可用于自动化合同处理流程

银行文件处理

识别支票、申请表等金融文件中的签名

提高金融文件处理效率

办公自动化

电子文档归档

自动标记文档中的签名区域以便归档

简化文档管理系统

🚀 YOLOv8s - 手写签名检测

本仓库展示了一个基于 YOLOv8s 的模型，该模型经过微调，可用于检测文档图像中的手写签名。

资源	链接 / 徽章	详情
文章		一篇详细的社区文章，涵盖了项目的完整开发过程
模型文件		可用格式：
数据集 - 原始		2,819 张带有签名坐标注释的文档图像
数据集 - 处理后		用于模型训练的增强和预处理版本（640 像素）
笔记本 - 模型实验		完整的训练和评估管道，可在不同架构（yolo、detr、rt-detr、conditional-detr、yolos）中进行选择
笔记本 - 超参数调优		使用 Optuna 进行试验，以优化精确率/召回率平衡
推理服务器		使用 Triton 推理服务器的完整部署和推理管道
实时演示		具有实时推理功能的图形界面

🚀 快速开始

本模型可以通过 CLI 或使用 Ultralytics 库的 Python 代码来使用。也可以直接使用 ONNX Runtime 或 TensorRT。

最终的权重文件可在仓库的主目录中找到：

yolov8s.pt（PyTorch 格式）
yolov8s.onnx（ONNX 格式）
yolov8s.engine（TensorRT 格式）

Python 代码

依赖安装

pip install ultralytics supervision huggingface_hub

推理代码

import cv2
import supervision as sv

from huggingface_hub import hf_hub_download
from ultralytics import YOLO

model_path = hf_hub_download(
  repo_id="tech4humans/yolov8s-signature-detector", 
  filename="yolov8s.pt"
)

model = YOLO(model_path)

image_path = "/path/to/your/image.jpg"
image = cv2.imread(image_path)

results = model(image_path)

detections = sv.Detections.from_ultralytics(results[0])

box_annotator = sv.BoxAnnotator()
annotated_image = box_annotator.annotate(scene=image, detections=detections)

cv2.imshow("Detections", annotated_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

请确保图像和模型文件的路径正确。

CLI

依赖安装

pip install -U ultralytics "huggingface_hub[cli]"

推理命令

huggingface-cli download tech4humans/yolov8s-signature-detector yolov8s.pt

yolo predict model=yolov8s.pt source=caminho/para/imagem.jpg

参数说明：

model：模型权重文件的路径。
source：用于检测的图像或图像目录的路径。

ONNX Runtime

若要进行优化推理，你可以在 handler.py 文件和 Hugging Face Space 此处中找到使用 onnxruntime 和 OpenVINO Execution Provider 的推理代码。

✨ 主要特性

精准检测：经过微调的 YOLOv8s 模型，能有效检测文档图像中的手写签名。
多格式支持：提供 PyTorch、ONNX 和 TensorRT 等多种格式的模型文件。
详细实验记录：记录了模型选择、超参数调优等详细实验过程。

📦 安装指南

Python 代码依赖安装

pip install ultralytics supervision huggingface_hub

CLI 依赖安装

pip install -U ultralytics "huggingface_hub[cli]"

💻 使用示例

基础用法

import cv2
import supervision as sv

from huggingface_hub import hf_hub_download
from ultralytics import YOLO

model_path = hf_hub_download(
  repo_id="tech4humans/yolov8s-signature-detector", 
  filename="yolov8s.pt"
)

model = YOLO(model_path)

image_path = "/path/to/your/image.jpg"
image = cv2.imread(image_path)

results = model(image_path)

detections = sv.Detections.from_ultralytics(results[0])

box_annotator = sv.BoxAnnotator()
annotated_image = box_annotator.annotate(scene=image, detections=detections)

cv2.imshow("Detections", annotated_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

高级用法

在实际应用中，你可以根据需求对检测结果进行进一步处理，例如将检测到的签名区域进行裁剪保存等。以下是一个简单示例：

import cv2
import supervision as sv

from huggingface_hub import hf_hub_download
from ultralytics import YOLO

model_path = hf_hub_download(
  repo_id="tech4humans/yolov8s-signature-detector", 
  filename="yolov8s.pt"
)

model = YOLO(model_path)

image_path = "/path/to/your/image.jpg"
image = cv2.imread(image_path)

results = model(image_path)

detections = sv.Detections.from_ultralytics(results[0])

for xyxy in detections.xyxy:
    x1, y1, x2, y2 = map(int, xyxy)
    signature_area = image[y1:y2, x1:x2]
    cv2.imwrite(f"signature_{x1}_{y1}.jpg", signature_area)

📚 详细文档

数据集

训练使用了一个由两个公共数据集构建的数据集：Tobacco800 和 signatures-xc8up，这些数据集在 Roboflow 中进行了统一和处理。

数据集摘要：

训练集：1,980 张图像（70%）
验证集：420 张图像（15%）
测试集：419 张图像（15%）
格式：COCO JSON
分辨率：640x640 像素

Roboflow Dataset

训练过程

1. 模型选择

评估了各种目标检测模型，以确定在精确率、召回率和推理时间之间达到最佳平衡的模型。

指标	rtdetr-l	yolos-base	yolos-tiny	conditional-detr-resnet-50	detr-resnet-50	yolov8x	yolov8l	yolov8m	yolov8s	yolov8n	yolo11x	yolo11l	yolo11m	yolo11s	yolo11n	yolov10x	yolov10l	yolov10b	yolov10m	yolov10s	yolov10n
推理时间 - CPU（ms）	583.608	1706.49	265.346	476.831	425.649	1259.47	871.329	401.183	216.6	110.442	1016.68	518.147	381.652	179.792	106.656	821.183	580.767	473.109	320.12	150.076	73.8596
mAP50	0.92709	0.901154	0.869814	0.936524	0.88885	0.794237	0.800312	0.875322	0.874721	0.816089	0.667074	0.707409	0.809557	0.835605	0.813799	0.681023	0.726802	0.789835	0.787688	0.663877	0.734332
mAP50 - 95	0.622364	0.583569	0.469064	0.653321	0.579428	0.552919	0.593976	0.665495	0.65457	0.623963	0.482289	0.499126	0.600797	0.638849	0.617496	0.474535	0.522654	0.578874	0.581259	0.473857	0.552704

Model Selection

亮点

最佳 mAP50：conditional-detr-resnet-50（0.936524）
最佳 mAP50 - 95：yolov8m（0.665495）
最快推理时间：yolov10n（73.8596 ms）

详细实验可在 Weights & Biases 上查看。

2. 超参数调优

选择了在推理时间、精确率和召回率方面表现出良好平衡的 YOLOv8s 模型进行超参数调优。

使用 Optuna 进行了 20 次优化试验。超参数调优使用了以下参数配置：

    dropout = trial.suggest_float("dropout", 0.0, 0.5, step=0.1)
    lr0 = trial.suggest_float("lr0", 1e-5, 1e-1, log=True)
    box = trial.suggest_float("box", 3.0, 7.0, step=1.0)
    cls = trial.suggest_float("cls", 0.5, 1.5, step=0.2)
    opt = trial.suggest_categorical("optimizer", ["AdamW", "RMSProp"])

结果可在此处查看：超参数调优实验。

Hypertuning Sweep

3. 评估

在训练结束时，以 ONNX（CPU）和 TensorRT（GPU - T4）格式在测试集上对模型进行了评估。性能指标包括精确率、召回率、mAP50 和 mAP50 - 95。

Trials

结果对比

指标	基础模型	最佳试验（#10）	差异
mAP50	87.47%	95.75%	+8.28%
mAP50 - 95	65.46%	66.26%	+0.81%
精确率	97.23%	95.61%	-1.63%
召回率	76.16%	91.21%	+15.05%
F1 分数	85.42%	93.36%	+7.94%

结果

在对 YOLOv8s 模型进行超参数调优后，最佳模型在测试集上取得了以下结果：

精确率：94.74%
召回率：89.72%
mAP@50：94.50%
mAP@50 - 95：67.35%
推理时间：
- ONNX Runtime（CPU）：171.56 ms
- TensorRT（GPU - T4）：7.657 ms

演示

你可以在基于 Gradio 和 ONNXRuntime 构建的 Hugging Face Spaces 演示中探索模型并测试实时推理。

推理服务器

如果你想在生产环境中部署此签名检测模型，请查看我们基于 NVIDIA Triton 推理服务器的推理服务器仓库。

基础设施

软件

模型的训练和调优使用了 Jupyter Notebook 环境。

操作系统：Ubuntu 22.04
Python：3.10.12
PyTorch：2.5.1 + cu121
Ultralytics：8.3.58
Roboflow：1.1.50
Optuna：4.1.0
ONNX Runtime：1.20.1
TensorRT：10.7.0

硬件

训练在具有以下规格的 Google Cloud Platform n1 - standard - 8 实例上进行：

CPU：8 vCPUs
GPU：NVIDIA Tesla T4

🔧 技术细节

本项目围绕手写签名检测任务，采用了 YOLOv8s 作为基础模型。在数据集方面，整合了多个公共数据集，并进行了统一和预处理。在训练过程中，对多种目标检测模型进行了评估，以找到精确率、召回率和推理时间的最佳平衡。同时，使用 Optuna 进行超参数调优，进一步提升了模型性能。在推理阶段，支持多种格式和环境，包括 PyTorch、ONNX 和 TensorRT，以满足不同场景的需求。

📄 许可证

模型权重（微调模型） – AGPL - 3.0

许可证：GNU Affero General Public License v3.0（AGPL - 3.0）
使用规定：从 Ultralytics 的 YOLOv8 模型派生的微调模型权重遵循 AGPL - 3.0 许可。这要求对这些模型权重的任何修改或派生作品也必须在 AGPL - 3.0 下分发，如果模型作为网络服务的一部分使用，则必须提供相应的源代码。