🚀 英文个人信息匿名化器OpenPII (Ai4Privacy)
该模型旨在从英文文本中编辑个人身份信息(PII)。它仅在open-pii-masking-500k-ai4privacy数据集的英文子集上进行了微调。
📦 模型信息
属性 |
详情 |
模型类型 |
英文个人信息匿名化器OpenPII (Ai4Privacy) |
训练数据 |
open-pii-masking-500k-ai4privacy数据集的英文子集 |
任务类型 |
标记分类(PII Masking) |
评估指标 |
F1分数、精确率、召回率、准确率 |
库名称 |
transformers |
管道标签 |
标记分类 |
许可证 |
MIT |
🔍 评估指标
以下表格总结了每个PII标签的详细评估结果:
标签 |
真阳性(TP) |
假阳性(FP) |
假阴性(FN) |
准确率 |
精确率 |
召回率 |
F1分数 |
SURNAME |
3724 |
0 |
26 |
99.31% |
100.0% |
99.31% |
99.65% |
O (非PII) |
0 |
368 |
0 |
99.36% |
不适用 |
不适用 |
不适用 |
TIME |
1934 |
0 |
2 |
99.90% |
100.0% |
99.90% |
99.95% |
DRIVERLICENSENUM |
505 |
0 |
2 |
99.61% |
100.0% |
99.61% |
99.80% |
PASSPORTNUM |
566 |
0 |
0 |
100.0% |
100.0% |
100.0% |
100.0% |
GIVENNAME |
7557 |
0 |
163 |
97.89% |
100.0% |
97.89% |
98.93% |
TELEPHONENUM |
3637 |
0 |
4 |
99.89% |
100.0% |
99.89% |
99.95% |
BUILDINGNUM |
418 |
0 |
8 |
98.12% |
100.0% |
98.12% |
99.05% |
AGE |
164 |
0 |
5 |
97.04% |
100.0% |
97.04% |
98.50% |
DATE |
2335 |
0 |
0 |
100.0% |
100.0% |
100.0% |
100.0% |
CITY |
1717 |
0 |
85 |
95.28% |
100.0% |
95.28% |
97.58% |
TITLE |
363 |
0 |
21 |
94.53% |
100.0% |
94.53% |
97.19% |
IDCARDNUM |
2008 |
0 |
12 |
99.41% |
100.0% |
99.41% |
99.70% |
GENDER |
120 |
0 |
1 |
99.17% |
100.0% |
99.17% |
99.59% |
CREDITCARDNUMBER |
555 |
0 |
3 |
99.46% |
100.0% |
99.46% |
99.73% |
SEX |
77 |
0 |
2 |
97.47% |
100.0% |
97.47% |
98.72% |
STREET |
1379 |
0 |
8 |
99.42% |
100.0% |
99.42% |
99.71% |
TAXNUM |
343 |
0 |
14 |
96.08% |
100.0% |
96.08% |
98.00% |
EMAIL |
2607 |
0 |
1 |
99.96% |
100.0% |
99.96% |
99.98% |
SOCIALNUM |
421 |
0 |
1 |
99.76% |
100.0% |
99.76% |
99.88% |
ZIPCODE |
418 |
0 |
8 |
98.12% |
100.0% |
98.12% |
99.05% |
整体评估
- 准确率:99.17%
- 精确率:98.82%
- 召回率:98.83%
- F1分数:98.82%
- 总真阳性(TP):30,848
- 总假阳性(FP):368
- 总假阴性(FN):366
宏平均指标
- 准确率:98.56%
- 精确率:95.24%
- 召回率:93.83%
- F1分数:94.52%
⚠️ 模型行为与局限性
📄 免责声明
本模型卡片详细介绍了英文匿名化器的评估指标和微调参数。请注意:
- 该模型按“原样”提供,遵循MIT许可证。
- 该模型仅用于编辑目的,不进行完整的PII分类。
- 用户在将其部署到生产环境之前,应在自己的数据上仔细测试和评估其性能。
Ai4Privacy – 致力于在人工智能时代保护个人数据。