D

Distilbert Base Multilingual Cased Pii

Developed by yonigo
A multilingual PII recognition model fine-tuned based on distilbert-base-multilingual-cased, used to identify personally identifiable information in text.
Downloads 531
Release Time : 6/25/2024

Model Overview

This model is fine-tuned on the ai4privacy/pii-masking-300k dataset, specifically designed to identify and classify personally identifiable information (PII) in text, such as names, addresses, phone numbers, etc.

Model Features

Multilingual Support
Based on the multilingual DistilBERT model, it supports PII recognition in multiple languages.
High-precision Recognition
Demonstrates high F1 scores across multiple PII categories, such as Email F1 reaching 0.9833 and Ip F1 reaching 0.9842.
Lightweight Model
Based on the DistilBERT architecture, it is more lightweight compared to the full BERT model while maintaining high performance.

Model Capabilities

Identify personally identifiable information
Multilingual text processing
Entity classification

Use Cases

Data Privacy Protection
Automatic PII Masking
Automatically identifies personally identifiable information in text and performs masking to protect user privacy.
Accurately identifies various PII types such as names, phone numbers, addresses, etc.
Compliance Checking
Document Compliance Review
Checks whether documents contain sensitive information that needs protection to ensure compliance with privacy regulations.
High accuracy in identifying various PII types, helping ensure compliance
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase