S

Sec Bert Shape

Developed by nlpaueb
A BERT variant for the financial domain, preserving financial data integrity through numeric morphology pseudo-token processing
Downloads 30
Release Time : 3/2/2022

Model Overview

A BERT model specifically designed for financial texts, optimizing numeric processing by converting numbers into morphological pseudo-tokens (e.g., '53.2'โ†’'[XX.X]'), suitable for analyzing financial documents like 10-K annual reports

Model Features

Numeric Morphology Standardization
Converts numbers into 214 predefined morphological tokens (e.g., '[XX.X]') to avoid numeric fragmentation issues
Financial Domain Pre-training
Trained on 260,000 SEC 10-K annual reports, deeply adapted to financial text characteristics
Multi-Version Adaptation
Offers three variants (base/numeric/morphology editions) to meet different scenario requirements

Model Capabilities

Financial text masked prediction
Financial numeric morphology recognition
Financial verb prediction
Numeric unit inference

Use Cases

Financial Report Analysis
Financial Metric Trend Prediction
Predicts trends in metrics like sales/profits in annual reports
3x accuracy improvement over base BERT in verb prediction tasks
Numeric Unit Completion
Automatically completes units for financial values (millions/billions, etc.)
Unit prediction accuracy >97%
Regulatory Document Processing
XBRL Tagging Assistance
Identifies financial numeric entities to assist XBRL tag generation
Related technology published in ACL 2022 paper
Featured Recommended AI Models
ยฉ 2025AIbase