B

Bert Large Portuguese Cased Legal Mlm Nli Sts V1

Developed by stjiris
A Portuguese BERT model specialized for the legal domain based on the BERTimbau large model, supporting sentence similarity calculation and semantic search
Downloads 331
Release Time : 1/6/2023

Model Overview

This is a BERT model optimized for Portuguese legal texts, capable of mapping sentences and paragraphs into a 1024-dimensional vector space, suitable for natural language processing tasks such as clustering and semantic search.

Model Features

Legal domain optimization
Trained on approximately 30,000 legal documents, excelling in legal text processing
Multi-stage training
Undergone a three-stage training process: MLM pre-training, NLI fine-tuning, and STS-specific optimization
High-dimensional vector space
Generates 1024-dimensional dense vectors, better capturing semantic features of legal texts

Model Capabilities

Sentence vectorization
Semantic similarity calculation
Legal text analysis
Semantic search
Text clustering

Use Cases

Judicial system
Legal document semantic search
Implement semantic-based retrieval of similar cases in legal document repositories
Practically applied in the IRIS project, improving legal retrieval efficiency
Judgment analysis
Analyze key sentence similarity in judgments
Natural language processing
Text similarity calculation
Calculate semantic similarity between two Portuguese sentences
Achieved a Pearson correlation coefficient of 0.81 on the assin2 dataset
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase