B

Bert Large Portuguese Cased Legal Tsdae Gpl Nli Sts V1

Developed by stjiris
A legal domain-specific Portuguese sentence transformer based on the BERTimbau large model, supporting semantic similarity calculation
Downloads 17
Release Time : 1/5/2023

Model Overview

This is a sentence transformer model optimized for Portuguese legal texts, capable of mapping sentences to a 1024-dimensional vector space, suitable for semantic search, clustering, and text similarity calculation tasks in the legal domain.

Model Features

Legal Domain Optimization
Specifically trained and optimized for Portuguese legal texts, containing approximately 30,000 legal document samples
Advanced Training Techniques
Utilizes TSDAE (Transformer-based Sequential Denoising Auto-Encoder) technology for training, combined with Generative Pseudo Labeling (GPL) enhancement
Multi-stage Training
Fine-tuned through Natural Language Inference (NLI) and Semantic Textual Similarity (STS) multi-stage training
High Performance
Outstanding performance on multiple Portuguese STS datasets, achieving Pearson correlation coefficients of 0.77-0.84

Model Capabilities

Sentence embedding generation
Semantic similarity calculation
Legal text analysis
Portuguese language processing
Text clustering

Use Cases

Legal Text Processing
Legal Document Semantic Search
Implementing semantic-based search functionality in legal document repositories
Excellent performance in the Supreme Court semantic search system
Case Law Similarity Analysis
Automatically calculating semantic similarity between different case law documents
General Text Processing
Text Clustering
Automatically grouping Portuguese documents with similar content
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase