B

Bert Large Portuguese Cased Legal Mlm Sts V1.0

Developed by stjiris
A legal domain-specific Portuguese sentence transformation model developed based on the BERTimbau large model, supporting sentence similarity calculation
Downloads 880
Release Time : 11/22/2022

Model Overview

This is a sentence-transformers model that maps sentences and paragraphs into a 1024-dimensional vector space, suitable for tasks such as clustering or semantic search. The model is specifically optimized for the Portuguese legal domain and trained on multiple Portuguese sentence similarity datasets.

Model Features

Legal domain optimization
Specifically trained and optimized for the Portuguese legal domain, using approximately 30,000 legal documents as training data
High-performance sentence embedding
Maps sentences and paragraphs into a 1024-dimensional dense vector space, supporting semantic search and clustering tasks
Multi-dataset training
Trained on multiple datasets including assin, assin2, and the Portuguese subset of stsb_multi_mt

Model Capabilities

Sentence embedding generation
Semantic similarity calculation
Legal text processing
Portuguese text analysis

Use Cases

Legal text processing
Legal document similarity analysis
Compare semantic similarity between different legal documents
Legal case retrieval
Legal case retrieval system based on semantic similarity
General text processing
Document clustering
Automatically group Portuguese documents with similar content
Semantic search
Build a Portuguese search system based on semantics rather than keywords
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase