M

Msmarco Distilbert Base Tas B Mmarco Pt 300k

Developed by mpjan
This is a Portuguese sentence embedding model based on the DistilBERT architecture, specifically optimized for semantic similarity tasks.
Downloads 37
Release Time : 11/5/2022

Model Overview

The model maps Portuguese sentences and paragraphs into a 768-dimensional vector space, suitable for natural language processing tasks such as clustering and semantic search.

Model Features

Portuguese language optimization
Fine-tuned specifically for Portuguese text, delivering superior performance on Portuguese semantic understanding tasks
Efficient architecture
Based on the DistilBERT architecture, it is more lightweight than standard BERT models while maintaining performance
Semantic vector representation
Converts text into 768-dimensional dense vectors, capturing deep semantic information

Model Capabilities

Text vectorization
Semantic similarity calculation
Text clustering
Semantic search

Use Cases

Information retrieval
Portuguese document search
Building a semantic-based Portuguese search engine
Delivers more relevant results compared to keyword search
Text analysis
Portuguese text clustering
Automatically categorizing Portuguese customer feedback or reviews
Identifies thematic patterns in text without manual labeling
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase