G

Gbert Large Paraphrase Euclidean

Developed by deutsche-telekom
German sentence embedding model based on sentence-transformers, mapping text to a 1024-dimensional vector space, optimized for few-shot classification
Downloads 19.03k
Release Time : 1/13/2023

Model Overview

This model is a German sentence embedding model built on deepset/gbert-large, using Euclidean distance as the similarity metric, specifically designed to enhance German few-shot classification performance when combined with SetFit.

Model Features

Euclidean distance optimization
Trained using BatchHardSoftMarginTripletLoss with Euclidean distance, suitable for specific distance metric requirements
High-quality training data
Based on rigorously filtered German back-translation and paraphrase datasets to ensure training quality
Few-shot optimization
Specifically designed to improve text classification performance in German few-shot scenarios
Siamese model support
Provides a cosine similarity version as a complementary option (deutsche-telekom/gbert-large-paraphrase-cosine)

Model Capabilities

German text embedding
Sentence similarity calculation
Few-shot learning
Text classification support

Use Cases

Text classification
Few-shot classification tasks
German text classification with limited labeled data
Excellent performance on NLU few-shot benchmark tests
Semantic search
German document retrieval
German document search system based on semantic similarity
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase