C

Codebert Base Cd Ft

Developed by mchochlov
This is a sentence-transformers-based model specifically fine-tuned for code clone detection tasks, capable of mapping code snippets into a 768-dimensional vector space.
Downloads 5,080
Release Time : 8/16/2022

Model Overview

The model is based on the CodeBERT architecture and fine-tuned using contrastive learning on the BigCloneBench dataset, primarily used for code similarity computation and clone detection tasks.

Model Features

Code-Specific Embedding
Vector representations optimized for code snippets, better capturing semantic features of code.
Clone Detection Optimization
Fine-tuned on the BigCloneBench dataset using contrastive learning, making it particularly suitable for code clone detection scenarios.
High-Dimensional Semantic Representation
Generates 768-dimensional dense vectors that effectively represent deep semantic features of code.

Model Capabilities

Code Similarity Computation
Code Clone Detection
Code Feature Extraction

Use Cases

Code Analysis
Code Clone Detection
Identify similarities between different code snippets to detect potential code clones.
Can effectively detect Type-1 to Type-4 level code clones.
Code Search
Achieve more precise code search through semantic similarity.
Code Quality
Duplicate Code Identification
Identify duplicate or highly similar code fragments in large codebases.
Helps reduce code redundancy and improve maintainability.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase