C

Codebertapy

Developed by mrm8488
CodeBERTaPy is a RoBERTa-like model trained on GitHub's CodeSearchNet dataset specifically for Python language, designed for code optimization.
Downloads 66
Release Time : 3/2/2022

Model Overview

CodeBERTaPy is a RoBERTa-like Transformer model optimized for Python code, featuring a 6-layer architecture with 84 million parameters, trained on complete Python corpus for 4 epochs. Its tokenizer uses byte-level BPE algorithm, achieving significantly higher encoding efficiency compared to natural language models.

Model Features

Code-optimized tokenizer
Uses byte-level BPE algorithm tokenizer specifically designed for code, reducing token length by 33%-50% compared to natural language models
Lightweight architecture
6-layer Transformer structure with 84 million parameters, comparable to DistilBERT in scale
Python-specific
Trained exclusively on Python code corpus, with deep understanding of Python syntax

Model Capabilities

Python code completion
Code mask prediction
Code understanding

Use Cases

Code assistance
Variable name prediction
Predicts correct variable names in loop structures
Accurately predicted 'val' variable with 98% probability in example
API completion
Predicts framework API calls (e.g., Flask/Keras)
Correctly predicted Flask route parameter 'name' and Keras layer 'Dense'
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase