code_trans_t5_base_code_documentation_generation_java Open-source Model - Generate Descriptive Documentation for Java Functions for Free

Code Trans T5 Base Code Documentation Generation Java

Developed by SEBIS

A T5-based Java code documentation generation model, specifically designed to generate descriptive documentation for Java functions

Large Language Model #Java Function Summarization #Code Documentation Automation #T5 Architecture Optimization

Downloads 22

Release Time : 3/2/2022

Model Overview

This model is pre-trained on the Java programming language and can automatically generate corresponding documentation based on input Java function code.

Model Features

Java Code Specialized Optimization

Pre-trained and optimized specifically for the Java programming language

Enhanced Tokenization Processing

Performs best with tokenized Java functions

Single-Task Training

Focused solely on code documentation generation for superior performance

Model Capabilities

Java Function Documentation Generation

Java Code Comprehension

Text Summarization Generation

Use Cases

Software Development

Automatic API Documentation Generation

Automatically generates API documentation for Java library functions

BLEU score 17.17 (Java)

Code Comprehension Assistance

Helps developers understand the purpose of complex Java functions

🚀 CodeTrans Model for Java Code Documentation Generation

This is a pre - trained model for the Java programming language, leveraging the T5 base model architecture. It was initially released in this repository. The model is trained on tokenized Java code functions and performs optimally with such tokenized functions.

✨ Features

Based on the t5 - base model with its own SentencePiece vocabulary model.
Trained using single - task training on the CodeSearchNet Corpus Java dataset.
Can generate descriptions for Java functions or be fine - tuned for other Java code tasks.
Can handle unparsed and untokenized Java code, but performs better with tokenized code.

📦 Installation

No specific installation steps are provided in the original document.

💻 Usage Examples

Basic Usage

Here is how to use this model to generate Java function documentation using Transformers SummarizationPipeline:

from transformers import AutoTokenizer, AutoModelWithLMHead, SummarizationPipeline

pipeline = SummarizationPipeline(
    model=AutoModelWithLMHead.from_pretrained("SEBIS/code_trans_t5_base_code_documentation_generation_java"),
    tokenizer=AutoTokenizer.from_pretrained("SEBIS/code_trans_t5_base_code_documentation_generation_java", skip_special_tokens=True),
    device=0
)

tokenized_code = "public static < T , U > Function < T , U > castFunction  ( Class < U > target ) { return new CastToClass < T , U > ( target ) ; }"
pipeline([tokenized_code])

Run this example in colab notebook.

📚 Documentation

Model description

This CodeTrans model is based on the t5 - base model. It has its own SentencePiece vocabulary model. It used single - task training on CodeSearchNet Corpus Java dataset.

Intended uses & limitations

The model could be used to generate the description for the Java function or be fine - tuned on other Java code tasks. It can be used on unparsed and untokenized Java code. However, if the Java code is tokenized, the performance should be better.

🔧 Technical Details

The supervised training tasks datasets can be downloaded on Link

Evaluation results

For the code documentation tasks, different models achieve the following results on different programming languages (in BLEU score):

Property	Details
Model Type	CodeTrans model for code documentation generation in Java, based on t5 - base
Training Data	Supervised training tasks datasets can be downloaded on Link

Language / Model	Python	Java	Go	Php	Ruby	JavaScript
CodeTrans - ST - Small	17.31	16.65	16.89	23.05	9.19	13.7
CodeTrans - ST - Base	16.86	17.17	17.16	22.98	8.23	13.17
CodeTrans - TF - Small	19.93	19.48	18.88	25.35	13.15	17.23
CodeTrans - TF - Base	20.26	20.19	19.50	25.84	14.07	18.25
CodeTrans - TF - Large	20.35	20.06	19.54	26.18	14.94	18.98
CodeTrans - MT - Small	19.64	19.00	19.15	24.68	14.91	15.26
CodeTrans - MT - Base	20.39	21.22	19.43	26.23	15.26	16.11
CodeTrans - MT - Large	20.18	21.87	19.38	26.08	15.00	16.23
CodeTrans - MT - TF - Small	19.77	20.04	19.36	25.55	13.70	17.24
CodeTrans - MT - TF - Base	19.77	21.12	18.86	25.79	14.24	18.62
CodeTrans - MT - TF - Large	18.94	21.42	18.77	26.20	14.19	18.83
State of the art	19.06	17.65	18.07	25.16	12.16	14.90

Created by Ahmed Elnaggar | LinkedIn and Wei Ding | LinkedIn

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご