code_trans_t5_small_api_generation_multitask_finetune Open-source Model - Boosting Java API Recommendation and Generation Optimization

Code Trans T5 Small Api Generation Multitask Finetune

Developed by SEBIS

Java API recommendation generation model pre-trained on T5-small architecture, optimized through multi-task training and fine-tuning

Large Language Model #Java API Recommendation #Multi-task Pretraining #Code Generation

Downloads 22

Release Time : 3/2/2022

Model Overview

This model is specifically designed for generating API usage recommendations for Java programming tasks, implementing code-to-API-call conversion based on Transformer architecture

Model Features

Multi-task Pretraining

Pre-trained on 13 supervised tasks and 7 unsupervised datasets to enhance model generalization

Domain Fine-tuning

Specially optimized for Java API recommendation tasks to improve recommendation accuracy

Efficient Architecture

Based on T5-small architecture to balance performance and computational efficiency, suitable for actual deployment

Model Capabilities

Java code analysis

API call recommendation generation

Code-to-text conversion

Use Cases

Development Assistance

IDE Intelligent Completion

Provides API call suggestions for Java developers in integrated development environments

Improves development efficiency and reduces documentation lookup time

Code Documentation Generation

Automatically generates API usage instructions based on code snippets

Achieves BLEU score of 68.71 (small model)

🚀 CodeTrans model for api recommendation generation

A pre - trained model for API recommendation generation using the T5 small model architecture, which can assist in Java programming tasks.

🚀 Quick Start

This is a pre - trained model for API recommendation generation, leveraging the T5 small model architecture. It was initially released in this repository.

✨ Features

Based on the t5 - small model with its own SentencePiece vocabulary model.
Trained on 13 supervised tasks in software development and 7 unsupervised datasets through multi - task training.
Fine - tuned for API recommendation generation for Java APIs.

📦 Installation

No specific installation steps are provided in the original document.

💻 Usage Examples

Basic Usage

Here's how to use this model to generate Java function documentation using the Transformers SummarizationPipeline:

from transformers import AutoTokenizer, AutoModelWithLMHead, SummarizationPipeline

pipeline = SummarizationPipeline(
    model=AutoModelWithLMHead.from_pretrained("SEBIS/code_trans_t5_small_api_generation_multitask_finetune"),
    tokenizer=AutoTokenizer.from_pretrained("SEBIS/code_trans_t5_small_api_generation_multitask_finetune", skip_special_tokens=True),
    device=0
)

tokenized_code = "parse the uses licence node of this package , if any , and returns the license definition if theres"
pipeline([tokenized_code])

Run this example in colab notebook.

📚 Documentation

Model Description

This CodeTrans model is based on the t5 - small model. It has its own SentencePiece vocabulary model. It used multi - task training on 13 supervised tasks in the software development domain and 7 unsupervised datasets. It is then fine - tuned on the API recommendation generation task for the Java APIs.

Intended Uses & Limitations

The model could be used to generate API usage for the Java programming tasks.

🔧 Technical Details

Training Data

The supervised training tasks datasets can be downloaded on Link

Training Procedure

Multi - task Pretraining

The model was trained on a single TPU Pod V3 - 8 for 500,000 steps in total, using sequence length 512 (batch size 4096). It has a total of approximately 220M parameters and was trained using the encoder - decoder architecture. The optimizer used is AdaFactor with inverse square root learning rate schedule for pre - training.

Fine - tuning

This model was then fine - tuned on a single TPU Pod V2 - 8 for 1,150,000 steps in total, using sequence length 512 (batch size 256), using only the dataset only containing API recommendation generation data.

Evaluation Results

For the code documentation tasks, different models achieve the following results on different programming languages (in BLEU score):

Test results :

Property	Details
Model Type	CodeTrans model for API recommendation generation
Training Data	The supervised training tasks datasets can be downloaded on Link

Language / Model	Java
CodeTrans - ST - Small	68.71
CodeTrans - ST - Base	70.45
CodeTrans - TF - Small	68.90
CodeTrans - TF - Base	72.11
CodeTrans - TF - Large	73.26
CodeTrans - MT - Small	58.43
CodeTrans - MT - Base	67.97
CodeTrans - MT - Large	72.29
CodeTrans - MT - TF - Small	69.29
CodeTrans - MT - TF - Base	72.89
CodeTrans - MT - TF - Large	73.39
State of the art	54.42

Created by Ahmed Elnaggar | LinkedIn and Wei Ding | LinkedIn

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご