đ VeriGen
The VeriGen model is a powerful tool for automated Verilog RTL code generation. It's fine - tuned from an existing model and trained on a specific Verilog dataset, offering great potential for hardware description language development.
đ Quick Start
The model is ready for inference. You can use it through the widget provided, with an example input of "module display_hello_word".
⨠Features
- Fine - Tuned Model: A 2B parameter fine - tuned version of CodeGen - multi - 2B.
- Trained on Specific Dataset: Trained on Verilog Dataset.
- Long Context Length: Supports a context length of 2048.
đ Documentation
đ Model Summary
The VeriGen model is a 2B parameter model, which is a fine - tuned version of CodeGen - multi - 2B. It is trained on the Verilog Dataset with a context length of 2048.
đģ Usage Examples
Basic Usage
The model was trained on Verilog from GitHub and textbooks. It's not an instruction model, but adding a partial line of module header like "module mux" to the prompt can make it a capable Verilog teaching assistant.
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
prompt = "//module half adder "
device='cuda'
model_name = "shailja/CodeGen_2B_Verilog"
tokenizer = AutoTokenizer.from_pretrained("shailja/fine - tuned - codegen - 2B - Verilog")
model = AutoModelForCausalLM.from_pretrained("shailja/fine - tuned - codegen - 2B - Verilog").to(device)
input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(device)
sample = model.generate(input_ids, max_length=128, temperature=0.5, top_p=0.9)
print(tokenizer.decode(sample[0], truncate_before_pattern=[r"endmodule"]) + "endmodule")
â ī¸ Limitations
The model is trained on Verilog source code from open sources. The predominant natural language in the source code is English, though other languages are also present. The generated Verilog code is not guaranteed to work as intended. It may be inefficient, contain bugs or exploits. For an in - depth discussion of the model limitations, refer to [the paper](https://drive.google.com/file/d/1cN - b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view).
đ§ Technical Details
Model
- Architecture: GPT - 2 model with multi - query attention
- Pretraining steps: 150k
- Pretraining tokens: ~72B
- Precision: fp16
Hardware
- GPUs: 3 Tesla A100
- Training time: 8 days
đ License
The model is licensed under the BigCode OpenRAIL - M v1 license agreement. You can find the full agreement [here](https://huggingface.co/spaces/bigcode/bigcode - model - license - agreement).
đ Citation
@misc{https://doi.org/10.48550/arxiv.2212.11140,
doi = {10.48550/ARXIV.2212.11140},
url = {https://arxiv.org/abs/2212.11140},
author = {Thakur, Shailja and Ahmad, Baleegh and Fan, Zhenxing and Pearce, Hammond and Tan, Benjamin and Karri, Ramesh and Dolan - Gavitt, Brendan and Garg, Siddharth},
title = {Benchmarking Large Language Models for Automated Verilog RTL Code Generation},
publisher = {arXiv},
year = {2022},
copyright = {arXiv.org perpetual, non - exclusive license}
}
Property |
Details |
Pipeline Tag |
text - generation |
Inference |
true |
Model Type |
Fine - tuned version of CodeGen - multi - 2B |
Training Data |
Verilog Dataset |
Library Name |
transformers |
License |
bigcode - openrail - m |
Datasets |
shailja/Verilog_GitHub |
â ī¸ Important Note
The pretraining dataset of the model was not filtered for permissive licenses only. The model can generate source code verbatim from the dataset, and the code's license might require attribution and/or other specific requirements that must be respected.
You need to read the BigCode [OpenRAIL - M license](https://huggingface.co/spaces/bigcode/bigcode - model - license - agreement) agreement before accepting it.