๐ GECKO: Generative Language Model for English, Code and Korean
GECKO is a powerful generative language model designed for English, code, and Korean. It offers high - quality text generation capabilities, trained on a large - scale corpus.
๐ Quick Start
GECKO-7B is a 7B parameter decoder - only transformer. It's pretrained on Korean, English, and code, using 200 billion tokens and terabytes of Korean corpus. It's an open - source model released under the Apache 2.0 License. For more details, refer to our technical report.
โจ Features
- Llama Architecture: GECKO uses the Llama architecture, making it easily integrated with other frameworks that support Llama.
๐ฆ Installation
~14GB RAM is the required minimum memory size with half - precision like float16 or bfloat16.
๐ป Usage Examples
Basic Usage
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = 'kifai/GECKO-7B'
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto")
text = """์ด HTML ์ฝ๋๊ฐ ์ด๋ค ๊ธฐ๋ฅ์ ํ๋์ง ์ค๋ช
ํ๊ณ , ๊ทธ ์ค๋ช
์ ์์ด๋ก ์ ๊ณตํด์ฃผ์ธ์.
\```html
<button onclick="alert('Welcome!')">Click Me</button>
\```
"""
inputs = tokenizer(text, return_tensors='pt')['input_ids'].to('cuda')
output = model.generate(inputs, max_new_tokens=512, repetition_penalty=1.2)
print(tokenizer.decode(output[0], skip_special_tokens=True))
๐ Documentation
Model Details
Property |
Details |
Model Type |
GECKO |
Training Data |
A mix of publicly available online data |
Params |
7B |
Content Length |
8k |
GQA |
X |
Tokens |
200B |
LR |
3.0 x 10-4 |
๐ง Technical Details
GECKO is a generative language model using the Llama architecture. This architecture allows the model to be easily integrated with other frameworks that support Llama.
โ ๏ธ Limitation
GECKO is a generative language model with some risks. Its testing has mainly been conducted in Korean and has not covered all possible scenarios. As with all large language models, the outputs from GECKO cannot be predicted in advance and might sometimes be inaccurate, biased, or otherwise problematic. Therefore, developers should conduct safety testing and fine - tune the model for the intended uses before deploying it.
๐ License
GECKO is released under the Apache 2.0 license.
๐ Citation
@misc{oh2024gecko,
title={GECKO: Generative Language Model for English, Code and Korean},
author={Sungwoo Oh and Donggyu Kim},
year={2024},
eprint={2405.15640},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
๐ Acknowledgement
The training is supported by the TPU Research Cloud program.
๐ Contact
We look forward to hearing from you and collaborating with you.