GPT2-Base-Thai Open-Source Thai Language Model - Trained on Massive Data to Power Thai Applications

Home

Gpt2 Base Thai

Developed by flax-community

Thai causal language model based on GPT-2 architecture, trained on the OSCAR dataset

Large Language Model OtherOpen Source License:MIT #Thai text generation #Low perplexity #OSCAR training

Downloads 1,026

Release Time : 3/2/2022

Model Overview

This is a Thai language model based on the GPT-2 architecture, specifically designed for Thai text generation tasks.

Model Features

Thai-specific model

A language model specifically trained for Thai, capable of better understanding and generating Thai text

Based on GPT-2 architecture

Utilizes the proven GPT-2 architecture with excellent text generation capabilities

Open-source license

Released under MIT license, allowing for both commercial and research use

Model Capabilities

Thai text generation

Language understanding

Contextual prediction

Use Cases

Natural Language Processing

Thai text auto-completion

Generates coherent subsequent text based on input Thai fragments

Thai dialogue systems

Used for building Thai chatbots or virtual assistants

Education

Thai learning assistance

Helps learners generate Thai example sentences or practice materials

🚀 GPT-2 Base Thai

GPT-2 Base Thai is a causal language model that addresses the need for Thai language processing, leveraging the power of the OpenAI GPT - 2 architecture to offer high - quality language generation and feature extraction capabilities.

🚀 Quick Start

GPT-2 Base Thai is a causal language model based on the OpenAI GPT-2 model. It was trained on the OSCAR dataset, specifically the unshuffled_deduplicated_th subset. The model was trained from scratch and achieved an evaluation loss of 1.708 and an evaluation perplexity of 5.516.

This model was trained using HuggingFace's Flax framework and is part of the JAX/Flax Community Week organized by HuggingFace. All training was done on a TPUv3 - 8 VM, sponsored by the Google Cloud team.

All necessary scripts used for training could be found in the Files and versions tab, as well as the Training metrics logged via Tensorboard.

✨ Features

Based on the OpenAI GPT - 2 architecture, suitable for Thai language tasks.
Trained from scratch on the unshuffled_deduplicated_th subset of the OSCAR dataset.
Achieved specific evaluation loss and perplexity metrics.
Trained using HuggingFace's Flax framework during the JAX/Flax Community Week.

📚 Documentation

Model

Property	Details
Model Type	`gpt2-base-thai`
#params	124M
Architecture	GPT - 2
Training Data	`unshuffled_deduplicated_th` Dataset

Evaluation Results

The model was trained for 3 epochs and the following is the final result once the training ended.

Property	Details
Train Loss	1.638
Valid Loss	1.708
Valid PPL	5.516
Total Time	6:12:34

💻 Usage Examples

Basic Usage

As Causal Language Model

from transformers import pipeline

pretrained_name = "flax-community/gpt2-base-thai"

nlp = pipeline(
    "text-generation",
    model=pretrained_name,
    tokenizer=pretrained_name
)

nlp("สวัสดีตอนเช้า")

Feature Extraction in PyTorch

from transformers import GPT2Model, GPT2TokenizerFast

pretrained_name = "flax-community/gpt2-base-thai"
model = GPT2Model.from_pretrained(pretrained_name)
tokenizer = GPT2TokenizerFast.from_pretrained(pretrained_name)

prompt = "สวัสดีตอนเช้า"
encoded_input = tokenizer(prompt, return_tensors='pt')
output = model(**encoded_input)

📄 License

This project is licensed under the MIT license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご