ChessGPT-chat-v1 Open-source Dialogue Model - Free Deployment, Focused on Question Answering in the Field of Chess

Chessgpt Chat V1

Developed by Waterhorse

A 2.8 billion parameter dialogue model fine-tuned with supervised learning (SFT) based on Chessgpt-Base-v1, specializing in the chess domain

Large Language Model

Transformers

EnglishOpen Source License:Apache-2.0 #Chess Dialogue #2.8 Billion Parameters #Strategy Learning

Downloads 218

Release Time : 6/3/2023

Model Overview

A dialogue language model specialized in chess, capable of discussing and answering chess-related questions

Model Features

Chess Domain Specialization

Trained specifically for chess, capable of understanding and generating chess-related content

Dialogue Capability

Supports multi-turn dialogue interactions, able to comprehend and respond to chess-related questions

Open-Source License

Released under Apache 2.0 license, allowing commercial and research use

Model Capabilities

Chess knowledge Q&A

Chess opening analysis

Chess terminology explanation

Multi-turn dialogue interaction

Use Cases

Chess Learning

Opening Analysis

Answers questions about chess openings

Can accurately identify and explain common openings like the Sicilian Defense

Game Discussion

Engages in discussions about specific chess games

AI Research

Strategy Learning Research

Used for studying language model applications in strategy learning

🚀 Chessgpt-Chat-v1

Chessgpt-Chat-v1 is the sft-tuned model of Chessgpt-Base-v1, aiming to provide high - quality language processing capabilities in the chess domain.

🚀 Quick Start

Chessgpt-Chat-v1 is the sft-tuned model of Chessgpt-Base-v1.

Base Model: Chessgpt-base-v1
Chat Version: Chessgpt-chat-v1

Also, we are actively working on the development of the next-generation model, ChessGPT-V2. We welcome any contribution, especially on chess related dataset. For related matters, please contact xidong.feng.20@ucl.ac.uk.

✨ Features

Model Type: It is a Language Model.
Language: Supports English.
License: Under the Apache 2.0 license.
Model Description: A 2.8B parameter pretrained language model in Chess.

📦 Installation

This requires a GPU with 8GB memory.

import torch
import transformers
from transformers import AutoTokenizer, AutoModelForCausalLM

MIN_TRANSFORMERS_VERSION = '4.25.1'

# check transformers version
assert transformers.__version__ >= MIN_TRANSFORMERS_VERSION, f'Please upgrade transformers to version {MIN_TRANSFORMERS_VERSION} or higher.'

# init
tokenizer = AutoTokenizer.from_pretrained("Waterhorse/chessgpt-chat-v1")
model = AutoModelForCausalLM.from_pretrained("Waterhorse/chessgpt-chat-v1", torch_dtype=torch.float16)
model = model.to('cuda:0')

# infer
# Conversation between two
prompt = "A friendly, helpful chat between some humans.<|endoftext|>Human 0: 1.e4 c5, what is the name of this opening?<|endoftext|>Human 1:"
# Conversation between more than two
#prompt = "A friendly, helpful chat between some humans.<|endoftext|>Human 0: 1.e4 c5, what is the name of this opening?<|endoftext|>Human 1: Sicilian defense.<|endoftext|>Human 2:"

inputs = tokenizer(prompt, return_tensors='pt').to(model.device)
input_length = inputs.input_ids.shape[1]
outputs = model.generate(
    **inputs, max_new_tokens=128, do_sample=True, temperature=0.7, top_p=0.7, top_k=50, return_dict_in_generate=True,
)
token = outputs.sequences[0, input_length:]
output_str = tokenizer.decode(token)
print(output_str)

📚 Documentation

Uses

Excluded uses are described below.

Direct Use

chessgpt-chat-v1 is mainly for research on large language model, especially for those research about policy learning and language modeling.

Out-of-Scope Use

chessgpt-chat-v1 is a language model trained on chess related data and may not perform well for other use cases beyond chess domain.

Bias, Risks, and Limitations

Just as with any language model, chessgpt-chat-v1 carries inherent limitations that necessitate careful consideration. Specifically, it may occasionally generate responses that are irrelevant or incorrect, particularly when tasked with interpreting complex or ambiguous queries. Additionally, given that its training is rooted in online data, the model may inadvertently reflect and perpetuate common online stereotypes and biases.

Evaluation

Please refer to our paper and code for benchmark results.

Citation Information

@article{feng2023chessgpt,
  title={ChessGPT: Bridging Policy Learning and Language Modeling},
  author={Feng, Xidong and Luo, Yicheng and Wang, Ziyan and Tang, Hongrui and Yang, Mengyue and Shao, Kun and Mguni, David and Du, Yali and Wang, Jun},
  journal={arXiv preprint arXiv:2306.09200},
  year={2023}
}

📄 License

This project is licensed under the Apache 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご