Jinyong Gpt2
J
Jinyong Gpt2
Developed by supermy
A text generation model fine-tuned on GPT-2 for Jin Yong martial arts novel style
Downloads 71
Release Time : 12/2/2022
Model Overview
This model is used to generate continuations in Jin Yong's style, automatically extending text based on a given beginning, mimicking the language style and narrative characteristics of Jin Yong's martial arts novels.
Model Features
Jin Yong Style Imitation
Capable of generating text with the distinctive features of Jin Yong's martial arts novels, including language style, character dialogue, and plot development.
Long Text Generation
Supports generating coherent text up to 108 tokens long, maintaining contextual consistency.
Random Sampling
Enables random sampling functionality, making the generated text more diverse and creative.
Model Capabilities
Martial arts novel continuation
Style imitation
Creative writing assistance
Use Cases
Literary creation
Martial arts novel continuation
Generates follow-up content consistent with Jin Yong's style based on provided excerpts
Produces coherent text that aligns with Jin Yong's style
Creative writing assistance
Provides inspiration and material for writers working on martial arts novels
Diverse plot developments and character dialogue suggestions
Educational research
Literary style research
Used for analyzing linguistic features of Jin Yong's literary style
Generates research samples for style analysis
🚀 Flying Snow, Shooting White Deer: AI for Jin Yong's Novels
This AI model is designed to continue Jin Yong's novels based on given openings, offering a unique way to experience the charm of martial - arts literature.
🚀 Quick Start
✨ Features
- Generate continuations for Jin Yong's novels with a given text as the starting point.
📦 Installation
No specific installation steps are provided in the original document, so this section is skipped.
💻 Usage Examples
Basic Usage
# Invoking the fine - tuned model
senc="These snowflakes are falling, so white and beautiful. In a few days, when the sun comes out, each snowflake will disappear without a trace. By next winter, there will be many more snowflakes, but they won't be these ones from this year."
model_id="jinyong - gpt2 - finetuning"
from transformers import AutoTokenizer, GPT2LMHeadModel, TextGenerationPipeline
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = GPT2LMHeadModel.from_pretrained(model_id)
text_generator = TextGenerationPipeline(model, tokenizer)
text_generator.model.config.pad_token_id = text_generator.model.config.eos_token_id
text_generator( senc,max_length=108, do_sample=True)
[{'generated_text': 'These snowflakes are falling, so white and beautiful. In a few days, when the sun comes out, each snowflake will disappear without a trace. By next winter, there will be many more snowflakes, but they won\'t be these ones from this year. Anyway, God has eyes. I don\'t know what kind of risks there are?” Just as he was speaking, he suddenly heard Xie Xun\'s howl getting closer. He couldn\'t help opening his mouth and exclaiming. Everyone rushed towards him. They only heard Xie Xun let out a roar, and then he slapped out with his left hand, using his palm force to disperse. Everyone was startled and jumped out of the sea at the same time, both taking a step back. Zhang Cuishan and Yin Susu looked at each other. They both thought about how they could resist with the strength of these two great masters and how they could attack the enemy with today\'s strength.'}]
Advanced Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("supermy/jinyong - gpt2")
model = AutoModelForCausalLM.from_pretrained("supermy/jinyong - gpt2")
📚 Documentation
Model Description
The model is used for AI - generated continuations of Jin Yong's novels. Given an opening text, it can generate subsequent content.
Training Data
This dataset is trained on Jin Yong's novel collection "Flying Snow, Shooting White Deer, Laughing Book, Divine Hero, Leaning on Green M鸳".
Training Procedure
- Base Model: GPT2
- Training Environment: NVIDIA 16G GPU
- BPE Tokenization: "vocab_size" = 30000
The following are the training statistics:
[INFO|trainer.py:1608] 2022-12-02 19:52:59,024 >> ***** Running training *****
[INFO|trainer.py:1609] 2022-12-02 19:52:59,024 >> Num examples = 9443
[INFO|trainer.py:1610] 2022-12-02 19:52:59,024 >> Num Epochs = 108
[INFO|trainer.py:1611] 2022-12-02 19:52:59,024 >> Instantaneous batch size per device = 12
[INFO|trainer.py:1612] 2022-12-02 19:52:59,024 >> Total train batch size (w. parallel, distributed & accumulation) = 12
[INFO|trainer.py:1613] 2022-12-02 19:52:59,024 >> Gradient Accumulation steps = 1
[INFO|trainer.py:1614] 2022-12-02 19:52:59,024 >> Total optimization steps = 84996
[INFO|trainer.py:1616] 2022-12-02 19:52:59,025 >> Number of trainable parameters = 124439808
[INFO|trainer.py:1608] 2022-12-03 21:44:00,182 >> ***** Running training *****
[INFO|trainer.py:1609] 2022-12-03 21:44:00,182 >> Num examples = 9443
[INFO|trainer.py:1610] 2022-12-03 21:44:00,182 >> Num Epochs = 216
[INFO|trainer.py:1611] 2022-12-03 21:44:00,182 >> Instantaneous batch size per device = 12
[INFO|trainer.py:1612] 2022-12-03 21:44:00,182 >> Total train batch size (w. parallel, distributed & accumulation) = 12
[INFO|trainer.py:1613] 2022-12-03 21:44:00,182 >> Gradient Accumulation steps = 1
[INFO|trainer.py:1614] 2022-12-03 21:44:00,182 >> Total optimization steps = 169992
[INFO|trainer.py:1616] 2022-12-03 21:44:00,183 >> Number of trainable parameters = 124439808
[INFO|trainer.py:1637] 2022-12-03 21:44:00,184 >> Continuing training from checkpoint, will skip to saved global_step
[INFO|trainer.py:1638] 2022-12-03 21:44:00,184 >> Continuing training from epoch 107
[INFO|trainer.py:1639] 2022-12-03 21:44:00,184 >> Continuing training from global step 84500
[INFO|trainer.py:1608] 2022-12-05 07:36:13,626 >> ***** Running training *****
[INFO|trainer.py:1609] 2022-12-05 07:36:13,626 >> Num examples = 9443
[INFO|trainer.py:1610] 2022-12-05 07:36:13,626 >> Num Epochs = 368
[INFO|trainer.py:1611] 2022-12-05 07:36:13,626 >> Instantaneous batch size per device = 12
[INFO|trainer.py:1612] 2022-12-05 07:36:13,626 >> Total train batch size (w. parallel, distributed & accumulation) = 12
[INFO|trainer.py:1613] 2022-12-05 07:36:13,626 >> Gradient Accumulation steps = 1
[INFO|trainer.py:1614] 2022-12-05 07:36:13,626 >> Total optimization steps = 289616
[INFO|trainer.py:1616] 2022-12-05 07:36:13,627 >> Number of trainable parameters = 124439808
[INFO|trainer.py:1637] 2022-12-05 07:36:13,628 >> Continuing training from checkpoint, will skip to saved global_step
[INFO|trainer.py:1638] 2022-12-05 07:36:13,628 >> Continuing training from epoch 255
[INFO|trainer.py:1639] 2022-12-05 07:36:13,628 >> Continuing training from global step 201000
{'loss': 8.0431, 'learning_rate': 4.970998635229893e-05, 'epoch': 0.64}
{'loss': 7.4867, 'learning_rate': 4.94158548637583e-05, 'epoch': 1.27}
{'loss': 7.322, 'learning_rate': 4.912172337521766e-05, 'epoch': 1.91}
......
{'loss': 3.901, 'learning_rate': 2.5010882865076008e-05, 'epoch': 108.01}
{'loss': 3.8959, 'learning_rate': 2.4863817120805686e-05, 'epoch': 108.64}
......
{'loss': 3.1625, 'learning_rate': 4.6090404254317857e-07, 'epoch': 214.1}
{'loss': 3.1592, 'learning_rate': 3.1413242976140055e-07, 'epoch': 214.74}
{'loss': 3.1625, 'learning_rate': 1.6706668549108195e-07, 'epoch': 215.37}
{'train_runtime': 72271.9602, 'train_samples_per_second': 28.222, 'train_steps_per_second': 2.352, 'train_loss': 1.7180436183842016, 'epoch': 216.0}
{'loss': 2.7087, 'learning_rate': 4.2642671675598036e-08, 'epoch': 367.85}
{'train_runtime': 74859.0808, 'train_samples_per_second': 46.421, 'train_steps_per_second': 3.869, 'train_loss': 0.8725239146935282, 'epoch': 368.0}
***** train metrics *****
epoch = 368.0
train_loss = 0.8725
train_runtime = 20:47:39.08
train_samples = 9443
train_samples_per_second = 46.421
train_steps_per_second = 3.869
12/06/2022 04:23:55 - INFO - __main__ - *** Evaluate ***
[INFO|trainer.py:2929] 2022-12-06 04:23:55,953 >> ***** Running Evaluation *****
[INFO|trainer.py:2931] 2022-12-06 04:23:55,953 >> Num examples = 283
[INFO|trainer.py:2934] 2022-12-06 04:23:55,954 >> Batch size = 12
100%|██████████| 24/24 [00:07<00:00, 3.20it/s]
[INFO|modelcard.py:449] 2022-12-06 04:24:04,760 >> Dropping the following result as it does not have all the necessary fields:
{'task': {'name': 'Causal Language Modeling', 'type': 'text-generation'}, 'metrics': [{'name': 'Accuracy', 'type': 'accuracy', 'value': 0.19599206157122803}]}
***** eval metrics *****
epoch = 368.0
eval_accuracy = 0.196
eval_loss = 7.9524
eval_runtime = 0:00:07.87
eval_samples = 283
eval_samples_per_second = 35.94
eval_steps_per_second = 3.048
perplexity = 2842.2766
📄 License
No license information is provided in the original document, so this section is skipped.
Phi 2 GGUF
Other
Phi-2 is a small yet powerful language model developed by Microsoft, featuring 2.7 billion parameters, focusing on efficient inference and high-quality text generation.
Large Language Model Supports Multiple Languages
P
TheBloke
41.5M
205
Roberta Large
MIT
A large English language model pre-trained with masked language modeling objectives, using improved BERT training methods
Large Language Model English
R
FacebookAI
19.4M
212
Distilbert Base Uncased
Apache-2.0
DistilBERT is a distilled version of the BERT base model, maintaining similar performance while being more lightweight and efficient, suitable for natural language processing tasks such as sequence classification and token classification.
Large Language Model English
D
distilbert
11.1M
669
Llama 3.1 8B Instruct GGUF
Meta Llama 3.1 8B Instruct is a multilingual large language model optimized for multilingual dialogue use cases, excelling in common industry benchmarks.
Large Language Model English
L
modularai
9.7M
4
Xlm Roberta Base
MIT
XLM-RoBERTa is a multilingual model pretrained on 2.5TB of filtered CommonCrawl data across 100 languages, using masked language modeling as the training objective.
Large Language Model Supports Multiple Languages
X
FacebookAI
9.6M
664
Roberta Base
MIT
An English pre-trained model based on Transformer architecture, trained on massive text through masked language modeling objectives, supporting text feature extraction and downstream task fine-tuning
Large Language Model English
R
FacebookAI
9.3M
488
Opt 125m
Other
OPT is an open pre-trained Transformer language model suite released by Meta AI, with parameter sizes ranging from 125 million to 175 billion, designed to match the performance of the GPT-3 series while promoting open research in large-scale language models.
Large Language Model English
O
facebook
6.3M
198
1
A pretrained model based on the transformers library, suitable for various NLP tasks
Large Language Model
Transformers

1
unslothai
6.2M
1
Llama 3.1 8B Instruct
Llama 3.1 is Meta's multilingual large language model series, featuring 8B, 70B, and 405B parameter scales, supporting 8 languages and code generation, with optimized multilingual dialogue scenarios.
Large Language Model
Transformers Supports Multiple Languages

L
meta-llama
5.7M
3,898
T5 Base
Apache-2.0
The T5 Base Version is a text-to-text Transformer model developed by Google with 220 million parameters, supporting multilingual NLP tasks.
Large Language Model Supports Multiple Languages
T
google-t5
5.4M
702
Featured Recommended AI Models