de_wiki_mlm_13 Open-Source Language Model - Fine-Tuned on Unknown Dataset for Text Processing

Home

De Wiki Mlm 13

Developed by fpadovani

A language model fine-tuned on an unknown dataset, trained using the Transformers library

Large Language Model

Transformers

#Text Generation #Low-loss Optimization #Large-scale Pretraining

Downloads 35

Release Time : 3/24/2025

Model Overview

This model is a Transformer-based language model fine-tuned on a specific dataset, suitable for natural language processing tasks such as text generation.

Model Features

Fine-tuning Optimization

The model is fine-tuned on a specific dataset, potentially offering superior performance in particular domains or tasks.

Transformer Architecture

Based on the advanced Transformer architecture, it exhibits strong text comprehension and generation capabilities.

Model Capabilities

Text Generation

Language Understanding

Use Cases

Text Generation

Content Creation

Can be used to assist in article writing, story creation, etc.

🚀 wiki_13

This is a fine - tuned model on an unknown dataset, achieving a loss of 2.9591 on the evaluation set.

🚀 Quick Start

This model is a fine - tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 2.9591

📚 Documentation

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 16
eval_batch_size: 16
seed: 13
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e - 08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 40000
training_steps: 100000

Training results

Training Loss	Epoch	Step	Validation Loss
No log	0.9847	2000	8.0854
8.1286	1.9695	4000	7.4147
8.1286	2.9542	6000	7.2936
7.3042	3.9389	8000	7.2263
7.3042	4.9237	10000	7.1350
7.1348	5.9084	12000	7.0611
7.1348	6.8932	14000	7.0000
6.9718	7.8779	16000	6.9539
6.9718	8.8626	18000	6.8852
6.8205	9.8474	20000	6.8512
6.8205	10.8321	22000	6.8137
6.6971	11.8168	24000	6.7650
6.6971	12.8016	26000	6.6483
6.5488	13.7863	28000	6.5099
6.5488	14.7710	30000	6.2472
6.2179	15.7558	32000	5.9238
6.2179	16.7405	34000	5.3578
5.4765	17.7253	36000	5.0209
5.4765	18.7100	38000	4.7463
4.8038	19.6947	40000	4.5390
4.8038	20.6795	42000	4.3029
4.341	21.6642	44000	4.1737
4.341	22.6489	46000	4.0038
3.993	23.6337	48000	3.8794
3.993	24.6184	50000	3.7730
3.74	25.6032	52000	3.6758
3.74	26.5879	54000	3.6050
3.5482	27.5726	56000	3.5573
3.5482	28.5574	58000	3.4807
3.4039	29.5421	60000	3.4149
3.4039	30.5268	62000	3.3689
3.2796	31.5116	64000	3.3317
3.2796	32.4963	66000	3.2805
3.1856	33.4810	68000	3.2562
3.1856	34.4658	70000	3.2052
3.1083	35.4505	72000	3.1827
3.1083	36.4353	74000	3.1513
3.0408	37.4200	76000	3.1234
3.0408	38.4047	78000	3.0981
2.9838	39.3895	80000	3.0862
2.9838	40.3742	82000	3.0890
2.939	41.3589	84000	3.0375
2.939	42.3437	86000	3.0297
2.8967	43.3284	88000	3.0112
2.8967	44.3131	90000	2.9907
2.8682	45.2979	92000	2.9836
2.8682	46.2826	94000	3.0020
2.8445	47.2674	96000	2.9588
2.8445	48.2521	98000	2.9804
2.8208	49.2368	100000	2.9591

Framework versions

Transformers 4.45.2
Pytorch 2.5.1+cu124
Datasets 3.0.1
Tokenizers 0.20.1

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご