fr_wiki_mlm_13 Open-source Language Model - Free to Use, Empowering Multiple Text Processing Application Scenarios

Home

Fr Wiki Mlm 13

Developed by fpadovani

A language model fine-tuned based on the Transformers library, with unspecified training dataset

Large Language Model

Transformers

#Low-loss optimization #Linear learning rate scheduling #Large-scale text processing

Downloads 32

Release Time : 4/7/2025

Model Overview

This model is a language model obtained through fine-tuning, primarily used for text generation tasks, with specific functionalities requiring further validation

Model Features

Fine-tuning optimization

The model underwent 100,000 steps of fine-tuning, with validation loss decreasing from an initial 7.6811 to 2.1200

Stable training

Loss values steadily decreased during training, demonstrating good convergence

Model Capabilities

Text generation

Use Cases

Text generation

General text generation

Can be used to generate various types of text content

🚀 wiki_13

This model is a fine - tuned version of on an unknown dataset. It provides a loss of 2.1200 on the evaluation set, offering a certain level of performance for relevant tasks.

🚀 Quick Start

This section is not provided in the original document, so it is skipped.

✨ Features

This section is not provided in the original document, so it is skipped.

📦 Installation

This section is not provided in the original document, so it is skipped.

💻 Usage Examples

This section is not provided in the original document, so it is skipped.

📚 Documentation

Model description

This section lacks detailed information in the original document, so it is skipped.

Intended uses & limitations

This section lacks detailed information in the original document, so it is skipped.

Training and evaluation data

This section lacks detailed information in the original document, so it is skipped.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

Property	Details
learning_rate	0.0001
train_batch_size	16
eval_batch_size	16
seed	13
gradient_accumulation_steps	2
total_train_batch_size	32
optimizer	Adam with betas=(0.9,0.999) and epsilon=1e - 08
lr_scheduler_type	linear
lr_scheduler_warmup_steps	40000
training_steps	100000

Training results

Training Loss	Epoch	Step	Validation Loss
No log	1.5662	2000	7.6811
7.6872	3.1323	4000	6.6323
7.6872	4.6985	6000	6.5117
6.5145	6.2647	8000	6.4461
6.5145	7.8309	10000	6.3755
6.3572	9.3970	12000	6.3094
6.3572	10.9632	14000	6.2737
6.2339	12.5294	16000	6.2091
6.2339	14.0955	18000	6.1964
6.1124	15.6617	20000	6.1261
6.1124	17.2279	22000	5.9661
5.9136	18.7940	24000	5.5933
5.9136	20.3602	26000	5.0449
5.109	21.9264	28000	4.4819
5.109	23.4926	30000	4.0711
4.1502	25.0587	32000	3.7598
4.1502	26.6249	34000	3.4685
3.5347	28.1911	36000	3.3009
3.5347	29.7572	38000	3.1496
3.1576	31.3234	40000	3.0139
3.1576	32.8896	42000	2.9557
2.8847	34.4558	44000	2.8395
2.8847	36.0219	46000	2.7659
2.6809	37.5881	48000	2.6953
2.6809	39.1543	50000	2.6246
2.5261	40.7204	52000	2.5583
2.5261	42.2866	54000	2.5142
2.4073	43.8528	56000	2.4925
2.4073	45.4190	58000	2.4343
2.3129	46.9851	60000	2.4278
2.3129	48.5513	62000	2.3707
2.23	50.1175	64000	2.3806
2.23	51.6836	66000	2.3299
2.1662	53.2498	68000	2.3031
2.1662	54.8160	70000	2.2718
2.1093	56.3821	72000	2.2745
2.1093	57.9483	74000	2.2610
2.0596	59.5145	76000	2.2490
2.0596	61.0807	78000	2.1928
2.0165	62.6468	80000	2.1660
2.0165	64.2130	82000	2.1797
1.9818	65.7792	84000	2.1873
1.9818	67.3453	86000	2.1384
1.9505	68.9115	88000	2.1419
1.9505	70.4777	90000	2.1471
1.9231	72.0439	92000	2.1419
1.9231	73.6100	94000	2.1390
1.9072	75.1762	96000	2.1414
1.9072	76.7424	98000	2.1240
1.8894	78.3085	100000	2.1200

Framework versions

Property	Details
Transformers	4.45.2
Pytorch	2.5.1+cu124
Datasets	3.0.1
Tokenizers	0.20.1

🔧 Technical Details

This section is not provided in the original document, so it is skipped.

📄 License

This section is not provided in the original document, so it is skipped.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご