đ Transformers Model
This project is a fine - tuned model based on the transformers
library. It leverages the power of pre - trained models to achieve excellent performance on specific tasks.
đ Quick Start
This model is a fine - tuned version of [answerdotai/ModernBERT - base](https://huggingface.co/answerdotai/ModernBERT - base) on the [davanstrien/arxiv - new - datasets](https://huggingface.co/datasets/davanstrien/arxiv - new - datasets) dataset.
It achieves the following results on the evaluation set:
- Loss: 0.3220
- Accuracy: 0.945
- F1: 0.9439
đ Documentation
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e - 05
- train_batch_size: 8
- eval_batch_size: 64
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon = 1e - 08 and optimizer_args = No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 30
- label_smoothing_factor: 0.1
Training results
Training Loss |
Epoch |
Step |
Validation Loss |
Accuracy |
F1 |
0.5181 |
1.0 |
300 |
0.4495 |
0.8333 |
0.8051 |
0.3804 |
2.0 |
600 |
0.3134 |
0.93 |
0.9268 |
0.3083 |
3.0 |
900 |
0.3407 |
0.9233 |
0.9192 |
0.2449 |
4.0 |
1200 |
0.3304 |
0.9367 |
0.9370 |
0.219 |
5.0 |
1500 |
0.3293 |
0.94 |
0.9377 |
0.2095 |
6.0 |
1800 |
0.3735 |
0.9283 |
0.9294 |
0.205 |
7.0 |
2100 |
0.3220 |
0.945 |
0.9439 |
0.2029 |
8.0 |
2400 |
0.3404 |
0.9367 |
0.9338 |
0.2 |
9.0 |
2700 |
0.3431 |
0.9333 |
0.9330 |
0.1989 |
10.0 |
3000 |
0.3286 |
0.9383 |
0.9377 |
0.1996 |
11.0 |
3300 |
0.3339 |
0.9383 |
0.9365 |
0.1986 |
12.0 |
3600 |
0.3295 |
0.9433 |
0.9419 |
Framework versions
- Transformers 4.48.2
- Pytorch 2.6.0+cu124
- Datasets 3.2.0
- Tokenizers 0.21.0
đ License
This project is licensed under the Apache 2.0 license.
Property |
Details |
Library Name |
transformers |
Model Type |
Fine - tuned version of answerdotai/ModernBERT - base |
Training Data |
davanstrien/arxiv - new - datasets |
Metrics |
accuracy, f1 |