đ distilbert-qasports
This model is a fine - tuned version of distilbert-base-uncased-distilled-squad, which can achieve good performance in specific evaluation metrics.
đ Quick Start
This model is a fine - tuned version of distilbert-base-uncased-distilled-squad on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.4019
- Exact: 76.8699
- F1: 81.3261
- Total: 15041
- Hasans Exact: 76.8699
- Hasans F1: 81.3261
- Hasans Total: 15041
- Best Exact: 76.8699
- Best Exact Thresh: 0.0
- Best F1: 81.3261
- Best F1 Thresh: 0.0
đ Documentation
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 50
- mixed_precision_training: Native AMP
Training results
Training Loss |
Epoch |
Step |
Validation Loss |
Exact |
F1 |
Total |
Hasans Exact |
Hasans F1 |
Hasans Total |
Best Exact |
Best Exact Thresh |
Best F1 |
Best F1 Thresh |
0.6782 |
0.1325 |
500 |
0.6027 |
74.4099 |
79.3976 |
15041 |
74.4099 |
79.3976 |
15041 |
74.4099 |
0.0 |
79.3976 |
0.0 |
0.569 |
0.2649 |
1000 |
0.5509 |
75.1080 |
80.1014 |
15041 |
75.1080 |
80.1014 |
15041 |
75.1080 |
0.0 |
80.1014 |
0.0 |
0.5821 |
0.3974 |
1500 |
0.5195 |
75.5535 |
80.3558 |
15041 |
75.5535 |
80.3558 |
15041 |
75.5535 |
0.0 |
80.3558 |
0.0 |
0.5814 |
0.5298 |
2000 |
0.4890 |
76.3978 |
81.0751 |
15041 |
76.3978 |
81.0751 |
15041 |
76.3978 |
0.0 |
81.0751 |
0.0 |
0.5165 |
0.6623 |
2500 |
0.4729 |
76.2117 |
80.9615 |
15041 |
76.2117 |
80.9615 |
15041 |
76.2117 |
0.0 |
80.9615 |
0.0 |
0.4822 |
0.7947 |
3000 |
0.4559 |
76.4976 |
81.2088 |
15041 |
76.4976 |
81.2088 |
15041 |
76.4976 |
0.0 |
81.2088 |
0.0 |
0.5015 |
0.9272 |
3500 |
0.4343 |
76.5308 |
81.0962 |
15041 |
76.5308 |
81.0962 |
15041 |
76.5308 |
0.0 |
81.0962 |
0.0 |
0.36 |
1.0596 |
4000 |
0.4349 |
76.5308 |
81.0828 |
15041 |
76.5308 |
81.0828 |
15041 |
76.5308 |
0.0 |
81.0828 |
0.0 |
0.4052 |
1.1921 |
4500 |
0.4257 |
76.6704 |
81.1909 |
15041 |
76.6704 |
81.1909 |
15041 |
76.6704 |
0.0 |
81.1909 |
0.0 |
0.36 |
1.3245 |
5000 |
0.4372 |
77.1624 |
81.7279 |
15041 |
77.1624 |
81.7279 |
15041 |
77.1624 |
0.0 |
81.7279 |
0.0 |
0.3597 |
1.4570 |
5500 |
0.4281 |
77.1225 |
81.7018 |
15041 |
77.1225 |
81.7018 |
15041 |
77.1225 |
0.0 |
81.7018 |
0.0 |
0.3739 |
1.5894 |
6000 |
0.4064 |
76.8566 |
81.3582 |
15041 |
76.8566 |
81.3582 |
15041 |
76.8566 |
0.0 |
81.3582 |
0.0 |
0.4176 |
1.7219 |
6500 |
0.4011 |
76.6438 |
81.0437 |
15041 |
76.6438 |
81.0437 |
15041 |
76.6438 |
0.0 |
81.0437 |
0.0 |
0.3924 |
1.8543 |
7000 |
0.3985 |
77.0560 |
81.4585 |
15041 |
77.0560 |
81.4585 |
15041 |
77.0560 |
0.0 |
81.4585 |
0.0 |
0.3453 |
1.9868 |
7500 |
0.4019 |
76.8699 |
81.3261 |
15041 |
76.8699 |
81.3261 |
15041 |
76.8699 |
0.0 |
81.3261 |
0.0 |
Framework versions
- Transformers 4.48.3
- Pytorch 2.6.0+cu124
- Datasets 3.2.0
- Tokenizers 0.21.0
đ License
apache - 2.0