đ model-QA-5-epoch-RU
This model is a fine - tuned version based on AndrewChar/diplom - prod - epoch - 4 - datast - sber - QA, aiming to answer questions from the context and achieve good performance on the SberSQuAD dataset.
đ Quick Start
This model is a fine - tuned version of [AndrewChar/diplom - prod - epoch - 4 - datast - sber - QA](https://huggingface.co/AndrewChar/diplom - prod - epoch - 4 - datast - sber - QA) on the sberquad dataset. It achieves the following results on the evaluation set:
- Train Loss: 1.1991
- Validation Loss: 0.0
- Epoch: 5
⨠Features
Model description
This model is designed to answer questions based on the context. It is a graduation project.
Intended uses & limitations
The context should contain no more than 512 tokens.
đĻ Installation
No installation steps are provided in the original document, so this section is skipped.
đģ Usage Examples
No code examples are provided in the original document, so this section is skipped.
đ Documentation
Training and evaluation data
DataSet SberSQuAD
{'exact_match': 54.586, 'f1': 73.644}
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- optimizer: {'name': 'Adam', 'learning_rate': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_re': 2e - 06, 'decay_steps': 2986, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e - 08, 'amsgrad': False}
- training_precision: float32
Training results
Train Loss |
Validation Loss |
Epoch |
1.1991 |
|
5 |
Framework versions
- Transformers 4.15.0
- TensorFlow 2.7.0
- Datasets 1.17.0
- Tokenizers 0.10.3
đ§ Technical Details
The model is fine - tuned from [AndrewChar/diplom - prod - epoch - 4 - datast - sber - QA](https://huggingface.co/AndrewChar/diplom - prod - epoch - 4 - datast - sber - QA) on the SberSQuAD dataset. During training, specific hyperparameters are used, such as the Adam optimizer with a polynomial decay learning rate strategy. The training precision is set to float32. The results on the evaluation set show the model's performance in terms of train loss, validation loss, and epoch.
đ License
No license information is provided in the original document, so this section is skipped.