đ bert-base-cased-finetuned-cola
This model is a fine - tuned variant of [bert - base - cased](https://huggingface.co/bert - base - cased) on the GLUE COLA dataset. It offers a comparison between [google/fnet - base](https://huggingface.co/google/fnet - base) and [bert - base - cased](https://huggingface.co/bert - base - cased), aiming to evaluate their performance on text classification tasks. The model achieves notable results on the evaluation set, such as a Matthews Correlation of 0.5957.
đ Quick Start
This model is trained using the [run_glue](https://github.com/huggingface/transformers/blob/master/examples/pytorch/text - classification/run_glue.py) script. The following command was used for training:
#!/usr/bin/bash
python ../run_glue.py \
--model_name_or_path bert-base-cased \
--task_name cola \
--do_train \
--do_eval \
--max_seq_length 512 \
--per_device_train_batch_size 16 \
--learning_rate 2e-5 \
--num_train_epochs 3 \
--output_dir bert-base-cased-finetuned-cola \
--push_to_hub \
--hub_strategy all_checkpoints \
--logging_strategy epoch \
--save_strategy epoch \
--evaluation_strategy epoch
⨠Features
- This model is a fine - tuned version of [bert - base - cased](https://huggingface.co/bert - base - cased) on the GLUE COLA dataset.
- It achieves specific results on the evaluation set, including a Loss of 0.6747 and a Matthews Correlation of 0.5957.
- The model was fine - tuned to compare [google/fnet - base](https://huggingface.co/google/fnet - base) against [bert - base - cased](https://huggingface.co/bert - base - cased).
đĻ Installation
No specific installation steps other than the training script are provided in the original document.
đģ Usage Examples
No code examples are provided in the original document.
đ Documentation
Model Information
Property |
Details |
Model Name |
bert - base - cased - finetuned - cola |
Base Model |
[bert - base - cased](https://huggingface.co/bert - base - cased) |
Fine - tuned Dataset |
GLUE COLA |
Comparison Model |
[google/fnet - base](https://huggingface.co/google/fnet - base) |
Paper for Comparison |
this paper |
Evaluation Results
This model achieves the following results on the evaluation set:
- Loss: 0.6747
- Matthews Correlation: 0.5957
Training Procedure
Training Command
#!/usr/bin/bash
python ../run_glue.py \
--model_name_or_path bert-base-cased \
--task_name cola \
--do_train \
--do_eval \
--max_seq_length 512 \
--per_device_train_batch_size 16 \
--learning_rate 2e-5 \
--num_train_epochs 3 \
--output_dir bert-base-cased-finetuned-cola \
--push_to_hub \
--hub_strategy all_checkpoints \
--logging_strategy epoch \
--save_strategy epoch \
--evaluation_strategy epoch
Training Hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e - 05
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9, 0.999) and epsilon = 1e - 08
- lr_scheduler_type: linear
- num_epochs: 3.0
Training Results
Training Loss |
Epoch |
Step |
Validation Loss |
Matthews Correlation |
0.4921 |
1.0 |
535 |
0.5283 |
0.5068 |
0.2837 |
2.0 |
1070 |
0.5133 |
0.5521 |
0.1775 |
3.0 |
1605 |
0.6747 |
0.5957 |
Framework Versions
- Transformers 4.11.0.dev0
- Pytorch 1.9.0
- Datasets 1.12.1
- Tokenizers 0.10.3
đ§ Technical Details
The model is fine - tuned on the GLUE COLA dataset to compare the performance of [google/fnet - base](https://huggingface.co/google/fnet - base) and [bert - base - cased](https://huggingface.co/bert - base - cased). It uses the [run_glue](https://github.com/huggingface/transformers/blob/master/examples/pytorch/text - classification/run_glue.py) script for training with specific hyperparameters and achieves certain evaluation results.
đ License
This model is licensed under the Apache - 2.0 license.