đ reranker_bincont_filt_train
This project offers a fine - tuned model based on Qwen/Qwen2.5 - 0.5B - Instruct, designed for text ranking tasks. It provides various quantized versions of the model with different sizes and quantization methods.
đ Quick Start
This model is a fine - tuned version of Qwen/Qwen2.5-0.5B-Instruct on the reranker_bincont_filt_train dataset. It achieves a loss of 0.1613 on the evaluation set.
⨠Features
- Multiple quantized versions of the model are available, offering different trade - offs between size and performance.
- Fine - tuned on a specific dataset for text ranking tasks.
đĻ Installation
No installation steps are provided in the original document.
đ Documentation
Model Information
- Pipeline Tag: text - ranking
- Quantization: Made by Richard Erkhov
- Links:
Quantized Model Details
Original Model Details
- Model Creator: https://huggingface.co/lightblue/
- Original Model: https://huggingface.co/lightblue/reranker_0.5_bincont_filt/
Model Description
This model is a fine - tuned version of Qwen/Qwen2.5-0.5B-Instruct on the reranker_bincont_filt_train dataset.
Training and Evaluation
Training Hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e - 05
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- distributed_type: multi - GPU
- num_devices: 8
- total_train_batch_size: 8
- total_eval_batch_size: 8
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon = 1e - 08 and optimizer_args = No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.01
- num_epochs: 1.0
Training Results
Training Loss |
Epoch |
Step |
Validation Loss |
0.209 |
0.1000 |
3952 |
0.2207 |
0.1541 |
0.2000 |
7904 |
0.2108 |
0.1519 |
0.3000 |
11856 |
0.2030 |
0.3499 |
0.4000 |
15808 |
0.1939 |
0.1045 |
0.5000 |
19760 |
0.1834 |
0.1887 |
0.6000 |
23712 |
0.1770 |
0.2182 |
0.7001 |
27664 |
0.1695 |
0.1281 |
0.8001 |
31616 |
0.1645 |
0.1463 |
0.9001 |
35568 |
0.1617 |
Framework Versions
- Transformers 4.46.1
- Pytorch 2.4.0+cu121
- Datasets 3.1.0
- Tokenizers 0.20.3
đ§ Technical Details
The model is fine - tuned based on the Qwen/Qwen2.5-0.5B-Instruct model. During training, a set of specific hyperparameters were used, and the training was conducted on a multi - GPU environment. The learning rate scheduler is of cosine type with a warm - up ratio of 0.01.
đ License
The license of the model is other.