Sembr2023 Bert Tiny
A text classification model fine-tuned based on bert-tiny, demonstrating good precision and accuracy on the evaluation set
Downloads 51
Release Time : 11/27/2023
Model Overview
This model is a fine-tuned version of prajjwal1/bert-tiny, suitable for text classification tasks. Specific application scenarios need to be supplemented
Model Features
Efficient and Lightweight
Based on the bert-tiny architecture, the model is small in size and suitable for resource-constrained environments
Good Classification Performance
Achieves a precision of 0.7983 and an accuracy of 0.9531 on the evaluation set
Optimized Training
Trained using cosine decay learning rate scheduler and Adam optimizer
Model Capabilities
Text Classification
Short Text Processing
Use Cases
Text Analysis
Sentiment Analysis
Can be used for short text sentiment classification
Evaluation set F1 score 0.7202
Content Classification
Classify text content
Evaluation set accuracy 0.9531
đ sembr2023-bert-tiny
This model is a fine - tuned version of [prajjwal1/bert - tiny](https://huggingface.co/prajjwal1/bert - tiny) on an unknown dataset. It offers high - performance results in evaluation, which can be used for various natural language processing tasks.
đ Quick Start
This model is ready to use after fine - tuning. You can load it and start making predictions according to your specific needs.
⨠Features
- Fine - tuned: Based on the
prajjwal1/bert - tiny
model, it has been fine - tuned on an unknown dataset. - High Performance: Achieves excellent results in multiple evaluation metrics such as precision, recall, F1, and accuracy.
đĻ Installation
No specific installation steps are provided in the original document.
đģ Usage Examples
No code examples are provided in the original document.
đ Documentation
Model Evaluation Results
This model achieves the following results on the evaluation set:
- Loss: 0.2101
- Precision: 0.7983
- Recall: 0.6561
- F1: 0.7202
- Iou: 0.5628
- Accuracy: 0.9531
- Balanced Accuracy: 0.8196
- Overall Accuracy: 0.9387
Training Hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 64
- eval_batch_size: 128
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
- lr_scheduler_type: cosine
- training_steps: 1000
Training Results
Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Iou | Accuracy | Balanced Accuracy | Overall Accuracy |
---|---|---|---|---|---|---|---|---|---|---|
1.2554 | 0.06 | 10 | 1.1550 | 0 | 0.0 | 0.0 | 0.0 | 0.9080 | 0.5 | 0.9080 |
0.8047 | 0.12 | 20 | 0.7616 | 0 | 0.0 | 0.0 | 0.0 | 0.9080 | 0.5 | 0.9080 |
0.6392 | 0.18 | 30 | 0.6116 | 0 | 0.0 | 0.0 | 0.0 | 0.9080 | 0.5 | 0.9080 |
0.5328 | 0.24 | 40 | 0.5384 | 0 | 0.0 | 0.0 | 0.0 | 0.9080 | 0.5 | 0.9080 |
0.4859 | 0.3 | 50 | 0.4982 | 0 | 0.0 | 0.0 | 0.0 | 0.9080 | 0.5 | 0.9080 |
0.469 | 0.36 | 60 | 0.4726 | 0 | 0.0 | 0.0 | 0.0 | 0.9080 | 0.5 | 0.9080 |
0.4711 | 0.42 | 70 | 0.4513 | 0 | 0.0 | 0.0 | 0.0 | 0.9080 | 0.5 | 0.9080 |
0.4341 | 0.48 | 80 | 0.4349 | 0 | 0.0 | 0.0 | 0.0 | 0.9080 | 0.5 | 0.9080 |
0.4234 | 0.55 | 90 | 0.4181 | 0 | 0.0 | 0.0 | 0.0 | 0.9080 | 0.5 | 0.9080 |
0.3661 | 0.61 | 100 | 0.3970 | 0 | 0.0 | 0.0 | 0.0 | 0.9080 | 0.5 | 0.9080 |
0.3901 | 0.67 | 110 | 0.3685 | 0 | 0.0 | 0.0 | 0.0 | 0.9080 | 0.5 | 0.9080 |
0.3493 | 0.73 | 120 | 0.3447 | 0.6074 | 0.0126 | 0.0247 | 0.0125 | 0.9084 | 0.5059 | 0.9081 |
0.3199 | 0.79 | 130 | 0.3309 | 0.6329 | 0.0676 | 0.1222 | 0.0651 | 0.9106 | 0.5318 | 0.9095 |
0.3444 | 0.85 | 140 | 0.3219 | 0.6748 | 0.1406 | 0.2328 | 0.1317 | 0.9147 | 0.5669 | 0.9130 |
0.3131 | 0.91 | 150 | 0.3158 | 0.6768 | 0.2211 | 0.3334 | 0.2000 | 0.9187 | 0.6052 | 0.9154 |
0.2921 | 0.97 | 160 | 0.3100 | 0.7245 | 0.1708 | 0.2765 | 0.1604 | 0.9178 | 0.5821 | 0.9156 |
0.3121 | 1.03 | 170 | 0.3057 | 0.6425 | 0.3246 | 0.4313 | 0.2749 | 0.9213 | 0.6531 | 0.9157 |
0.3267 | 1.09 | 180 | 0.3035 | 0.6597 | 0.3155 | 0.4269 | 0.2714 | 0.9221 | 0.6495 | 0.9168 |
0.28 | 1.15 | 190 | 0.2986 | 0.6836 | 0.3429 | 0.4567 | 0.2960 | 0.9250 | 0.6634 | 0.9171 |
0.2945 | 1.21 | 200 | 0.2929 | 0.7005 | 0.3078 | 0.4276 | 0.2720 | 0.9242 | 0.6472 | 0.9177 |
0.2744 | 1.27 | 210 | 0.2874 | 0.7108 | 0.3406 | 0.4606 | 0.2992 | 0.9266 | 0.6633 | 0.9183 |
0.2563 | 1.33 | 220 | 0.2866 | 0.6712 | 0.4432 | 0.5339 | 0.3641 | 0.9288 | 0.7106 | 0.9182 |
0.2565 | 1.39 | 230 | 0.2793 | 0.7057 | 0.4187 | 0.5256 | 0.3565 | 0.9305 | 0.7005 | 0.9203 |
0.2383 | 1.45 | 240 | 0.2760 | 0.6918 | 0.4493 | 0.5448 | 0.3744 | 0.9309 | 0.7145 | 0.9197 |
0.2477 | 1.52 | 250 | 0.2698 | 0.7317 | 0.4190 | 0.5328 | 0.3632 | 0.9324 | 0.7017 | 0.9218 |
0.2466 | 1.58 | 260 | 0.2674 | 0.7119 | 0.4605 | 0.5593 | 0.3882 | 0.9332 | 0.7208 | 0.9212 |
0.2623 | 1.64 | 270 | 0.2641 | 0.7071 | 0.4675 | 0.5629 | 0.3917 | 0.9332 | 0.7240 | 0.9220 |
0.2308 | 1.7 | 280 | 0.2622 | 0.7169 | 0.4797 | 0.5748 | 0.4033 | 0.9347 | 0.7303 | 0.9225 |
0.2179 | 1.76 | 290 | 0.2577 | 0.7287 | 0.4678 | 0.5698 | 0.3984 | 0.9350 | 0.7251 | 0.9236 |
0.2347 | 1.82 | 300 | 0.2557 | 0.7425 | 0.4651 | 0.5719 | 0.4005 | 0.9360 | 0.7244 | 0.9246 |
0.2175 | 1.88 | 310 | 0.2549 | 0.7314 | 0.4873 | 0.5849 | 0.4133 | 0.9364 | 0.7346 | 0.9244 |
0.2365 | 1.94 | 320 | 0.2524 | 0.7237 | 0.5057 | 0.5954 | 0.4239 | 0.9368 | 0.7431 | 0.9244 |
0.2068 | 2.0 | 330 | 0.2513 | 0.7569 | 0.4744 | 0.5832 | 0.4117 | 0.9376 | 0.7295 | 0.9260 |
0.2004 | 2.06 | 340 | 0.2506 | 0.6962 | 0.5462 | 0.6122 | 0.4411 | 0.9363 | 0.7611 | 0.9234 |
0.231 | 2.12 | 350 | 0.2490 | 0.7145 | 0.5251 | 0.6053 | 0.4340 | 0.9370 | 0.7519 | 0.9241 |
0.2117 | 2.18 | 360 | 0.2457 | 0.7300 | 0.5132 | 0.6027 | 0.4314 | 0.9378 | 0.7470 | 0.9257 |
0.1768 | 2.24 | 370 | 0.2450 | 0.7281 | 0.5273 | 0.6116 | 0.4405 | 0.9384 | 0.7537 | 0.9256 |
0.2013 | 2.3 | 380 | 0.2433 | 0.7198 | 0.5513 | 0.6244 | 0.4539 | 0.9390 | 0.7648 | 0.9258 |
0.2128 | 2.36 | 390 | 0.2405 | 0.7568 | 0.5214 | 0.6174 | 0.4466 | 0.9406 | 0.7522 | 0.9282 |
0.2186 | 2.42 | 400 | 0.2393 | 0.7560 | 0.5215 | 0.6173 | 0.4464 | 0.9405 | 0.7522 | 0.9279 |
0.2105 | 2.48 | 410 | 0.2408 | 0.6966 | 0.5834 | 0.6350 | 0.4652 | 0.9383 | 0.7788 | 0.9246 |
0.2216 | 2.55 | 420 | 0.2382 | 0.7415 | 0.5493 | 0.6311 | 0.4610 | 0.9409 | 0.7650 | 0.9277 |
0.1816 | 2.61 | 430 | 0.2377 | 0.7258 | 0.5768 | 0.6428 | 0.4736 | 0.9410 | 0.7774 | 0.9274 |
0.2136 | 2.67 | 440 | 0.2352 | 0.7506 | 0.5456 | 0.6319 | 0.4619 | 0.9415 | 0.7636 | 0.9284 |
0.2043 | 2.73 | 450 | 0.2341 | 0.7425 | 0.5615 | 0.6394 | 0.4700 | 0.9418 | 0.7709 | 0.9286 |
0.2014 | 2.79 | 460 | 0.2333 | 0.7565 | 0.5572 | 0.6417 | 0.4725 | 0.9428 | 0.7695 | 0.9297 |
0.1862 | 2.85 | 470 | 0.2306 | 0.7744 | 0.5520 | 0.6446 | 0.4755 | 0.9440 | 0.7678 | 0.9313 |
0.1714 | 2.91 | 480 | 0.2312 | 0.7354 | 0.6083 | 0.6658 | 0.4991 | 0.9438 | 0.7931 | 0.9302 |
0.1693 | 2.97 | 490 | 0.2280 | 0.7637 | 0.5768 | 0.6572 | 0.4895 | 0.9447 | 0.7794 | 0.9314 |
0.2043 | 3.03 | 500 | 0.2288 | 0.7577 | 0.5848 | 0.6601 | 0.4927 | 0.9446 | 0.7830 | 0.9314 |
0.2138 | 3.09 | 510 | 0.2256 | 0.7797 | 0.5650 | 0.6552 | 0.4872 | 0.9453 | 0.7744 | 0.9327 |
0.1914 | 3.15 | 520 | 0.2250 | 0.7732 | 0.5873 | 0.6675 | 0.5010 | 0.9462 | 0.7849 | 0.9330 |
0.1647 | 3.21 | 530 | 0.2240 | 0.7586 | 0.6173 | 0.6807 | 0.5160 | 0.9467 | 0.7987 | 0.9329 |
0.1749 | 3.27 | 540 | 0.2237 | 0.7679 | 0.6108 | 0.6804 | 0.5156 | 0.9472 | 0.7961 | 0.9331 |
0.1883 | 3.33 | 550 | 0.2226 | 0.7839 | 0.5992 | 0.6792 | 0.5143 | 0.9479 | 0.7913 | 0.9344 |
0.1657 | 3.39 | 560 | 0.2196 | 0.7856 | 0.6059 | 0.6841 | 0.5199 | 0.9485 | 0.7946 | 0.9353 |
0.1721 | 3.45 | 570 | 0.2217 | 0.7556 | 0.6408 | 0.6935 | 0.5308 | 0.9479 | 0.8099 | 0.9335 |
0.1843 | 3.52 | 580 | 0.2188 | 0.7935 | 0.6010 | 0.6840 | 0.5197 | 0.9489 | 0.7926 | 0.9354 |
0.1709 | 3.58 | 590 | 0.2175 | 0.7993 | 0.6078 | 0.6905 | 0.5273 | 0.9499 | 0.7962 | 0.9364 |
0.1526 | 3.64 | 600 | 0.2168 | 0.7782 | 0.6380 | 0.7012 | 0.5398 | 0.9500 | 0.8098 | 0.9358 |
0.1614 | 3.7 | 610 | 0.2148 | 0.8129 | 0.6083 | 0.6959 | 0.5336 | 0.9511 | 0.7971 | 0.9380 |
0.1585 | 3.76 | 620 | 0.2149 | 0.8046 | 0.6210 | 0.7010 | 0.5396 | 0.9513 | 0.8029 | 0.9377 |
0.1798 | 3.82 | 630 | 0.2163 | 0.7788 | 0.6476 | 0.7072 | 0.5470 | 0.9507 | 0.8145 | 0.9364 |
0.1637 | 3.88 | 640 | 0.2147 | 0.8000 | 0.6276 | 0.7034 | 0.5425 | 0.9513 | 0.8059 | 0.9375 |
0.1542 | 3.94 | 650 | 0.2138 | 0.8004 | 0.6335 | 0.7072 | 0.5471 | 0.9518 | 0.8088 | 0.9379 |
0.1575 | 4.0 | 660 | 0.2146 | 0.7867 | 0.6464 | 0.7097 | 0.5500 | 0.9514 | 0.8143 | 0.9371 |
0.1632 | 4.06 | 670 | 0.2124 | 0.7998 | 0.6368 | 0.7091 | 0.5493 | 0.9519 | 0.8103 | 0.9380 |
0.1687 | 4.12 | 680 | 0.2112 | 0.8129 | 0.6294 | 0.7095 | 0.5498 | 0.9526 | 0.8074 | 0.9390 |
0.1565 | 4.18 | 690 | 0.2129 | 0.7959 | 0.6429 | 0.7113 | 0.5519 | 0.9520 | 0.8131 | 0.9380 |
0.1869 | 4.24 | 700 | 0.2128 | 0.7896 | 0.6526 | 0.7146 | 0.5559 | 0.9521 | 0.8175 | 0.9378 |
0.1689 | 4.3 | 710 | 0.2119 | 0.8052 | 0.6361 | 0.7107 | 0.5512 | 0.9524 | 0.8102 | 0.9385 |
0.1581 | 4.36 | 720 | 0.2126 | 0.7817 | 0.6618 | 0.7167 | 0.5585 | 0.9519 | 0.8215 | 0.9373 |
0.1683 | 4.42 | 730 | 0.2121 | 0.8019 | 0.6442 | 0.7145 | 0.5558 | 0.9526 | 0.8140 | 0.9384 |
0.1735 | 4.48 | 740 | 0.2111 | 0.8009 | 0.6452 | 0.7147 | 0.5560 | 0.9526 | 0.8145 | 0.9387 |
0.1537 | 4.55 | 750 | 0.2104 | 0.7991 | 0.6461 | 0.7145 | 0.5558 | 0.9525 | 0.8148 | 0.9386 |
0.174 | 4.61 | 760 | 0.2112 | 0.8031 | 0.6454 | 0.7156 | 0.5572 | 0.9528 | 0.8147 | 0.9387 |
0.1662 | 4.67 | 770 | 0.2118 | 0.7897 | 0.6586 | 0.7182 | 0.5603 | 0.9525 | 0.8204 | 0.9378 |
0.1486 | 4.73 | 780 | 0.2113 | 0.8009 | 0.6492 | 0.7171 | 0.5590 | 0.9529 | 0.8164 | 0.9386 |
0.1672 | 4.79 | 790 | 0.2110 | 0.8055 | 0.6461 | 0.7170 | 0.5589 | 0.9531 | 0.8152 | 0.9389 |
0.1553 | 4.85 | 800 | 0.2108 | 0.7969 | 0.6527 | 0.7176 | 0.5596 | 0.9528 | 0.8179 | 0.9383 |
0.1504 | 4.91 | 810 | 0.2106 | 0.8047 | 0.6461 | 0.7167 | 0.5585 | 0.9530 | 0.8151 | 0.9389 |
0.176 | 4.97 | 820 | 0.2103 | 0.8059 | 0.6459 | 0.7171 | 0.5589 | 0.9531 | 0.8151 | 0.9389 |
0.1597 | 5 |
đ§ Technical Details
Training Hyperparameters
The model was trained with the following hyperparameters:
- learning_rate: 0.0001
- train_batch_size: 64
- eval_batch_size: 128
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
- lr_scheduler_type: cosine
- training_steps: 1000
đ License
This project is licensed under the MIT license.
Phi 2 GGUF
Other
Phi-2 is a small yet powerful language model developed by Microsoft, featuring 2.7 billion parameters, focusing on efficient inference and high-quality text generation.
Large Language Model Supports Multiple Languages
P
TheBloke
41.5M
205
Roberta Large
MIT
A large English language model pre-trained with masked language modeling objectives, using improved BERT training methods
Large Language Model English
R
FacebookAI
19.4M
212
Distilbert Base Uncased
Apache-2.0
DistilBERT is a distilled version of the BERT base model, maintaining similar performance while being more lightweight and efficient, suitable for natural language processing tasks such as sequence classification and token classification.
Large Language Model English
D
distilbert
11.1M
669
Llama 3.1 8B Instruct GGUF
Meta Llama 3.1 8B Instruct is a multilingual large language model optimized for multilingual dialogue use cases, excelling in common industry benchmarks.
Large Language Model English
L
modularai
9.7M
4
Xlm Roberta Base
MIT
XLM-RoBERTa is a multilingual model pretrained on 2.5TB of filtered CommonCrawl data across 100 languages, using masked language modeling as the training objective.
Large Language Model Supports Multiple Languages
X
FacebookAI
9.6M
664
Roberta Base
MIT
An English pre-trained model based on Transformer architecture, trained on massive text through masked language modeling objectives, supporting text feature extraction and downstream task fine-tuning
Large Language Model English
R
FacebookAI
9.3M
488
Opt 125m
Other
OPT is an open pre-trained Transformer language model suite released by Meta AI, with parameter sizes ranging from 125 million to 175 billion, designed to match the performance of the GPT-3 series while promoting open research in large-scale language models.
Large Language Model English
O
facebook
6.3M
198
1
A pretrained model based on the transformers library, suitable for various NLP tasks
Large Language Model
Transformers

1
unslothai
6.2M
1
Llama 3.1 8B Instruct
Llama 3.1 is Meta's multilingual large language model series, featuring 8B, 70B, and 405B parameter scales, supporting 8 languages and code generation, with optimized multilingual dialogue scenarios.
Large Language Model
Transformers Supports Multiple Languages

L
meta-llama
5.7M
3,898
T5 Base
Apache-2.0
The T5 Base Version is a text-to-text Transformer model developed by Google with 220 million parameters, supporting multilingual NLP tasks.
Large Language Model Supports Multiple Languages
T
google-t5
5.4M
702
Featured Recommended AI Models
Š 2025AIbase