đ kpf-sbert-v1.1
This is a sentence-transformers model. It maps sentences and paragraphs to a 768-dimensional dense vector space, which can be used for tasks such as clustering or semantic search.
This model is fine - tuned from the jinmang2/kpfbert model using SentenceBERT. (One more round of NLI - STS training was conducted on kpf - sbert - v1.)
đ Quick Start
This model can be used directly for tasks like clustering or semantic search by mapping sentences and paragraphs into a 768 - dimensional dense vector space.
⨠Features
- Maps sentences and paragraphs to a 768 - dimensional dense vector space.
- Suitable for tasks such as clustering and semantic search.
đ Documentation
Evaluation Results
- For performance measurement, the following Korean (kor) and English (en) evaluation corpora were used:
- Korean: korsts (1,379 sentence pairs) and klue - sts (519 sentence pairs)
- English: stsb_multi_mt (1,376 sentence pairs) and glue:stsb (1,500 sentence pairs)
- The performance indicator is cosin.spearman.
- Refer to the evaluation measurement code [here](https://github.com/kobongsoo/BERT/blob/master/sbert/sbert - test3.ipynb).
Model |
korsts |
klue - sts |
glue(stsb) |
stsb_multi_mt(en) |
distiluse - base - multilingual - cased - v2 |
0.7475 |
0.7855 |
0.8193 |
0.8075 |
paraphrase - multilingual - mpnet - base - v2 |
0.8201 |
0.7993 |
0.8907 |
0.8682 |
bongsoo/albert - small - kor - sbert - v1 |
0.8305 |
0.8588 |
0.8419 |
0.7965 |
bongsoo/klue - sbert - v1.0 |
0.8529 |
0.8952 |
0.8813 |
0.8469 |
bongsoo/kpf - sbert - v1.0 |
0.8590 |
0.8924 |
0.8840 |
0.8531 |
bongsoo/kpf - sbert - v1.1 |
0.8750 |
0.8900 |
0.8863 |
0.8554 |
For an automated evaluation of this model, see the Sentence Embeddings Benchmark: https://seb.sbert.net
Training
- The jinmang2/kpfbert model was trained with the following steps: sts(10) - distil(10) - nli(3) - sts(10) - nli(3) - sts(10)
The model was trained with the following parameters:
Common
- do_lower_case = 1, correct_bios = 0, polling_mode = mean
1. STS
- Corpus: korsts(5,749) + kluestsV1.1(11,668) + stsb_multi_mt(5,749) + mteb/sickr - sts(9,927) + glue stsb(5,749) (Total: 38,842)
- Parameters: lr: 1e - 4, eps: 1e - 6, warm_step = 10%, epochs: 10, train_batch: 128, eval_batch: 64, max_token_len: 72
- Refer to the training code [here](https://github.com/kobongsoo/BERT/blob/master/sbert/sentece - bert - sts.ipynb).
2. Distillation
- Teacher model: paraphrase - multilingual - mpnet - base - v2 (max_token_len: 128)
- Corpus: news_talk_en_ko_train.tsv (English - Korean dialogue - news parallel corpus: 1.38M)
- Parameters: lr: 5e - 5, eps: 1e - 8, epochs: 10, train_batch: 128, eval/test_batch: 64, max_token_len: 128 (to match the teacher model)
- Refer to the training code [here](https://github.com/kobongsoo/BERT/blob/master/sbert/sbert - distillaton.ipynb).
3. NLI
- Corpus: Training (967,852): kornli(550,152), kluenli(24,998), glue - mnli(392,702); Evaluation (3,519): korsts(1,500), kluests(519), gluests(1,500)
- Hyperparameters: lr: 3e - 5, eps: 1e - 8, warm_step = 10%, epochs: 3, train/eval_batch: 64, max_token_len: 128
- Refer to the training code [here](https://github.com/kobongsoo/BERT/blob/master/sbert/sentence - bert - nli.ipynb).
đ License
No license information provided.
Citing & Authors
bongsoo