K

Kosolar 10.7B V0.2

Developed by yanolja
A Korean vocabulary expansion version based on upstage/SOLAR-10.7B-v1.0, specifically fine-tuned for Korean web-crawled datasets.
Downloads 21
Release Time : 1/18/2024

Model Overview

This model extends Korean comprehension by pre-training embeddings for new tokens and partially fine-tuning the `lm_head` embeddings of existing tokens, while preserving the original parameters of the base model.

Model Features

Korean vocabulary expansion
Expanded the vocabulary with 8,960 carefully selected Korean tokens to enhance Korean comprehension.
Selective parameter freezing
Froze the `embed_tokens` layer of existing tokens while unfreezing the `lm_head` layer, balancing Korean capability with original language performance.
Multilingual corpus training
Training data includes Korean web content (83.46%), multilingual corpora (10.69%), and English-to-Korean paragraph pairs (5.86%).

Model Capabilities

Korean text generation
Multilingual text generation

Use Cases

Natural language processing
Korean content generation
Generate text content that conforms to Korean language conventions
Multilingual translation assistance
Assist in English-to-Korean translation tasks
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase