# Multi-task Optimization
Ling Lite 1.5
MIT
Ling is a large-scale Mixture of Experts (MoE) language model open-sourced by InclusionAI. The Lite version features 16.8 billion total parameters with 2.75 billion activated parameters, demonstrating exceptional performance.
Large Language Model
Transformers

L
inclusionAI
46
3
Olmo 2 0425 1B Instruct GGUF
Apache-2.0
OLMo 2 1B Instruct Edition is a post-training variant of the OLMo-2-0425-1B-RLVR1 model, optimized through supervised fine-tuning, DPO training, and RLVR training to achieve state-of-the-art performance across multiple tasks.
Large Language Model English
O
unsloth
3,137
3
Olmo 2 0425 1B Instruct
Apache-2.0
OLMo 2 1B is a post-training variant of the allenai/OLMo-2-0425-1B-RLVR1 model, undergoing supervised fine-tuning, DPO training, and RLVR training, aiming to achieve state-of-the-art performance across multiple tasks.
Large Language Model
Transformers English

O
allenai
5,127
33
Deepseek R1
MIT
DeepSeek-R1 is the first-generation inference model launched by DeepSeek. Through large-scale reinforcement learning training, it performs excellently in mathematics, code, and reasoning tasks.
Large Language Model
Transformers

D
deepseek-ai
1.7M
12.03k
Gte Modernbert Base
Apache-2.0
A text embedding model based on the ModernBERT pre-trained encoder, supporting long text processing up to 8192 tokens, with excellent performance on evaluation tasks such as MTEB, LoCO, and COIR.
Text Embedding
Transformers English

G
Alibaba-NLP
74.52k
138
Ruri Small
Apache-2.0
Ruri is a model specialized in Japanese text embedding, capable of efficiently calculating sentence similarity and extracting text features.
Text Embedding Japanese
R
cl-nagoya
11.75k
9
SILMA 9B Instruct V1.0
SILMA-9B-Instruct-v1.0 is a 9-billion-parameter open-source Arabic large language model that excels in Arabic tasks, built on the Google Gemma architecture.
Large Language Model
Transformers Supports Multiple Languages

S
silma-ai
18.08k
74
Beyonder 4x7B V2
Other
Beyonder-4x7B-v2 is a large language model based on the Mixture of Experts (MoE) architecture, consisting of 4 expert modules, each specializing in different domains such as dialogue, programming, creative writing, and mathematical reasoning.
Large Language Model
Transformers

B
mlabonne
758
130
Hindi Tpu Electra
A Hindi pre-trained language model based on the ELECTRA architecture, outperforming multilingual BERT on various Hindi NLP tasks
Large Language Model
Transformers Other

H
monsoon-nlp
25
1
Featured Recommended AI Models