# Long Context Processing
Internvl3 78B Pretrained
Other
InternVL3-78B is an advanced multimodal large language model developed by OpenGVLab, demonstrating exceptional comprehensive performance. Compared to its predecessor InternVL 2.5, it possesses stronger multimodal perception and reasoning capabilities, extending its abilities to new domains such as tool usage, GUI agents, industrial image analysis, and 3D visual perception.
Text-to-Image
Transformers Other

I
OpenGVLab
22
1
Internvl3 2B Instruct
Apache-2.0
InternVL3-2B-Instruct is a supervised fine-tuned version based on InternVL3-2B, undergoing native multimodal pretraining and SFT processing, equipped with powerful multimodal perception and reasoning capabilities.
Text-to-Image
Transformers Other

I
OpenGVLab
1,345
4
Deepcoder 1.5B Preview GGUF
MIT
A code-reasoning large language model fine-tuned based on DeepSeek-R1-Distilled-Qwen-1.5B, utilizing distributed reinforcement learning technology to extend long-context processing capabilities
Large Language Model English
D
Mungert
888
2
La Superba 14B Y.2
Apache-2.0
A next-generation language model based on the Qwen 2.5 14B architecture, specifically optimized for mathematical reasoning, programming, and general logical tasks.
Large Language Model
Transformers Supports Multiple Languages

L
prithivMLmods
19
2
Moderncamembert Cv2 Base
MIT
A French language model pre-trained on 1 trillion high-quality French texts, the French version of ModernBERT
Large Language Model
Transformers French

M
almanach
232
2
Minueza 2 96M
Apache-2.0
A compact language model based on the Llama architecture, supporting English and Portuguese, with 96 million parameters and a context length of 4096 tokens.
Large Language Model
Transformers Supports Multiple Languages

M
Felladrin
357
6
Deepseek V3 0324 GGUF
MIT
The current V3-0324 model is the best-performing quantized version in its size category, significantly reducing volume while maintaining performance close to Q8_0
Large Language Model Other
D
ubergarm
1,712
20
Granite 3.2 2b Instruct GGUF
Apache-2.0
Granite-3.2-2B-Instruct is a 2-billion-parameter long-context AI model, fine-tuned for cognitive reasoning capabilities, supporting 12 languages and multitasking.
Large Language Model
G
ibm-research
1,476
7
Granite 3.2 8b Instruct GGUF
Apache-2.0
Granite-3.2-8B-Instruct is an 8-billion-parameter long-context AI model, specifically fine-tuned for cognitive reasoning capabilities, supporting multiple languages and tasks.
Large Language Model
Transformers

G
ibm-research
1,059
5
Mmmamba Linear
MIT
mmMamba-linear is the first pure decoder multimodal state space model to achieve quadratic-to-linear distillation with moderate academic computing resources, featuring efficient multimodal processing capabilities.
Image-to-Text
Transformers

M
hustvl
16
3
Multilingual ModernBert Base Preview
MIT
A multilingual BERT model developed by the Algomatic team, supporting mask-filling tasks with an 8192 context length and a vocabulary of 151,680.
Large Language Model
Safetensors
M
makiart
60
4
Rumodernbert Small
Apache-2.0
A modern Russian version of a unidirectional and bidirectional encoder Transformer model, pre-trained on approximately 2 trillion tokens of Russian, English, and code data, with a context length of up to 8,192 tokens.
Large Language Model
Transformers Supports Multiple Languages

R
deepvk
619
14
Phi 4 Model Stock V2
Phi-4-Model-Stock-v2 is a large language model merged from multiple Phi-4 variant models using the model_stock merging method, demonstrating strong performance across multiple benchmarks.
Large Language Model
Transformers

P
bunnycore
56
2
Qwen2 VL 2B Instruct GGUF
Apache-2.0
Qwen2-VL-2B-Instruct is a multimodal vision-language model that supports interaction between images and text, suitable for image understanding and generation tasks.
Image-to-Text English
Q
gaianet
95
1
HTML Pruner Phi 3.8B
Apache-2.0
An HTML pruning model designed for RAG systems where HTML is more suitable than plain text for modeling retrieval results
Large Language Model
Transformers English

H
zstanjj
319
10
Jais Family 13b
Apache-2.0
The Jais series is a comprehensive English-Arabic bilingual large language model, optimized for Arabic while maintaining strong English capabilities. This model has been fine-tuned for instruction following, making it suitable for conversational scenarios.
Large Language Model Supports Multiple Languages
J
inceptionai
30
6
Jais Family 6p7b
Apache-2.0
The Jais series is a large English-Arabic bilingual language model specifically optimized for Arabic, with strong English capabilities and 670 million parameters
Large Language Model Supports Multiple Languages
J
inceptionai
79
6
Jais Family 2p7b Chat
Apache-2.0
Jais is a bilingual large language model family specifically optimized for Arabic, with strong English capabilities, ranging from 590 million to 70 billion parameters
Large Language Model
Safetensors Supports Multiple Languages
J
inceptionai
583
7
Jais Adapted 7b Chat
Apache-2.0
The Jais series is a bilingual large language model based on the Llama-2 architecture, specifically optimized for Arabic while maintaining strong English capabilities. This model is a 70-billion-parameter Arabic-adapted version, supporting a context length of 4,096 tokens.
Large Language Model Supports Multiple Languages
J
inceptionai
736
6
Phi 3 Vision 128k Instruct
MIT
Phi-3-Vision-128K-Instruct is a lightweight, cutting-edge open multimodal model supporting a 128K token context length, focusing on high-quality reasoning in text and visual domains.
Image-to-Text
Transformers Other

P
microsoft
25.19k
958
Phi 3 Mini 128k Instruct
MIT
Phi-3 Mini 128K Instruct is a 3.8B parameter lightweight open-source model focused on reasoning capabilities, supporting 128K context length.
Large Language Model
Transformers Supports Multiple Languages

P
microsoft
399.68k
1,638
Fireblossom 32K 7B
A 7B-parameter language model merged from Mistral 7B v0.1, combining multiple fine-tuned models via task arithmetic, supporting 32K context length, balancing creativity and reasoning
Large Language Model
Transformers

F
grimjim
21
3
Xlam V0.1 R
xLAM-v0.1 is a major upgrade in the Large Action Model series, fine-tuned across a wide range of agent tasks and scenarios while maintaining the original model's capabilities with the same parameter count.
Large Language Model
Transformers

X
Salesforce
190
53
Miqu 1 120b
Other
A 120B hybrid large language model generated by interleaving fusion of miqu-1-70b-sf layers using the mergekit tool based on miqu-1-70b
Large Language Model
Transformers Supports Multiple Languages

M
wolfram
15
52
Blockchainlabs 7B Merged Test2 4 Prune
Pruned version based on alnrg2arg/blockchainlabs_7B_merged_test2_4, integrating 7B-parameter large language models from mlabonne/NeuralBeagle14-7B and udkai/Turdus
Large Language Model
Transformers

B
alnrg2arg
135
2
Deepmoney 34b 200k Base
Apache-2.0
Deepmoney is a large language model focused on the financial investment domain, trained on high-quality research reports and financial knowledge, aiming to provide professional investment analysis and decision-making support.
Large Language Model
Transformers Supports Multiple Languages

D
TriadParty
144
69
Neural Chat 7b V3 1
Apache-2.0
A 7-billion-parameter large language model fine-tuned on Intel Gaudi 2 processors based on Mistral-7B, aligned using DPO method, suitable for various language tasks
Large Language Model
Transformers English

N
Intel
3,019
546
Sciphi Self RAG Mistral 7B 32k
MIT
A large language model fine-tuned based on Mistral-7B-v0.1, incorporating Self-RAG technology and supporting a 32k context length
Large Language Model
Transformers

S
SciPhi
147
89
Longalpaca 70B
LongLoRA is an efficient fine-tuning technique for large language models with long context processing capabilities, achieving this through shifted short attention mechanisms, supporting context lengths from 8k to 100k.
Large Language Model
Transformers

L
Yukang
1,293
21
Open Llm Search
Open LLM Search is a specialized adaptation of Together AI's llama-2-7b-32k model, specifically built for extracting information from web pages.
Large Language Model
Transformers English

O
masonbarnes
43
10
Xlnet Base Cased
MIT
XLNet is a model pre-trained on English language using generalized permutation language modeling objectives and Transformer-XL architecture, achieving SOTA results on multiple language tasks.
Large Language Model English
X
xlnet
166.60k
78
Xlnet Large Cased
MIT
XLNet is an unsupervised language representation learning method based on a generalized permutation language modeling objective, using Transformer-XL as the backbone model, excelling in long-context tasks.
Large Language Model
Transformers English

X
xlnet
2,419
24
Featured Recommended AI Models