Long Context Support

# Long Context Support

Medgemma 4b Pt
Other
MedGemma is a medical multimodal model optimized based on Gemma 3, specifically designed for medical text and image understanding, available in 4B and 27B versions.
Image-to-Text Transformers
M
google
1,054
73
Qwen3 0.6B GGUF
Apache-2.0
Qwen3-0.6B is the latest 0.6B-parameter large language model in the Qwen series, supporting mode switching between reasoning and non-reasoning modes, with powerful reasoning, instruction-following, and multilingual capabilities.
Large Language Model
Q
QuantFactory
317
1
Qwen3 30B A3B ERP V0.1
MIT
A role-play specialized large language model fine-tuned from Qwen3-30B-A3B-NSFW-JP, supporting long Japanese text generation
Large Language Model Transformers Japanese
Q
Aratako
68
6
Superthoughts Lite V2 MOE Llama3.2 GGUF
Superthoughts Lite v2 is a lightweight Mixture of Experts (MOE) model based on the Llama-3.2 architecture, focusing on reasoning tasks to provide higher accuracy and performance.
Large Language Model Supports Multiple Languages
S
Pinkstack
119
2
Qwen3 1.7B GGUF
Apache-2.0
The latest version of the Tongyi Qianwen series of large language models, supporting switching between thinking and non-thinking modes, with powerful reasoning, multilingual, and agent capabilities.
Large Language Model
Q
Qwen
1,180
1
GLM4 32B Neon V2
MIT
A roleplay fine-tuned version based on GLM-4-32B-0414, with excellent performance, distinctive personality, diverse styles, and elegant writing.
Large Language Model Transformers English
G
allura-org
171
7
Qwen3 1.7B GGUF
Apache-2.0
Qwen3-1.7B is the latest generation of the Qwen series with 1.7B parameters, supporting switching between thinking and non-thinking modes, featuring enhanced reasoning capabilities and multilingual support.
Large Language Model English
Q
unsloth
28.55k
16
Qwen3 0.6B GGUF
Apache-2.0
Qwen3-0.6B is a 0.6B-parameter large language model developed by Alibaba Cloud, the latest member of the Qwen3 series, supporting over 100 languages with strong reasoning, instruction-following, and multilingual capabilities.
Large Language Model English
Q
unsloth
53.56k
41
Viper Coder V1.7 Vsm6
Apache-2.0
Viper-Coder-v1.7-Vsm6 is a large language model based on the Qwen2.5 14B modal architecture, focusing on improving coding efficiency and computational reasoning capabilities, optimizing memory usage, and reducing redundant text generation.
Large Language Model Transformers Supports Multiple Languages
V
prithivMLmods
491
5
Llama 3 70b Arimas Story RP V1.6 3.5bpw H6 Exl2
A merged model based on Llama-3-70B, specializing in story generation and role-play (RP) tasks, combining multiple high-quality models via the breadcrumbs_ties method
Large Language Model Transformers
L
kim512
21
1
EXAONE Deep 7.8B GGUF
Other
The EXAONE Deep series models excel in reasoning tasks such as mathematics and programming. The 7.8B version outperforms open-source models of similar scale and even surpasses certain proprietary models.
Large Language Model Supports Multiple Languages
E
QuantFactory
297
3
Modernbert Base Tr Uncased
MIT
Turkish pre-trained model based on ModernBERT architecture, supporting 8192 context length with excellent performance across multiple domains
Large Language Model Transformers Other
M
artiwise-ai
159
9
Ola 7b
Apache-2.0
Ola-7B is a multimodal large language model jointly developed by Tencent, Tsinghua University, and Nanyang Technological University. Based on the Qwen2.5 architecture, it supports processing text, image, video, and audio inputs and generates text outputs.
Multimodal Fusion Safetensors Supports Multiple Languages
O
THUdyh
1,020
37
Falcon3 MoE 2x7B Insruct
Other
Falcon3 7B-IT and 7B-IT Mixture of Experts model with 13.4 billion parameters, supporting English, French, Spanish, and Portuguese, with a context length of up to 32K.
Large Language Model Safetensors English
F
ehristoforu
273
10
Jina Embeddings V2 Base Code GGUF
Apache-2.0
Jina Embeddings V2 Base Code is a transformer-based English text embedding model, specializing in feature extraction and sentence similarity calculation for code-related texts.
Text Embedding English
J
gaianet
575
1
MN Slush
Slush is a two-stage model trained with high LoRA dropout rate, focusing on enhancing creativity and role-playing capabilities
Large Language Model Transformers
M
crestf411
59
27
Magnum V4 72b FP8 Dynamic
Apache-2.0
A large language model with 72B parameters fine - tuned based on Qwen2.5 - 72B - Instruct. It uses dynamic FP8 quantization technology to optimize inference efficiency and aims to reproduce the prose quality of Claude 3.
Large Language Model Transformers English
M
Infermatic
2,106
2
Allegro
Apache-2.0
Allegro is an open-source high-quality text-to-video generation model capable of producing 6-second detailed videos at 720x1280 resolution and 15 FPS.
Text-to-Video English
A
rhymes-ai
250
257
Mistral Nemo BD RP
Apache-2.0
A large language model fine-tuned on the BeyondDialogue dataset, specifically designed for Chinese-English role-playing scenarios
Large Language Model Safetensors Supports Multiple Languages
M
yuyouyu
36
7
Internvideo2 Chat 8B InternLM2 5
MIT
InternVideo2-Chat-8B-InternLM2.5 is a video-text multimodal model that enhances video understanding and human-computer interaction by integrating the InternVideo2 video encoder with a large language model (LLM).
Video-to-Text Safetensors
I
OpenGVLab
60
7
Deepseek V2 Lite
DeepSeek-V2-Lite is a cost-efficient Mixture of Experts (MoE) language model with a total of 16B parameters and 2.4B active parameters, supporting a 32k context length.
Large Language Model Transformers
D
ZZichen
20
1
Llama3 German 8B 32k
A German-optimized large language model based on Meta Llama3-8B, continuously pre-trained on 65 billion German tokens, specifically optimized for German and supporting 32k long context
Large Language Model Transformers German
L
DiscoResearch
91
13
Erosumika 7B V3 7.1bpw Exl2
Erosumika-7B-v3 is a 7.1bpw exl2 quantized language model suitable for running 16k context on GPUs with 8GB VRAM. It was created by fusing multiple models using the DARE TIES method, primarily for entertainment-oriented fictional writing.
Large Language Model Transformers English
E
Natkituwu
24
1
Meltemi 7B V1
Apache-2.0
The first large-scale Greek foundational language model, based on the Mistral-7B architecture, enhanced with 40 billion tokens of Greek and English corpus to improve Greek language capabilities
Large Language Model Transformers Supports Multiple Languages
M
ilsp
49
51
Midnight Miqu 70B V1.5 GPTQ32G
Other
A 70B-parameter large language model merged using the DARE linear fusion method, optimized for role-playing and story creation
Large Language Model Transformers
M
Kotokin
175
4
Codellama 70b Instruct Hf
Code Llama is a series of code generation and understanding models released by Meta, ranging from 7 billion to 70 billion parameters. This model is the 70 billion parameter instruction fine-tuned version.
Large Language Model Transformers Other
C
meta-llama
505
18
Midnight Miqu 70B V1.5
Other
Midnight-Miqu-70B-v1.5 is a 70B-parameter large language model specifically designed for role-playing and story creation, merged from models by sophosympatheia and migtissera.
Large Language Model Transformers
M
sophosympatheia
734
199
Lemonaderp 4.5.3 GGUF
A 7B-parameter large language model focused on role-playing, featuring 8192 context length with emphasis on creativity and reduced clichés
Large Language Model English
L
KatyTheCutie
238
28
Codeninja 1.0 OpenChat 7B
MIT
Code Ninja is an enhanced version of the renowned model openchat/openchat-3.5-1210, trained through supervised fine-tuning on two large-scale datasets containing over 400,000 coding instructions.
Large Language Model Transformers
C
beowolx
2,998
105
Tinymistral 248M
Apache-2.0
A language model scaled down from Mistral 7B to 248 million parameters, designed for text generation tasks and suitable for downstream task fine-tuning.
Large Language Model Transformers English
T
Locutusque
1,127
46
Codes 7b
Apache-2.0
CodeS-7B is a large language model optimized for SQL generation, based on incremental pre-training of StarCoderBase-7B, supporting a maximum context length of 8,192 tokens.
Large Language Model Transformers Other
C
seeklhy
409
8
Guanaco 7b Leh V2
Gpl-3.0
A multilingual instruction-following language model based on LLaMA 7B, supporting English, Chinese, and Japanese, suitable for chatbots and instruction-following tasks.
Large Language Model Transformers Supports Multiple Languages
G
KBlueLeaf
474
37
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase