The Best 6603 Large Language Model Tools in 2025

Phi 2 GGUF
Other
Phi-2 is a small yet powerful language model developed by Microsoft, featuring 2.7 billion parameters, focusing on efficient inference and high-quality text generation.
Large Language Model Supports Multiple Languages
P
TheBloke
41.5M
205
Roberta Large
MIT
A large English language model pre-trained with masked language modeling objectives, using improved BERT training methods
Large Language Model English
R
FacebookAI
19.4M
212
Distilbert Base Uncased
Apache-2.0
DistilBERT is a distilled version of the BERT base model, maintaining similar performance while being more lightweight and efficient, suitable for natural language processing tasks such as sequence classification and token classification.
Large Language Model English
D
distilbert
11.1M
669
Llama 3.1 8B Instruct GGUF
Meta Llama 3.1 8B Instruct is a multilingual large language model optimized for multilingual dialogue use cases, excelling in common industry benchmarks.
Large Language Model English
L
modularai
9.7M
4
Xlm Roberta Base
MIT
XLM-RoBERTa is a multilingual model pretrained on 2.5TB of filtered CommonCrawl data across 100 languages, using masked language modeling as the training objective.
Large Language Model Supports Multiple Languages
X
FacebookAI
9.6M
664
Roberta Base
MIT
An English pre-trained model based on Transformer architecture, trained on massive text through masked language modeling objectives, supporting text feature extraction and downstream task fine-tuning
Large Language Model English
R
FacebookAI
9.3M
488
Opt 125m
Other
OPT is an open pre-trained Transformer language model suite released by Meta AI, with parameter sizes ranging from 125 million to 175 billion, designed to match the performance of the GPT-3 series while promoting open research in large-scale language models.
Large Language Model English
O
facebook
6.3M
198
1
A pretrained model based on the transformers library, suitable for various NLP tasks
Large Language Model Transformers
1
unslothai
6.2M
1
Llama 3.1 8B Instruct
Llama 3.1 is Meta's multilingual large language model series, featuring 8B, 70B, and 405B parameter scales, supporting 8 languages and code generation, with optimized multilingual dialogue scenarios.
Large Language Model Transformers Supports Multiple Languages
L
meta-llama
5.7M
3,898
T5 Base
Apache-2.0
The T5 Base Version is a text-to-text Transformer model developed by Google with 220 million parameters, supporting multilingual NLP tasks.
Large Language Model Supports Multiple Languages
T
google-t5
5.4M
702
Xlm Roberta Large
MIT
XLM-RoBERTa is a multilingual model pretrained on 2.5TB of filtered CommonCrawl data across 100 languages, trained with a masked language modeling objective.
Large Language Model Supports Multiple Languages
X
FacebookAI
5.3M
431
Bart Large Mnli
MIT
Zero-shot classification model based on BART-large architecture, fine-tuned on MultiNLI dataset
Large Language Model
B
facebook
3.7M
1,364
T5 Small
Apache-2.0
T5-Small is a 60-million-parameter text transformation model developed by Google, using a unified text-to-text framework to handle various NLP tasks
Large Language Model Supports Multiple Languages
T
google-t5
3.7M
450
Flan T5 Base
Apache-2.0
FLAN-T5 is a language model optimized through instruction fine-tuning based on the T5 model, supporting multilingual task processing and outperforming the original T5 model with the same parameter count.
Large Language Model Supports Multiple Languages
F
google
3.3M
862
Albert Base V2
Apache-2.0
ALBERT is a lightweight pre-trained language model based on Transformer architecture, reducing memory usage through parameter sharing mechanism, suitable for English text processing tasks.
Large Language Model English
A
albert
3.1M
121
Distilbert Base Multilingual Cased
Apache-2.0
DistilBERT is a distilled version of the BERT base multilingual model, retaining 97% of BERT's performance with fewer parameters and faster speed. It supports 104 languages and is suitable for various natural language processing tasks.
Large Language Model Transformers Supports Multiple Languages
D
distilbert
2.8M
187
Distilgpt2
Apache-2.0
DistilGPT2 is a lightweight distilled version of GPT-2 with 82 million parameters, retaining GPT-2's core text generation capabilities while being smaller and faster.
Large Language Model English
D
distilbert
2.7M
527
BLEURT 20 D12
The BLEURT model implemented based on PyTorch, used for text evaluation tasks in natural language processing.
Large Language Model Transformers
B
lucadiliello
2.6M
1
Llama 3.2 1B Instruct
Llama 3.2 is a multilingual large language model series developed by Meta, including 1B and 3B scale pre-trained and instruction-tuned generative models, optimized for multilingual dialogue scenarios, supporting intelligent retrieval and summarization tasks.
Large Language Model Transformers Supports Multiple Languages
L
meta-llama
2.4M
901
Qwen2.5 0.5B Instruct
Apache-2.0
A 0.5B parameter instruction fine-tuned model designed for the Gensyn reinforcement learning group, supporting local fine-tuning training
Large Language Model Transformers English
Q
Gensyn
2.4M
5
Qwen2.5 1.5B Instruct
Apache-2.0
A 1.5B parameter instruction fine-tuned model designed for Gensyn RL Swarm, supporting local fine-tuning via peer-to-peer reinforcement learning
Large Language Model Transformers English
Q
Gensyn
2.1M
4
Llama 3.2 1B
Llama 3.2 is a multilingual large language model series launched by Meta, including 1B and 3B parameter pre-trained and instruction-tuned generative models, optimized for multilingual dialogue scenarios, supporting agent retrieval and summarization tasks.
Large Language Model Transformers Supports Multiple Languages
L
meta-llama
2.1M
1,866
Bart Base
Apache-2.0
BART is a Transformer model combining a bidirectional encoder and an autoregressive decoder, suitable for text generation and understanding tasks.
Large Language Model English
B
facebook
2.1M
183
Bio ClinicalBERT
MIT
Bio+Clinical BERT is a clinical BERT model initialized from BioBERT, trained on all notes from MIMIC III, suitable for biomedical and clinical text processing.
Large Language Model English
B
emilyalsentzer
2.0M
334
Deepseek R1 GGUF
MIT
DeepSeek-R1 is a 1.58-bit dynamically quantized large language model optimized by Unsloth, adopting the MoE architecture and supporting English task processing.
Large Language Model English
D
unsloth
2.0M
1,045
Biomednlp BiomedBERT Base Uncased Abstract Fulltext
MIT
BiomedBERT is a biomedical domain-specific language model pretrained on PubMed abstracts and PubMedCentral full-text articles, achieving state-of-the-art performance in multiple biomedical NLP tasks.
Large Language Model English
B
microsoft
1.7M
240
Deepseek R1
MIT
DeepSeek-R1 is the first-generation inference model launched by DeepSeek. Through large-scale reinforcement learning training, it performs excellently in mathematics, code, and reasoning tasks.
Large Language Model Transformers
D
deepseek-ai
1.7M
12.03k
Codebert Python
This model is a masked language model trained on Python code based on microsoft/codebert-base-mlm, primarily used for code evaluation and generation tasks.
Large Language Model Transformers
C
neulab
1.7M
25
Camembert Base
MIT
Cutting-edge French language model based on RoBERTa, offering 6 different versions
Large Language Model Transformers French
C
almanach
1.7M
87
Firefunction V2 GGUF
FireFunction V2 is a state-of-the-art function calling model developed by Fireworks AI with a commercially viable license. It is trained on Llama 3 and supports parallel function calls with strong instruction-following capabilities.
Large Language Model
F
MaziyarPanahi
1.6M
18
Deberta V3 Base
MIT
DeBERTaV3 is an improved pre-trained language model based on DeBERTa, which enhances efficiency through gradient-disentangled embedding sharing in ELECTRA-style pretraining and excels in natural language understanding tasks.
Large Language Model English
D
microsoft
1.6M
316
Llama 3.2 3B Instruct
Llama 3.2 is a multilingual large language model series developed by Meta, including 1B and 3B scale pre-trained and instruction-tuned generative models, optimized for multilingual conversation scenarios.
Large Language Model Transformers Supports Multiple Languages
L
meta-llama
1.6M
1,391
Finbert
FinBERT is a pre-trained natural language processing model specifically designed for financial text sentiment analysis
Large Language Model English
F
ProsusAI
1.6M
864
Openelm 1 1B Instruct
OpenELM is a set of open-source efficient language models that use a hierarchical scaling strategy to efficiently allocate parameters in each layer of the Transformer model, thereby improving model accuracy.
Large Language Model Transformers
O
apple
1.5M
62
Qwen2 7B Instruct GGUF
The GGUF quantized version of Qwen2-7B-Instruct, suitable for local deployment and inference
Large Language Model
Q
MaziyarPanahi
1.5M
11
Byt5 Small
Apache-2.0
ByT5 is a tokenizer-free version of Google's T5 that directly processes raw UTF-8 bytes, supporting multilingual text processing with excellent performance on noisy data.
Large Language Model Supports Multiple Languages
B
google
1.4M
69
Deberta Large Mnli
MIT
DeBERTa-V2-XXLarge is an improved BERT model based on the disentangled attention mechanism and enhanced masked decoder, excelling in multiple natural language understanding tasks.
Large Language Model Transformers English
D
microsoft
1.4M
18
Tinyllama 1.1B Chat V1.0
Apache-2.0
TinyLlama is a lightweight 1.1B-parameter Llama model pre-trained on 3 trillion tokens, fine-tuned for conversation and alignment optimized, suitable for resource-constrained scenarios.
Large Language Model Transformers English
T
TinyLlama
1.4M
1,237
Bartpho Syllable Base
MIT
BARTpho is a pre-trained sequence-to-sequence model for Vietnamese, based on the BART architecture and specifically optimized for the Vietnamese language.
Large Language Model Transformers
B
vinai
1.3M
1
Stablebeluga2
A large language model fine-tuned based on Llama2 70B, trained with Orca-style datasets, excels at executing complex instructions
Large Language Model Transformers English
S
petals-team
1.3M
19
Roberta Base
A RoBERTa model pretrained on Korean, suitable for various Korean natural language processing tasks.
Large Language Model Transformers Korean
R
klue
1.2M
33
Distilroberta Base
Apache-2.0
DistilRoBERTa is a distilled version of the RoBERTa-base model with fewer parameters but faster speed, suitable for English text processing tasks.
Large Language Model English
D
distilbert
1.2M
153
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase