Access Global AI Models - Power Next-Gen Apps

From General to Specialized AI - All Models in One Platform

Hot

Latest

High Likes

Filter

Commercial Models

Open Source Models

Classification

Framework

Open Source License

Language

Selected Conditions:

Reset

9994 models match the criteria

Hot

Latest

High Likes

Nsfw Image Detection

An NSFW image classification model based on the ViT architecture, pre-trained on ImageNet-21k via supervised learning and fine-tuned on 80,000 images to distinguish between normal and NSFW content.

Image Classification

Fairface Age Image Detection

An image classification model based on Vision Transformer architecture, pre-trained on the ImageNet-21k dataset, suitable for multi-category image classification tasks

Image Classification

Clip Vit Large Patch14

CLIP is a vision-language model developed by OpenAI that maps images and text into a shared embedding space through contrastive learning, supporting zero-shot image classification.

Chronos T5 Small

Chronos is a family of pre-trained time series forecasting models based on language model architectures. It converts time series into token sequences through quantization and scaling for training, suitable for probabilistic forecasting tasks.

A large English language model pre-trained with masked language modeling objectives, using improved BERT training methods

Large Language Model English

Distilbert Base Uncased

DistilBERT is a distilled version of the BERT base model, maintaining similar performance while being more lightweight and efficient, suitable for natural language processing tasks such as sequence classification and token classification.

Large Language Model English

Clipseg Rd64 Refined

CLIPSeg is an image segmentation model based on text and image prompts, supporting zero-shot and one-shot image segmentation tasks.

Image Segmentation

Xlm Roberta Base

XLM-RoBERTa is a multilingual model pretrained on 2.5TB of filtered CommonCrawl data across 100 languages, using masked language modeling as the training objective.

Large Language Model Supports Multiple Languages

An English pre-trained model based on Transformer architecture, trained on massive text through masked language modeling objectives, supporting text feature extraction and downstream task fine-tuning

Large Language Model English

Vit Face Expression

A facial emotion recognition model fine-tuned based on Vision Transformer (ViT), supporting 7 expression classifications

Chronos Bolt Small

Chronos-Bolt is a series of pretrained time series foundation models based on the T5 architecture, achieving efficient time series forecasting through innovative chunk encoding and direct multi-step prediction

A pretrained model based on the transformers library, suitable for various NLP tasks

Large Language Model

Siglip So400m Patch14 384

SigLIP is a vision-language model pre-trained on the WebLi dataset, employing an improved sigmoid loss function to optimize image-text matching tasks.

Llama 3.1 8B Instruct

Llama 3.1 is Meta's multilingual large language model series, featuring 8B, 70B, and 405B parameter scales, supporting 8 languages and code generation, with optimized multilingual dialogue scenarios.

Large Language Model

Transformers Supports Multiple Languages

The T5 Base Version is a text-to-text Transformer model developed by Google with 220 million parameters, supporting multilingual NLP tasks.

Large Language Model Supports Multiple Languages

Xlm Roberta Large

XLM-RoBERTa is a multilingual model pretrained on 2.5TB of filtered CommonCrawl data across 100 languages, trained with a masked language modeling objective.

Large Language Model Supports Multiple Languages

Distilbert Base Uncased Finetuned Sst 2 English

Text classification model fine-tuned on the SST-2 sentiment analysis dataset based on DistilBERT-base-uncased, with 91.3% accuracy

Text Classification English

A small-scale vision Transformer model trained using the DINOv2 method, extracting image features through self-supervised learning

Image Classification

Vit Base Patch16 224

Vision Transformer model pre-trained on ImageNet-21k and fine-tuned on ImageNet for image classification tasks

Image Classification

Chronos Bolt Base

Chronos-Bolt is a series of pretrained time series forecasting models that support zero-shot prediction with high accuracy and fast inference speed.

Whisper Large V3

Whisper is an advanced automatic speech recognition (ASR) and speech translation model proposed by OpenAI, trained on over 5 million hours of labeled data, with strong cross-dataset and cross-domain generalization capabilities.

Speech Recognition Supports Multiple Languages

Whisper Large V3 Turbo

Whisper is a state-of-the-art automatic speech recognition (ASR) and speech translation model developed by OpenAI, trained on over 5 million hours of labeled data, demonstrating strong generalization capabilities in zero-shot settings.

Speech Recognition

Transformers Supports Multiple Languages

BART model pre-trained on English corpus, specifically fine-tuned for the CNN/Daily Mail dataset, suitable for text summarization tasks

Text Generation English

FashionCLIP is a vision-language model fine-tuned specifically for the fashion domain based on CLIP, capable of generating universal product representations.

Transformers English

Jina Embeddings V3

Jina Embeddings V3 is a multilingual sentence embedding model supporting over 100 languages, specializing in sentence similarity and feature extraction tasks.

Transformers Supports Multiple Languages

Stable Diffusion V1 5

Stable Diffusion is a latent text-to-image diffusion model capable of generating realistic images from any text input.

Image Generation

stable-diffusion-v1-5

Bart Large Mnli

Zero-shot classification model based on BART-large architecture, fine-tuned on MultiNLI dataset

Large Language Model

T5-Small is a 60-million-parameter text transformation model developed by Google, using a unified text-to-text framework to handle various NLP tasks

Large Language Model Supports Multiple Languages

FLAN-T5 is a language model optimized through instruction fine-tuning based on the T5 model, supporting multilingual task processing and outperforming the original T5 model with the same parameter count.

Large Language Model Supports Multiple Languages

ALBERT is a lightweight pre-trained language model based on Transformer architecture, reducing memory usage through parameter sharing mechanism, suitable for English text processing tasks.

Large Language Model English

Distilbert Base Multilingual Cased

DistilBERT is a distilled version of the BERT base multilingual model, retaining 97% of BERT's performance with fewer parameters and faster speed. It supports 104 languages and is suitable for various natural language processing tasks.

Large Language Model

Transformers Supports Multiple Languages

DistilGPT2 is a lightweight distilled version of GPT-2 with 82 million parameters, retaining GPT-2's core text generation capabilities while being smaller and faster.

Large Language Model English

Xlm Roberta Base Language Detection

Multilingual detection model based on XLM-RoBERTa, supporting text classification in 20 languages

Text Classification

Transformers Supports Multiple Languages

Table Transformer Detection

A table detection model based on the DETR architecture, specifically designed for extracting tables from unstructured documents

Object Detection

Blip Image Captioning Large

BLIP is a unified vision-language pretraining framework, excelling at image caption generation tasks, supporting both conditional and unconditional image caption generation.

Ms Marco MiniLM L6 V2

A cross-encoder model trained on the MS Marco passage ranking task for query-passage relevance scoring in information retrieval

Text Embedding English

Mms 300m 1130 Forced Aligner

A text-to-audio forced alignment tool based on Hugging Face pre-trained models, supporting multiple languages with high memory efficiency

Speech Recognition

Transformers Supports Multiple Languages

Llama 3.2 1B Instruct

Llama 3.2 is a multilingual large language model series developed by Meta, including 1B and 3B scale pre-trained and instruction-tuned generative models, optimized for multilingual dialogue scenarios, supporting intelligent retrieval and summarization tasks.

Large Language Model

Transformers Supports Multiple Languages

Stable Diffusion Xl Base 1.0

SDXL 1.0 is a diffusion-based text-to-image generation model that employs an expert-integrated latent diffusion process, supporting high-resolution image generation

Image Generation

Qwen2.5 0.5B Instruct

A 0.5B parameter instruction fine-tuned model designed for the Gensyn reinforcement learning group, supporting local fine-tuning training

Large Language Model

Transformers English

Vit Base Patch16 224 In21k

A Vision Transformer model pretrained on the ImageNet-21k dataset for image classification tasks.

Image Classification

Indonesian Roberta Base Posp Tagger

This is a POS tagging model fine-tuned based on the Indonesian RoBERTa model, trained on the indonlu dataset for Indonesian text POS tagging tasks.

Sequence Labeling

Transformers Other

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase