Model Selection

Fast inference

# Fast inference

Qwen3 0.6B 8bit

Qwen3-0.6B-8bit is an 8-bit quantized version converted from Qwen/Qwen3-0.6B, a text generation model suitable for the MLX framework.

Large Language Model

Rubert Mini Frida

A lightweight and fast modified version of the FRIDA model for computing embedding vectors of Russian and English sentences

Transformers Supports Multiple Languages

Lite Whisper Large V3 Fast

Lite-Whisper is a lightweight version of OpenAI Whisper compressed using LiteASR technology, significantly reducing model size while maintaining high recognition accuracy.

Speech Recognition

efficient-speech

Kokoro is an open-source text-to-speech model with 82 million parameters, achieving sound quality comparable to large models with a lightweight architecture while improving generation speed and reducing computational costs.

Speech Synthesis English

Ai Image Detector Dev Deploy

This is an auto-trained image classification model capable of recognizing multiple common object categories

Image Classification

SD3.5 Large Fp8

FP8 quantized version of Stable Diffusion 3.5 Large for text-to-image generation tasks.

Image Generation

Sana 1600M 1024px MultiLing

Sana is an efficient text-to-image framework capable of generating images with resolutions up to 4096×4096, supporting multilingual input.

Text-to-Image Supports Multiple Languages

Efficient-Large-Model

DepthPro is a foundational model for zero-shot metric monocular depth estimation, capable of generating high-resolution, high-precision depth maps.

Midjourney Mini Openvino

This is an OpenVINO-optimized midjourney-mini model for text-to-image generation tasks.

Image Generation Supports Multiple Languages

Sana 1600M 1024px

Sana is an efficient text-to-image framework capable of generating images up to 4096×4096 resolution, deployable on laptop GPUs.

Image Generation Supports Multiple Languages

Efficient-Large-Model

An efficient text-to-image model series distilled from mainstream diffusion models on AMD Instinct™ GPUs

Image Generation

Vit Gigantic Patch14 Clip 224.metaclip 2pt5b

A dual-framework compatible vision model trained on MetaCLIP-2.5B dataset, supporting both OpenCLIP and timm frameworks

Image Classification

Vit Huge Patch14 Clip 224.metaclip 2pt5b

A dual-purpose vision-language model trained on the MetaCLIP-2.5B dataset, supporting zero-shot image classification tasks

Image Classification

Vit Large Patch14 Clip 224.metaclip 2pt5b

A dual-framework compatible vision model trained on MetaCLIP-2.5B dataset, supporting zero-shot image classification tasks

Image Classification

Vit Large Patch14 Clip 224.metaclip 400m

Vision Transformer model trained on MetaCLIP-400M dataset, supporting zero-shot image classification tasks

Image Classification

Vit Base Patch16 Clip 224.metaclip 2pt5b

A dual-framework compatible vision model trained on the MetaCLIP-2.5B dataset, supporting both OpenCLIP and timm frameworks

Image Classification

Vit Base Patch32 Clip 224.metaclip 2pt5b

A vision Transformer model trained on the MetaCLIP-2.5B dataset, compatible with both open_clip and timm frameworks

Image Classification

Molmo 7B O Bnb 4bit

The 4-bit quantized version of Molmo-7B-O, significantly reducing the memory requirement and suitable for environments with limited resources.

Large Language Model

Historiccolorsoonr Schnell

A versatile vision+text generation model, particularly suitable for generating realistic images that simulate color film photography, presenting a wide range of visual paradigms from Autochrome to Kodachrome to Fujifilm and other iconic photographic technologies.

Image Generation English

Mlx FLUX.1 Schnell 4bit Quantized

A 4-bit quantized text-to-image generation model optimized for the MLX framework, supporting efficient image generation

Text-to-Image English

A zero-shot classification model based on the Transformers library, capable of performing classification tasks without task-specific training data.

Text Classification

Protgpt2 Distilled Tiny

A distilled version of ProtGPT2, compressed into a more efficient small model through knowledge distillation, maintaining performance while improving inference speed

Llama 3 8b Quantized

The 4-bit quantized version of the Llama 3 model, which optimizes memory usage and speeds up inference, suitable for environments with limited computing resources.

Large Language Model

Transformers English

Cat Vs Dog Classification

An image classification model fine-tuned on the cats_vs_dogs dataset using Google's ViT model, designed to distinguish between images of cats and dogs.

Image Classification

Tinyllama 1.1B Chat V1.0 GGUF

TinyLlama is a lightweight 1.1B-parameter Llama model optimized for chat and programming assistance tasks.

Large Language Model English

Vitforimageclassification

This model is a fine-tuned image classification model based on google/vit-base-patch16-224-in21k on the CIFAR10 dataset, achieving an accuracy of 96.78%.

Image Classification

X Ray Ai Detection

An X-ray image detection model fine-tuned based on AI-image-detector, achieving 99.83% accuracy

Image Classification

Sdxl Chinese Ink Lora

A Chinese ink painting style generation model fine-tuned based on the Stable Diffusion XL framework

Image Generation

Lcm Lora Sdv1 5

This is a LoRA adapter designed for Stable Diffusion v1-5, reducing inference steps to just 2-8 steps, significantly improving generation speed.

Image Generation

latent-consistency

Vit Finetuned Vanilla Cifar10 0

An image classification model fine-tuned on the CIFAR-10 dataset based on the Vision Transformer (ViT) architecture, achieving 99.2% accuracy

Image Classification

Voidnoisecore R0829

A text-to-image generation model based on Stable Diffusion that can generate high-quality images according to text descriptions.

Image Generation

Fantasticmix2.5d V4.5

This is a text-to-image generation model based on Stable Diffusion, capable of generating high-quality images according to text descriptions.

Image Generation

BK-SDM is a stable diffusion model compressed through architecture compression, used for efficient and general text-to-image synthesis. It achieves lightweight design by removing residual and attention blocks in the U-Net.

Image Generation

Mousetrap ButterflyGenerator

This is an unconditional image generation model based on the diffusion model, specifically designed to generate cute butterfly images.

Image Generation

Large Language Model

Vegam Whisper Medium Ml

This is a version of thennal/whisper-medium-ml converted to the CTranslate2 model format for Malayalam speech recognition

Speech Recognition Other

Ct2fast Opus Mt ROMANCE En

This is a CTranslate2-optimized multilingual translation model that supports fast translation from multiple Romance languages to English.

Machine Translation

Ct2fast Opus Mt De En

This is a quantized version of the Helsinki-NLP/opus-mt-de-en model, enabling fast inference through CTranslate2, supporting German to English machine translation.

Machine Translation

Deit Tiny Patch16 224 Finetuned Main Gpu 20e Final

Lightweight image classification model based on DeiT-tiny architecture, achieving 98.56% validation accuracy after fine-tuning on a custom image dataset

Image Classification

Convnext Tiny 224 Finetuned Aiornot

A computer vision model based on ConvNeXt-Tiny architecture, fine-tuned on specific datasets for image classification tasks

Image Classification

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase