# Lightweight vision model
Vintern 3B Beta GGUF
MIT
Vintern-3B-beta is a multilingual foundation model that supports English, Vietnamese, and Chinese, and is mainly used for image-text to text conversion tasks.
Image-to-Text
Transformers Supports Multiple Languages

V
rootonchair
576
1
Aimv2 Large Patch14 336.apple Pt Dist
AIM-v2 is an efficient image encoder implemented based on the timm library, suitable for various computer vision tasks.
Image Classification
Transformers

A
timm
14
0
Minh
Apache-2.0
YOLOS is an object detection model based on Vision Transformer (ViT), trained with DETR loss, and performs excellently on the COCO dataset.
Object Detection
M
minh14122003
14
0
Swin Tiny Patch4 Window7 224 Cifar10
Apache-2.0
A tiny model based on Swin Transformer architecture, specifically fine-tuned for CIFAR-10 image classification tasks
Image Classification
Transformers

S
Skafu
94
1
Deit Tiny Patch16 224 Finetuned Main Gpu 20e Final
Apache-2.0
Lightweight image classification model based on DeiT-tiny architecture, achieving 98.56% validation accuracy after fine-tuning on a custom image dataset
Image Classification
Transformers

D
Gokulapriyan
15
0
Autotrain Pick A Card 3726099222
This is a multi-category image classification model trained via AutoTrain, demonstrating outstanding performance on the validation set with an accuracy of 90.9%.
Image Classification
Transformers

A
rwcuffney
16
0
Autotrain Weather Classification 3723199089
This is a multi-class image classification model trained via AutoTrain, specifically designed for weather classification tasks.
Image Classification
Transformers

A
8kkillian
16
0
Swin Tiny Patch4 Window7 224 Finetuned Ai Not
Apache-2.0
Fine-tuned model based on Swin Transformer architecture for AI-generated content detection tasks
Image Classification
Transformers

S
LukeSajkowski
17
0
3 Labels
A three-class image classification model trained with AutoTrain, achieving 95% accuracy on the validation set
Image Classification
Transformers

3
Ailyth
18
0
Swin Tiny Patch4 Window7 224 Finetuned Aiornot Baseline
Apache-2.0
A vision model based on the Swin Transformer Tiny architecture, fine-tuned on an unknown dataset for image classification tasks
Image Classification
Transformers

S
Thabet
17
0
Swin Tiny Patch4 Window7 224 Finetuned Fluro Cls
Apache-2.0
Fine-tuned model based on Swin Transformer Tiny architecture for image classification tasks
Image Classification
Transformers

S
zlgao
19
0
Swin Tiny Patch4 Window7 224 Finetuned Woody LeftGR Clean 130epochs
Apache-2.0
An image classification model based on the Swin Transformer Tiny architecture, fine-tuned on a custom image dataset for 130 epochs, with an accuracy of 90.23%.
Image Classification
Transformers

S
Alex-VisTas
11
0
Autotrain Cat Vs Dogs 1858163503
This is a binary classification model trained using AutoTrain, specifically designed to distinguish between images of cats and dogs.
Image Classification
Transformers

A
kem000123
10
2
Vit Small Patch16 224
Apache-2.0
ViT-tiny model converted from timm codebase, suitable for image classification tasks
Image Classification
Transformers

V
WinKawaks
447.70k
18
Vit Tiny Patch16 224
Apache-2.0
ViT-Tiny model converted from the timm repository, suitable for image classification tasks, with usage consistent with the ViT-base model
Image Classification
Transformers

V
WinKawaks
692.49k
21
Visual Transformer Chihuahua Cookies
An image classification model based on the Vision Transformer architecture, specifically designed to distinguish between images of Chihuahuas and cookies
Image Classification
Transformers

V
peterbonnesoeur
15
1
Featured Recommended AI Models