# Transformer architecture

Sundial Base 128m
Apache-2.0
Sundial is a series of generative time series foundation models capable of zero-shot inference for both deterministic and probabilistic forecasting.
Climate Model Safetensors
S
thuml
214
5
Ast Finetuned Audioset 10 10 0.4593 ONNX
This is the ONNX version of the AST (Audio Spectrogram Transformer) model, designed specifically for audio classification tasks and fine-tuned on the AudioSet dataset.
Audio Classification Transformers
A
onnx-community
684
1
Falcon E 3B Instruct
Other
Falcon-E-3B-Instruct is an efficient language model based on a 1.58-bit architecture, optimized for edge devices, with excellent inference capabilities and low memory usage.
Large Language Model Transformers
F
tiiuae
225
22
Orpheus TTS MediaSpeech
This is an Arabic model trained on the MediaSpeech dataset. Specific uses and functionalities require further information for confirmation.
Large Language Model Transformers Arabic
O
kadirnar
21
2
Unt 8b
Apache-2.0
The Camel Model is a text generation model based on the transformer architecture, supporting Azerbaijani and trained using reinforcement learning.
Large Language Model Transformers Other
U
omar07ibrahim
33
2
Bidi Eng Pol
Transformer-based bidirectional machine translation model supporting mutual translation among Slavic languages
Machine Translation Transformers Supports Multiple Languages
B
allegro
185
1
Vit Large Patch14 Dinov2.lvd142m
Apache-2.0
A vision Transformer (ViT)-based image feature model, pre-trained on the LVD-142M dataset using the self-supervised DINOv2 method.
Image Classification Transformers
V
pcuenq
18
0
Vit Liveness Detection V1.0
Apache-2.0
This model is a face liveness detection model based on the Transformer library and has achieved excellent performance on the evaluation set.
Face-related Transformers
V
nguyenkhoa
176
1
MOMENT 1 Base
MIT
MOMENT is a series of general-purpose foundational models for time series analysis, supporting various tasks such as forecasting, classification, anomaly detection, etc., with out-of-the-box and fine-tuning capabilities.
Materials Science Transformers
M
AutonLab
4,975
3
Speecht5 Finetuned Emirhan Tr
MIT
A Turkish text-to-speech model fine-tuned based on Microsoft SpeechT5, capable of generating high-quality Turkish speech.
Speech Synthesis TensorBoard Other
S
emirhanbilgic
22
1
Swahili English Translation
MIT
A Transformer model specifically developed for bidirectional translation between Swahili and English, fine-tuned on 210,000 sentence pairs
Machine Translation Transformers
S
Bildad
98
2
Birna Bert
A Transformer encoder model based on BERT architecture, specifically designed for generating RNA sequence embeddings
Text Embedding Transformers
B
buetnlpbio
364
1
Dictalm2 It Qa Fine Tune
Apache-2.0
This is a fine-tuned version of Dicta - IL's dictalm2.0 - instruct model, specifically designed for generating Hebrew question-answer pairs.
Question Answering System Transformers Other
D
618AI
2,900
6
Real3d
MIT
Real3D is a 2D-to-3D mapping Transformer model based on the TripoSR architecture, extending its capability to process real-world images through unsupervised self-training and automatic data filtering.
3D Vision
R
hwjiang
22
19
Codontransformer
Apache-2.0
The ultimate tool for codon optimization, capable of converting protein sequences into DNA sequences optimized for target organisms.
Protein Model Transformers
C
adibvafa
1,327
7
Medsam Breast Cancer
Image segmentation model based on the Transformers library for image segmentation tasks in vision applications
Image Segmentation Transformers Other
M
MichaelSoloveitchik
61
0
Segformer B3 Fashion
Other
A fashion item image segmentation model based on SegFormer architecture, specifically designed for identifying and segmenting clothing and accessories
Image Segmentation Transformers
S
sayeed99
75.65k
21
Pllava 7b
Apache-2.0
PLLaVA is an open-source video language chatbot, obtained by fine-tuning a large image language model on video instruction following data, which can be used for the research of multimodal large models and chatbots.
Text-to-Video Transformers
P
ermu2001
109
13
Trocr Base Spanish
MIT
Base version of TrOCR model, specifically designed for Spanish printed text, based on Transformer architecture, fine-tuned on a custom dataset
Text Recognition Transformers Supports Multiple Languages
T
qantev
170
5
Granite Timeseries Patchtst
Apache-2.0
PatchTST is a Transformer-based time series forecasting model designed for long-term time series forecasting, utilizing subsequence patching and channel independence techniques to improve prediction accuracy.
Climate Model Transformers
G
ibm-granite
1,505
11
Dpt Beit Large 512
MIT
A monocular depth estimation model based on BEiT Transformer, capable of inferring fine depth information from a single image
3D Vision Transformers
D
Intel
2,794
8
Llm Jp 13b Instruct Full Jaster Dolly Oasst V1.0
Apache-2.0
A large-scale language model developed by the Japanese LLM-jp project, supporting text generation tasks in Japanese and English
Large Language Model Transformers Supports Multiple Languages
L
llm-jp
750
8
Gpt2 Demo
Other
GPT-2 is a self-supervised pre-trained language model based on the Transformer architecture, which excels at text generation tasks.
Large Language Model Transformers
G
demo-leaderboard
19.21k
1
Bge Base En V1.5 Ct2
MIT
BGE Base English v1.5 is a transformer-based sentence embedding model, specifically designed for extracting sentence features and calculating sentence similarity.
Text Embedding Transformers English
B
winstxnhdw
30
0
Discogs Maest 10s Pw 129e
MAEST is a Transformer model family based on PASST, focusing on music analysis applications, particularly excelling in music genre classification tasks.
Audio Classification Transformers
D
mtg-upf
33
0
Dogs Breed Classification Using Vision Transformers
Openrail
This is a model for image classification tasks, supporting the English language and adopting an open license.
Image Classification Transformers English
D
AmitMidday
27
1
Hubert Base Audioset
Audio representation model based on HuBERT architecture, pre-trained on the complete AudioSet dataset, suitable for general audio tasks
Audio Classification Transformers
H
ALM
345
2
Dinov2 Large
Apache-2.0
A vision Transformer model trained using the DINOv2 method, extracting robust visual features from massive image data through self-supervised learning
Image Classification Transformers
D
facebook
558.78k
79
Segformer B0 Finetuned Segments Sidewalk 2
A SegFormer semantic segmentation model fine-tuned on the Segments.ai sidewalk-semantic dataset, suitable for sidewalk scene analysis
Image Segmentation Transformers
S
thesisabc
16
0
Trocr Base Printed Fr
MIT
Transformer-based French printed text OCR model, filling the gap of French version in TrOCR models
Image-to-Text Transformers French
T
agomberto
110
2
Japanese Hubert Base
Apache-2.0
Japanese HuBERT base model trained by rinna Co., Ltd., based on approximately 19,000 hours of Japanese speech corpus ReazonSpeech v1.
Speech Recognition Transformers Japanese
J
rinna
4,550
68
Trocr Processor
TrOCR is a Transformer-based optical character recognition model specifically designed for handwritten text recognition, fine-tuned on the IAM handwritten database.
Image-to-Text Transformers
T
anaghasavit
18
3
Plant Disease Classification2
An image classification model based on the transformers library for identifying and classifying plant diseases.
Image Classification Transformers
P
ayerr
40
1
Trocr Base Ckb
An OCR system based on Transformer architecture, specifically designed for recognizing Central Kurdish text, trained using synthetic data.
Text Recognition Transformers
T
razhan
19
0
Pythia 160m
Apache-2.0
Pythia-160M is a language model dedicated to interpretability research developed by EleutherAI. It belongs to the 160M parameter scale version in the Pythia suite and is based on the Transformer architecture, trained on the Pile dataset.
Large Language Model Transformers English
P
EleutherAI
163.75k
31
BLEURT 20 D12
The BLEURT model implemented based on PyTorch, used for text evaluation tasks in natural language processing.
Large Language Model Transformers
B
lucadiliello
2.6M
1
Segformer Finetuned Segments Cmp Facade
MIT
A building facade semantic segmentation model based on SegFormer architecture, capable of recognizing 12 types of architectural elements
Image Segmentation Transformers English
S
Xpitfire
379
1
Oneformer Ade20k Swin Tiny
MIT
The first multi-task universal image segmentation framework, supporting semantic/instance/panoptic segmentation tasks with a single model
Image Segmentation Transformers
O
shi-labs
12.96k
16
Scinertopic
MIT
A scientific term recognition model based on SciBERT, supporting NER-enhanced topic modeling
Sequence Labeling Transformers
S
RJuro
71
7
Gpt2 Small
MIT
GPT-2 is an autoregressive language model based on the Transformer architecture. It is pre-trained on a large-scale English corpus through self-supervised learning and excels at text generation tasks.
Large Language Model Transformers English
G
ComCom
1,032
3
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase