Model Selection

Self-supervised Learning

# Self-supervised Learning

Resencl OpenMind MAE

The first comprehensive benchmark study model for self-supervised learning on 3D medical imaging data, providing multiple pre-trained checkpoints

Hubert Ecg Small

A self-supervised pre-trained foundation model for ECG analysis, supporting detection of 164 cardiovascular diseases

Molecular Model

Path Foundation

Path Foundation is a machine learning model for histopathology applications, trained through self-supervised learning to generate 384-dimensional embedding vectors from H&E-stained slides for efficient classifier model training.

Image Classification English

A multilingual vision-language pre-trained model for the remote sensing field, supporting image-text cross-modal tasks in 10 languages.

Image-to-Text Supports Multiple Languages

Dinov2.giant.patch 14.reg 4

DINOv2 is a visual feature extraction model based on Vision Transformer (ViT), which enhances feature extraction capabilities by introducing register mechanisms.

Rad Dino Maira 2

RAD-DINO-MAIRA-2 is a vision transformer model trained with DINOv2 self-supervised learning, specifically designed for encoding chest X-ray images.

DaSheng is a general-purpose audio encoder trained with large-scale self-supervised learning, capable of capturing rich audio information across multiple domains such as speech, music, and environmental sounds.

Audio Classification

Wav2vec2 Base BirdSet XCL

wav2vec 2.0 is a self-supervised learning framework for speech representation learning, capable of learning speech features from unlabeled audio data.

Audio Classification

DBD-research-group

Phikon-v2 is a model based on the Vision Transformer Large architecture, pre-trained on the PANCAN-XL dataset using the DinoV2 self-supervised method, specifically designed for histological image analysis.

Image Classification

Transformers English

VQVAE is a video generation model based on the VQ-VAE architecture, cloned from the VideoGPT project, aimed at converting the model to Hugging Face format for easier loading.

Video Processing

Wav2vec2 Base Audioset

Audio representation learning model based on HuBERT architecture, pre-trained on the complete AudioSet dataset

Audio Classification

Hubert Large Audioset

A Transformer model based on the HuBERT architecture, pre-trained on the complete AudioSet dataset, suitable for general audio representation learning tasks.

Audio Classification

Wav2vec2 Large Audioset

Audio representation model based on HuBERT architecture, pretrained on the complete AudioSet dataset, suitable for general audio tasks

Audio Classification

TwinBooster is a DeBERTa V3 base model fine-tuned on the PubChem bioassay corpus, combined with the Barlow Twins self-supervised learning method for molecular property prediction.

Molecular Model

Transformers English

Hubert Base Korean

Hubert (Hidden-Unit BERT) is a speech representation learning model proposed by Facebook, which uses self-supervised learning to directly learn speech features from raw waveform signals.

Speech Recognition Korean

Videomae Small Finetuned Kinetics

VideoMAE is a masked autoencoder model for video, pretrained with self-supervision and fine-tuned on the Kinetics-400 dataset, suitable for video classification tasks.

Video Processing

Vit Base Patch16 224.dino

A Vision Transformer (ViT) image feature model trained with self-supervised DINO method, suitable for image classification and feature extraction tasks.

Image Classification

A ResNet-50 model pre-trained using the DINO self-supervised learning method, suitable for visual feature extraction tasks

Image Classification

Biomednlp KRISSBERT PubMed UMLS EL

KRISSBERT is a knowledge-enhanced self-supervised learning model for biomedical entity linking. It trains contextual encoders using unannotated text and domain knowledge to effectively address the diversity and ambiguity of entity names.

Knowledge Graph

Transformers English

ProtGPT2 is a protein language model based on the GPT2 architecture, capable of generating novel protein sequences while retaining key features of natural proteins.

Albert Fa Base V2 Ner Peyma

The first ALBERT model specifically for Persian, based on Google's ALBERT base v2.0 architecture, trained on diverse Persian corpora

Large Language Model

Transformers Other

Albert Fa Base V2 Ner Arman

A lightweight BERT model for self-supervised language representation learning in Persian

Large Language Model

Transformers Other

Albert Fa Base V2 Sentiment Deepsentipers Multi

Lightweight BERT model designed for self-supervised learning of Persian language representations

Large Language Model

Transformers Other

Albert Fa Base V2 Clf Persiannews

A lightweight BERT model designed for Persian self-supervised language representation learning

Large Language Model

Transformers Other

Splinter is a self-supervised pre-trained model specifically designed for few-shot QA tasks, utilizing the Recurring Span Selection (RSS) objective for pre-training.

Question Answering System

Transformers English

Albert Fa Base V2 Sentiment Binary

The Persian ALBERT model is a lightweight BERT for self-supervised learning of Persian language representations

Large Language Model

Transformers Other

Albert Fa Base V2 Sentiment Deepsentipers Binary

A lightweight BERT model for self-supervised language representation learning in Persian

Large Language Model

Transformers Other

Albert Fa Base V2

A lightweight BERT model for self-supervised learning of Persian language representations

Large Language Model

Transformers Other

DistilHuBERT is a lightweight speech representation learning model achieved through hierarchical distillation of the HuBERT model, significantly reducing model size and computational costs while maintaining performance.

Speech Recognition

Transformers English

Splinter Base Qass

Splinter is a few-shot QA model pre-trained via self-supervised learning, utilizing the Recurrent Span Selection (RSS) objective to simulate the span selection process in extractive QA.

Question Answering System

Transformers English

Albert Fa Base V2 Sentiment Digikala

A lightweight BERT model for self-supervised language representation learning in Persian

Large Language Model

Transformers Other

Wav2vec2 FR 1K Base

A wav2vec2 base model trained on 1K hours of French speech, supporting tasks like speech recognition

Speech Recognition

Transformers French

Wav2vec2 FR 7K Base

Large wav2vec2 model trained on 7.6K hours of French speech, including spontaneous, read, and broadcast speech

Speech Recognition

Transformers French

W2v Xlsr Dutch Lm

This is a Dutch speech recognition model based on the wav2vec2 architecture, developed by Facebook and specifically optimized for Dutch.

Speech Recognition

Albert Fa Zwnj Base V2

A lightweight BERT model for self-supervised language representation learning in Persian

Large Language Model

Transformers Other

Wav2vec2 FR 7K Large

Large wav2vec2 model trained on 7.6K hours of French speech data, including spontaneous, read, and broadcast speech

Speech Recognition

Transformers French

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase