R

RADIO L

Developed by nvidia
AM-RADIO is a visual foundation model developed by NVIDIA Research, featuring an aggregated architecture for unified multi-domain representation, suitable for various computer vision tasks.
Downloads 23.27k
Release Time : 7/23/2024

Model Overview

AM-RADIO is a general-purpose visual foundation model capable of simultaneously extracting global conceptual representations and local spatial features from images, supporting a wide range of computer vision tasks from image classification to semantic segmentation.

Model Features

Dual Output Representation
Simultaneously outputs global conceptual representations (similar to ViT's cls_token) and local spatial features, adapting to visual tasks of different granularities
Multi-domain Unification
Achieves unified visual feature representation across domains through an aggregated architecture
Flexible Feature Transformation
Supports converting spatial features into standard (B,D,H,W) tensor format for easy integration into various computer vision workflows

Model Capabilities

Global image conceptual representation extraction
Local spatial feature extraction
Semantic segmentation support
LLM vision feature integration

Use Cases

Computer Vision
Image Classification
Utilizes summary features for overall image classification
Semantic Segmentation
Uses spatial_features for pixel-level prediction
Multimodal Systems
LLM Visual Input
Provides visual feature inputs for large language models
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase