C

Cosmos Predict2 14B Text2Image

Developed by nvidia
Cosmos-Predict2 is a series of high-performance pre-trained world foundation models designed for physical AI to generate physics-aware images, videos, and world states.
Downloads 312
Release Time : 4/22/2025

Model Overview

A diffusion-based world foundation model that can generate dynamic, high-quality images and videos based on text, image, or video inputs, and can serve as a building block for various world generation-related applications or research.

Model Features

Physics-aware generation
Designed for physical AI, it can generate physics-aware images and videos and simulate physical interactions in the real world.
High-quality output
Generates dynamic, high-quality images and videos with a default resolution of 1280x704 pixels.
Multimodal input support
Supports text, images, or videos as input conditions, flexibly adapting to different application scenarios.
Commercial use license
It can be used for commercial purposes under the NVIDIA Open Model License, and derivative models can be freely created and distributed.

Model Capabilities

Text-to-image generation
Video prediction
Physical scene simulation
Multimodal understanding

Use Cases

Creative content generation
Advertising creative generation
Automatically generates high-quality advertising images based on product descriptions.
Generates product display images that comply with physical laws
Game development
Game scene generation
Generates physical scenes in games based on text descriptions.
Generates game environments with physical interaction capabilities
Film and television pre-production
Storyboard generation
Generates film and television storyboard frames based on script descriptions.
Generates storyboard images with dynamic effects
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase