Vision-Language Models for Vision Tasks: A Survey Vision-Language Models Tutorial

VisionHub: Learning Task-Plugins for Efficient Universal Vision Model

Abstract: Building on the success of universal language models in natural language processing (NLP), researchers have recently sought to develop methods capable of tackling a broad spectrum of visual ...

The Lancet

Mapping the susceptibility of large language models to medical misinformation across clinical notes and social media: a cross-sectional benchmarking analysis

aThe Windreich Department of Artificial Intelligence and Human Health, Mount Sinai Health System, New York, NY, USA bThe Hasso Plattner Institute for Digital Health at Mount Sinai, Mount Sinai Health ...

GitHub

Motus: A Unified Latent Action World Model

Inference (without pre-encoded T5) ~ 41 GB A100 (40GB) / A100 (80GB) / H100 / B200 Motus_Wan2_2_5B_pretrain Pretrain / VGM Backbone Stage 1 VGM pretrained checkpoint ...

GitHub

egoPPG: Heart Rate Estimation from Eye Tracking Cameras in Egocentric Systems to Benefit Downstream Vision Tasks (ICCV 2025)

egoPPG is a novel vision task for egocentric systems to recover a person’s cardiac activity to aid downstream vision tasks. Our method, PulseFormer continuously estimates the person’s ...

The Robot Report

Show inaccessible results

VisionHub: Learning Task-Plugins for Efficient Universal Vision Model

Mapping the susceptibility of large language models to medical misinformation across clinical notes and social media: a cross-sectional benchmarking analysis

Motus: A Unified Latent Action World Model

egoPPG: Heart Rate Estimation from Eye Tracking Cameras in Egocentric Systems to Benefit Downstream Vision Tasks (ICCV 2025)

Vision-language-action models are the next leap in autonomous robotics

The Search Engine for OnlyFans Models Who Look Like Your Crush

Moonshot AI Releases Open-Weight Kimi K2.5 Model with Vision and Agent Swarm Capabilities

Remote Sensing Spatiotemporal Vision–Language Models: A comprehensive survey

AI creates artificial animals that over time develop functioning vision without instruction