Abstract: The language-guided robot grasping task requires a robot agent to integrate multimodal information from both visual and linguistic inputs to predict actions for target-driven grasping. While ...
Abstract: Zero-shot object navigation (ZSON) in unseen environments poses a significant challenge due to the absence of object-specific priors and the need for efficient exploration. Existing ...
The Adorant figurine from Geißenklösterle Cave, approximately 38,000 years old, consists of a small ivory plate bearing an anthropomorphic figure and multiple sequences of notches and dots. The ...
WS-DETR is a real-time traffic object detection model built upon the RT-DETR framework. The model incorporates two key innovations: Wavelet-Mamba Dual Path Block (WM-Dual Block) for improving ...