Abstract: Audio-Visual Question Answering (AVQA) requires complex reasoning across auditory and visual modalities. While recent advancements leverage sophisticated spatio-temporal representations, ...
Abstract: Visual place recognition is a fundamental task essential for applications like visual localization and loop closure detection. Existing methods perform well under controlled environments, ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results