Abstract: Open-vocabulary 3D scene understanding presents a significant challenge in the field. Recent works have sought to transfer knowledge embedded in vision-language models from 2D to 3D domains.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results