Abstract: In this paper, we present a neat yet effective transformer-based framework for visual grounding, namely TransVG, to address the task of grounding a language query to the corresponding region ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results