Abstract: Large multimodal models (LMM) have recently shown encouraging progress with visual instruction tuning. In this paper, we present the first systematic study to investigate the design choices ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results