Abstract: In this article, two new models, namely granulated RCNN (G-RCNN) and multi-class deep SORT (MCD-SORT), for object detection and tracking, respectively from videos are developed. Object ...
Specifically for the classification task, the goal is, for each of the classes predict the presence/ absence of at least one object of that class in a test image. We will use Pascal VOC 2012 dataset ...
Abstract: Audiovisual scenes are pervasive in our daily life. It is commonplace for humans to discriminatively localize different sounding objects but quite challenging for machines to achieve ...