Abstract: Audio-visual event localization (AVEL) aims to identify both the categories and temporal boundaries of events that are both audible and visible in unconstrained videos. However, the inherent ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results