OVIS is a large-scale dataset for occluded video instance segmentation. It consists of 296k high-quality instance masks from 25 semantic categories, where heavy object occlusions usually occur.
1st Occluded Video Instance Segmentation Challenge in ICCV 2021
2nd Occluded Video Instance Segmentation Challenge in ECCV 2022
DanceTrack is a multi-human tracking dataset, emphasizing 1) uniform appearance: humans are in highly similar and almost undistinguished appearance, and 2) diverse motion: humans are in complicated motion pattern and their relative positions exchange frequently.
1st Multiple People Tracking in Group Dance Challenge in ECCV 2022
MUSES is a large-scale video dataset, designed to spur researches on a new task called multi-shot temporal event localization. MUSES has 31,477 event instances for a total of 716 video hours. The core nature of MUSES is the frequent shot cuts, for an average of 19 shots per instance and 176 shots per video, which induces large intra-instance variations.
YouMVOs is a dataset for multi-shot video object segmentation, consisting of 431K segmentation masks and 200 YouTube videos.