1 code implementation • CVPR 2022 • Minghang Zheng, Yanjie Huang, Qingchao Chen, Yuxin Peng, Yang Liu
Moreover, they train their model to distinguish positive visual-language pairs from negative ones randomly collected from other videos, ignoring the highly confusing video segments within the same video.
Ranked #7 on Temporal Sentence Grounding on Charades-STA