no code implementations • 9 Feb 2021 • Linwei Ye, Mrigank Rochan, Zhi Liu, Xiaoqin Zhang, Yang Wang
In this paper, we propose a cross-modal self-attention (CMSA) module to utilize fine details of individual words and the input image or video, which effectively captures the long-range dependencies between linguistic and visual features.
Ranked #5 on Referring Expression Segmentation on J-HMDB (Precision@0.9 metric)
1 code implementation • ECCV 2020 • Mrigank Rochan, Mahesh Kumar Krishna Reddy, Linwei Ye, Yang Wang
In this paper, we propose a simple yet effective framework that learns to adapt highlight detection to a user by exploiting the user's history in the form of highlights that the user has previously created.
2 code implementations • ECCV 2020 • Gongyang Li, Zhi Liu, Linwei Ye, Yang Wang, Haibin Ling
In this paper, we propose a novel Cross-Modal Weighting (CMW) strategy to encourage comprehensive interactions between RGB and depth channels for RGB-D SOD.
Ranked #9 on RGB-D Salient Object Detection on NJU2K
no code implementations • 30 Jan 2020 • Linwei Ye, Zhi Liu, Yang Wang
Given an input image and a referring expression in the form of a natural language sentence, the goal is to segment the object of interest in the image referred by the linguistic query.
1 code implementation • CVPR 2019 • Linwei Ye, Mrigank Rochan, Zhi Liu, Yang Wang
This module controls the information flow of features at different levels.
Ranked #14 on Referring Video Object Segmentation on Refer-YouTube-VOS (using extra training data)
no code implementations • ECCV 2018 • Mrigank Rochan, Linwei Ye, Yang Wang
This paper addresses the problem of video summarization.
no code implementations • 1 Feb 2018 • Linwei Ye, Zhi Liu, Yang Wang
Models based on deep convolutional neural networks (CNN) have significantly improved the performance of semantic segmentation.