no code implementations • Findings (ACL) 2022 • Zihan Wang, Jiuxiang Gu, Jason Kuen, Handong Zhao, Vlad Morariu, Ruiyi Zhang, Ani Nenkova, Tong Sun, Jingbo Shang
We present a comprehensive study of sparse attention patterns in Transformer models.
no code implementations • 18 Apr 2024 • Shengcao Cao, Jiuxiang Gu, Jason Kuen, Hao Tan, Ruiyi Zhang, Handong Zhao, Ani Nenkova, Liang-Yan Gui, Tong Sun, Yu-Xiong Wang
Using raw images as the sole training data, our method achieves unprecedented performance in self-supervised open-world segmentation, marking a significant milestone towards high-quality open-world entity segmentation in the absence of human-annotated masks.
no code implementations • 6 Nov 2023 • Hanrong Ye, Jason Kuen, Qing Liu, Zhe Lin, Brian Price, Dan Xu
On the highly competitive ADE20K and COCO benchmarks, our data generation method markedly improves the performance of state-of-the-art segmentation models in semantic segmentation, panoptic segmentation, and instance segmentation.
1 code implementation • 28 May 2023 • Lu Qi, Jason Kuen, Weidong Guo, Jiuxiang Gu, Zhe Lin, Bo Du, Yu Xu, Ming-Hsuan Yang
Despite the progress of image segmentation for accurate visual entity segmentation, completing the diverse requirements of image editing applications for different-level region-of-interest selections remains unsolved.
no code implementations • CVPR 2023 • Sijie Zhu, Zhe Lin, Scott Cohen, Jason Kuen, Zhifei Zhang, Chen Chen
Given a background image and a segmented object, the goal is to train a model to predict plausible placements (location and scale) of the object for compositing.
no code implementations • ICCV 2023 • Lu Qi, Jason Kuen, Tiancheng Shen, Jiuxiang Gu, Wenbo Li, Weidong Guo, Jiaya Jia, Zhe Lin, Ming-Hsuan Yang
Given the high-quality and -resolution nature of the dataset, we propose CropFormer which is designed to tackle the intractability of instance-level segmentation on high-resolution images.
no code implementations • CVPR 2023 • Yu Zeng, Zhe Lin, Jianming Zhang, Qing Liu, John Collomosse, Jason Kuen, Vishal M. Patel
We propose a new framework for conditional image synthesis from semantic layouts of any precision levels, ranging from pure text to a 2D semantic canvas with precise shapes.
1 code implementation • 10 Nov 2022 • Lu Qi, Jason Kuen, Weidong Guo, Tiancheng Shen, Jiuxiang Gu, Jiaya Jia, Zhe Lin, Ming-Hsuan Yang
It improves mask prediction by fusing high-res image crops that provide more fine-grained image details and the full image.
no code implementations • 13 Oct 2022 • Haoxuan Qu, Yanchao Li, Lin Geng Foo, Jason Kuen, Jiuxiang Gu, Jun Liu
Confidence estimation, a task that aims to evaluate the trustworthiness of the model's prediction output during deployment, has received lots of research attention recently, due to its importance for the safe deployment of deep models.
no code implementations • 17 Aug 2022 • Xin Yuan, Zhe Lin, Jason Kuen, Jianming Zhang, John Collomosse
We develop an approach for text-to-image generation that embraces additional retrieval images, driven by a combination of implicit visual guidance loss and generative objectives.
no code implementations • 23 Jul 2022 • Li Xu, Haoxuan Qu, Jason Kuen, Jiuxiang Gu, Jun Liu
Video scene graph generation (VidSGG) aims to parse the video content into scene graphs, which involves modeling the spatio-temporal contextual information in the video.
no code implementations • 22 Apr 2022 • Jiuxiang Gu, Jason Kuen, Vlad I. Morariu, Handong Zhao, Nikolaos Barmpalios, Rajiv Jain, Ani Nenkova, Tong Sun
Document intelligence automates the extraction of information from documents and supports many business applications.
Ranked #7 on Document Layout Analysis on PubLayNet val
no code implementations • 31 Mar 2022 • Sijie Zhu, Zhe Lin, Scott Cohen, Jason Kuen, Zhifei Zhang, Chen Chen
To move a step further, this paper proposes GALA (Geometry-and-Lighting-Aware), a generic foreground object search method with discriminative modeling on geometry and lighting compatibility for open-world image compositing.
1 code implementation • 9 Dec 2021 • Lu Qi, Jason Kuen, Zhe Lin, Jiuxiang Gu, Fengyun Rao, Dian Li, Weidong Guo, Zhen Wen, Ming-Hsuan Yang, Jiaya Jia
To improve instance-level detection/segmentation performance, existing self-supervised and semi-supervised methods extract either task-unrelated or task-specific training signals from unlabeled data.
no code implementations • NeurIPS 2021 • Jiuxiang Gu, Jason Kuen, Vlad Morariu, Handong Zhao, Rajiv Jain, Nikolaos Barmpalios, Ani Nenkova, Tong Sun
Document intelligence automates the extraction of information from documents and supports many business applications.
1 code implementation • CVPR 2022 • Tiancheng Shen, Yuechen Zhang, Lu Qi, Jason Kuen, Xingyu Xie, Jianlong Wu, Zhe Lin, Jiaya Jia
To segment 4K or 6K ultra high-resolution images needs extra computation consideration in image segmentation.
1 code implementation • CVPR 2022 • Dat Huynh, Jason Kuen, Zhe Lin, Jiuxiang Gu, Ehsan Elhamifar
To address this, we propose a cross-modal pseudo-labeling framework, which generates training pseudo masks by aligning word semantics in captions with visual features of object masks in images.
2 code implementations • CVPR 2021 • Lu Qi, Jason Kuen, Jiuxiang Gu, Zhe Lin, Yi Wang, Yukang Chen, Yanwei Li, Jiaya Jia
However, this option traditionally hurts the detection performance much.
2 code implementations • 29 Jul 2021 • Lu Qi, Jason Kuen, Yi Wang, Jiuxiang Gu, Hengshuang Zhao, Zhe Lin, Philip Torr, Jiaya Jia
By removing the need of class label prediction, the models trained for such task can focus more on improving segmentation quality.
no code implementations • CVPR 2021 • Peizhao Li, Jiuxiang Gu, Jason Kuen, Vlad I. Morariu, Handong Zhao, Rajiv Jain, Varun Manjunatha, Hongfu Liu
For downstream usage, we propose a novel modality-adaptive attention mechanism for multimodal feature fusion by adaptively emphasizing language and vision signals.
no code implementations • CVPR 2021 • Xin Yuan, Zhe Lin, Jason Kuen, Jianming Zhang, Yilin Wang, Michael Maire, Ajinkya Kale, Baldo Faieta
We first train our model on COCO and evaluate the learned visual representations on various downstream tasks including image classification, object detection, and instance segmentation.
no code implementations • NeurIPS 2020 • Jiuxiang Gu, Jason Kuen, Shafiq Joty, Jianfei Cai, Vlad Morariu, Handong Zhao, Tong Sun
Structured representations of images that model visual relationships are beneficial for many vision and vision-language applications.
1 code implementation • ICCV 2019 • Jason Kuen, Federico Perazzi, Zhe Lin, Jianming Zhang, Yap-Peng Tan
Large scale object detection datasets are constantly increasing their size in terms of the number of classes and annotations count.
no code implementations • CVPR 2018 • Ping Hu, Gang Wang, Xiangfei Kong, Jason Kuen, Yap-Peng Tan
Then, the proposed Cascaded Refinement Network(CRN) takes the coarse segmentation as guidance to generate an accurate segmentation of full resolution.
no code implementations • CVPR 2018 • Jianlou Si, Honggang Zhang, Chun-Guang Li, Jason Kuen, Xiangfei Kong, Alex C. Kot, Gang Wang
Typical person re-identification (ReID) methods usually describe each pedestrian with a single feature vector and match them in a task-specific metric space.
1 code implementation • CVPR 2018 • Jason Kuen, Xiangfei Kong, Zhe Lin, Gang Wang, Jianxiong Yin, Simon See, Yap-Peng Tan
We propose a novel approach for cost-adjustable inference in CNNs - Stochastic Downsampling Point (SDPoint).
1 code implementation • 17 Nov 2016 • Jason Kuen, Xiangfei Kong, Gang Wang, Yap-Peng Tan
Deluge Networks (DelugeNets) are deep neural networks which efficiently facilitate massive cross-layer information inflows from preceding layers to succeeding layers.
no code implementations • 14 Apr 2016 • Jason Kuen, Kian Ming Lim, Chin Poo Lee
Visual representation is crucial for a visual tracking method's performances.
no code implementations • CVPR 2016 • Jason Kuen, Zhenhua Wang, Gang Wang
Convolutional-deconvolution networks can be adopted to perform end-to-end saliency detection.
no code implementations • 22 Dec 2015 • Jiuxiang Gu, Zhenhua Wang, Jason Kuen, Lianyang Ma, Amir Shahroudy, Bing Shuai, Ting Liu, Xingxing Wang, Li Wang, Gang Wang, Jianfei Cai, Tsuhan Chen
In the last few years, deep learning has led to very good performance on a variety of problems, such as visual recognition, speech recognition and natural language processing.