Search Results for author: Jason Kuen

Found 30 papers, 10 papers with code

Learning Adaptive Axis Attentions in Fine-tuning: Beyond Fixed Sparse Attention Patterns

no code implementations • Findings (ACL) 2022 • Zihan Wang, Jiuxiang Gu, Jason Kuen, Handong Zhao, Vlad Morariu, Ruiyi Zhang, Ani Nenkova, Tong Sun, Jingbo Shang

We present a comprehensive study of sparse attention patterns in Transformer models.

Paper
Add Code

SOHES: Self-supervised Open-world Hierarchical Entity Segmentation

no code implementations • 18 Apr 2024 • Shengcao Cao, Jiuxiang Gu, Jason Kuen, Hao Tan, Ruiyi Zhang, Handong Zhao, Ani Nenkova, Liang-Yan Gui, Tong Sun, Yu-Xiong Wang

Using raw images as the sole training data, our method achieves unprecedented performance in self-supervised open-world segmentation, marking a significant milestone towards high-quality open-world entity segmentation in the absence of human-annotated masks.

Segmentation

Paper
Add Code

SegGen: Supercharging Segmentation Models with Text2Mask and Mask2Img Synthesis

no code implementations • 6 Nov 2023 • Hanrong Ye, Jason Kuen, Qing Liu, Zhe Lin, Brian Price, Dan Xu

On the highly competitive ADE20K and COCO benchmarks, our data generation method markedly improves the performance of state-of-the-art segmentation models in semantic segmentation, panoptic segmentation, and instance segmentation.

Image Generation Image Segmentation +3

Paper
Add Code

AIMS: All-Inclusive Multi-Level Segmentation

1 code implementation • 28 May 2023 • Lu Qi, Jason Kuen, Weidong Guo, Jiuxiang Gu, Zhe Lin, Bo Du, Yu Xu, Ming-Hsuan Yang

Despite the progress of image segmentation for accurate visual entity segmentation, completing the diverse requirements of image editing applications for different-level region-of-interest selections remains unsolved.

Image Segmentation Segmentation +1

669

Paper
Code

TopNet: Transformer-based Object Placement Network for Image Compositing

no code implementations • CVPR 2023 • Sijie Zhu, Zhe Lin, Scott Cohen, Jason Kuen, Zhifei Zhang, Chen Chen

Given a background image and a segmented object, the goal is to train a model to predict plausible placements (location and scale) of the object for compositing.

Object

Paper
Add Code

High Quality Entity Segmentation

no code implementations • ICCV 2023 • Lu Qi, Jason Kuen, Tiancheng Shen, Jiuxiang Gu, Wenbo Li, Weidong Guo, Jiaya Jia, Zhe Lin, Ming-Hsuan Yang

Given the high-quality and -resolution nature of the dataset, we propose CropFormer which is designed to tackle the intractability of instance-level segmentation on high-resolution images.

Image Segmentation Segmentation +1

Paper
Add Code

SceneComposer: Any-Level Semantic Image Synthesis

no code implementations • CVPR 2023 • Yu Zeng, Zhe Lin, Jianming Zhang, Qing Liu, John Collomosse, Jason Kuen, Vishal M. Patel

We propose a new framework for conditional image synthesis from semantic layouts of any precision levels, ranging from pure text to a 2D semantic canvas with precise shapes.

Image Generation

Paper
Add Code

High-Quality Entity Segmentation

1 code implementation • 10 Nov 2022 • Lu Qi, Jason Kuen, Weidong Guo, Tiancheng Shen, Jiuxiang Gu, Jiaya Jia, Zhe Lin, Ming-Hsuan Yang

It improves mask prediction by fusing high-res image crops that provide more fine-grained image details and the full image.

Image Segmentation Segmentation +2

669

Paper
Code

Improving the Reliability for Confidence Estimation

no code implementations • 13 Oct 2022 • Haoxuan Qu, Yanchao Li, Lin Geng Foo, Jason Kuen, Jiuxiang Gu, Jun Liu

Confidence estimation, a task that aims to evaluate the trustworthiness of the model's prediction output during deployment, has received lots of research attention recently, due to its importance for the safe deployment of deep models.

Image Classification Meta-Learning +1

Paper
Add Code

Text-to-Image Generation via Implicit Visual Guidance and Hypernetwork

no code implementations • 17 Aug 2022 • Xin Yuan, Zhe Lin, Jason Kuen, Jianming Zhang, John Collomosse

We develop an approach for text-to-image generation that embraces additional retrieval images, driven by a combination of implicit visual guidance loss and generative objectives.

Retrieval Text-to-Image Generation

Paper
Add Code

Meta Spatio-Temporal Debiasing for Video Scene Graph Generation

no code implementations • 23 Jul 2022 • Li Xu, Haoxuan Qu, Jason Kuen, Jiuxiang Gu, Jun Liu

Video scene graph generation (VidSGG) aims to parse the video content into scene graphs, which involves modeling the spatio-temporal contextual information in the video.

Graph Generation Meta-Learning +2

Paper
Add Code

Unified Pretraining Framework for Document Understanding

no code implementations • 22 Apr 2022 • Jiuxiang Gu, Jason Kuen, Vlad I. Morariu, Handong Zhao, Nikolaos Barmpalios, Rajiv Jain, Ani Nenkova, Tong Sun

Document intelligence automates the extraction of information from documents and supports many business applications.

Ranked #7 on Document Layout Analysis on PubLayNet val

Document Layout Analysis document understanding +1

Paper
Add Code

GALA: Toward Geometry-and-Lighting-Aware Object Search for Compositing

no code implementations • 31 Mar 2022 • Sijie Zhu, Zhe Lin, Scott Cohen, Jason Kuen, Zhifei Zhang, Chen Chen

To move a step further, this paper proposes GALA (Geometry-and-Lighting-Aware), a generic foreground object search method with discriminative modeling on geometry and lighting compatibility for open-world image compositing.

Object

Paper
Add Code

CA-SSL: Class-Agnostic Semi-Supervised Learning for Detection and Segmentation

1 code implementation • 9 Dec 2021 • Lu Qi, Jason Kuen, Zhe Lin, Jiuxiang Gu, Fengyun Rao, Dian Li, Weidong Guo, Zhen Wen, Ming-Hsuan Yang, Jiaya Jia

To improve instance-level detection/segmentation performance, existing self-supervised and semi-supervised methods extract either task-unrelated or task-specific training signals from unlabeled data.

object-detection Object Detection +2

669

Paper
Code

UniDoc: Unified Pretraining Framework for Document Understanding

no code implementations • NeurIPS 2021 • Jiuxiang Gu, Jason Kuen, Vlad Morariu, Handong Zhao, Rajiv Jain, Nikolaos Barmpalios, Ani Nenkova, Tong Sun

Document intelligence automates the extraction of information from documents and supports many business applications.

document understanding Self-Supervised Learning

Paper
Add Code

High Quality Segmentation for Ultra High-resolution Images

1 code implementation • CVPR 2022 • Tiancheng Shen, Yuechen Zhang, Lu Qi, Jason Kuen, Xingyu Xie, Jianlong Wu, Zhe Lin, Jiaya Jia

To segment 4K or 6K ultra high-resolution images needs extra computation consideration in image segmentation.

4k Image Segmentation +3

669

Paper
Code

Open-Vocabulary Instance Segmentation via Robust Cross-Modal Pseudo-Labeling

1 code implementation • CVPR 2022 • Dat Huynh, Jason Kuen, Zhe Lin, Jiuxiang Gu, Ehsan Elhamifar

To address this, we propose a cross-modal pseudo-labeling framework, which generates training pseudo masks by aligning word semantics in captions with visual features of object masks in images.

Instance Segmentation Semantic Segmentation

Paper
Code

Multi-Scale Aligned Distillation for Low-Resolution Detection

2 code implementations • CVPR 2021 • Lu Qi, Jason Kuen, Jiuxiang Gu, Zhe Lin, Yi Wang, Yukang Chen, Yanwei Li, Jiaya Jia

However, this option traditionally hurts the detection performance much.

Knowledge Distillation object-detection +1

128

Paper
Code

Open-World Entity Segmentation

2 code implementations • 29 Jul 2021 • Lu Qi, Jason Kuen, Yi Wang, Jiuxiang Gu, Hengshuang Zhao, Zhe Lin, Philip Torr, Jiaya Jia

By removing the need of class label prediction, the models trained for such task can focus more on improving segmentation quality.

Image Manipulation Image Segmentation +2

669

Paper
Code

SelfDoc: Self-Supervised Document Representation Learning

no code implementations • CVPR 2021 • Peizhao Li, Jiuxiang Gu, Jason Kuen, Vlad I. Morariu, Handong Zhao, Rajiv Jain, Varun Manjunatha, Hongfu Liu

For downstream usage, we propose a novel modality-adaptive attention mechanism for multimodal feature fusion by adaptively emphasizing language and vision signals.

Representation Learning

Paper
Add Code

Multimodal Contrastive Training for Visual Representation Learning

no code implementations • CVPR 2021 • Xin Yuan, Zhe Lin, Jason Kuen, Jianming Zhang, Yilin Wang, Michael Maire, Ajinkya Kale, Baldo Faieta

We first train our model on COCO and evaluate the learned visual representations on various downstream tasks including image classification, object detection, and instance segmentation.

Cross-Modal Retrieval Image Classification +6

Paper
Add Code

Self-Supervised Relationship Probing

no code implementations • NeurIPS 2020 • Jiuxiang Gu, Jason Kuen, Shafiq Joty, Jianfei Cai, Vlad Morariu, Handong Zhao, Tong Sun

Structured representations of images that model visual relationships are beneficial for many vision and vision-language applications.

Contrastive Learning Language Modelling +1

Paper
Add Code

Scaling Object Detection by Transferring Classification Weights

1 code implementation • ICCV 2019 • Jason Kuen, Federico Perazzi, Zhe Lin, Jianming Zhang, Yap-Peng Tan

Large scale object detection datasets are constantly increasing their size in terms of the number of classes and annotations count.

Classification General Classification +3

Paper
Code

Motion-Guided Cascaded Refinement Network for Video Object Segmentation

no code implementations • CVPR 2018 • Ping Hu, Gang Wang, Xiangfei Kong, Jason Kuen, Yap-Peng Tan

Then, the proposed Cascaded Refinement Network(CRN) takes the coarse segmentation as guidance to generate an accurate segmentation of full resolution.

Object Optical Flow Estimation +4

Paper
Add Code

Dual Attention Matching Network for Context-Aware Feature Sequence based Person Re-Identification

no code implementations • CVPR 2018 • Jianlou Si, Honggang Zhang, Chun-Guang Li, Jason Kuen, Xiangfei Kong, Alex C. Kot, Gang Wang

Typical person re-identification (ReID) methods usually describe each pedestrian with a single feature vector and match them in a task-specific metric space.

Person Re-Identification

Paper
Add Code

Stochastic Downsampling for Cost-Adjustable Inference and Improved Regularization in Convolutional Networks

1 code implementation • CVPR 2018 • Jason Kuen, Xiangfei Kong, Zhe Lin, Gang Wang, Jianxiong Yin, Simon See, Yap-Peng Tan

We propose a novel approach for cost-adjustable inference in CNNs - Stochastic Downsampling Point (SDPoint).

Image Classification Object Recognition

Paper
Code

DelugeNets: Deep Networks with Efficient and Flexible Cross-layer Information Inflows

1 code implementation • 17 Nov 2016 • Jason Kuen, Xiangfei Kong, Gang Wang, Yap-Peng Tan

Deluge Networks (DelugeNets) are deep neural networks which efficiently facilitate massive cross-layer information inflows from preceding layers to succeeding layers.

General Classification

Paper
Code

Self-taught learning of a deep invariant representation for visual tracking via temporal slowness principle

no code implementations • 14 Apr 2016 • Jason Kuen, Kian Ming Lim, Chin Poo Lee

Visual representation is crucial for a visual tracking method's performances.

Visual Tracking

Paper
Add Code

Recurrent Attentional Networks for Saliency Detection

no code implementations • CVPR 2016 • Jason Kuen, Zhenhua Wang, Gang Wang

Convolutional-deconvolution networks can be adopted to perform end-to-end saliency detection.

Saliency Detection

Paper
Add Code

Recent Advances in Convolutional Neural Networks

no code implementations • 22 Dec 2015 • Jiuxiang Gu, Zhenhua Wang, Jason Kuen, Lianyang Ma, Amir Shahroudy, Bing Shuai, Ting Liu, Xingxing Wang, Li Wang, Gang Wang, Jianfei Cai, Tsuhan Chen

In the last few years, deep learning has led to very good performance on a variety of problems, such as visual recognition, speech recognition and natural language processing.

speech-recognition Speech Recognition

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.