no code implementations • ECCV 2020 • Wenqing Chu, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Rongrong Ji
Each connection extracts the style feature of the latent feature maps in the encoder and then performs a residual learning based mapping function in the global information space guided by the target attributes.
no code implementations • ECCV 2020 • Yunhang Shen, Rongrong Ji, Yan Wang, Zhiwei Chen, Feng Zheng, Feiyue Huang, Yunsheng Wu
Weakly supervised object detection (WSOD) has attracted extensive research attention due to its great flexibility of exploiting large-scale image-level annotation for detector training.
1 code implementation • ECCV 2020 • Xinshuai Dong, Hong Liu, Rongrong Ji, Liujuan Cao, Qixiang Ye, Jianzhuang Liu, Qi Tian
On the contrary, a discriminative classifier only models the conditional distribution of labels given inputs, but benefits from effective optimization owing to its succinct structure.
no code implementations • 2 May 2024 • Yiwei Ma, Zhekai Lin, Jiayi Ji, Yijun Fan, Xiaoshuai Sun, Rongrong Ji
Recent advancements in automatic 3D avatar generation guided by text have made significant progress.
no code implementations • 1 May 2024 • Yian Zhao, Kehan Li, Zesen Cheng, Pengchong Qiao, Xiawu Zheng, Rongrong Ji, Chang Liu, Li Yuan, Jie Chen
In this work, we introduce Granularity-Controllable Interactive Segmentation (GraCo), a novel approach that allows precise control of prediction granularity by introducing additional parameters to input.
1 code implementation • 26 Apr 2024 • Ziyue Zhang, Mingbao Lin, Rongrong Ji
We introduce ObjectAdd, a training-free diffusion modification method to add user-expected objects into user-specified area.
no code implementations • 24 Apr 2024 • Timin Gao, Peixian Chen, Mengdan Zhang, Chaoyou Fu, Yunhang Shen, Yan Zhang, Shengchuan Zhang, Xiawu Zheng, Xing Sun, Liujuan Cao, Rongrong Ji
This paper delves into the realm of multimodal CoT to solve intricate visual reasoning tasks with multimodal large language models(MLLMs) and their cognitive capability.
1 code implementation • 23 Apr 2024 • Mingbao Lin, Zhihang Lin, Wengyi Zhan, Liujuan Cao, Rongrong Ji
Transforming large pre-trained low-resolution diffusion models to cater to higher-resolution demands, i. e., diffusion extrapolation, significantly improves diffusion adaptability.
no code implementations • 23 Apr 2024 • Wensheng Pan, Timin Gao, Yan Zhang, Runze Hu, Xiawu Zheng, Enwei Zhang, Yuting Gao, Yutao Liu, Yunhang Shen, Ke Li, Shengchuan Zhang, Liujuan Cao, Rongrong Ji
Image Quality Assessment (IQA) models benefit significantly from semantic information, which allows them to treat different types of objects distinctly.
no code implementations • 22 Apr 2024 • Chi Huang, Xinyang Li, Shengchuan Zhang, Liujuan Cao, Rongrong Ji
As a preliminary work, NeRF-Det unifies the tasks of novel view synthesis and 3D perception, demonstrating that perceptual tasks can benefit from novel view synthesis methods like NeRF, significantly improving the performance of indoor multi-view 3D object detection.
no code implementations • 17 Apr 2024 • Yongdong Luo, Haojia Lin, Xiawu Zheng, Yigeng Jiang, Fei Chao, Jie Hu, Guannan Jiang, Songan Zhang, Rongrong Ji
3D Visual Grounding (3DVG) and 3D Dense Captioning (3DDC) are two crucial tasks in various 3D applications, which require both shared and complementary information in localization and visual-language relationships.
1 code implementation • 31 Mar 2024 • Xiaorui Huang, Gen Luo, Chaoyang Zhu, Bo Tong, Yiyi Zhou, Xiaoshuai Sun, Rongrong Ji
Recently, Segment Anything Model (SAM) has become a research hotspot in the fields of multimedia and computer vision, which exhibits powerful yet versatile capabilities on various (un) conditional image segmentation tasks.
1 code implementation • 31 Mar 2024 • Lirui Zhao, Yue Yang, Kaipeng Zhang, Wenqi Shao, Yuxin Zhang, Yu Qiao, Ping Luo, Rongrong Ji
Text-to-image (T2I) generative models have attracted significant attention and found extensive applications within and beyond academic research.
1 code implementation • 27 Mar 2024 • Zhongxi Chen, Ke Sun, Ziyin Zhou, Xianming Lin, Xiaoshuai Sun, Liujuan Cao, Rongrong Ji
The rapid progress in deep learning has given rise to hyper-realistic facial forgery methods, leading to concerns related to misinformation and security risks.
no code implementations • 22 Mar 2024 • Qiong Wu, Weihao Ye, Yiyi Zhou, Xiaoshuai Sun, Rongrong Ji
In this paper, we propose a novel parameter and computation efficient tuning method for Multi-modal Large Language Models (MLLMs), termed Efficient Attention Skipping (EAS).
no code implementations • 19 Mar 2024 • Jianlong Hu, Xu Chen, Zhenye Gan, Jinlong Peng, Shengchuan Zhang, Jiangning Zhang, Yabiao Wang, Chengjie Wang, Liujuan Cao, Rongrong Ji
To address the challenge of real-world anomaly detection, we propose a new framework named Dual Memory bank enhanced representation learning for Anomaly Detection (DMAD).
1 code implementation • 19 Mar 2024 • Yuexiao Ma, Huixia Li, Xiawu Zheng, Feng Ling, Xuefeng Xiao, Rui Wang, Shilei Wen, Fei Chao, Rongrong Ji
Among these techniques, Post-Training Quantization (PTQ) has emerged as a subject of considerable interest due to its noteworthy compression efficiency and cost-effectiveness in the context of training.
no code implementations • 15 Mar 2024 • Jinxia Xie, Bineng Zhong, Zhiyi Mo, Shengping Zhang, Liangtao Shi, Shuxiang Song, Rongrong Ji
Firstly, we introduce a set of learnable and autoregressive queries to capture the instantaneous target appearance changes in a sliding window fashion.
1 code implementation • 11 Mar 2024 • Jinlu Zhang, Yiyi Zhou, Qiancheng Zheng, Xiaoxiong Du, Gen Luo, Jun Peng, Xiaoshuai Sun, Rongrong Ji
Text-to-3D-aware face (T3D Face) generation and manipulation is an emerging research hot spot in machine learning, which still suffers from low efficiency and poor quality.
no code implementations • 10 Mar 2024 • Xiaobin Hu, Xu Peng, Donghao Luo, Xiaozhong Ji, Jinlong Peng, Zhengkai Jiang, Jiangning Zhang, Taisong Jin, Chengjie Wang, Rongrong Ji
Our DiffuMatting shows several potential applications (e. g., matting-data generator, community-friendly art design and controllable generation).
1 code implementation • 5 Mar 2024 • Gen Luo, Yiyi Zhou, Yuxin Zhang, Xiawu Zheng, Xiaoshuai Sun, Rongrong Ji
Contrary to previous works, we study this problem from the perspective of image resolution, and reveal that a combination of low- and high-resolution visual features can effectively mitigate this shortcoming.
Ranked #61 on Visual Question Answering on MM-Vet
no code implementations • 23 Feb 2024 • Hui Lin, Zhiheng Ma, Rongrong Ji, YaoWei Wang, Zhou Su, Xiaopeng Hong, Deyu Meng
This paper focuses on semi-supervised crowd counting, where only a small portion of the training data are labeled.
1 code implementation • 19 Feb 2024 • Song Guo, Fan Wu, Lei Zhang, Xiawu Zheng, Shengchuan Zhang, Fei Chao, Yiyu Shi, Rongrong Ji
For instance, on the Wikitext2 dataset with LlamaV1-7B at 70% sparsity, our proposed EBFT achieves a perplexity of 16. 88, surpassing the state-of-the-art DSnoT with a perplexity of 75. 14.
1 code implementation • 24 Jan 2024 • Yimin Xu, Nanxi Gao, Zhongyun Shan, Fei Chao, Rongrong Ji
In contrast to traditional image restoration methods, all-in-one image restoration techniques are gaining increased attention for their ability to restore images affected by diverse and unknown corruption types and levels.
no code implementations • 22 Jan 2024 • Xudong Li, Jingyuan Zheng, Runze Hu, Yan Zhang, Ke Li, Yunhang Shen, Xiawu Zheng, Yutao Liu, Shengchuan Zhang, Pingyang Dai, Rongrong Ji
Blind Image Quality Assessment (BIQA) aims to evaluate image quality in line with human perception, without reference benchmarks.
1 code implementation • 18 Jan 2024 • Zesen Cheng, Kehan Li, Hao Li, Peng Jin, Chang Liu, Xiawu Zheng, Rongrong Ji, Jie Chen
To mold instance queries to follow Brownian bridge and accomplish alignment with class texts, we design Bridge-Text Alignment (BTA) to learn discriminative bridge-level representations of instances via contrastive objectives.
no code implementations • 18 Jan 2024 • Yunpeng Gong, Zhun Zhong, Zhiming Luo, Yansong Qu, Rongrong Ji, Min Jiang
For instance, infrared images are typically grayscale, unlike visible images that contain color information.
1 code implementation • 5 Jan 2024 • Yunshan Zhong, Yuyao Zhou, Yuxin Zhang, Fei Chao, Rongrong Ji
The proposed method, referred to as Unpaired Demoireing (UnDeM), synthesizes pseudo moire images from unpaired datasets, generating pairs with clean images for training demoireing models.
1 code implementation • 19 Dec 2023 • Sihan Liu, Yiwei Ma, Xiaoqing Zhang, Haowei Wang, Jiayi Ji, Xiaoshuai Sun, Rongrong Ji
Referring Remote Sensing Image Segmentation (RRSIS) is a new challenge that combines computer vision and natural language processing, delineating specific regions in aerial images as described by textual queries.
no code implementations • 11 Dec 2023 • Xudong Li, Timin Gao, Xiawu Zheng, Runze Hu, Jingyuan Zheng, Yunhang Shen, Ke Li, Yutao Liu, Pingyang Dai, Yan Zhang, Rongrong Ji
The current state-of-the-art No-Reference Image Quality Assessment (NR-IQA) methods typically use feature extraction in upstream backbone networks, which assumes that all extracted features are relevant.
no code implementations • 11 Dec 2023 • Xu Peng, Junwei Zhu, Boyuan Jiang, Ying Tai, Donghao Luo, Jiangning Zhang, Wei Lin, Taisong Jin, Chengjie Wang, Rongrong Ji
Moreover, these methods often grapple with identity distortion and limited expression diversity.
1 code implementation • 9 Dec 2023 • Lirui Zhao, Yuxin Zhang, Mingbao Lin, Fei Chao, Rongrong Ji
The poor cross-architecture generalization of dataset distillation greatly weakens its practical significance.
2 code implementations • 4 Dec 2023 • Yunhang Shen, Chaoyou Fu, Peixian Chen, Mengdan Zhang, Ke Li, Xing Sun, Yunsheng Wu, Shaohui Lin, Rongrong Ji
However, predominant paradigms, driven by casting instance-level tasks as an object-word alignment, bring heavy cross-modality interaction, which is not effective in prompting object detection and visual grounding.
no code implementations • 1 Dec 2023 • Xudong Li, Jingyuan Zheng, Xiawu Zheng, Runze Hu, Enwei Zhang, Yuting Gao, Yunhang Shen, Ke Li, Yutao Liu, Pingyang Dai, Yan Zhang, Rongrong Ji
Concretely, by innovatively introducing a novel feature distillation method in IQA, we propose a new framework to learn comparative knowledge from non-aligned reference images.
1 code implementation • 30 Nov 2023 • Yiwei Ma, Yijun Fan, Jiayi Ji, Haowei Wang, Xiaoshuai Sun, Guannan Jiang, Annan Shu, Rongrong Ji
Nevertheless, a substantial domain gap exists between 2D images and 3D assets, primarily attributed to variations in camera-related attributes and the exclusive presence of foreground objects.
1 code implementation • 16 Nov 2023 • Yunshan Zhong, Jiawei Hu, Mingbao Lin, Mengzhao Chen, Rongrong Ji
Albeit the scalable performance of vision transformers (ViTs), the dense computational costs (training & inference) undermine their position in industrial applications.
1 code implementation • 27 Oct 2023 • Danni Yang, Jiayi Ji, Xiaoshuai Sun, Haowei Wang, Yinan Li, Yiwei Ma, Rongrong Ji
Remarkably, our SS-PNG-NW+ outperforms fully-supervised models with only 30% and 50% supervision data, exceeding their performance by 0. 8% and 1. 1% respectively.
1 code implementation • 17 Oct 2023 • Haowei Wang, Jiayi Ji, Tianyu Guo, Yilong Yang, Yiyi Zhou, Xiaoshuai Sun, Rongrong Ji
To address this, we introduce two cascading modules based on the barycenter of the mask, which are Coordinate Guided Aggregation (CGA) and Barycenter Driven Localization (BDL), responsible for segmentation and detection, respectively.
1 code implementation • 14 Oct 2023 • Jiayi Ji, Haowei Wang, Changli Wu, Yiwei Ma, Xiaoshuai Sun, Rongrong Ji
The rising importance of 3D understanding, pivotal in computer vision, autonomous driving, and robotics, is evident.
1 code implementation • 13 Oct 2023 • Yuxin Zhang, Lirui Zhao, Mingbao Lin, Yunyun Sun, Yiwu Yao, Xingjia Han, Jared Tanner, Shiwei Liu, Rongrong Ji
Inspired by the Dynamic Sparse Training, DSnoT minimizes the reconstruction error between the dense and sparse LLMs, in the fashion of performing iterative weight pruning-and-growing on top of sparse LLMs.
1 code implementation • ICCV 2023 • Lijiang Li, Huixia Li, Xiawu Zheng, Jie Wu, Xuefeng Xiao, Rui Wang, Min Zheng, Xin Pan, Fei Chao, Rongrong Ji
Therefore, we propose to search the optimal time steps sequence and compressed model architecture in a unified framework to achieve effective image generation for diffusion models without any further training.
1 code implementation • 27 Aug 2023 • Yaozong Zheng, Bineng Zhong, Qihua Liang, Guorong Li, Rongrong Ji, Xianxian Li
In this paper, we present a simple, flexible and effective vision-language (VL) tracking pipeline, termed \textbf{MMTrack}, which casts VL tracking as a token generation task.
no code implementations • 24 Aug 2023 • Huafeng Kuang, Jie Wu, Xiawu Zheng, Ming Li, Xuefeng Xiao, Rui Wang, Min Zheng, Rongrong Ji
Furthermore, DLIP succeeds in retaining more than 95% of the performance with 22. 4% parameters and 24. 8% FLOPs compared to the teacher model and accelerates inference speed by 2. 7x.
1 code implementation • 23 Aug 2023 • Haojia Lin, Yongdong Luo, Xiawu Zheng, Lijiang Li, Fei Chao, Taisong Jin, Donghao Luo, Yan Wang, Liujuan Cao, Rongrong Ji
This elaborate design enables 3DRefTR to achieve both well-performing 3DRES and 3DREC capacities with only a 6% additional latency compared to the original 3DREC model.
1 code implementation • 22 Aug 2023 • Tao Chen, Ze Lin, Hui Li, Jiayi Ji, Yiyi Zhou, Guanbin Li, Rongrong Ji
Furthermore, we model product attributes based on both text and image modalities so that multi-modal product characteristics can be manifested in the generated summaries.
no code implementations • 20 Aug 2023 • Shuman Fang, Zhiwen Lin, Ke Yan, Jie Li, Xianming Lin, Rongrong Ji
However, these methods ignore the relationship among humans, objects, and interactions: 1) human features are more contributive than object ones to interaction prediction; 2) interactive information disturbs the detection of objects but helps human detection.
no code implementations • 11 Aug 2023 • Ke Sun, Shen Chen, Taiping Yao, Xiaoshuai Sun, Shouhong Ding, Rongrong Ji
In this paper, we focus on a novel and challenging problem: Continual Face Forgery Detection (CFFD), which aims to efficiently learn from new forgery attacks without forgetting previous ones.
1 code implementation • ICCV 2023 • Jie Hu, Chen Chen, Liujuan Cao, Shengchuan Zhang, Annan Shu, Guannan Jiang, Rongrong Ji
Through extensive experiments conducted on the COCO and Cityscapes datasets, we demonstrate that PAIS is a promising framework for semi-supervised instance segmentation, particularly in cases where labeled data is severely limited.
1 code implementation • 6 Aug 2023 • Haowei Wang, Jiji Tang, Jiayi Ji, Xiaoshuai Sun, Rongsheng Zhang, Yiwei Ma, Minda Zhao, Lincheng Li, Zeng Zhao, Tangjie Lv, Rongrong Ji
Insufficient synergy neglects the idea that a robust 3D representation should align with the joint vision-language space, rather than independently aligning with each modality.
no code implementations • 4 Aug 2023 • Shuman Fang, Shuai Liu, Jie Li, Guannan Jiang, Xianming Lin, Rongrong Ji
Human-Object Interaction (HOI) detection aims to understand the interactions between humans and objects, which plays a curtail role in high-level semantic understanding tasks.
no code implementations • 31 Jul 2023 • Ke Sun, Shen Chen, Taiping Yao, Haozhe Yang, Xiaoshuai Sun, Shouhong Ding, Rongrong Ji
To address this issues, in this paper, we propose a novel paradigm named Visual-Linguistic Face Forgery Detection(VLFFD), which uses fine-grained sentence-level prompts as the annotation.
1 code implementation • 30 Jun 2023 • Peng Mi, Li Shen, Tianhe Ren, Yiyi Zhou, Tianshuo Xu, Xiaoshuai Sun, Tongliang Liu, Rongrong Ji, DaCheng Tao
Sharpness-Aware Minimization (SAM) is a popular solution that smooths the loss landscape by minimizing the maximized change of training loss when adding a perturbation to the weight.
no code implementations • 27 Jun 2023 • Qiong Wu, Shubin Huang, Yiyi Zhou, Pingyang Dai, Annan Shu, Guannan Jiang, Rongrong Ji
Prompt tuning is a parameter-efficient way to deploy large-scale pre-trained models to downstream tasks by adding task-specific tokens.
3 code implementations • 23 Jun 2023 • Chaoyou Fu, Peixian Chen, Yunhang Shen, Yulei Qin, Mengdan Zhang, Xu Lin, Jinrui Yang, Xiawu Zheng, Ke Li, Xing Sun, Yunsheng Wu, Rongrong Ji
Multimodal Large Language Model (MLLM) relies on the powerful LLM to perform multimodal tasks, showing amazing emergent abilities in recent studies, such as writing poems based on an image.
no code implementations • 9 Jun 2023 • Yuxin Zhang, Mingbao Lin, Yunshan Zhong, Mengzhao Chen, Fei Chao, Rongrong Ji
This paper presents a Spatial Re-parameterization (SpRe) method for the N:M sparsity in CNNs.
1 code implementation • CVPR 2023 • Zhenglin Zhou, Huaxia Li, Hong Liu, Nanyang Wang, Gang Yu, Rongrong Ji
To solve this problem, we propose a Self-adapTive Ambiguity Reduction (STAR) loss by exploiting the properties of semantic ambiguity.
Ranked #1 on Face Alignment on 300W
1 code implementation • 1 Jun 2023 • Shubin Huang, Qiong Wu, Yiyi Zhou, WeiJie Chen, Rongsheng Zhang, Xiaoshuai Sun, Rongrong Ji
In addition, we also experiment DVP with the recently popular adapter approach to keep the most parameters of PLMs intact when adapting to VL tasks, helping PLMs achieve a quick shift between single- and multi-modal tasks.
1 code implementation • ICCV 2023 • Mengzhao Chen, Wenqi Shao, Peng Xu, Mingbao Lin, Kaipeng Zhang, Fei Chao, Rongrong Ji, Yu Qiao, Ping Luo
Token compression aims to speed up large-scale vision transformers (e. g. ViTs) by pruning (dropping) or merging tokens.
Ranked #4 on Efficient ViTs on ImageNet-1K (with DeiT-S)
1 code implementation • 29 May 2023 • Zhongxi Chen, Ke Sun, Xianming Lin, Rongrong Ji
Due to the stochastic sampling process of diffusion, our model is capable of sampling multiple possible predictions from the mask distribution, avoiding the problem of overconfident point estimation.
1 code implementation • NeurIPS 2023 • Gen Luo, Yiyi Zhou, Tianhe Ren, Shengxin Chen, Xiaoshuai Sun, Rongrong Ji
To validate MMA, we apply it to a recent LLM called LLaMA and term this formed large vision-language instructed model as LaVIN.
1 code implementation • 14 May 2023 • Yunshan Zhong, Mingbao Lin, Yuyao Zhou, Mengzhao Chen, Yuxin Zhang, Fei Chao, Rongrong Ji
However, in this paper, we investigate existing methods and observe a significant accumulation of quantization errors caused by frequent bit-width switching of weights and activations, leading to limited performance.
1 code implementation • 10 May 2023 • Yunshan Zhong, Mingbao Lin, Jingjing Xie, Yuxin Zhang, Fei Chao, Rongrong Ji
Compared to the common iterative exhaustive search algorithm, our strategy avoids the enumeration of all possible combinations in the universal set, reducing the time complexity from exponential to linear.
1 code implementation • ICCV 2023 • You Huang, Hao Yang, Ke Sun, Shengchuan Zhang, Liujuan Cao, Guannan Jiang, Rongrong Ji
Interactive image segmentation enables annotators to efficiently perform pixel-level annotation for segmentation tasks.
1 code implementation • 29 Mar 2023 • Xingbin Liu, Huafeng Kuang, Hong Liu, Xianming Lin, Yongjian Wu, Rongrong Ji
Deep neural networks have been applied in many computer vision tasks and achieved state-of-the-art performance.
1 code implementation • ICCV 2023 • Yiwei Ma, Xiaioqing Zhang, Xiaoshuai Sun, Jiayi Ji, Haowei Wang, Guannan Jiang, Weilin Zhuang, Rongrong Ji
Text-driven 3D stylization is a complex and crucial task in the fields of computer vision (CV) and computer graphics (CG), aimed at transforming a bare mesh to fit a target text.
1 code implementation • 27 Mar 2023 • Xingbin Liu, Huafeng Kuang, Xianming Lin, Yongjian Wu, Rongrong Ji
By revisiting the previous methods, we find different adversarial training methods have distinct robustness for sample instances.
2 code implementations • CVPR 2023 • Jie Hu, Linyan Huang, Tianhe Ren, Shengchuan Zhang, Rongrong Ji, Liujuan Cao
To reduce the computational overhead, we design a feature pyramid aggregator for the feature map extraction, and a separable dynamic decoder for the panoptic kernel generation.
1 code implementation • CVPR 2023 • Yuexiao Ma, Huixia Li, Xiawu Zheng, Xuefeng Xiao, Rui Wang, Shilei Wen, Xin Pan, Fei Chao, Rongrong Ji
In particular, we first formulate the oscillation in PTQ and prove the problem is caused by the difference in module capacity.
1 code implementation • CVPR 2022 • Peng Mi, Jianghang Lin, Yiyi Zhou, Yunhang Shen, Gen Luo, Xiaoshuai Sun, Liujuan Cao, Rongrong Fu, Qiang Xu, Rongrong Ji
In this paper, we study teacher-student learning from the perspective of data initialization and propose a novel algorithm called Active Teacher(Source code are available at: \url{https://github. com/HunterJ-Lin/ActiveTeacher}) for semi-supervised object detection (SSOD).
1 code implementation • CVPR 2023 • Suhang Ye, Yingyi Zhang, Jie Hu, Liujuan Cao, Shengchuan Zhang, Lei Shen, Jun Wang, Shouhong Ding, Rongrong Ji
Specifically, DistilPose maximizes the transfer of knowledge from the teacher model (heatmap-based) to the student model (regression-based) through Token-distilling Encoder (TDE) and Simulated Heatmaps.
1 code implementation • 22 Feb 2023 • Gen Luo, Yiyi Zhou, Lei Jin, Xiaoshuai Sun, Rongrong Ji
In addition to this challenge, we also reveal two key issues in one-stage SSOD, which are low-quality pseudo-labeling and multi-task optimization conflict, respectively.
1 code implementation • 16 Feb 2023 • Gen Luo, Minglang Huang, Yiyi Zhou, Xiaoshuai Sun, Guannan Jiang, Zhiyu Wang, Rongrong Ji
Experimental results show the superior performance and efficiency of RepAdapter than the state-of-the-art PETL methods.
1 code implementation • 13 Feb 2023 • Yiwei Ma, Jiayi Ji, Xiaoshuai Sun, Yiyi Zhou, Rongrong Ji
In this paper, we study the local visual modeling with grid features for image captioning, which is critical for generating accurate and detailed captions.
1 code implementation • 13 Feb 2023 • Yuxin Zhang, Yiting Luo, Mingbao Lin, Yunshan Zhong, Jingjing Xie, Fei Chao, Rongrong Ji
We focus on addressing the dense backward propagation issue for training efficiency of N:M fine-grained sparsity that preserves at most N out of M consecutive weights and achieves practical speedups supported by the N:M sparse tensor core.
1 code implementation • 4 Feb 2023 • Yuxin Zhang, Mingbao Lin, Xunchao Li, Han Liu, Guozhi Wang, Fei Chao, Shuai Ren, Yafei Wen, Xiaoxin Chen, Rongrong Ji
In this paper, we launch the first study on accelerating demoireing networks and propose a dynamic demoireing acceleration method (DDA) towards a real-time deployment on mobile devices.
no code implementations • 3 Feb 2023 • Lei Tan, Pingyang Dai, Qixiang Ye, Mingliang Xu, Yongjian Wu, Rongrong Ji
Based on the observation and analysis of SA-Softmax, we modify the SA-Softmax with the Feature Mask and Absolute-Similarity Term to alleviate the ambiguous optimization during model training.
no code implementations • 2 Feb 2023 • Lei Tan, Yukang Zhang, ShengMei Shen, Yan Wang, Pingyang Dai, Xianming Lin, Yongjian Wu, Rongrong Ji
Cross-spectral person re-identification, which aims to associate identities to pedestrians across different spectra, faces a main challenge of the modality discrepancy.
no code implementations • 29 Jan 2023 • Qiong Wu, Jiahan Li, Pingyang Dai, Qixiang Ye, Liujuan Cao, Yongjian Wu, Rongrong Ji
The knowledge transfer between two networks is based on an asymmetric mutual learning manner.
no code implementations • CVPR 2023 • Lei Jin, Gen Luo, Yiyi Zhou, Xiaoshuai Sun, Guannan Jiang, Annan Shu, Rongrong Ji
Based on RefCLIP, we further propose the first model-agnostic weakly supervised training scheme for existing REC models, where RefCLIP acts as a mature teacher to generate pseudo-labels for teaching common REC models.
1 code implementation • ICCV 2023 • Song Guo, Lei Zhang, Xiawu Zheng, Yan Wang, Yuchao Li, Fei Chao, Chenglin Wu, Shengchuan Zhang, Rongrong Ji
In this paper, we try to solve this problem by introducing a principled and unified framework based on Information Bottleneck (IB) theory, which further guides us to an automatic pruning approach.
no code implementations • CVPR 2023 • Jiamu Sun, Gen Luo, Yiyi Zhou, Xiaoshuai Sun, Guannan Jiang, Zhiyu Wang, Rongrong Ji
In this paper, we present the first attempt of semi-supervised learning for REC and propose a strong baseline method called RefTeacher.
no code implementations • ICCV 2023 • Zhiwei Chen, Jinren Ding, Liujuan Cao, Yunhang Shen, Shengchuan Zhang, Guannan Jiang, Rongrong Ji
Weakly supervised object localization (WSOL) aims to localize objects based on only image-level labels as supervision.
1 code implementation • CVPR 2023 • Tie Hu, Mingbao Lin, Lizhou You, Fei Chao, Rongrong Ji
In contrast to conventional pixel-to-pixel match methods in feature map distillation, our DCD utilizes teacher discriminator as a transformation to drive intermediate results of the student generator to be perceptually close to corresponding outputs of the teacher generator.
1 code implementation • ICCV 2023 • Mengzhao Chen, Mingbao Lin, Zhihang Lin, Yuxin Zhang, Fei Chao, Rongrong Ji
Due to the subtle designs of the self-motivated paradigm, our SMMix is significant in its smaller training overhead and better performance than other CutMix variants.
1 code implementation • 21 Dec 2022 • Lizhou You, Mingbao Lin, Tie Hu, Fei Chao, Rongrong Ji
This paper proposes a content relationship distillation (CRD) to tackle the over-parameterized generative adversarial networks (GANs) for the serviceability in cutting-edge devices.
1 code implementation • 8 Dec 2022 • Yunshan Zhong, Lizhou You, Yuxin Zhang, Fei Chao, Yonghong Tian, Rongrong Ji
Specifically, the encoder extracts the shadow feature of a region identity which is then paired with another region identity to serve as the generator input to synthesize a pseudo image.
no code implementations • 6 Dec 2022 • Zequan Xu, Lianyun Li, Hui Li, Qihang Sun, Shaofeng Hu, Rongrong Ji
This paper illustrates our BMA detection system SGRL (Self-supervised Graph Representation Learning) used in WeChat, a representative MMMA with over a billion users.
1 code implementation • CVPR 2023 • Haojia Lin, Xiawu Zheng, Lijiang Li, Fei Chao, Shanshan Wang, Yan Wang, Yonghong Tian, Rongrong Ji
However, the lack of a unified framework to interpret those networks makes any systematic comparison, contrast, or analysis challenging, and practically limits healthy development of the field.
Ranked #2 on 3D Semantic Segmentation on OpenTrench3D
1 code implementation • 12 Nov 2022 • Yunshan Zhong, Gongrui Nan, Yuxin Zhang, Fei Chao, Rongrong Ji
In QAT, the contemporary experience is that all quantized weights are updated for an entire training process.
1 code implementation • 11 Oct 2022 • Peng Mi, Li Shen, Tianhe Ren, Yiyi Zhou, Xiaoshuai Sun, Rongrong Ji, DaCheng Tao
One of the popular solutions is Sharpness-Aware Minimization (SAM), which smooths the loss landscape via minimizing the maximized change of training loss when adding a perturbation to the weight.
1 code implementation • 25 Sep 2022 • Dongli Tan, Jiang-Jiang Liu, Xingyu Chen, Chao Chen, Ruixin Zhang, Yunhang Shen, Shouhong Ding, Rongrong Ji
In this paper, we propose an efficient structure named Efficient Correspondence Transformer (ECO-TR) by finding correspondences in a coarse-to-fine manner, which significantly improves the efficiency of functional correspondence model.
1 code implementation • 8 Sep 2022 • Xingbin Liu, Jinghao Zhou, Tao Kong, Xianming Lin, Rongrong Ji
Masked autoencoders have become popular training paradigms for self-supervised visual representation learning.
1 code implementation • 27 Aug 2022 • Hong Yang, Gongrui Nan, Mingbao Lin, Fei Chao, Yunhang Shen, Ke Li, Rongrong Ji
Finally, the LSA modules are further developed to fully use the prior information in non-shadow regions to cleanse the shadow regions.
no code implementations • 21 Aug 2022 • Qiong Wu, Jiaer Xia, Pingyang Dai, Yiyi Zhou, Yongjian Wu, Rongrong Ji
Visible-infrared person re-identification (VI-ReID) is a task of matching the same individuals across the visible and infrared modalities.
1 code implementation • 19 Jul 2022 • Lei Tan, Pingyang Dai, Rongrong Ji, Yongjian Wu
Although person re-identification has achieved an impressive improvement in recent years, the common occlusion case caused by different obstacles is still an unsettled issue in real application scenarios.
1 code implementation • 19 Jul 2022 • Xudong Mao, Liujuan Cao, Aurele T. Gnanha, Zhenguo Yang, Qing Li, Rongrong Ji
The recently proposed pivotal tuning model makes significant progress towards reconstruction and editability, by using a two-step approach that first inverts the input image into a latent code, called pivot code, and then alters the generator so that the input image can be accurately mapped into the pivot code.
1 code implementation • CVPR 2023 • Jingjia Huang, Yinan Li, Jiashi Feng, Xinglong Wu, Xiaoshuai Sun, Rongrong Ji
We then introduce \textbf{Clover}\textemdash a Correlated Video-Language pre-training method\textemdash towards a universal Video-Language model for solving multiple video understanding tasks with neither performance nor efficiency compromise.
Ranked #1 on Video Question Answering on LSMDC-FiB
1 code implementation • 15 Jul 2022 • Jiazhen Ji, Huan Wang, Yuge Huang, Jiaxiang Wu, Xingkun Xu, Shouhong Ding, Shengchuan Zhang, Liujuan Cao, Rongrong Ji
This paper proposes a privacy-preserving face recognition method using differential privacy in the frequency domain.
1 code implementation • 15 Jul 2022 • Yiwei Ma, Guohai Xu, Xiaoshuai Sun, Ming Yan, Ji Zhang, Rongrong Ji
However, cross-grained contrast, which is the contrast between coarse-grained representations and fine-grained representations, has rarely been explored in prior research.
Ranked #12 on Video Retrieval on MSVD
1 code implementation • 14 Jun 2022 • Yuxin Zhang, Mingbao Lin, Zhihang Lin, Yiting Luo, Ke Li, Fei Chao, Yongjian Wu, Rongrong Ji
In this paper, we show that the N:M learning can be naturally characterized as a combinatorial problem which searches for the best combination candidate within a finite collection.
1 code implementation • 23 May 2022 • Mingbao Lin, Mengzhao Chen, Yuxin Zhang, Chunhua Shen, Rongrong Ji, Liujuan Cao
Experimental results on ImageNet demonstrate that our SuperViT can considerably reduce the computational costs of ViT models with even performance increase.
2 code implementations • 10 May 2022 • Yimin Xu, Mingbao Lin, Hong Yang, Fei Chao, Rongrong Ji
Inspired by the fact that the color mapping of the non-shadow region is easier to learn, our SADC processes the non-shadow region with a lightweight convolution module in a computationally cheap manner and recovers the shadow region with a more complicated convolution module to ensure the quality of image reconstruction.
no code implementations • 29 Apr 2022 • Yuting Gao, Jinfeng Liu, Zihan Xu, Jun Zhang, Ke Li, Rongrong Ji, Chunhua Shen
Large-scale vision-language pre-training has achieved promising results on downstream tasks.
no code implementations • 28 Apr 2022 • Shaohui Lin, Bo Ji, Rongrong Ji, Angela Yao
Multi-exit architectures consist of a backbone and branch classifiers that offer shortened inference pathways to reduce the run-time of deep neural networks.
1 code implementation • 17 Apr 2022 • Gen Luo, Yiyi Zhou, Jiamu Sun, Xiaoshuai Sun, Rongrong Ji
But the most encouraging finding is that with much less training overhead and parameters, SimREC can still achieve better performance than a set of large-scale pre-trained models, e. g., UNITER and VILLA, portraying the special role of REC in existing V&L research.
1 code implementation • 16 Apr 2022 • Gen Luo, Yiyi Zhou, Xiaoshuai Sun, Yan Wang, Liujuan Cao, Yongjian Wu, Feiyue Huang, Rongrong Ji
Despite the exciting performance, Transformer is criticized for its excessive parameters and computation cost.
1 code implementation • 2 Apr 2022 • Jing He, Yiyi Zhou, Qi Zhang, Jun Peng, Yunhang Shen, Xiaoshuai Sun, Chao Chen, Rongrong Ji
Pixel synthesis is a promising research paradigm for image generation, which can well exploit pixel-wise prior knowledge for generation.
3 code implementations • 30 Mar 2022 • Chaoyang Zhu, Yiyi Zhou, Yunhang Shen, Gen Luo, Xingjia Pan, Mingbao Lin, Chao Chen, Liujuan Cao, Xiaoshuai Sun, Rongrong Ji
In this paper, we propose a simple yet universal network termed SeqTR for visual grounding tasks, e. g., phrase localization, referring expression comprehension (REC) and segmentation (RES).
1 code implementation • CVPR 2022 • Qinqin Zhou, Kekai Sheng, Xiawu Zheng, Ke Li, Xing Sun, Yonghong Tian, Jie Chen, Rongrong Ji
Recently, Vision Transformer (ViT) has achieved remarkable success in several computer vision tasks.
1 code implementation • 21 Mar 2022 • Bohong Chen, Mingbao Lin, Kekai Sheng, Mengdan Zhang, Peixian Chen, Ke Li, Liujuan Cao, Rongrong Ji
To that effect, we construct an Edge-to-PSNR lookup table that maps the edge score of an image patch to the PSNR performance for each subnet, together with a set of computation costs for the subnets.
no code implementations • 13 Mar 2022 • Chengpeng Dai, Fuhai Chen, Xiaoshuai Sun, Rongrong Ji, Qixiang Ye, Yongjian Wu
Recently, automatic video captioning has attracted increasing attention, where the core challenge lies in capturing the key semantic items, like objects and actions as well as their spatial-temporal correlations from the redundant frames and semantic content.
no code implementations • 12 Mar 2022 • Fuhai Chen, Rongrong Ji, Chengpeng Dai, Xuri Ge, Shengchuang Zhang, Xiaojing Ma, Yue Gao
Echocardiography is widely used to clinical practice for diagnosis and treatment, e. g., on the common congenital heart defects.
1 code implementation • 8 Mar 2022 • Mengzhao Chen, Mingbao Lin, Ke Li, Yunhang Shen, Yongjian Wu, Fei Chao, Rongrong Ji
Our proposed CF-ViT is motivated by two important observations in modern ViT models: (1) The coarse-grained patch splitting can locate informative regions of an input image.
1 code implementation • 8 Mar 2022 • Yunshan Zhong, Mingbao Lin, Xunchao Li, Ke Li, Yunhang Shen, Fei Chao, Yongjian Wu, Rongrong Ji
However, these methods suffer from severe performance degradation when quantizing the SR models to ultra-low precision (e. g., 2-bit and 3-bit) with the low-cost layer-wise quantizer.
1 code implementation • CVPR 2022 • Hui Lin, Zhiheng Ma, Rongrong Ji, YaoWei Wang, Xiaopeng Hong
Secondly, we design the Local Attention Regularization to supervise the training of LRA by minimizing the deviation among the attention for different feature locations.
1 code implementation • 15 Feb 2022 • Mingbao Lin, Liujuan Cao, Yuxin Zhang, Ling Shao, Chia-Wen Lin, Rongrong Ji
Then, we introduce a recommendation-based filter selection scheme where each filter recommends a group of its closest filters.
1 code implementation • 30 Jan 2022 • Yuxin Zhang, Mingbao Lin, Mengzhao Chen, Fei Chao, Rongrong Ji
We prove that supermask training is to accumulate the criteria of gradient-driven sparsity for both removed and preserved weights, and it can partly solve the independence paradox.
no code implementations • 11 Jan 2022 • Zhengying Liu, Adrien Pavao, Zhen Xu, Sergio Escalera, Fabio Ferreira, Isabelle Guyon, Sirui Hong, Frank Hutter, Rongrong Ji, Julio C. S. Jacques Junior, Ge Li, Marius Lindauer, Zhipeng Luo, Meysam Madadi, Thomas Nierhoff, Kangning Niu, Chunguang Pan, Danny Stoll, Sebastien Treguer, Jin Wang, Peng Wang, Chenglin Wu, Youcheng Xiong, Arbe r Zela, Yang Zhang
Code submissions were executed on hidden tasks, with limited time and computational resources, pushing solutions that get results quickly.
no code implementations • 4 Jan 2022 • Tianshuo Xu, Lijiang Li, Peng Mi, Xiawu Zheng, Fei Chao, Rongrong Ji, Yonghong Tian, Qiang Shen
PSNR-oriented models are a critical class of super-resolution models with applications across various fields.
no code implementations • CVPR 2022 • Mingrui Wu, Xuying Zhang, Xiaoshuai Sun, Yiyi Zhou, Chao Chen, Jiaxin Gu, Xing Sun, Rongrong Ji
Current Image captioning (IC) methods predict textual words sequentially based on the input visual information from the visual feature extractor and the partially generated sentence information.
1 code implementation • CVPR 2022 • Xiawu Zheng, Xiang Fei, Lei Zhang, Chenglin Wu, Fei Chao, Jianzhuang Liu, Wei Zeng, Yonghong Tian, Rongrong Ji
Building upon RMI, we further propose a new search algorithm termed RMI-NAS, facilitating with a theorem to guarantee the global optimal of the searched architecture.
no code implementations • 27 Dec 2021 • Ke Sun, Taiping Yao, Shen Chen, Shouhong Ding, Jilin L, Rongrong Ji
With various facial manipulation techniques arising, face forgery detection has drawn growing attention due to security concerns.
1 code implementation • 10 Dec 2021 • Shuman Fang, Jie Li, Xianming Lin, Rongrong Ji
By treating the attack of both specific data and a modified model as a task, we expect the adversarial perturbations to adopt enough tasks for generalization.
1 code implementation • CVPR 2022 • Yunshan Zhong, Mingbao Lin, Gongrui Nan, Jianzhuang Liu, Baochang Zhang, Yonghong Tian, Rongrong Ji
In this paper, we observe an interesting phenomenon of intra-class heterogeneity in real data and show that existing methods fail to retain this property in their synthetic images, which causes a limited performance increase.
1 code implementation • NeurIPS 2021 • Shaojie Li, Jie Wu, Xuefeng Xiao, Fei Chao, Xudong Mao, Rongrong Ji
In this work, we revisit the role of discriminator in GAN compression and design a novel generator-discriminator cooperative compression scheme for GAN compression, termed GCC.
1 code implementation • 17 Oct 2021 • Gen Luo, Yiyi Zhou, Xiaoshuai Sun, Yongjian Wu, Yue Gao, Rongrong Ji
Based on the LaConv module, we further build the first fully language-driven convolution network, termed as LaConvNet, which can unify the visual recognition and multi-modal reasoning in one forward structure.
1 code implementation • 25 Sep 2021 • Qinglin Liu, Haozhe Xie, Shengping Zhang, Bineng Zhong, Rongrong Ji
Finally, we use the matting module which takes the image, trimap and context features to estimate the alpha matte.
Ranked #6 on Image Matting on Composition-1K (using extra training data)
1 code implementation • 16 Sep 2021 • Yuexiao Ma, Taisong Jin, Xiawu Zheng, Yan Wang, Huixia Li, Yongjian Wu, Guannan Jiang, Wei zhang, Rongrong Ji
Instead of solving a problem of the original integer programming, we propose to optimize a proxy metric, the concept of network orthogonality, which is highly correlated with the loss of the integer programming but also easy to optimize with linear programming.
1 code implementation • 12 Sep 2021 • Bohong Chen, Mingbao Lin, Rongrong Ji, Liujuan Cao
At the end of training, our PSS-Net retains the best subnet in each pool to entitle a fast switch of high-quality subnets for inference when the available resources vary.
1 code implementation • 9 Sep 2021 • Yunshan Zhong, Mingbao Lin, Mengzhao Chen, Ke Li, Yunhang Shen, Fei Chao, Yongjian Wu, Rongrong Ji
While post-training quantization receives popularity mostly due to its evasion in accessing the original complete training dataset, its poor performance also stems from scarce images.
no code implementations • ICCV 2021 • Hongjun Chen, Jinbao Wang, Hong Cai Chen, XianTong Zhen, Feng Zheng, Rongrong Ji, Ling Shao
Annotation burden has become one of the biggest barriers to semantic segmentation.
Weakly supervised Semantic Segmentation Weakly-Supervised Semantic Segmentation
1 code implementation • 19 Aug 2021 • Xiawu Zheng, Yuexiao Ma, Teng Xi, Gang Zhang, Errui Ding, Yuchao Li, Jie Chen, Yonghong Tian, Rongrong Ji
This practically limits the application of model compression when the model needs to be deployed on a wide range of devices.
1 code implementation • ICLR 2021 • Xinshuai Dong, Anh Tuan Luu, Rongrong Ji, Hong Liu
Robustness against word substitutions has a well-defined and widely acceptable form, i. e., using semantically similar words as substitutions, and thus it is considered as a fundamental stepping-stone towards broader robustness in natural language processing.
1 code implementation • 14 Jul 2021 • Mingbao Lin, Bohong Chen, Fei Chao, Rongrong Ji
Each filter in our DCFF is firstly given an inter-similarity distribution with a temperature parameter as a filter proxy, on top of which, a fresh Kullback-Leibler divergence based dynamic-coded criterion is proposed to evaluate the filter importance.
1 code implementation • 29 Jun 2021 • Peng Tu, Yawen Huang, Rongrong Ji, Feng Zheng, Ling Shao
To take advantage of the labeled examples and guide unlabeled data learning, we further propose a mask generation module to generate high-quality pseudo masks for the unlabeled data.
no code implementations • CVPR 2021 • Yunhang Shen, Liujuan Cao, Zhiwei Chen, Feihong Lian, Baochang Zhang, Chi Su, Yongjian Wu, Feiyue Huang, Rongrong Ji
To date, learning weakly supervised panoptic segmentation (WSPS) with only image-level labels remains unexplored.
1 code implementation • CVPR 2021 • Qiong Wu, Pingyang Dai, Jie Chen, Chia-Wen Lin, Yongjian Wu, Feiyue Huang, Bineng Zhong, Rongrong Ji
In this paper, we propose a joint Modality and Pattern Alignment Network (MPANet) to discover cross-modality nuances in different patterns for visible-infrared person Re-ID, which introduces a modality alleviation module and a pattern alignment module to jointly extract discriminative features.
1 code implementation • CVPR 2021 • Xuying Zhang, Xiaoshuai Sun, Yunpeng Luo, Jiayi Ji, Yiyi Zhou, Yongjian Wu, Feiyue Huang, Rongrong Ji
Then, we build a BERTbased language model to extract language context and propose Adaptive-Attention (AA) module on top of a transformer decoder to adaptively measure the contribution of visual and language cues before making decisions for word prediction.
1 code implementation • 18 Jun 2021 • YuHan Wang, Xu Chen, Junwei Zhu, Wenqing Chu, Ying Tai, Chengjie Wang, Jilin Li, Yongjian Wu, Feiyue Huang, Rongrong Ji
In this work, we propose a high fidelity face swapping method, called HifiFace, which can well preserve the face shape of the source face and generate photo-realistic results.
Ranked #7 on Face Swapping on FaceForensics++
1 code implementation • 4 Jun 2021 • Shaokun Zhang, Xiawu Zheng, Chenyi Yang, Yuchao Li, Yan Wang, Fei Chao, Mengdi Wang, Shen Li, Jun Yang, Rongrong Ji
Motivated by the necessity of efficient inference across various constraints on BERT, we propose a novel approach, YOCO-BERT, to achieve compress once and deploy everywhere.
1 code implementation • 31 May 2021 • Mingbao Lin, Yuxin Zhang, Yuchao Li, Bohong Chen, Fei Chao, Mengdi Wang, Shen Li, Yonghong Tian, Rongrong Ji
We also provide a workflow of filter rearrangement that first rearranges the weight matrix in the output channel dimension to derive more influential blocks for accuracy improvements and then applies similar rearrangement to the next-layer weights in the input channel dimension to ensure correct convolutional operations.
1 code implementation • CVPR 2021 • Yuchao Li, Shaohui Lin, Jianzhuang Liu, Qixiang Ye, Mengdi Wang, Fei Chao, Fan Yang, Jincheng Ma, Qi Tian, Rongrong Ji
Channel pruning and tensor decomposition have received extensive attention in convolutional neural network compression.
no code implementations • NeurIPS 2021 • Gengchen Duan, Taisong Jin, Rongrong Ji, Ling Shao, Baochang Zhang, Feiyue Huang, Yongjian Wu
In this article, we propose a novel auxiliary learning induced graph convolutional network in a multi-task fashion.
no code implementations • 6 May 2021 • Shen Chen, Taiping Yao, Yang Chen, Shouhong Ding, Jilin Li, Rongrong Ji
Specifically, we propose a Multi-scale Patch Similarity Module (MPSM), which measures the similarity between features of local regions and forms a robust and generalized similarity pattern.
1 code implementation • 3 May 2021 • Jie Hu, Liujuan Cao, Yao Lu, Shengchuan Zhang, Yan Wang, Ke Li, Feiyue Huang, Ling Shao, Rongrong Ji
However, such an upgrade is not applicable to instance segmentation, due to its significantly higher output dimensions compared to object detection.
Ranked #21 on Instance Segmentation on COCO test-dev
no code implementations • NeurIPS 2021 • Yixu Wang, Jie Li, Hong Liu, Yan Wang, Yongjian Wu, Feiyue Huang, Rongrong Ji
We argue this is due to the lack of rich information in the probability prediction and the overfitting caused by hard labels.
1 code implementation • 24 Apr 2021 • Yuxin Zhang, Mingbao Lin, Chia-Wen Lin, Jie Chen, Feiyue Huang, Yongjian Wu, Yonghong Tian, Rongrong Ji
Specifically, to model the contribution of each channel to differentiating categories, we develop a class-wise mask for each channel, implemented in a dynamic training manner w. r. t.
2 code implementations • 18 Apr 2021 • Yuxin Zhang, Mingbao Lin, Yunshan Zhong, Fei Chao, Rongrong Ji
Existing studies achieve the sparsity of neural networks via time-consuming weight training or complex searching on networks with expanded width, which greatly limits the applications of network pruning.
no code implementations • 6 Apr 2021 • Boyu Yang, Mingbao Lin, Binghao Liu, Mengying Fu, Chang Liu, Rongrong Ji, Qixiang Ye
By tentatively expanding network nodes, LEC-Net enlarges the representation capacity of features, alleviating feature drift of old network from the perspective of model regularization.
1 code implementation • 26 Mar 2021 • Shaojie Li, Mingbao Lin, Yan Wang, Yongjian Wu, Yonghong Tian, Ling Shao, Rongrong Ji
Besides, a self-distillation module is adopted to convert the feature map of deeper layers into a shallower one.
no code implementations • 25 Mar 2021 • Kekai Sheng, Ke Li, Xiawu Zheng, Jian Liang, WeiMing Dong, Feiyue Huang, Rongrong Ji, Xing Sun
However, considering that the configuration of attention, i. e., the type and the position of attention module, affects the performance significantly, it is more generalized to optimize the attention configuration automatically to be specialized for arbitrary UDA scenario.
Ranked #1 on Partial Domain Adaptation on Office-Home
3 code implementations • ICCV 2021 • Zihan Xu, Mingbao Lin, Jianzhuang Liu, Jie Chen, Ling Shao, Yue Gao, Yonghong Tian, Rongrong Ji
We prove that reviving the "dead weights" by ReCU can result in a smaller quantization error.
2 code implementations • CVPR 2021 • Bohao Li, Boyu Yang, Chang Liu, Feng Liu, Rongrong Ji, Qixiang Ye
Few-shot object detection has made substantial progressby representing novel class objects using the feature representation learned upon a set of base class objects.
Ranked #14 on Few-Shot Object Detection on MS-COCO (10-shot)
1 code implementation • CVPR 2021 • Xinyang Li, Shengchuan Zhang, Jie Hu, Liujuan Cao, Xiaopeng Hong, Xudong Mao, Feiyue Huang, Yongjian Wu, Rongrong Ji
Recently, image-to-image translation has made significant progress in achieving both multi-label (\ie, translation conditioned on different labels) and multi-style (\ie, generation with diverse styles) tasks.
Disentanglement Multimodal Unsupervised Image-To-Image Translation +1
2 code implementations • 18 Feb 2021 • Liming Jiang, Zhengkui Guo, Wayne Wu, Zhaoyang Liu, Ziwei Liu, Chen Change Loy, Shuo Yang, Yuanjun Xiong, Wei Xia, Baoying Chen, Peiyu Zhuang, Sili Li, Shen Chen, Taiping Yao, Shouhong Ding, Jilin Li, Feiyue Huang, Liujuan Cao, Rongrong Ji, Changlei Lu, Ganchao Tan
This paper reports methods and results in the DeeperForensics Challenge 2020 on real-world face forgery detection.
2 code implementations • 16 Feb 2021 • Mingbao Lin, Rongrong Ji, Zihan Xu, Baochang Zhang, Fei Chao, Chia-Wen Lin, Ling Shao
In this paper, we show that our weight binarization provides an analytical solution by encoding high-magnitude weights into +1s, and 0s otherwise.
no code implementations • 1 Feb 2021 • Jian Zhang, Ying Tai, Taiping Yao, Jia Meng, Shouhong Ding, Chengjie Wang, Jilin Li, Feiyue Huang, Rongrong Ji
Face authentication on mobile end has been widely applied in various scenarios.
1 code implementation • 20 Jan 2021 • Mingbao Lin, Rongrong Ji, Shaojie Li, Yan Wang, Yongjian Wu, Feiyue Huang, Qixiang Ye
Inspired by the face recognition community, we use a message passing algorithm Affinity Propagation on the weight matrices to obtain an adaptive number of exemplars, which then act as the preserved filters.
1 code implementation • 16 Jan 2021 • Yunpeng Luo, Jiayi Ji, Xiaoshuai Sun, Liujuan Cao, Yongjian Wu, Feiyue Huang, Chia-Wen Lin, Rongrong Ji
Descriptive region features extracted by object detection networks have played an important role in the recent advancements of image captioning.
1 code implementation • ICCV 2021 • Jie Li, Rongrong Ji, Peixian Chen, Baochang Zhang, Xiaopeng Hong, Ruixin Zhang, Shaoxin Li, Jilin Li, Feiyue Huang, Yongjian Wu
A common practice is to start from a large perturbation and then iteratively reduce it with a deterministic direction and a random one while keeping it adversarial.
no code implementations • ICCV 2021 • Qinqin Zhou, Xiawu Zheng, Liujuan Cao, Bineng Zhong, Teng Xi, Gang Zhang, Errui Ding, Mingliang Xu, Rongrong Ji
EC-DARTS decouples different operations based on their categories to optimize the operation weights so that the operation gap between them is shrinked.
no code implementations • ICCV 2021 • Peixian Chen, Wenfeng Liu, Pingyang Dai, Jianzhuang Liu, Qixiang Ye, Mingliang Xu, Qi'an Chen, Rongrong Ji
To avoid such problematic models in occluded person ReID, we propose the Occlusion-Aware Mask Network (OAMN).
1 code implementation • ICCV 2021 • Yiyi Zhou, Tianhe Ren, Chaoyang Zhu, Xiaoshuai Sun, Jianzhuang Liu, Xinghao Ding, Mingliang Xu, Rongrong Ji
Due to the superior ability of global dependency modeling, Transformer and its variants have become the primary choice of many vision-and-language tasks.
no code implementations • ICCV 2021 • Yunhang Shen, Liujuan Cao, Zhiwei Chen, Baochang Zhang, Chi Su, Yongjian Wu, Feiyue Huang, Rongrong Ji
Weakly supervised instance segmentation (WSIS) with only image-level labels has recently drawn much attention.
1 code implementation • 13 Dec 2020 • Jiayi Ji, Yunpeng Luo, Xiaoshuai Sun, Fuhai Chen, Gen Luo, Yongjian Wu, Yue Gao, Rongrong Ji
The latter contains a Global Adaptive Controller that can adaptively fuse the global information into the decoder to guide the caption generation.
1 code implementation • NeurIPS 2020 • Yunhang Shen, Rongrong Ji, Zhiwei Chen, Yongjian Wu, Feiyue Huang
In this paper, we propose a unified WSOD framework, termed UWSOD, to develop a high-capacity general detection model with only image-level labels, which is self-contained and does not require external modules or additional supervision.
no code implementations • 1 Dec 2020 • Mingbao Lin, Rongrong Ji, Xiaoshuai Sun, Baochang Zhang, Feiyue Huang, Yonghong Tian, DaCheng Tao
To achieve fast online adaptivity, a class-wise updating method is developed to decompose the binary code learning and alternatively renew the hash functions in a class-wise fashion, which well addresses the burden on large amounts of training batches.
1 code implementation • 17 Nov 2020 • Shaojie Li, Mingbao Lin, Yan Wang, Fei Chao, Ling Shao, Rongrong Ji
The latter simultaneously distills informative attention maps from both the generator and discriminator of a pre-trained model to the searched generator, effectively stabilizing the adversarial training of our light-weight model.
1 code implementation • ECCV 2020 • Huixia Li, Chenqian Yan, Shaohui Lin, Xiawu Zheng, Yuchao Li, Baochang Zhang, Fan Yang, Rongrong Ji
Specifically, most state-of-the-art SR models without batch normalization have a large dynamic quantization range, which also serves as another cause of performance drop.
2 code implementations • NeurIPS 2020 • Mingbao Lin, Rongrong Ji, Zihan Xu, Baochang Zhang, Yan Wang, Yongjian Wu, Feiyue Huang, Chia-Wen Lin
In this paper, for the first time, we explore the influence of angular bias on the quantization error and then introduce a Rotated Binary Neural Network (RBNN), which considers the angle alignment between the full-precision weight vector and its binarized version.
2 code implementations • CVPR 2021 • Jinpeng Wang, Yuting Gao, Ke Li, Yiqi Lin, Andy J. Ma, Hao Cheng, Pai Peng, Feiyue Huang, Rongrong Ji, Xing Sun
Then we force the model to pull the feature of the distracting video and the feature of the original video closer, so that the model is explicitly restricted to resist the background influence, focusing more on the motion changes.
3 code implementations • 12 Sep 2020 • Jinpeng Wang, Yuting Gao, Ke Li, Jianguo Hu, Xinyang Jiang, Xiaowei Guo, Rongrong Ji, Xing Sun
Specifically, we construct a positive clip and a negative clip for each video.
no code implementations • 8 Sep 2020 • Hanlin Chen, Li'an Zhuo, Baochang Zhang, Xiawu Zheng, Jianzhuang Liu, Rongrong Ji, David Doermann, Guodong Guo
In this paper, binarized neural architecture search (BNAS), with a search space of binarized convolutions, is introduced to produce extremely compressed models to reduce huge computational cost on embedded devices for edge computing.
no code implementations • SIGKDD 2020 • Shuyi Ji, Yifan Feng, Rongrong Ji, Xibin Zhao, Wanwan Tang, Yue Gao.
Second, the hypergraph structure is employed for modeling users and items with explicit hybrid high-order correlations.
no code implementations • ECCV 2020 • Hanlin Chen, Baochang Zhang, Song Xue, Xuan Gong, Hong Liu, Rongrong Ji, David Doermann
Deep convolutional neural networks (DCNNs) have dominated as the best performers in machine learning, but can be challenged by adversarial attacks.
no code implementations • 27 Jul 2020 • Pingyang Dai, Peixian Chen, Qiong Wu, Xiaopeng Hong, Qixiang Ye, Qi Tian, Rongrong Ji
This drawback limits the flexibility of UDA in complicated open-set tasks where no labels are shared between domains.
1 code implementation • 27 Jul 2020 • Peixian Chen, Pingyang Dai, Jianzhuang Liu, Feng Zheng, Qi Tian, Rongrong Ji
Domain generalization (DG) serves as a promising solution to handle person Re-Identification (Re-ID), which trains the model using labels from the source domain alone, and then directly adopts the trained model to the target domain without model updating.
Domain Generalization Generalizable Person Re-identification
2 code implementations • ECCV 2020 • Yunpeng Zhai, Qixiang Ye, Shijian Lu, Mengxi Jia, Rongrong Ji, Yonghong Tian
Often the best performing deep neural models are ensembles of multiple base-level networks, nevertheless, ensemble learning with respect to domain adaptive person re-ID remains unexplored.
Domain Adaptive Person Re-Identification Ensemble Learning +1
no code implementations • CVPR 2020 • Li'an Zhuo, Baochang Zhang, Linlin Yang, Hanlin Chen, Qixiang Ye, David Doermann, Guodong Guo, Rongrong Ji
Conventional learning methods simplify the bilinear model by regarding two intrinsically coupled factors independently, which degrades the optimization procedure.
1 code implementation • CVPR 2020 • Xiawu Zheng, Rongrong Ji, Qiang Wang, Qixiang Ye, Zhenguo Li, Yonghong Tian, Qi Tian
In this paper, we provide a novel yet systematic rethinking of PE in a resource constrained regime, termed budgeted PE (BPE), which precisely and effectively estimates the performance of an architecture sampled from an architecture space.
1 code implementation • CVPR 2020 • Jie Li, Rongrong Ji, Hong Liu, Jianzhuang Liu, Bineng Zhong, Cheng Deng, Qi Tian
For reducing the solution space, we first model the adversarial perturbation optimization problem as a process of recovering frequency-sparse perturbations with compressed sensing, under the setting that random noise in the low-frequency space is more likely to be adversarial.
1 code implementation • 26 Apr 2020 • Hao Cheng, Fanxu Meng, Ke Li, Yuting Gao, Guangming Lu, Xing Sun, Rongrong Ji
To gain a universal improvement on both valid and invalid filters, we compensate grafting with distillation (\textbf{Cultivation}) to overcome the drawback of grafting .
no code implementations • CVPR 2020 • Yunpeng Zhai, Shijian Lu, Qixiang Ye, Xuebo Shan, Jie Chen, Rongrong Ji, Yonghong Tian
Domain adaptive person re-identification (re-ID) is a challenging task, especially when person identities in target domains are unknown.
Ranked #8 on Unsupervised Domain Adaptation on Duke to Market
1 code implementation • ICCV 2021 • Jie Hu, Liujuan Cao, Qixiang Ye, Tong Tong, Shengchuan Zhang, Ke Li, Feiyue Huang, Rongrong Ji, Ling Shao
Based on the experimental results, we present three new findings that provide fresh insights into the inner logic of DNNs.
no code implementations • 25 Mar 2020 • Bin Zhang, Jian Li, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Yili Xia, Wenjiang Pei, Rongrong Ji
In this paper, we propose a novel Automatic and Scalable Face Detector (ASFD), which is based on a combination of neural architecture search techniques as well as a new loss design.
2 code implementations • CVPR 2020 • Gen Luo, Yiyi Zhou, Xiaoshuai Sun, Liujuan Cao, Chenglin Wu, Cheng Deng, Rongrong Ji
In addition, we address a key challenge in this multi-task setup, i. e., the prediction conflict, with two innovative designs namely, Consistency Energy Maximization (CEM) and Adaptive Soft Non-Located Suppression (ASNLS).
Generalized Referring Expression Comprehension Referring Expression +2
2 code implementations • CVPR 2020 • Zedu Chen, Bineng Zhong, Guorong Li, Shengping Zhang, Rongrong Ji
Most of the existing trackers usually rely on either a multi-scale searching scheme or pre-defined anchor boxes to accurately estimate the scale and aspect ratio of a target.
2 code implementations • CVPR 2020 • Mingbao Lin, Rongrong Ji, Yan Wang, Yichen Zhang, Baochang Zhang, Yonghong Tian, Ling Shao
The principle behind our pruning is that low-rank feature maps contain less information, and thus pruned results can be easily reproduced.
2 code implementations • ECCV 2020 • Yuge Huang, Pengcheng Shen, Ying Tai, Shaoxin Li, Xiaoming Liu, Jilin Li, Feiyue Huang, Rongrong Ji
To improve the performance on those hard samples for general tasks, we propose a novel Distribution Distillation Loss to narrow the performance gap between easy and hard samples, which is a simple, effective and generic for various types of facial variations.
1 code implementation • 23 Jan 2020 • Mingbao Lin, Liujuan Cao, Shaojie Li, Qixiang Ye, Yonghong Tian, Jianzhuang Liu, Qi Tian, Rongrong Ji
Our approach, referred to as FilterSketch, encodes the second-order information of pre-trained weights, which enables the representation capacity of pruned networks to be recovered with a simple fine-tuning procedure.
1 code implementation • 23 Jan 2020 • Mingbao Lin, Rongrong Ji, Yuxin Zhang, Baochang Zhang, Yongjian Wu, Yonghong Tian
In this paper, we propose a new channel pruning method based on artificial bee colony algorithm (ABC), dubbed as ABCPruner, which aims to efficiently find optimal pruned structure, i. e., channel number in each layer, rather than selecting "important" channels as previous works did.
2 code implementations • CVPR 2020 • Fanxu Meng, Hao Cheng, Ke Li, Zhixin Xu, Rongrong Ji, Xing Sun, Gaungming Lu
To better perform the grafting process, we develop an entropy-based criterion to measure the information of filters and an adaptive weighting strategy for balancing the grafted information among networks.
1 code implementation • 7 Dec 2019 • Yiyi Zhou, Rongrong Ji, Gen Luo, Xiaoshuai Sun, Jinsong Su, Xinghao Ding, Chia-Wen Lin, Qi Tian
Referring Expression Comprehension (REC) is an emerging research spot in computer vision, which refers to detecting the target region in an image given an text description.
1 code implementation • 3 Dec 2019 • Fengxiang Yang, Ke Li, Zhun Zhong, Zhiming Luo, Xing Sun, Hao Cheng, Xiaowei Guo, Feiyue Huang, Rongrong Ji, Shaozi Li
This procedure encourages that the selected training samples can be both clean and miscellaneous, and that the two models can promote each other iteratively.
Ranked #9 on Unsupervised Domain Adaptation on Market to Duke
no code implementations • NeurIPS 2019 • Fuhai Chen, Rongrong Ji, Jiayi Ji, Xiaoshuai Sun, Baochang Zhang, Xuri Ge, Yongjian Wu, Feiyue Huang, Yan Wang
To model these two inherent diversities in image captioning, we propose a Variational Structured Semantic Inferring model (termed VSSI-cap) executed in a novel structured encoder-inferer-decoder schema.
no code implementations • 26 Nov 2019 • Kekai Sheng, Wei-Ming Dong, Menglei Chai, Guohui Wang, Peng Zhou, Feiyue Huang, Bao-Gang Hu, Rongrong Ji, Chongyang Ma
In this paper, we revisit the problem of image aesthetic assessment from the self-supervised feature learning perspective.
no code implementations • 25 Nov 2019 • Hanlin Chen, Li'an Zhuo, Baochang Zhang, Xiawu Zheng, Jianzhuang Liu, David Doermann, Rongrong Ji
A variant, binarized neural architecture search (BNAS), with a search space of binarized convolutions, can produce extremely compressed models.
3 code implementations • 11 Nov 2019 • Chuming Lin, Jian Li, Yabiao Wang, Ying Tai, Donghao Luo, Zhipeng Cui, Chengjie Wang, Jilin Li, Feiyue Huang, Rongrong Ji
In this paper, we propose an efficient and unified framework to generate temporal action proposals named Dense Boundary Generator (DBG), which draws inspiration from boundary-sensitive methods and implements boundary classification and action completeness regression for densely distributed proposals.
Ranked #7 on Temporal Action Localization on FineAction
no code implementations • 30 Oct 2019 • Wenjie Ding, Xing Wei, Rongrong Ji, Xiaopeng Hong, Qi Tian, Yihong Gong
We propose a \emph{more universal} adversarial perturbation (MUAP) method for both image-agnostic and model-insensitive person Re-ID attack.
no code implementations • CVPR 2019 • Chunlei Liu, Wenrui Ding, Xin Xia, Baochang Zhang, Jiaxin Gu, Jianzhuang Liu, Rongrong Ji, David Doermann
The CiFs can be easily incorporated into existing deep convolutional neural networks (DCNNs), which leads to new Circulant Binary Convolutional Networks (CBCNs).