Search Results for author: Luc van Gool

Found 544 papers, 274 papers with code

Modeling the Effects of Windshield Refraction for Camera Calibration

no code implementations • ECCV 2020 • Frank Verbiest, Marc Proesmans, Luc van Gool

Instead of using a generalized camera approach, we propose a novel approach to jointly optimize a traditional camera model, and a mathematical representation of the windshield’s surface.

Autonomous Driving Camera Calibration

Paper
Add Code

Fixing Localization Errors to Improve Image Classification

1 code implementation • ECCV 2020 • Guolei Sun, Salman Khan, Wen Li, Hisham Cholakkal, Fahad Shahbaz Khan, Luc van Gool

This way, in an effort to fix localization errors, our loss provides an extra supervisory signal that helps the model to better discriminate between similar classes.

Classification General Classification +3

Paper
Code

Sharing Key Semantics in Transformer Makes Efficient Image Restoration

no code implementations • 30 May 2024 • Bin Ren, Yawei Li, Jingyun Liang, Rakesh Ranjan, Mengyuan Liu, Rita Cucchiara, Luc van Gool, Ming-Hsuan Yang, Nicu Sebe

Additionally, for IR, it is commonly noted that small segments of a degraded image, particularly those closely aligned semantically, provide particularly relevant information to aid in the restoration process, as they contribute essential contextual cues crucial for accurate reconstruction.

Image Restoration

Paper
Add Code

Towards a Generalist and Blind RGB-X Tracker

no code implementations • 28 May 2024 • Yuedong Tan, Zongwei Wu, Yuqian Fu, Zhuyun Zhou, Guolei Sun, Chao Ma, Danda Pani Paudel, Luc van Gool, Radu Timofte

With the emergence of a single large model capable of successfully solving a multitude of tasks in NLP, there has been growing research interest in achieving similar goals in computer vision.

Inductive Bias Multi-Label Classification +1

Paper
Add Code

Splat-SLAM: Globally Optimized RGB-only SLAM with 3D Gaussians

1 code implementation • 26 May 2024 • Erik Sandström, Keisuke Tateno, Michael Oechsle, Michael Niemeyer, Luc van Gool, Martin R. Oswald, Federico Tombari

In response, we propose the first RGB-only SLAM system with a dense 3D Gaussian map representation that utilizes all benefits of globally optimized tracking by adapting dynamically to keyframe pose and depth updates by actively deforming the 3D Gaussian map.

3D Reconstruction Simultaneous Localization and Mapping

Paper
Code

Radar Fields: Frequency-Space Neural Scene Representations for FMCW Radar

no code implementations • 7 May 2024 • David Borts, Erich Liang, Tim Brödermann, Andrea Ramazzina, Stefanie Walz, Edoardo Palladin, Jipeng Sun, David Bruggemann, Christos Sakaridis, Luc van Gool, Mario Bijelic, Felix Heide

Neural fields have been broadly investigated as scene representations for the reproduction and novel generation of diverse outdoor scenes, including those autonomous vehicles and robots must handle.

Autonomous Vehicles

Paper
Add Code

Self-Explainable Affordance Learning with Embodied Caption

no code implementations • 8 Apr 2024 • Zhipeng Zhang, Zhimin Wei, Guolei Sun, Peng Wang, Luc van Gool

In the field of visual affordance learning, previous methods mainly used abundant images or videos that delineate human behavior patterns to identify action possibility regions for object manipulation, with a variety of applications in robotic tasks.

Paper
Add Code

Investigating the Effectiveness of Cross-Attention to Unlock Zero-Shot Editing of Text-to-Video Diffusion Models

no code implementations • 8 Apr 2024 • Saman Motamed, Wouter Van Gansbeke, Luc van Gool

With recent advances in image and video diffusion models for content creation, a plethora of techniques have been proposed for customizing their generated content.

Video Editing

Paper
Add Code

Empowering Image Recovery_ A Multi-Attention Approach

no code implementations • 6 Apr 2024 • Juan Wen, Yawei Li, Chao Zhang, Weiyan Hou, Radu Timofte, Luc van Gool

Integration of attention mechanisms across feature and positional dimensions further enhances the recovery of fine details.

Image Restoration

Paper
Add Code

HandDiff: 3D Hand Pose Estimation with Diffusion on Image-Point Cloud

2 code implementations • 4 Apr 2024 • Wencan Cheng, Hao Tang, Luc van Gool, Jong Hwan Ko

Extracting keypoint locations from input hand frames, known as 3D hand pose estimation, is a critical task in various human-computer interaction applications.

3D Hand Pose Estimation

266

Paper
Code

Know Your Neighbors: Improving Single-View Reconstruction via Spatial Vision-Language Reasoning

1 code implementation • 4 Apr 2024 • Rui Li, Tobias Fischer, Mattia Segu, Marc Pollefeys, Luc van Gool, Federico Tombari

We propose KYN, a novel method for single-view scene reconstruction that reasons about semantic and spatial context to predict each point's density.

3D Scene Reconstruction Depth Estimation +2

Paper
Code

Language-Guided Instance-Aware Domain-Adaptive Panoptic Segmentation

no code implementations • 4 Apr 2024 • Elham Amin Mansour, Ozan Unal, Suman Saha, Benjamin Bejar, Luc van Gool

A key challenge in panoptic UDA is reducing the domain gap between a labeled source and an unlabeled target domain while harmonizing the subtasks of semantic and instance segmentation to limit catastrophic interference.

Autonomous Driving Instance Segmentation +3

Paper
Add Code

I-Design: Personalized LLM Interior Designer

no code implementations • 3 Apr 2024 • Ata Çelen, Guo Han, Konrad Schindler, Luc van Gool, Iro Armeni, Anton Obukhov, Xi Wang

Interior design allows us to be who we are and live how we want - each design is as unique as our distinct personality.

Language Modelling Large Language Model +2

Paper
Add Code

A Unified and Interpretable Emotion Representation and Expression Generation

no code implementations • 1 Apr 2024 • Reni Paskaleva, Mykyta Holubakha, Andela Ilic, Saman Motamed, Luc van Gool, Danda Paudel

However, emotions are often compound, e. g. happily surprised, and can be mapped to the action units (AUs) used for expressing emotions, and trivially to the canonical ones.

Paper
Add Code

GlORIE-SLAM: Globally Optimized RGB-only Implicit Encoding Point Cloud SLAM

1 code implementation • 28 Mar 2024 • Ganlin Zhang, Erik Sandström, Youmin Zhang, Manthan Patel, Luc van Gool, Martin R. Oswald

To alleviate this issue, with the aid of a monocular depth estimator, we introduce a novel DSPO layer for bundle adjustment which optimizes the pose and depth of keyframes along with the scale of the monocular depth.

Simultaneous Localization and Mapping

Paper
Code

UniDepth: Universal Monocular Metric Depth Estimation

1 code implementation • 27 Mar 2024 • Luigi Piccinelli, Yung-Hsu Yang, Christos Sakaridis, Mattia Segu, Siyuan Li, Luc van Gool, Fisher Yu

However, the remarkable accuracy of recent MMDE methods is confined to their training domains.

Ranked #4 on Monocular Depth Estimation on NYU-Depth V2 (using extra training data)

Monocular Depth Estimation

381

Paper
Code

Towards Online Real-Time Memory-based Video Inpainting Transformers

no code implementations • 24 Mar 2024 • Guillaume Thiry, Hao Tang, Radu Timofte, Luc van Gool

Video inpainting tasks have seen significant improvements in recent years with the rise of deep neural networks and, in particular, vision transformers.

Video Inpainting

Paper
Add Code

FocusCLIP: Multimodal Subject-Level Guidance for Zero-Shot Transfer in Human-Centric Tasks

no code implementations • 11 Mar 2024 • Muhammad Saif Ullah Khan, Muhammad Ferjad Naeem, Federico Tombari, Luc van Gool, Didier Stricker, Muhammad Zeshan Afzal

We propose FocusCLIP, integrating subject-level guidance--a specialized mechanism for target-specific supervision--into the CLIP framework for improved zero-shot transfer on human-centric tasks.

Ranked #1 on Age Classification on EMOTIC

Activity Recognition Age Classification +1

Paper
Add Code

Rethinking Few-shot 3D Point Cloud Semantic Segmentation

1 code implementation • 1 Mar 2024 • Zhaochong An, Guolei Sun, Yun Liu, Fayao Liu, Zongwei Wu, Dan Wang, Luc van Gool, Serge Belongie

The former arises from non-uniform point sampling, allowing models to distinguish the density disparities between foreground and background for easier segmentation.

Few-shot 3D Point Cloud Semantic Segmentation Segmentation +1

Paper
Code

Loopy-SLAM: Dense Neural SLAM with Loop Closures

no code implementations • 14 Feb 2024 • Lorenzo Liso, Erik Sandström, Vladimir Yugay, Luc van Gool, Martin R. Oswald

Neural RGBD SLAM techniques have shown promise in dense Simultaneous Localization And Mapping (SLAM), yet face challenges such as error accumulation during camera tracking resulting in distorted maps.

Simultaneous Localization and Mapping

Paper
Add Code

Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector

no code implementations • 5 Feb 2024 • Yuqian Fu, Yu Wang, Yixuan Pan, Lian Huai, Xingyu Qiu, Zeyu Shangguan, Tong Liu, Yanwei Fu, Luc van Gool, Xingqun Jiang

This paper studies the challenging cross-domain few-shot object detection (CD-FSOD), aiming to develop an accurate object detector for novel domains with minimal labeled examples.

Cross-Domain Few-Shot Few-Shot Object Detection +3

Paper
Add Code

Key-Graph Transformer for Image Restoration

no code implementations • 4 Feb 2024 • Bin Ren, Yawei Li, Jingyun Liang, Rakesh Ranjan, Mengyuan Liu, Rita Cucchiara, Luc van Gool, Nicu Sebe

While it is crucial to capture global information for effective image restoration (IR), integrating such cues into transformer-based methods becomes computationally expensive, especially with high input resolution.

Graph Attention Image Restoration

Paper
Add Code

Image Fusion via Vision-Language Model

no code implementations • 3 Feb 2024 • Zixiang Zhao, Lilun Deng, Haowen Bai, Yukun Cui, Zhipeng Zhang, Yulun Zhang, Haotong Qin, Dongdong Chen, Jiangshe Zhang, Peng Wang, Luc van Gool

Therefore, we introduce a novel fusion paradigm named image Fusion via vIsion-Language Model (FILM), for the first time, utilizing explicit textual information in different source images to guide image fusion.

Decoder Language Modelling

Paper
Add Code

Vanishing-Point-Guided Video Semantic Segmentation of Driving Scenes

1 code implementation • 27 Jan 2024 • Diandian Guo, Deng-Ping Fan, Tongyu Lu, Christos Sakaridis, Luc van Gool

The estimation of implicit cross-frame correspondences and the high computational cost have long been major challenges in video semantic segmentation (VSS) for driving scenes.

Motion Estimation Segmentation +2

Paper
Code

MUSES: The Multi-Sensor Semantic Perception Dataset for Driving under Uncertainty

no code implementations • 23 Jan 2024 • Tim Brödermann, David Bruggemann, Christos Sakaridis, Kevin Ta, Odysseas Liagouris, Jason Corkill, Luc van Gool

Achieving level-5 driving automation in autonomous vehicles necessitates a robust semantic visual perception system capable of parsing data from different sensors across diverse conditions.

Autonomous Vehicles Panoptic Segmentation

Paper
Add Code

Graph Transformer GANs with Graph Masked Modeling for Architectural Layout Generation

no code implementations • 15 Jan 2024 • Hao Tang, Ling Shao, Nicu Sebe, Luc van Gool

Finally, we propose a novel self-guided pre-training method for graph representation learning.

Generative Adversarial Network Graph Representation Learning +1

Paper
Add Code

InseRF: Text-Driven Generative Object Insertion in Neural 3D Scenes

no code implementations • 10 Jan 2024 • Mohamad Shahbazi, Liesbeth Claessens, Michael Niemeyer, Edo Collins, Alessio Tonioni, Luc van Gool, Federico Tombari

We introduce InseRF, a novel method for generative object insertion in the NeRF reconstructions of 3D scenes.

3D scene Editing Monocular Depth Estimation +2

Paper
Add Code

Learning to Prompt with Text Only Supervision for Vision-Language Models

1 code implementation • 4 Jan 2024 • Muhammad Uzair Khattak, Muhammad Ferjad Naeem, Muzammal Naseer, Luc van Gool, Federico Tombari

While effective, most of these works require labeled data which is not practical, and often struggle to generalize towards new datasets due to over-fitting on the source data.

Prompt Engineering

Paper
Code

Residual Learning for Image Point Descriptors

no code implementations • 24 Dec 2023 • Rashik Shrestha, Ajad Chhatkuli, Menelaos Kanakis, Luc van Gool

Such an approach of optimization allows us to discard learning knowledge already present in non-differentiable functions such as the hand-crafted descriptors and only learn the residual knowledge in the main network branch.

Camera Localization Ensemble Learning

Paper
Add Code

Ternary-type Opacity and Hybrid Odometry for RGB-only NeRF-SLAM

no code implementations • 20 Dec 2023 • Junru Lin, Asen Nachkov, Songyou Peng, Luc van Gool, Danda Pani Paudel

To foster this line of research, we also propose a simple yet novel visual odometry scheme that uses a hybrid combination of volumetric and warping-based image renderings.

Visual Odometry

Paper
Add Code

Diffusion-Based Particle-DETR for BEV Perception

no code implementations • 18 Dec 2023 • Asen Nachkov, Martin Danelljan, Danda Pani Paudel, Luc van Gool

For the enhanced safety of AVs, modeling perception uncertainty in BEV is crucial.

Autonomous Vehicles Object +2

Paper
Add Code

G-MEMP: Gaze-Enhanced Multimodal Ego-Motion Prediction in Driving

no code implementations • 13 Dec 2023 • M. Eren Akbiyik, Nedko Savov, Danda Pani Paudel, Nikola Popovic, Christian Vater, Otmar Hilliges, Luc van Gool, Xi Wang

In contrast, we focus on inferring the ego trajectory of a driver's vehicle using their gaze data.

Decision Making motion prediction +1

Paper
Add Code

Zero-Shot Point Cloud Registration

no code implementations • 5 Dec 2023 • Weijie Wang, Guofeng Mei, Bin Ren, Xiaoshui Huang, Fabio Poiesi, Luc van Gool, Nicu Sebe, Bruno Lepri

The cornerstone of ZeroReg is the novel transfer of image features from keypoints to the point cloud, enriched by aggregating information from 3D geometric neighborhoods.

Decoder Point Cloud Registration

Paper
Add Code

DGInStyle: Domain-Generalizable Semantic Segmentation with Image Diffusion Models and Stylized Semantic Control

no code implementations • 5 Dec 2023 • Yuru Jia, Lukas Hoyer, Shengyu Huang, Tianfu Wang, Luc van Gool, Konrad Schindler, Anton Obukhov

Large, pretrained latent diffusion models (LDMs) have demonstrated an extraordinary ability to generate creative content, specialize to user data through few-shot fine-tuning, and condition their output on other modalities, such as semantic maps.

Ranked #5 on Domain Generalization on GTA-to-Avg(Cityscapes,BDD,Mapillary)

Autonomous Driving Domain Generalization +1

Paper
Add Code

LALM: Long-Term Action Anticipation with Language Models

no code implementations • 29 Nov 2023 • Sanghwan Kim, Daoji Huang, Yongqin Xian, Otmar Hilliges, Luc van Gool, Xi Wang

Understanding human activity is a crucial yet intricate task in egocentric vision, a field that focuses on capturing visual perspectives from the camera wearer's viewpoint.

Action Anticipation Action Recognition +4

Paper
Add Code

Continuous Pose for Monocular Cameras in Neural Implicit Representation

1 code implementation • 28 Nov 2023 • Qi Ma, Danda Pani Paudel, Ajad Chhatkuli, Luc van Gool

In this paper, we showcase the effectiveness of optimizing monocular camera poses as a continuous function of time.

Simultaneous Localization and Mapping

Paper
Code

Single-Model and Any-Modality for Video Object Tracking

1 code implementation • 27 Nov 2023 • Zongwei Wu, Jilai Zheng, Xiangxuan Ren, Florin-Alexandru Vasluianu, Chao Ma, Danda Pani Paudel, Luc van Gool, Radu Timofte

In practice, most existing RGB trackers learn a single set of parameters to use them across datasets and applications.

Ranked #17 on Rgb-T Tracking on LasHeR

Object Rgb-T Tracking +1

Paper
Code

SemiVL: Semi-Supervised Semantic Segmentation with Vision-Language Guidance

1 code implementation • 27 Nov 2023 • Lukas Hoyer, David Joseph Tan, Muhammad Ferjad Naeem, Luc van Gool, Federico Tombari

In SemiVL, we propose to integrate rich priors from VLM pre-training into semi-supervised semantic segmentation to learn better semantic decision boundaries.

Ranked #1 on Semi-Supervised Semantic Segmentation on PASCAL VOC 2012 732 labeled (using extra training data)

Decoder Segmentation +1

Paper
Code

2D Feature Distillation for Weakly- and Semi-Supervised 3D Semantic Segmentation

no code implementations • 27 Nov 2023 • Ozan Unal, Dengxin Dai, Lukas Hoyer, Yigit Baran Can, Luc van Gool

As 3D perception problems grow in popularity and the need for large-scale labeled datasets for LiDAR semantic segmentation increase, new methods arise that aim to reduce the necessity for dense annotations by employing weakly-supervised training.

Ranked #1 on 3D Semantic Segmentation on ScribbleKITTI

2D Semantic Segmentation 3D Semantic Segmentation +3

Paper
Add Code

Lego: Learning to Disentangle and Invert Concepts Beyond Object Appearance in Text-to-Image Diffusion Models

no code implementations • 23 Nov 2023 • Saman Motamed, Danda Pani Paudel, Luc van Gool

To enable customized content creation based on a few example images of a concept, methods such as Textual Inversion and DreamBooth invert the desired concept and enable synthesizing it in new scenes.

Language Modelling Large Language Model +3

Paper
Add Code

3D Compression Using Neural Fields

no code implementations • 21 Nov 2023 • Janis Postels, Yannick Strümpler, Klara Reichard, Luc van Gool, Federico Tombari

Neural Fields (NFs) have gained momentum as a tool for compressing various data modalities - e. g. images and videos.

Attribute

Paper
Add Code

Deep Equilibrium Diffusion Restoration with Parallel Sampling

1 code implementation • 20 Nov 2023 • JieZhang Cao, Yue Shi, Kai Zhang, Yulun Zhang, Radu Timofte, Luc van Gool

Due to the inherent property of diffusion models, most existing methods need long serial sampling chains to restore HQ images step-by-step, resulting in expensive sampling time and high computation costs.

Image Restoration

Paper
Code

Model-aware 3D Eye Gaze from Weak and Few-shot Supervisions

1 code implementation • 20 Nov 2023 • Nikola Popovic, Dimitrios Christodoulou, Danda Pani Paudel, Xi Wang, Luc van Gool

In this work, we propose to predict 3D eye gaze from weak supervision of eye semantic segmentation masks and direct supervision of a few 3D gaze vectors.

Semantic Segmentation

Paper
Code

MoVideo: Motion-Aware Video Generation with Diffusion Models

no code implementations • 19 Nov 2023 • Jingyun Liang, Yuchen Fan, Kai Zhang, Radu Timofte, Luc van Gool, Rakesh Ranjan

While recent years have witnessed great progress on using diffusion models for video generation, most of them are simple extensions of image generation frameworks, which fail to explicitly consider one of the key differences between videos and images, i. e., motion.

Image Generation Image to Video Generation +1

Paper
Add Code

Contrastive Learning for Multi-Object Tracking with Transformers

no code implementations • 14 Nov 2023 • Pierre-François De Plaen, Nicola Marinello, Marc Proesmans, Tinne Tuytelaars, Luc van Gool

The DEtection TRansformer (DETR) opened new possibilities for object detection by modeling it as a translation task: converting image features into object-level representations.

Ranked #1 on Multiple Object Tracking on BDD100K test

Contrastive Learning Multi-Object Tracking +4

Paper
Add Code

Learning Robust Multi-Scale Representation for Neural Radiance Fields from Unposed Images

no code implementations • 8 Nov 2023 • Nishant Jain, Suryansh Kumar, Luc van Gool

The key ideas presented in this paper are (i) Recovering accurate camera parameters via a robust pipeline from unposed day-to-day images is equally crucial in neural novel view synthesis problem; (ii) It is rather more practical to model object's content at different resolutions since dramatic camera motion is highly likely in day-to-day unposed images.

Depth Estimation Depth Prediction +3

Paper
Add Code

Long-Term Invariant Local Features via Implicit Cross-Domain Correspondences

no code implementations • 6 Nov 2023 • Zador Pataki, Mohammad Altillawi, Menelaos Kanakis, Rémi Pautrat, Fengyi Shen, Ziyuan Liu, Luc van Gool, Marc Pollefeys

Our proposed method enhances cross-domain localization performance, significantly reducing the performance gap.

Visual Localization

Paper
Add Code

Towards High-quality HDR Deghosting with Conditional Diffusion Models

no code implementations • 2 Nov 2023 • Qingsen Yan, Tao Hu, Yuan Sun, Hao Tang, Yu Zhu, Wei Dong, Luc van Gool, Yanning Zhang

To address this challenge, we formulate the HDR deghosting problem as an image generation that leverages LDR features as the diffusion model's condition, consisting of the feature condition generator and the noise predictor.

Denoising Image Generation

Paper
Add Code

SILC: Improving Vision Language Pretraining with Self-Distillation

no code implementations • 20 Oct 2023 • Muhammad Ferjad Naeem, Yongqin Xian, Xiaohua Zhai, Lukas Hoyer, Luc van Gool, Federico Tombari

However, the contrastive objective used by these models only focuses on image-text alignment and does not incentivise image feature learning for dense prediction tasks.

Ranked #1 on Open Vocabulary Semantic Segmentation on PascalVOC-20b

Classification Contrastive Learning +8

Paper
Add Code

Real-Time Motion Prediction via Heterogeneous Polyline Transformer with Relative Pose Encoding

1 code implementation • NeurIPS 2023 • Zhejun Zhang, Alexander Liniger, Christos Sakaridis, Fisher Yu, Luc van Gool

The real-world deployment of an autonomous driving system requires its components to run on-board and in real-time, including the motion prediction module that predicts the future trajectories of surrounding traffic participants.

Autonomous Driving motion prediction

Paper
Code

Probabilistic Sampling of Balanced K-Means using Adiabatic Quantum Computing

no code implementations • 18 Oct 2023 • Jan-Nico Zaech, Martin Danelljan, Tolga Birdal, Luc van Gool

Adiabatic quantum computing (AQC) is a promising approach for discrete and often NP-hard optimization problems.

Clustering

Paper
Add Code

Discwise Active Learning for LiDAR Semantic Segmentation

no code implementations • 23 Sep 2023 • Ozan Unal, Dengxin Dai, Ali Tamer Unal, Luc van Gool

Finally we propose a semi-supervised learning approach to utilize all frames within our dataset and improve performance.

Active Learning LIDAR Semantic Segmentation +1

Paper
Add Code

Breathing New Life into 3D Assets with Generative Repainting

2 code implementations • 15 Sep 2023 • Tianfu Wang, Menelaos Kanakis, Konrad Schindler, Luc van Gool, Anton Obukhov

Diffusion-based text-to-image models ignited immense attention from the vision community, artists, and content creators.

Paper
Code

Deformable Neural Radiance Fields using RGB and Event Cameras

no code implementations • ICCV 2023 • Qi Ma, Danda Pani Paudel, Ajad Chhatkuli, Luc van Gool

In this work, we develop a novel method to model the deformable neural radiance fields using RGB and event cameras.

Paper
Add Code

Temporal-aware Hierarchical Mask Classification for Video Semantic Segmentation

1 code implementation • 14 Sep 2023 • Zhaochong An, Guolei Sun, Zongwei Wu, Hao Tang, Luc van Gool

Modern approaches have proved the huge potential of addressing semantic segmentation as a mask classification task which is widely used in instance-level segmentation.

Classification Decoder +3

Paper
Code

Three Ways to Improve Verbo-visual Fusion for Dense 3D Visual Grounding

no code implementations • 8 Sep 2023 • Ozan Unal, Christos Sakaridis, Suman Saha, Fisher Yu, Luc van Gool

A common formulation to tackle 3D visual grounding is grounding-by-detection, where localization is done via bounding boxes.

3D Instance Segmentation 3D visual grounding +3

Paper
Add Code

Video Task Decathlon: Unifying Image and Video Tasks in Autonomous Driving

no code implementations • ICCV 2023 • Thomas E. Huang, Yifan Liu, Luc van Gool, Fisher Yu

VTD is a promising new direction for exploring the unification of perception tasks in autonomous driving.

Autonomous Driving Representation Learning +1

Paper
Add Code

Neural Gradient Regularizer

1 code implementation • 31 Aug 2023 • Shuang Xu, Yifan Wang, Zixiang Zhao, Jiangjun Peng, Xiangyong Cao, Deyu Meng, Yulun Zhang, Radu Timofte, Luc van Gool

NGR is applicable to various image types and different image processing tasks, functioning in a zero-shot learning fashion, making it a versatile and plug-and-play regularizer.

Zero-Shot Learning

Paper
Code

Introducing Language Guidance in Prompt-based Continual Learning

1 code implementation • ICCV 2023 • Muhammad Gul Zain Ali Khan, Muhammad Ferjad Naeem, Luc van Gool, Didier Stricker, Federico Tombari, Muhammad Zeshan Afzal

While the model faces a disjoint set of classes in each task in this setting, we argue that these classes can be encoded to the same embedding space of a pre-trained language encoder.

Continual Learning

Paper
Code

DiffI2I: Efficient Diffusion Model for Image-to-Image Translation

no code implementations • 26 Aug 2023 • Bin Xia, Yulun Zhang, Shiyin Wang, Yitong Wang, Xinglong Wu, Yapeng Tian, Wenming Yang, Radu Timotfe, Luc van Gool

Compared to traditional DMs, the compact IPR enables DiffI2I to obtain more accurate outcomes and employ a lighter denoising network and fewer iterations.

Denoising Image-to-Image Translation +2

Paper
Add Code

DREAMWALKER: Mental Planning for Continuous Vision-Language Navigation

no code implementations • ICCV 2023 • Hanqing Wang, Wei Liang, Luc van Gool, Wenguan Wang

VLN-CE is a recently released embodied task, where AI agents need to navigate a freely traversable environment to reach a distant target location, given language instructions.

Decision Making Navigate +1

Paper
Add Code

When Super-Resolution Meets Camouflaged Object Detection: A Comparison Study

no code implementations • 8 Aug 2023 • Juan Wen, Shupeng Cheng, Peng Xu, BoWen Zhou, Radu Timofte, Weiyan Hou, Luc van Gool

Super Resolution (SR) and Camouflaged Object Detection (COD) are two hot topics in computer vision with various joint applications.

Object object-detection +2

Paper
Add Code

How Good is Google Bard's Visual Understanding? An Empirical Study on Open Challenges

1 code implementation • 27 Jul 2023 • Haotong Qin, Ge-Peng Ji, Salman Khan, Deng-Ping Fan, Fahad Shahbaz Khan, Luc van Gool

Google's Bard has emerged as a formidable competitor to OpenAI's ChatGPT in the field of conversational AI.

Paper
Code

Prior Based Online Lane Graph Extraction from Single Onboard Camera Image

no code implementations • 25 Jul 2023 • Yigit Baran Can, Alexander Liniger, Danda Pani Paudel, Luc van Gool

Thus, online estimation of the lane graph is crucial for widespread and reliable autonomous navigation.

Autonomous Navigation

Paper
Add Code

Edge Guided GANs with Multi-Scale Contrastive Learning for Semantic Image Synthesis

1 code implementation • 22 Jul 2023 • Hao Tang, Guolei Sun, Nicu Sebe, Luc van Gool

To tackle 2), we design an effective module to selectively highlight class-dependent feature maps according to the original semantic layout to preserve the semantic information.

Contrastive Learning Image Generation

Paper
Code

Improving Online Lane Graph Extraction by Object-Lane Clustering

no code implementations • ICCV 2023 • Yigit Baran Can, Alexander Liniger, Danda Pani Paudel, Luc van Gool

In this work, we propose an architecture and loss formulation to improve the accuracy of local lane graph estimates by using 3D object detection outputs.

3D Object Detection Autonomous Driving +4

Paper
Add Code

AutoDecoding Latent 3D Diffusion Models

1 code implementation • NeurIPS 2023 • Evangelos Ntavelis, Aliaksandr Siarohin, Kyle Olszewski, Chaoyang Wang, Luc van Gool, Sergey Tulyakov

We present a novel approach to the generation of static and articulated 3D assets that has a 3D autodecoder at its core.

132

Paper
Code

Prompting Diffusion Representations for Cross-Domain Semantic Segmentation

no code implementations • 5 Jul 2023 • Rui Gong, Martin Danelljan, Han Sun, Julio Delgado Mangas, Luc van Gool

Intrigued by this result, we set out to explore how well diffusion-pretrained representations generalize to new domains, a crucial ability for any representation.

Domain Generalization Image Generation +2

Paper
Add Code

Unbalanced Optimal Transport: A Unified Framework for Object Detection

1 code implementation • CVPR 2023 • Henri De Plaen, Pierre-François De Plaen, Johan A. K. Suykens, Marc Proesmans, Tinne Tuytelaars, Luc van Gool

The approach is well suited for GPU implementation, which proves to be an advantage for large-scale models.

Object object-detection +1

Paper
Code

Palm: Predicting Actions through Language Models @ Ego4D Long-Term Action Anticipation Challenge 2023

1 code implementation • 28 Jun 2023 • Daoji Huang, Otmar Hilliges, Luc van Gool, Xi Wang

We present Palm, a solution to the Long-Term Action Anticipation (LTA) task utilizing vision-language and large language models.

Action Anticipation Image Captioning +3

Paper
Code

UncLe-SLAM: Uncertainty Learning for Dense Neural SLAM

1 code implementation • 19 Jun 2023 • Erik Sandström, Kevin Ta, Luc van Gool, Martin R. Oswald

We present an uncertainty learning framework for dense neural simultaneous localization and mapping (SLAM).

Simultaneous Localization and Mapping

Paper
Code

SF-FSDA: Source-Free Few-Shot Domain Adaptive Object Detection with Efficient Labeled Data Factory

no code implementations • 7 Jun 2023 • Han Sun, Rui Gong, Konrad Schindler, Luc van Gool

Domain adaptive object detection aims to leverage the knowledge learned from a labeled source domain to improve the performance on an unlabeled target domain.

Object object-detection +2

Paper
Add Code

Condition-Invariant Semantic Segmentation

1 code implementation • 27 May 2023 • Christos Sakaridis, David Bruggemann, Fisher Yu, Luc van Gool

Motivated by these findings, we propose to leverage stylization in performing feature-level adaptation by aligning the internal network features extracted by the encoder of the network from the original and the stylized view of each input image with a novel feature invariance loss.

Segmentation Semantic Segmentation +1

Paper
Code

Equivariant Multi-Modality Image Fusion

3 code implementations • 19 May 2023 • Zixiang Zhao, Haowen Bai, Jiangshe Zhang, Yulun Zhang, Kai Zhang, Shuang Xu, Dongdong Chen, Radu Timofte, Luc van Gool

These components enable the net training to follow the principles of the natural sensing-imaging process while satisfying the equivariant imaging prior.

Self-Supervised Learning

334

Paper
Code

Denoising Diffusion Models for Plug-and-Play Image Restoration

2 code implementations • 15 May 2023 • Yuanzhi Zhu, Kai Zhang, Jingyun Liang, JieZhang Cao, Bihan Wen, Radu Timofte, Luc van Gool

Although diffusion models have shown impressive performance for high-quality image synthesis, their potential to serve as a generative denoiser prior to the plug-and-play IR methods remains to be further explored.

Deblurring Denoising +4

323

Paper
Code

StyleGenes: Discrete and Efficient Latent Distributions for GANs

no code implementations • 30 Apr 2023 • Evangelos Ntavelis, Mohamad Shahbazi, Iason Kastanis, Radu Timofte, Martin Danelljan, Luc van Gool

Thus, by independently sampling a variant for each gene and combining them into the final latent vector, our approach can represent a vast number of unique latent samples from a compact set of learnable parameters.

Disentanglement

Paper
Add Code

Event-Free Moving Object Segmentation from Moving Ego Vehicle

2 code implementations • 28 Apr 2023 • Zhuyun Zhou, Zongwei Wu, Danda Pani Paudel, Rémi Boutteau, Fan Yang, Luc van Gool, Radu Timofte, Dominique Ginhac

Subsequently, we devise EmoFormer, a novel network able to exploit the event data.

Autonomous Driving Object +6

Paper
Code

Neural Implicit Dense Semantic SLAM

no code implementations • 27 Apr 2023 • Yasaman Haghighi, Suryansh Kumar, Jean-Philippe Thiran, Luc van Gool

Visual Simultaneous Localization and Mapping (vSLAM) is a widely used technique in robotics and computer vision that enables a robot to create a map of an unfamiliar environment using a camera sensor while simultaneously tracking its position over time.

Scene Understanding Semantic Segmentation +1

Paper
Add Code

EDAPS: Enhanced Domain-Adaptive Panoptic Segmentation

1 code implementation • ICCV 2023 • Suman Saha, Lukas Hoyer, Anton Obukhov, Dengxin Dai, Luc van Gool

EDAPS significantly improves the state-of-the-art performance for panoptic segmentation UDA by a large margin of 20% on SYNTHIA-to-Cityscapes and even 72% on the more challenging SYNTHIA-to-Mapillary Vistas.

Ranked #1 on Domain Adaptation on Panoptic SYNTHIA-to-Mapillary

Domain Adaptation Instance Segmentation +2

Paper
Code

Domain Adaptive and Generalizable Network Architectures and Training Strategies for Semantic Image Segmentation

3 code implementations • 26 Apr 2023 • Lukas Hoyer, Dengxin Dai, Luc van Gool

As previous UDA&DG semantic segmentation methods are mostly based on outdated networks, we benchmark more recent architectures, reveal the potential of Transformers, and design the DAFormer network tailored for UDA&DG.

Ranked #6 on Domain Generalization on GTA-to-Avg(Cityscapes,BDD,Mapillary)

Domain Generalization Image Segmentation +2

436

Paper
Code

Indiscernible Object Counting in Underwater Scenes

1 code implementation • CVPR 2023 • Guolei Sun, Zhaochong An, Yun Liu, Ce Liu, Christos Sakaridis, Deng-Ping Fan, Luc van Gool

We further advance the frontier of this field by systematically studying a new challenge named indiscernible object counting (IOC), the goal of which is to count objects that are blended with respect to their surroundings.

Benchmarking Object +2

Paper
Code

Advances in Deep Concealed Scene Understanding

1 code implementation • 21 Apr 2023 • Deng-Ping Fan, Ge-Peng Ji, Peng Xu, Ming-Ming Cheng, Christos Sakaridis, Luc van Gool

Concealed scene understanding (CSU) is a hot computer vision topic aiming to perceive objects exhibiting camouflage.

Scene Understanding Semantic Segmentation

Paper
Code

Quantum Annealing for Single Image Super-Resolution

no code implementations • 18 Apr 2023 • Han Yao Choong, Suryansh Kumar, Luc van Gool

As a result, in this work, we take the privilege to perform an early exploration of applying a quantum computing algorithm to this important image enhancement problem, i. e., SISR.

Combinatorial Optimization Image Enhancement +1

Paper
Add Code

SMAE: Few-shot Learning for HDR Deghosting with Saturation-Aware Masked Autoencoders

no code implementations • CVPR 2023 • Qingsen Yan, Song Zhang, Weiye Chen, Hao Tang, Yu Zhu, Jinqiu Sun, Luc van Gool, Yanning Zhang

In this work, we propose a novel semi-supervised approach to realize few-shot HDR imaging via two stages of training, called SSHDR.

Few-Shot Learning Pseudo Label

Paper
Add Code

SAM Struggles in Concealed Scenes -- Empirical Study on "Segment Anything"

no code implementations • 12 Apr 2023 • Ge-Peng Ji, Deng-Ping Fan, Peng Xu, Ming-Ming Cheng, BoWen Zhou, Luc van Gool

Segmenting anything is a ground-breaking step toward artificial general intelligence, and the Segment Anything Model (SAM) greatly fosters the foundation models for computer vision.

Paper
Add Code

CamDiff: Camouflage Image Augmentation via Diffusion Model

1 code implementation • 11 Apr 2023 • Xue-Jing Luo, Shuo Wang, Zongwei Wu, Christos Sakaridis, Yun Cheng, Deng-Ping Fan, Luc van Gool

Specifically, we leverage the latent diffusion model to synthesize salient objects in camouflaged scenes, while using the zero-shot image classification ability of the Contrastive Language-Image Pre-training (CLIP) model to prevent synthesis failures and ensure the synthesized object aligns with the input prompt.

Image Augmentation Image Classification +3

Paper
Code

Point-SLAM: Dense Neural Point Cloud-based SLAM

2 code implementations • ICCV 2023 • Erik Sandström, Yue Li, Luc van Gool, Martin R. Oswald

We propose a dense neural simultaneous localization and mapping (SLAM) approach for monocular RGBD input which anchors the features of a neural scene representation in a point cloud that is iteratively generated in an input-dependent data-driven manner.

Simultaneous Localization and Mapping

352

Paper
Code

Online Lane Graph Extraction from Onboard Video

no code implementations • 3 Apr 2023 • Yigit Baran Can, Alexander Liniger, Danda Pani Paudel, Luc van Gool

One of the most common and useful representation of such an understanding is done in the form of BEV lane graphs.

Autonomous Driving Navigate

Paper
Add Code

Single Image Depth Prediction Made Better: A Multivariate Gaussian Take

no code implementations • CVPR 2023 • Ce Liu, Suryansh Kumar, Shuhang Gu, Radu Timofte, Luc van Gool

Accordingly, we introduce an approach that performs continuous modeling of per-pixel depth, where we can predict and reason about the per-pixel depth and its distribution.

Depth Estimation Depth Prediction

Paper
Add Code

Enhanced Stable View Synthesis

no code implementations • CVPR 2023 • Nishant Jain, Suryansh Kumar, Luc van Gool

Extensive evaluation of our approach on the popular benchmark dataset, such as Tanks and Temples, shows substantial improvement in view synthesis results compared to the prior art.

3D Reconstruction Novel View Synthesis

Paper
Add Code

Unsupervised Deep Probabilistic Approach for Partial Point Cloud Registration

no code implementations • CVPR 2023 • Guofeng Mei, Hao Tang, Xiaoshui Huang, Weijie Wang, Juan Liu, Jian Zhang, Luc van Gool, Qiang Wu

Deep point cloud registration methods face challenges to partial overlaps and rely on labeled data.

Point Cloud Registration

Paper
Add Code

NeRF-GAN Distillation for Efficient 3D-Aware Generation with Convolutions

1 code implementation • 22 Mar 2023 • Mohamad Shahbazi, Evangelos Ntavelis, Alessio Tonioni, Edo Collins, Danda Pani Paudel, Martin Danelljan, Luc van Gool

Pose-conditioned convolutional generative models struggle with high-quality 3D-consistent image generation from single-view datasets, due to their lack of sufficient 3D priors.

Image Generation Inductive Bias

Paper
Code

Lidar Line Selection with Spatially-Aware Shapley Value for Cost-Efficient Depth Completion

no code implementations • 21 Mar 2023 • Kamil Adamczewski, Christos Sakaridis, Vaishakh Patil, Luc van Gool

Lidar is a vital sensor for estimating the depth of a scene.

Depth Completion

Paper
Add Code

DiffIR: Efficient Diffusion Model for Image Restoration

1 code implementation • ICCV 2023 • Bin Xia, Yulun Zhang, Shiyin Wang, Yitong Wang, Xinglong Wu, Yapeng Tian, Wenming Yang, Luc van Gool

Diffusion model (DM) has achieved SOTA performance by modeling the image synthesis process into a sequential application of a denoising network.

Denoising Image Generation +1

383

Paper
Code

Spherical Space Feature Decomposition for Guided Depth Map Super-Resolution

no code implementations • ICCV 2023 • Zixiang Zhao, Jiangshe Zhang, Xiang Gu, Chengli Tan, Shuang Xu, Yulun Zhang, Radu Timofte, Luc van Gool

Then, the extracted features are mapped to the spherical space to complete the separation of private features and the alignment of shared features.

Contrastive Learning Depth Map Super-Resolution

Paper
Add Code

Graph Transformer GANs for Graph-Constrained House Generation

no code implementations • CVPR 2023 • Hao Tang, Zhenyu Zhang, Humphrey Shi, Bo Li, Ling Shao, Nicu Sebe, Radu Timofte, Luc van Gool

We present a novel graph Transformer generative adversarial network (GTGAN) to learn effective graph node relations in an end-to-end fashion for the challenging graph-constrained house generation task.

Generative Adversarial Network House Generation +1

Paper
Add Code

DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion

3 code implementations • ICCV 2023 • Zixiang Zhao, Haowen Bai, Yuanzhi Zhu, Jiangshe Zhang, Shuang Xu, Yulun Zhang, Kai Zhang, Deyu Meng, Radu Timofte, Luc van Gool

To leverage strong generative priors and address challenges such as unstable training and lack of interpretability for GAN-based generative methods, we propose a novel fusion algorithm based on the denoising diffusion probabilistic model (DDPM).

Denoising

334

Paper
Code

Contrastive Model Adaptation for Cross-Condition Robustness in Semantic Segmentation

1 code implementation • ICCV 2023 • David Bruggemann, Christos Sakaridis, Tim Brödermann, Luc van Gool

We investigate normal-to-adverse condition model adaptation for semantic segmentation, whereby image-level correspondences are available in the target domain.

Ranked #1 on Source-Free Domain Adaptation on Cityscapes to ACDC

Contrastive Learning Semantic Segmentation +2

Paper
Code

TrafficBots: Towards World Models for Autonomous Driving Simulation and Motion Prediction

2 code implementations • 7 Mar 2023 • Zhejun Zhang, Alexander Liniger, Dengxin Dai, Fisher Yu, Luc van Gool

We present TrafficBots, a multi-agent policy built upon motion prediction and end-to-end driving, and based on TrafficBots we obtain a world model tailored for the planning module of autonomous vehicles.

Autonomous Driving Model-based Reinforcement Learning +1

Paper
Code

A Multiplicative Value Function for Safe and Efficient Reinforcement Learning

1 code implementation • 7 Mar 2023 • Nick Bührer, Zhejun Zhang, Alexander Liniger, Fisher Yu, Luc van Gool

To this end, we propose a safe model-free RL algorithm with a novel multiplicative value function consisting of a safety critic and a reward critic.

Navigate reinforcement-learning +3

Paper
Code

Efficient and Explicit Modelling of Image Hierarchies for Image Restoration

1 code implementation • CVPR 2023 • Yawei Li, Yuchen Fan, Xiaoyu Xiang, Denis Demandolx, Rakesh Ranjan, Radu Timofte, Luc van Gool

The aim of this paper is to propose a mechanism to efficiently and explicitly model image hierarchies in the global, regional, and local range for image restoration.

Ranked #1 on Image Defocus Deblurring on DPD (Dual-view)

Image Deblurring Image Defocus Deblurring +1

338

Paper
Code

VA-DepthNet: A Variational Approach to Single Image Depth Prediction

2 code implementations • 13 Feb 2023 • Ce Liu, Suryansh Kumar, Shuhang Gu, Radu Timofte, Luc van Gool

While state-of-the-art deep neural network methods for SIDP learn the scene depth from images in a supervised setting, they often overlook the invaluable invariances and priors in the rigid scene space, such as the regularity of the scene.

Ranked #21 on Monocular Depth Estimation on NYU-Depth V2

Depth Prediction Monocular Depth Estimation

110

Paper
Code

No One Left Behind: Real-World Federated Class-Incremental Learning

2 code implementations • 2 Feb 2023 • Jiahua Dong, Hongliu Li, Yang Cong, Gan Sun, Yulun Zhang, Luc van Gool

These issues render global model to undergo catastrophic forgetting on old categories, when local clients receive new categories consecutively under limited memory of storing old categories.

Class Incremental Learning Federated Learning +1

Paper
Code

Summarize the Past to Predict the Future: Natural Language Descriptions of Context Boost Multimodal Object Interaction Anticipation

no code implementations • 22 Jan 2023 • Razvan-George Pasca, Alexey Gavryushin, Muhammad Hamza, Yen-Ling Kuo, Kaichun Mo, Luc van Gool, Otmar Hilliges, Xi Wang

This task requires an understanding of the spatio-temporal context formed by past actions on objects, coined action context.

Common Sense Reasoning Image Captioning

Paper
Add Code

Event-Based Frame Interpolation with Ad-hoc Deblurring

no code implementations • CVPR 2023 • Lei Sun, Christos Sakaridis, Jingyun Liang, Peng Sun, JieZhang Cao, Kai Zhang, Qi Jiang, Kaiwei Wang, Luc van Gool

The performance of video frame interpolation is inherently correlated with the ability to handle motion in the input scene.

Deblurring Image Deblurring +1

Paper
Add Code

Self-Supervised Burst Super-Resolution

no code implementations • ICCV 2023 • Goutam Bhat, Michaël Gharbi, Jiawen Chen, Luc van Gool, Zhihao Xia

Extensive experiments on real and synthetic data show that, despite only using noisy bursts during training, models trained with our self-supervised strategy match, and sometimes surpass, the quality of fully-supervised baselines trained with synthetic data or weakly-paired ground-truth.

Super-Resolution

Paper
Add Code

Continuous Pseudo-Label Rectified Domain Adaptive Semantic Segmentation With Implicit Neural Representations

no code implementations • CVPR 2023 • Rui Gong, Qin Wang, Martin Danelljan, Dengxin Dai, Luc van Gool

Unsupervised domain adaptation (UDA) for semantic segmentation aims at improving the model performance on the unlabeled target domain by leveraging a labeled source domain.

Pseudo Label Semantic Segmentation +1

Paper
Add Code

Beyond SOT: Tracking Multiple Generic Objects at Once

1 code implementation • 22 Dec 2022 • Christoph Mayer, Martin Danelljan, Ming-Hsuan Yang, Vittorio Ferrari, Luc van Gool, Alina Kuznetsova

Our approach achieves a 4x faster run-time in case of 10 concurrent objects compared to tracking each object independently and outperforms existing single object trackers on our new benchmark.

Attribute Object +1

3,114

Paper
Code

One-Shot Domain Adaptive and Generalizable Semantic Segmentation with Class-Aware Cross-Domain Transformers

no code implementations • 14 Dec 2022 • Rui Gong, Qin Wang, Dengxin Dai, Luc van Gool

Thus, we aim to relieve this need on a large number of real data, and explore the one-shot unsupervised sim-to-real domain adaptation (OSUDA) and generalization (OSDG) problem, where only one real-world data sample is available.

Autonomous Driving Domain Adaptation +1

Paper
Add Code

CamoFormer: Masked Separable Attention for Camouflaged Object Detection

1 code implementation • 10 Dec 2022 • Bowen Yin, Xuying Zhang, Qibin Hou, Bo-Yuan Sun, Deng-Ping Fan, Luc van Gool

How to identify and segment camouflaged objects from the background is challenging.

Decoder Object +2

Paper
Code

Source-free Depth for Object Pop-out

1 code implementation • ICCV 2023 • Zongwei Wu, Danda Pani Paudel, Deng-Ping Fan, Jingjing Wang, Shuo Wang, Cédric Demonceaux, Radu Timofte, Luc van Gool

In this work, we adapt such depth inference models for object segmentation using the objects' "pop-out" prior in 3D.

Object object-detection +3

Paper
Code

CiaoSR: Continuous Implicit Attention-in-Attention Network for Arbitrary-Scale Image Super-Resolution

1 code implementation • CVPR 2023 • JieZhang Cao, Qin Wang, Yongqin Xian, Yawei Li, Bingbing Ni, Zhiming Pi, Kai Zhang, Yulun Zhang, Radu Timofte, Luc van Gool

We explicitly design an implicit attention network to learn the ensemble weights for the nearby local features.

Image Super-Resolution

107

Paper
Code

I2MVFormer: Large Language Model Generated Multi-View Document Supervision for Zero-Shot Image Classification

no code implementations • CVPR 2023 • Muhammad Ferjad Naeem, Muhammad Gul Zain Ali Khan, Yongqin Xian, Muhammad Zeshan Afzal, Didier Stricker, Luc van Gool, Federico Tombari

Our proposed model, I2MVFormer, learns multi-view semantic embeddings for zero-shot image classification with these class views.

Classification Image Classification +3

Paper
Add Code

Surface Normal Clustering for Implicit Representation of Manhattan Scenes

1 code implementation • ICCV 2023 • Nikola Popovic, Danda Pani Paudel, Luc van Gool

In this work, we aim to leverage the geometric prior of Manhattan scenes to improve the implicit neural radiance field representations.

Clustering Novel View Synthesis

Paper
Code

MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation

1 code implementation • CVPR 2023 • Lukas Hoyer, Dengxin Dai, Haoran Wang, Luc van Gool

MIC significantly improves the state-of-the-art performance across the different recognition tasks for synthetic-to-real, day-to-nighttime, and clear-to-adverse-weather UDA.

Ranked #1 on Image-to-Image Translation on Cityscapes-to-Foggy Cityscapes

Image Classification object-detection +4

250

Paper
Code

Knowledge Distillation based Degradation Estimation for Blind Super-Resolution

1 code implementation • 30 Nov 2022 • Bin Xia, Yulun Zhang, Yitong Wang, Yapeng Tian, Wenming Yang, Radu Timofte, Luc van Gool

It consists of a knowledge distillation based implicit degradation estimator network (KD-IDE) and an efficient SR network.

Blind Super-Resolution Image Super-Resolution +1

134

Paper
Code

CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for Multi-Modality Image Fusion

3 code implementations • CVPR 2023 • Zixiang Zhao, Haowen Bai, Jiangshe Zhang, Yulun Zhang, Shuang Xu, Zudi Lin, Radu Timofte, Luc van Gool

We then introduce a dual-branch Transformer-CNN feature extractor with Lite Transformer (LT) blocks leveraging long-range attention to handle low-frequency global features and Invertible Neural Networks (INN) blocks focusing on extracting high-frequency local information.

object-detection Object Detection +1

334

Paper
Code

DiffDreamer: Towards Consistent Unsupervised Single-view Scene Extrapolation with Conditional Diffusion Models

no code implementations • ICCV 2023 • Shengqu Cai, Eric Ryan Chan, Songyou Peng, Mohamad Shahbazi, Anton Obukhov, Luc van Gool, Gordon Wetzstein

Scene extrapolation -- the idea of generating novel views by flying into a given image -- is a promising, yet challenging task.

Ranked #1 on Perpetual View Generation on LHQ

Denoising Perpetual View Generation

Paper
Add Code

Piecewise Planar Hulls for Semi-Supervised Learning of 3D Shape and Pose from 2D Images

no code implementations • 14 Nov 2022 • Yigit Baran Can, Alexander Liniger, Danda Pani Paudel, Luc van Gool

On the one hand, the proposed method learns to segment these planar hulls from the labeled data.

Object Pose Estimation

Paper
Add Code

Advancing Learned Video Compression with In-loop Frame Prediction

1 code implementation • 13 Nov 2022 • Ren Yang, Radu Timofte, Luc van Gool

In this paper, we propose an Advanced Learned Video Compression (ALVC) approach with the in-loop frame prediction module, which is able to effectively predict the target frame from the previously compressed frames, without consuming any bit-rate.

MS-SSIM SSIM +1

Paper
Code

MicroISP: Processing 32MP Photos on Mobile Devices with Deep Learning

no code implementations • 8 Nov 2022 • Andrey Ignatov, Anastasia Sycheva, Radu Timofte, Yu Tseng, Yu-Syuan Xu, Po-Hsiang Yu, Cheng-Ming Chiang, Hsien-Kai Kuo, Min-Hung Chen, Chia-Ming Cheng, Luc van Gool

While neural networks-based photo processing solutions can provide a better image quality compared to the traditional ISP systems, their application to mobile devices is still very limited due to their very high computational complexity.

Paper
Add Code

PyNet-V2 Mobile: Efficient On-Device Photo Processing With Neural Networks

1 code implementation • 8 Nov 2022 • Andrey Ignatov, Grigory Malivenko, Radu Timofte, Yu Tseng, Yu-Syuan Xu, Po-Hsiang Yu, Cheng-Ming Chiang, Hsien-Kai Kuo, Min-Hung Chen, Chia-Ming Cheng, Luc van Gool

The increased importance of mobile photography created a need for fast and performant RAW image processing pipelines capable of producing good visual results in spite of the mobile camera sensor limitations.

Paper
Code

Towards Versatile Embodied Navigation

1 code implementation • 30 Oct 2022 • Hanqing Wang, Wei Liang, Luc van Gool, Wenguan Wang

With the emergence of varied visual navigation tasks (e. g, image-/object-/audio-goal and vision-language navigation) that specify the target in different ways, the community has made appealing advances in training specialized agents capable of handling individual navigation tasks well.

Decision Making Vision-Language Navigation +1

Paper
Code

TripletTrack: 3D Object Tracking using Triplet Embeddings and LSTM

no code implementations • 28 Oct 2022 • Nicola Marinello, Marc Proesmans, Luc van Gool

We start from an off-the-shelf 3D object detector, and apply a tracking mechanism where objects are matched by an affinity score computed on local object feature embeddings and motion descriptors.

3D Object Tracking Autonomous Driving +2

Paper
Add Code

Masked Vision-Language Transformer in Fashion

1 code implementation • 27 Oct 2022 • Ge-Peng Ji, Mingcheng Zhuge, Dehong Gao, Deng-Ping Fan, Christos Sakaridis, Luc van Gool

We present a masked vision-language transformer (MVLT) for fashion-specific multi-modal representation.

Image Reconstruction Retrieval

Paper
Code

Learning Attention Propagation for Compositional Zero-Shot Learning

no code implementations • 20 Oct 2022 • Muhammad Gul Zain Ali Khan, Muhammad Ferjad Naeem, Luc van Gool, Alain Pagani, Didier Stricker, Muhammad Zeshan Afzal

CAPE learns to identify this structure and propagates knowledge between them to learn class embedding for all seen and unseen compositions.

Compositional Zero-Shot Learning

Paper
Add Code

Multi-View Photometric Stereo Revisited

no code implementations • 14 Oct 2022 • Berk Kaya, Suryansh Kumar, Carlos Oliveira, Vittorio Ferrari, Luc van Gool

The proposed approach in this paper exploits the benefit of uncertainty modeling in a deep neural network for a reliable fusion of photometric stereo (PS) and multi-view stereo (MVS) network predictions.

3D Shape Representation

Paper
Add Code

Composite Learning for Robust and Effective Dense Predictions

no code implementations • 13 Oct 2022 • Menelaos Kanakis, Thomas E. Huang, David Bruggemann, Fisher Yu, Luc van Gool

In this paper, we find that jointly training a dense prediction (target) task with a self-supervised (auxiliary) task can consistently improve the performance of the target task, while eliminating the need for labeling auxiliary tasks.

Ranked #104 on Semantic Segmentation on NYU Depth v2

Boundary Detection Monocular Depth Estimation +3

Paper
Add Code

SiNeRF: Sinusoidal Neural Radiance Fields for Joint Pose Estimation and Scene Reconstruction

1 code implementation • 10 Oct 2022 • Yitong Xia, Hao Tang, Radu Timofte, Luc van Gool

NeRFmm is the Neural Radiance Fields (NeRF) that deal with Joint Optimization tasks, i. e., reconstructing real-world scenes and registering camera parameters simultaneously.

Image Generation Pose Estimation

Paper
Code

Robustifying the Multi-Scale Representation of Neural Radiance Fields

no code implementations • 9 Oct 2022 • Nishant Jain, Suryansh Kumar, Luc van Gool

Although recently proposed Mip-NeRF could handle multi-scale imaging problems with NeRF, it cannot handle camera pose estimation error.

Graph Neural Network Pose Estimation

Paper
Add Code

Basic Binary Convolution Unit for Binarized Image Restoration Network

2 code implementations • 2 Oct 2022 • Bin Xia, Yulun Zhang, Yitong Wang, Yapeng Tian, Wenming Yang, Radu Timofte, Luc van Gool

In this study, we reconsider components in binary convolution, such as residual connection, BatchNorm, activation function, and structure, for IR tasks.

Binarization Image Restoration +1

114

Paper
Code

TT-NF: Tensor Train Neural Fields

1 code implementation • 30 Sep 2022 • Anton Obukhov, Mikhail Usvyatsov, Christos Sakaridis, Konrad Schindler, Luc van Gool

Learning neural fields has been an active topic in deep learning research, focusing, among other issues, on finding more compact and easy-to-fit representations.

Denoising Low-rank compression

Paper
Code

Physical Adversarial Attack meets Computer Vision: A Decade Survey

1 code implementation • 30 Sep 2022 • Hui Wei, Hao Tang, Xuemei Jia, Zhixiang Wang, Hanxun Yu, Zhubo Li, Shin'ichi Satoh, Luc van Gool, Zheng Wang

Building upon this foundation, we uncover the pervasive role of artifacts carrying adversarial perturbations in the physical world.

Adversarial Attack Medical Diagnosis

Paper
Code

Exploiting Instance-based Mixed Sampling via Auxiliary Source Domain Supervision for Domain-adaptive Action Detection

1 code implementation • 28 Sep 2022 • Yifan Lu, Gurkirt Singh, Suman Saha, Luc van Gool

We propose a novel domain adaptive action detection approach and a new adaptation protocol that leverages the recent advancements in image-level unsupervised domain adaptation (UDA) techniques and handle vagaries of instance-level video data.

Action Detection Pseudo Label +2

Paper
Code

I2DFormer: Learning Image to Document Attention for Zero-Shot Image Classification

no code implementations • 21 Sep 2022 • Muhammad Ferjad Naeem, Yongqin Xian, Luc van Gool, Federico Tombari

In order to distill discriminative visual words from noisy documents, we introduce a new cross-modal attention module that learns fine-grained interactions between image patches and document words.

Generalized Zero-Shot Learning Image Classification +2

Paper
Add Code

Spatio-Temporal Action Detection Under Large Motion

no code implementations • 6 Sep 2022 • Gurkirt Singh, Vasileios Choutas, Suman Saha, Fisher Yu, Luc van Gool

Current methods for spatiotemporal action tube detection often extend a bounding box proposal at a given keyframe into a 3D temporal cuboid and pool features from nearby frames.

Action Detection

Paper
Add Code

Learning Task-Oriented Flows to Mutually Guide Feature Alignment in Synthesized and Real Video Denoising

no code implementations • 25 Aug 2022 • JieZhang Cao, Qin Wang, Jingyun Liang, Yulun Zhang, Kai Zhang, Radu Timofte, Luc van Gool

To this end, we propose a new multi-scale refined optical flow-guided video denoising method, which is more robust to different noise levels.

Ranked #1 on Video Denoising on VideoLQ

Denoising Optical Flow Estimation +1

Paper
Add Code

ManiFlow: Implicitly Representing Manifolds with Normalizing Flows

no code implementations • 18 Aug 2022 • Janis Postels, Martin Danelljan, Luc van Gool, Federico Tombari

In contrast to prior work, we approach this problem by generating samples from the original data distribution given full knowledge about the perturbed distribution and the noise model.

Surface Reconstruction

Paper
Add Code

AVisT: A Benchmark for Visual Object Tracking in Adverse Visibility

1 code implementation • 14 Aug 2022 • Mubashir Noman, Wafa Al Ghallabi, Daniya Najiha, Christoph Mayer, Akshay Dudhane, Martin Danelljan, Hisham Cholakkal, Salman Khan, Luc van Gool, Fahad Shahbaz Khan

While being greatly benefiting to the tracking research, existing benchmarks do not pose the same difficulty as before with recent trackers achieving higher performance mainly due to (i) the introduction of more sophisticated transformers-based methods and (ii) the lack of diverse scenarios with adverse visibility such as, severe weather conditions, camouflage and imaging effects.

Visual Object Tracking Visual Tracking

3,114

Paper
Code

Reference-based Image Super-Resolution with Deformable Attention Transformer

1 code implementation • 25 Jul 2022 • JieZhang Cao, Jingyun Liang, Kai Zhang, Yawei Li, Yulun Zhang, Wenguan Wang, Luc van Gool

Reference-based image super-resolution (RefSR) aims to exploit auxiliary reference (Ref) images to super-resolve low-resolution (LR) images.

Ranked #1 on Reference-based Super-Resolution on CUFED5 - 4x upscaling

Image Super-Resolution Reference-based Super-Resolution

127

Paper
Code

Towards Interpretable Video Super-Resolution via Alternating Optimization

1 code implementation • 21 Jul 2022 • JieZhang Cao, Jingyun Liang, Kai Zhang, Wenguan Wang, Qin Wang, Yulun Zhang, Hao Tang, Luc van Gool

These issues can be alleviated by a cascade of three separate sub-tasks, including video deblurring, frame interpolation, and super-resolution, which, however, would fail to capture the spatial and temporal correlations among video sequences.

Deblurring Space-time Video Super-resolution +2

Paper
Code

Mining Relations among Cross-Frame Affinities for Video Semantic Segmentation

1 code implementation • 21 Jul 2022 • Guolei Sun, Yun Liu, Hao Tang, Ajad Chhatkuli, Le Zhang, Luc van Gool

The essence of video semantic segmentation (VSS) is how to leverage temporal information for prediction.

Optical Flow Estimation Semantic Segmentation +1

Paper
Code

Refign: Align and Refine for Adaptation of Semantic Segmentation to Adverse Conditions

1 code implementation • 14 Jul 2022 • David Bruggemann, Christos Sakaridis, Prune Truong, Luc van Gool

Due to the scarcity of dense pixel-level semantic annotations for images recorded in adverse visual conditions, there has been a keen interest in unsupervised domain adaptation (UDA) for the semantic segmentation of such images.

Ranked #1 on Semantic Segmentation on Dark Zurich

Semantic Segmentation Unsupervised Domain Adaptation

Paper
Code

Organic Priors in Non-Rigid Structure from Motion

no code implementations • 13 Jul 2022 • Suryansh Kumar, Luc van Gool

Besides that, the paper provides insights into the NRSfM factorization -- both in terms of shape and motion -- and is the first approach to show the benefit of single rotation averaging for NRSfM.

Paper
Add Code

OSFormer: One-Stage Camouflaged Instance Segmentation with Transformers

1 code implementation • 5 Jul 2022 • Jialun Pei, Tianyang Cheng, Deng-Ping Fan, He Tang, Chuanbo Chen, Luc van Gool

We present OSFormer, the first one-stage transformer framework for camouflaged instance segmentation (CIS).

Instance Segmentation Semantic Segmentation

Paper
Code

L2E: Lasers to Events for 6-DoF Extrinsic Calibration of Lidars and Event Cameras

1 code implementation • 3 Jul 2022 • Kevin Ta, David Bruggemann, Tim Brödermann, Christos Sakaridis, Luc van Gool

As neuromorphic technology is maturing, its application to robotics and autonomous vehicle systems has become an area of active research.

Autonomous Driving Camera Calibration

Paper
Code

HRFuser: A Multi-resolution Sensor Fusion Architecture for 2D Object Detection

1 code implementation • 30 Jun 2022 • Tim Broedermann, Christos Sakaridis, Dengxin Dai, Luc van Gool

Besides standard cameras, autonomous vehicles typically include multiple additional sensors, such as lidars and radars, which help acquire richer information for perceiving the content of the driving scene.

Ranked #1 on 2D Object Detection on Clear Weather

Autonomous Vehicles object-detection +3

Paper
Code

3D-Aware Video Generation

1 code implementation • 29 Jun 2022 • Sherwin Bahmani, Jeong Joon Park, Despoina Paschalidou, Hao Tang, Gordon Wetzstein, Leonidas Guibas, Luc van Gool, Radu Timofte

Generative models have emerged as an essential building block for many image synthesis and editing tasks.

Image Generation Video Generation

Paper
Code

SHIFT: A Synthetic Driving Dataset for Continuous Multi-Task Domain Adaptation

1 code implementation • CVPR 2022 • Tao Sun, Mattia Segu, Janis Postels, Yuxuan Wang, Luc van Gool, Bernt Schiele, Federico Tombari, Fisher Yu

Adapting to a continuously evolving environment is a safety-critical challenge inevitably faced by all autonomous driving systems.

Autonomous Driving Domain Adaptation

100

Paper
Code

Structured Sparsity Learning for Efficient Video Super-Resolution

1 code implementation • CVPR 2023 • Bin Xia, Jingwen He, Yulun Zhang, Yitong Wang, Yapeng Tian, Wenming Yang, Luc van Gool

In SSL, we design pruning schemes for several key components in VSR models, including residual blocks, recurrent networks, and upsampling networks.

Video Super-Resolution

Paper
Code

Discovering Object Masks with Transformers for Unsupervised Semantic Segmentation

1 code implementation • 13 Jun 2022 • Wouter Van Gansbeke, Simon Vandenhende, Luc van Gool

This paper presents MaskDistill: a novel framework for unsupervised semantic segmentation based on three key ideas.

Ranked #4 on Unsupervised Semantic Segmentation on PASCAL VOC 2012 val (using extra training data)

Object Segmentation +1

Paper
Code

Recurrent Video Restoration Transformer with Guided Deformable Attention

3 code implementations • 5 Jun 2022 • Jingyun Liang, Yuchen Fan, Xiaoyu Xiang, Rakesh Ranjan, Eddy Ilg, Simon Green, JieZhang Cao, Kai Zhang, Radu Timofte, Luc van Gool

Specifically, RVRT divides the video into multiple clips and uses the previously inferred clip feature to estimate the subsequent clip feature.

Ranked #1 on Video Super-Resolution on Vid4 - 4x upscaling - BD degradation

Analog Video Restoration Deblurring +3

336

Paper
Code

Gradient Obfuscation Checklist Test Gives a False Sense of Security

no code implementations • 3 Jun 2022 • Nikola Popovic, Danda Pani Paudel, Thomas Probst, Luc van Gool

It has since become a trend to use these five characteristics as a sufficient test, to determine whether or not gradient obfuscation is the main source of robustness.

Paper
Add Code

GCoNet+: A Stronger Group Collaborative Co-Salient Object Detector

3 code implementations • 30 May 2022 • Peng Zheng, Huazhu Fu, Deng-Ping Fan, Qi Fan, Jie Qin, Yu-Wing Tai, Chi-Keung Tang, Luc van Gool

In this paper, we present a novel end-to-end group collaborative learning network, termed GCoNet+, which can effectively and efficiently (250 fps) identify co-salient objects in natural scenes.

Ranked #1 on Co-Salient Object Detection on CoCA

Co-Salient Object Detection Object +2

212

Paper
Code

Deep Gradient Learning for Efficient Camouflaged Object Detection

1 code implementation • 25 May 2022 • Ge-Peng Ji, Deng-Ping Fan, Yu-Cheng Chou, Dengxin Dai, Alexander Liniger, Luc van Gool

This paper introduces DGNet, a novel deep framework that exploits object gradient supervision for camouflaged object detection (COD).

Defect Detection Object +4

Paper
Code

Unsupervised Flow-Aligned Sequence-to-Sequence Learning for Video Restoration

1 code implementation • 20 May 2022 • Jing Lin, Xiaowan Hu, Yuanhao Cai, Haoqian Wang, Youliang Yan, Xueyi Zou, Yulun Zhang, Luc van Gool

On the other hand, we equip the sequence-to-sequence model with an unsupervised optical flow estimator to maximize its potential.

Ranked #2 on Video Enhancement on MFQE v2

Deblurring Optical Flow Estimation +3

149

Paper
Code

Degradation-Aware Unfolding Half-Shuffle Transformer for Spectral Compressive Imaging

1 code implementation • 20 May 2022 • Yuanhao Cai, Jing Lin, Haoqian Wang, Xin Yuan, Henghui Ding, Yulun Zhang, Radu Timofte, Luc van Gool

In coded aperture snapshot spectral compressive imaging (CASSI) systems, hyperspectral image (HSI) reconstruction methods are employed to recover the spatial-spectral signal from a compressed measurement.

Ranked #1 on Spectral Reconstruction on Real HSI

Compressive Sensing Image Reconstruction +1

505

Paper
Code

Revisiting Random Channel Pruning for Neural Network Compression

1 code implementation • CVPR 2022 • Yawei Li, Kamil Adamczewski, Wen Li, Shuhang Gu, Radu Timofte, Luc van Gool

The proposed approach provides a new way to compare different methods, namely how well they behave compared with random pruning.

Neural Network Compression

Paper
Code

A Continual Deepfake Detection Benchmark: Dataset, Methods, and Essentials

1 code implementation • 11 May 2022 • Chuqiao Li, Zhiwu Huang, Danda Pani Paudel, Yabin Wang, Mohamad Shahbazi, Xiaopeng Hong, Luc van Gool

Within the proposed benchmark, we explore some commonly known essentials of standard continual learning.

Continual Learning DeepFake Detection +2

Paper
Code

NTIRE 2022 Challenge on Efficient Super-Resolution: Methods and Results

2 code implementations • 11 May 2022 • Yawei Li, Kai Zhang, Radu Timofte, Luc van Gool, Fangyuan Kong, Mingxi Li, Songwei Liu, Zongcai Du, Ding Liu, Chenhui Zhou, Jingyi Chen, Qingrui Han, Zheyuan Li, Yingqi Liu, Xiangyu Chen, Haoming Cai, Yu Qiao, Chao Dong, Long Sun, Jinshan Pan, Yi Zhu, Zhikai Zong, Xiaoxiao Liu, Zheng Hui, Tao Yang, Peiran Ren, Xuansong Xie, Xian-Sheng Hua, Yanbo Wang, Xiaozhong Ji, Chuming Lin, Donghao Luo, Ying Tai, Chengjie Wang, Zhizhong Zhang, Yuan Xie, Shen Cheng, Ziwei Luo, Lei Yu, Zhihong Wen, Qi Wu1, Youwei Li, Haoqiang Fan, Jian Sun, Shuaicheng Liu, Yuanfei Huang, Meiguang Jin, Hua Huang, Jing Liu, Xinjian Zhang, Yan Wang, Lingshun Long, Gen Li, Yuanfan Zhang, Zuowei Cao, Lei Sun, Panaetov Alexander, Yucong Wang, Minjie Cai, Li Wang, Lu Tian, Zheyuan Wang, Hongbing Ma, Jie Liu, Chao Chen, Yidong Cai, Jie Tang, Gangshan Wu, Weiran Wang, Shirui Huang, Honglei Lu, Huan Liu, Keyan Wang, Jun Chen, Shi Chen, Yuchun Miao, Zimo Huang, Lefei Zhang, Mustafa Ayazoğlu, Wei Xiong, Chengyi Xiong, Fei Wang, Hao Li, Ruimian Wen, Zhijing Yang, Wenbin Zou, Weixin Zheng, Tian Ye, Yuncheng Zhang, Xiangzhen Kong, Aditya Arora, Syed Waqas Zamir, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Dandan Gaoand Dengwen Zhouand Qian Ning, Jingzhu Tang, Han Huang, YuFei Wang, Zhangheng Peng, Haobo Li, Wenxue Guan, Shenghua Gong, Xin Li, Jun Liu, Wanjun Wang, Dengwen Zhou, Kun Zeng, Hanjiang Lin, Xinyu Chen, Jinsheng Fang

The aim was to design a network for single image super-resolution that achieved improvement of efficiency measured according to several metrics including runtime, parameters, FLOPs, activations, and memory consumption while at least maintaining the PSNR of 29. 00dB on DIV2K validation set.

Image Super-Resolution

117

Paper
Code

HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation

1 code implementation • 27 Apr 2022 • Lukas Hoyer, Dengxin Dai, Luc van Gool

Therefore, we propose HRDA, a multi-resolution training approach for UDA, that combines the strengths of small high-resolution crops to preserve fine segmentation details and large low-resolution crops to capture long-range context dependencies with a learned scale attention, while maintaining a manageable GPU memory footprint.

Ranked #3 on Semantic Segmentation on GTAV-to-Cityscapes Labels

Segmentation Semantic Segmentation +3

230

Paper
Code

MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction

3 code implementations • 17 Apr 2022 • Yuanhao Cai, Jing Lin, Zudi Lin, Haoqian Wang, Yulun Zhang, Hanspeter Pfister, Radu Timofte, Luc van Gool

Existing leading methods for spectral reconstruction (SR) focus on designing deeper or wider convolutional neural networks (CNNs) to learn the end-to-end mapping from the RGB image to its hyperspectral image (HSI).

Ranked #1 on Spectral Reconstruction on ARAD-1K

Spectral Reconstruction Spectral Super-Resolution

4,327

Paper
Code

Neural Vector Fields for Implicit Surface Representation and Inference

1 code implementation • 13 Apr 2022 • Edoardo Mello Rella, Ajad Chhatkuli, Ender Konukoglu, Luc van Gool

With neural networks, several other variations and training principles have been proposed with the goal to represent all classes of shapes.

Paper
Code

Learning Local and Global Temporal Contexts for Video Semantic Segmentation

1 code implementation • CVPR 2022 • Guolei Sun, Yun Liu, Henghui Ding, Min Wu, Luc van Gool

Specifically, we uniformly sample certain frames from the video and extract global contextual prototypes by k-means.

Segmentation Semantic Segmentation +1

Paper
Code

Learning Online Multi-Sensor Depth Fusion

1 code implementation • 7 Apr 2022 • Erik Sandström, Martin R. Oswald, Suryansh Kumar, Silvan Weder, Fisher Yu, Cristian Sminchisescu, Luc van Gool

Multi-sensor depth fusion is able to substantially improve the robustness and accuracy of 3D reconstruction methods, but existing techniques are not robust enough to handle sensors which operate with diverse value ranges as well as noise and outlier statistics.

3D Reconstruction Mixed Reality +1

Paper
Code

P3Depth: Monocular Depth Estimation with a Piecewise Planarity Prior

1 code implementation • CVPR 2022 • Vaishakh Patil, Christos Sakaridis, Alexander Liniger, Luc van Gool

We focus on the supervised setup, in which ground-truth depth is available only at training time.

Ranked #6 on Depth Estimation on NYU-Depth V2

Monocular Depth Estimation Scene Understanding

119

Paper
Code

Arbitrary-Scale Image Synthesis

1 code implementation • CVPR 2022 • Evangelos Ntavelis, Mohamad Shahbazi, Iason Kastanis, Radu Timofte, Martin Danelljan, Luc van Gool

Positional encodings have enabled recent works to train a single adversarial network that can generate images of different scales.

Image Generation

Paper
Code

Deep Interactive Motion Prediction and Planning: Playing Games with Motion Prediction Models

no code implementations • 5 Apr 2022 • Jose L. Vazquez, Alexander Liniger, Wilko Schwarting, Daniela Rus, Luc van Gool

Fundamental to the success of our method is the design of a novel multi-agent policy network that can steer a vehicle given the state of the surrounding agents and the map information.

motion prediction

Paper
Add Code

Direct Dense Pose Estimation

no code implementations • 4 Apr 2022 • Liqian Ma, Lingjie Liu, Christian Theobalt, Luc van Gool

In addition, DDP is computationally more efficient than previous dense pose estimation methods, and it reduces jitters when applied to a video sequence, which is a problem plaguing the previous methods.

Action Recognition Pose Estimation +2

Paper
Add Code

FoV-Net: Field-of-View Extrapolation Using Self-Attention and Uncertainty

no code implementations • 4 Apr 2022 • Liqian Ma, Stamatios Georgoulis, Xu Jia, Luc van Gool

The ability to make educated predictions about their surroundings, and associate them with certain confidence, is important for intelligent systems, like autonomous vehicles and robots.

Autonomous Vehicles Decision Making

Paper
Add Code

Counterfactual Cycle-Consistent Learning for Instruction Following and Generation in Vision-Language Navigation

1 code implementation • CVPR 2022 • Hanqing Wang, Wei Liang, Jianbing Shen, Luc van Gool, Wenguan Wang

Since the rise of vision-language navigation (VLN), great progress has been made in instruction following -- building a follower to navigate environments under the guidance of instructions.

counterfactual Data Augmentation +3

Paper
Code

LiDAR Snowfall Simulation for Robust 3D Object Detection

1 code implementation • CVPR 2022 • Martin Hahner, Christos Sakaridis, Mario Bijelic, Felix Heide, Fisher Yu, Dengxin Dai, Luc van Gool

Due to the difficulty of collecting and annotating training data in this setting, we propose a physically based method to simulate the effect of snowfall on real clear-weather LiDAR point clouds.

Ranked #1 on 3D Object Detection on Heavy Snowfall

Autonomous Driving Object +3

166

Paper
Code

Rethinking Semantic Segmentation: A Prototype View

1 code implementation • CVPR 2022 • Tianfei Zhou, Wenguan Wang, Ender Konukoglu, Luc van Gool

Prevalent semantic segmentation solutions, despite their different network designs (FCN based or attention based) and mask decoding strategies (parametric softmax based or pixel-query based), can be placed in one category, by considering the softmax weights or query vectors as learnable class prototypes.

Segmentation Semantic Segmentation

328

Paper
Code

Video Polyp Segmentation: A Deep Learning Perspective

4 code implementations • 27 Mar 2022 • Ge-Peng Ji, Guobao Xiao, Yu-Cheng Chou, Deng-Ping Fan, Kai Zhao, Geng Chen, Luc van Gool

We present the first comprehensive video polyp segmentation (VPS) study in the deep learning era.

Ranked #2 on Video Polyp Segmentation on SUN-SEG-Easy (Unseen)

Attribute Segmentation +4

412

Paper
Code

Spatially Multi-conditional Image Generation

no code implementations • 25 Mar 2022 • Ritika Chakraborty, Nikola Popovic, Danda Pani Paudel, Thomas Probst, Luc van Gool

However, multi-conditional image generation is a very challenging problem due to the heterogeneity and the sparsity of the (in practice) available conditioning labels.

Conditional Image Generation Missing Labels

Paper
Add Code

Continual Test-Time Domain Adaptation

2 code implementations • CVPR 2022 • Qin Wang, Olga Fink, Luc van Gool, Dengxin Dai

However, real-world machine perception systems are running in non-stationary and continually changing environments where the target domain distribution can change over time.

Test-time Adaptation

209

Paper
Code

Practical Blind Image Denoising via Swin-Conv-UNet and Data Synthesis

2 code implementations • 24 Mar 2022 • Kai Zhang, Yawei Li, Jingyun Liang, JieZhang Cao, Yulun Zhang, Hao Tang, Deng-Ping Fan, Radu Timofte, Luc van Gool

While recent years have witnessed a dramatic upsurge of exploiting deep neural networks toward solving image denoising, existing methods mostly rely on simple noise assumptions, such as additive white Gaussian noise (AWGN), JPEG compression noise and camera sensor noise, and a general-purpose blind denoising method for real images remains unsolved.

Ranked #1 on Image Denoising on urban100 sigma15

Image Denoising Image-to-Image Translation

601

Paper
Code

Robust Visual Tracking by Segmentation

2 code implementations • 21 Mar 2022 • Matthieu Paul, Martin Danelljan, Christoph Mayer, Luc van Gool

We infer a bounding box from the segmentation mask, validate our tracker on challenging tracking datasets and achieve the new state of the art on LaSOT with a success AUC score of 69. 7%.

Decoder Segmentation +5

3,114

Paper
Code

Transforming Model Prediction for Tracking

1 code implementation • CVPR 2022 • Christoph Mayer, Martin Danelljan, Goutam Bhat, Matthieu Paul, Danda Pani Paudel, Fisher Yu, Luc van Gool

Optimization based tracking methods have been widely successful by integrating a target model prediction module, providing effective global reasoning by minimizing an objective function.

Ranked #21 on Visual Object Tracking on LaSOT (Precision metric)

Inductive Bias Visual Object Tracking

3,114

Paper
Code

Transform your Smartphone into a DSLR Camera: Learning the ISP in the Wild

no code implementations • 20 Mar 2022 • Ardhendu Shekhar Tripathi, Martin Danelljan, Samarth Shukla, Radu Timofte, Luc van Gool

We propose a trainable Image Signal Processing (ISP) framework that produces DSLR quality images given RAW images captured by a smartphone.

Motion Estimation

Paper
Add Code

Scribble-Supervised LiDAR Semantic Segmentation

3 code implementations • CVPR 2022 • Ozan Unal, Dengxin Dai, Luc van Gool

Densely annotating LiDAR point clouds remains too expensive and time-consuming to keep up with the ever growing volume of data.

Ranked #2 on 3D Semantic Segmentation on ScribbleKITTI

3D Semantic Segmentation LIDAR Semantic Segmentation +1

309

Paper
Code

Zero Pixel Directional Boundary by Vector Transform

1 code implementation • ICLR 2022 • Edoardo Mello Rella, Ajad Chhatkuli, Yun Liu, Ender Konukoglu, Luc van Gool

One of the key problems in boundary detection is the label representation, which typically leads to class imbalance and, as a consequence, to thick boundaries that require non-differential post-processing steps to be thinned.

Boundary Detection

Paper
Code

Revisiting Deep Semi-supervised Learning: An Empirical Distribution Alignment Framework and Its Generalization Bound

no code implementations • 13 Mar 2022 • Feiyu Wang, Qin Wang, Wen Li, Dong Xu, Luc van Gool

Benefited from this new perspective, we first propose a new deep semi-supervised learning framework called Semi-supervised Learning by Empirical Distribution Alignment (SLEDA), in which existing technologies from the domain adaptation community can be readily used to address the semi-supervised learning problem through reducing the empirical distribution distance between labeled and unlabeled data.

Data Augmentation Domain Adaptation

Paper
Add Code

Coarse-to-Fine Sparse Transformer for Hyperspectral Image Reconstruction

1 code implementation • 9 Mar 2022 • Yuanhao Cai, Jing Lin, Xiaowan Hu, Haoqian Wang, Xin Yuan, Yulun Zhang, Radu Timofte, Luc van Gool

Many algorithms have been developed to solve the inverse problem of coded aperture snapshot spectral imaging (CASSI), i. e., recovering the 3D hyperspectral images (HSIs) from a 2D compressive measurement.

Ranked #2 on Spectral Reconstruction on Real HSI

Compressive Sensing Image Reconstruction +1

505

Paper
Code

Probabilistic Warp Consistency for Weakly-Supervised Semantic Correspondences

1 code implementation • CVPR 2022 • Prune Truong, Martin Danelljan, Fisher Yu, Luc van Gool

We propose Probabilistic Warp Consistency, a weakly-supervised learning objective for semantic matching.

Weakly-supervised Learning

632

Paper
Code

Barlow constrained optimization for Visual Question Answering

1 code implementation • 7 Mar 2022 • Abhishek Jha, Badri N. Patro, Luc van Gool, Tinne Tuytelaars

In this paper, we propose a novel regularization for VQA models, Constrained Optimization using Barlow's theory (COB), that improves the information content of the joint space by minimizing the redundancy.

Question Answering Visual Question Answering

Paper
Code

ZippyPoint: Fast Interest Point Detection, Description, and Matching through Mixed Precision Discretization

1 code implementation • 7 Mar 2022 • Menelaos Kanakis, Simon Maurer, Matteo Spallanzani, Ajad Chhatkuli, Luc van Gool

Efficient detection and description of geometric regions in images is a prerequisite in visual systems for localization and mapping.

Homography Estimation Quantization +1

Paper
Code

HDNet: High-resolution Dual-domain Learning for Spectral Compressive Imaging

2 code implementations • CVPR 2022 • Xiaowan Hu, Yuanhao Cai, Jing Lin, Haoqian Wang, Xin Yuan, Yulun Zhang, Radu Timofte, Luc van Gool

On the one hand, the proposed HR spatial-spectral attention module with its efficient feature fusion provides continuous and fine pixel-level features.

Ranked #5 on Spectral Reconstruction on Real HSI

Compressive Sensing Image Reconstruction +2

505

Paper
Code

Uncertainty-Aware Deep Multi-View Photometric Stereo

no code implementations • CVPR 2022 • Berk Kaya, Suryansh Kumar, Carlos Oliveira, Vittorio Ferrari, Luc van Gool

At each pixel, our approach either selects or discards deep-PS and deep-MVS network prediction depending on the prediction uncertainty measure.

Surface Reconstruction

Paper
Add Code

Pix2NeRF: Unsupervised Conditional $π$-GAN for Single Image to Neural Radiance Fields Translation

2 code implementations • 26 Feb 2022 • Shengqu Cai, Anton Obukhov, Dengxin Dai, Luc van Gool

We propose a pipeline to generate Neural Radiance Fields~(NeRF) of an object or a scene of a specific class, conditioned on a single input image.

3D-Aware Image Synthesis Novel View Synthesis +2

267

Paper
Code

Adiabatic Quantum Computing for Multi Object Tracking

no code implementations • CVPR 2022 • Jan-Nico Zaech, Alexander Liniger, Martin Danelljan, Dengxin Dai, Luc van Gool

Multi-Object Tracking (MOT) is most often approached in the tracking-by-detection paradigm, where object detections are associated through time.

Multi-Object Tracking Object

Paper
Add Code

Fast Online Video Super-Resolution with Deformable Attention Pyramid

no code implementations • 3 Feb 2022 • Dario Fuoli, Martin Danelljan, Radu Timofte, Luc van Gool

Our DAP aligns and integrates information from the recurrent state into the current frame prediction.

Video Super-Resolution

Paper
Add Code

VRT: A Video Restoration Transformer

1 code implementation • 28 Jan 2022 • Jingyun Liang, JieZhang Cao, Yuchen Fan, Kai Zhang, Rakesh Ranjan, Yawei Li, Radu Timofte, Luc van Gool

Besides, parallel warping is used to further fuse information from neighboring frames by parallel feature warping.

Ranked #1 on Deblurring on BASED

Deblurring Denoising +7

1,273

Paper
Code

Revisiting RCAN: Improved Training for Image Super-Resolution

6 code implementations • 27 Jan 2022 • Zudi Lin, Prateek Garg, Atmadeep Banerjee, Salma Abdel Magid, Deqing Sun, Yulun Zhang, Luc van Gool, Donglai Wei, Hanspeter Pfister

Image super-resolution (SR) is a fast-moving field with novel architectures attracting the spotlight.

Image Super-Resolution

Paper
Code

RePaint: Inpainting using Denoising Diffusion Probabilistic Models

3 code implementations • CVPR 2022 • Andreas Lugmayr, Martin Danelljan, Andres Romero, Fisher Yu, Radu Timofte, Luc van Gool

In this work, we propose RePaint: A Denoising Diffusion Probabilistic Model (DDPM) based inpainting approach that is applicable to even extreme masks.

Denoising Image Inpainting

10,906

Paper
Code

Collapse by Conditioning: Training Class-conditional GANs with Limited Data

1 code implementation • ICLR 2022 • Mohamad Shahbazi, Martin Danelljan, Danda Pani Paudel, Luc van Gool

On the contrary, we observe that class-conditioning causes mode collapse in limited data settings, where unconditional learning leads to satisfactory generative ability.

Generative Adversarial Network

Paper
Code

End-To-End Optimization of LiDAR Beam Configuration for 3D Object Detection and Localization

1 code implementation • 11 Jan 2022 • Niclas Vödisch, Ozan Unal, Ke Li, Luc van Gool, Dengxin Dai

In this work, we take a new route to learn to optimize the LiDAR beam configuration for a given application.

3D Object Detection object-detection

Paper
Code

Flow-Guided Sparse Transformer for Video Deblurring

1 code implementation • 6 Jan 2022 • Jing Lin, Yuanhao Cai, Xiaowan Hu, Haoqian Wang, Youliang Yan, Xueyi Zou, Henghui Ding, Yulun Zhang, Radu Timofte, Luc van Gool

Exploiting similar and sharper scene patches in spatio-temporal neighborhoods is critical for video deblurring.

Ranked #1 on Deblurring on DVD

Deblurring Optical Flow Estimation

149

Paper
Code

Sound and Visual Representation Learning with Multiple Pretraining Tasks

no code implementations • CVPR 2022 • Arun Balajee Vasudevan, Dengxin Dai, Luc van Gool

Specifically, for this study, we investigate binaural sounds and image data in isolation.

Incremental Learning Representation Learning +3

Paper
Add Code

Pix2NeRF: Unsupervised Conditional p-GAN for Single Image to Neural Radiance Fields Translation

1 code implementation • CVPR 2022 • Shengqu Cai, Anton Obukhov, Dengxin Dai, Luc van Gool

We propose a pipeline to generate Neural Radiance Fields (NeRF) of an object or a scene of a specific class, conditioned on a single input image.

3D-Aware Image Synthesis Novel View Synthesis +2

267

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.