Search Results for author: Kailun Yang

Found 96 papers, 78 papers with code

DTCLMapper: Dual Temporal Consistent Learning for Vectorized HD Map Construction

no code implementations • 9 May 2024 • Siyu Li, Jiacheng Lin, Hao Shi, Jiaming Zhang, Song Wang, You Yao, Zhiyong Li, Kailun Yang

In this paper, we revisit the temporal fusion of vectorized HD maps, focusing on temporal instance consistency and temporal map consistency learning.

Contrastive Learning Scene Understanding +1

Paper
Add Code

Design, analysis, and manufacturing of a glass-plastic hybrid minimalist aspheric panoramic annular lens

no code implementations • 5 May 2024 • Shaohua Gao, Qi Jiang, Yiqi Liao, Yi Qiu, Wanglei Ying, Kailun Yang, Kaiwei Wang, Benhao Zhang, Jian Bai

We propose a high-performance glass-plastic hybrid minimalist aspheric panoramic annular lens (ASPAL) to solve several major limitations of the traditional panoramic annular lens (PAL), such as large size, high weight, and complex system.

Paper
Add Code

Towards Consistent Object Detection via LiDAR-Camera Synergy

1 code implementation • 2 May 2024 • Kai Luo, Hao Wu, Kefu Yi, Kailun Yang, Wei Hao, Rongdong Hu

In light of this, this paper introduces an end-to-end Consistency Object Detection (COD) algorithm framework that requires only a single forward inference to simultaneously obtain an object's position in both point clouds and images and establish their correlation.

Object object-detection +2

Paper
Code

Global Search Optics: Automatically Exploring Optimal Solutions to Compact Computational Imaging Systems

no code implementations • 30 Apr 2024 • Yao Gao, Qi Jiang, Shaohua Gao, Lei Sun, Kailun Yang, Kaiwei Wang

In this work, we present Global Search Optics (GSO) to automatically design compact computational imaging systems through two parts: (i) Fused Optimization Method for Automatic Optical Design (OptiFusion), which searches for diverse initial optical systems under certain design specifications; and (ii) Efficient Physic-aware Joint Optimization (EPJO), which conducts parallel joint optimization of initial optical systems and image reconstruction networks with the consideration of physical constraints, culminating in the selection of the optimal solution.

Image Reconstruction

Paper
Add Code

CFMW: Cross-modality Fusion Mamba for Multispectral Object Detection under Adverse Weather Conditions

1 code implementation • 25 Apr 2024 • Haoyuan Li, Qi Hu, You Yao, Kailun Yang, Peng Chen

Furthermore, we introduce the Cross-modality Fusion Mamba with Weather-removal (CFMW) to augment detection accuracy in adverse weather conditions.

Multispectral Object Detection Object +2

Paper
Code

MambaMOS: LiDAR-based 3D Moving Object Segmentation with Motion-aware State Space Model

1 code implementation • 19 Apr 2024 • Kang Zeng, Hao Shi, Jiacheng Lin, Siyu Li, Jintao Cheng, Kaiwei Wang, Zhiyong Li, Kailun Yang

In this paper, we propose a novel LiDAR-based 3D Moving Object Segmentation with Motion-aware State Space Model, termed MambaMOS.

Object Semantic Segmentation

Paper
Code

Skeleton-Based Human Action Recognition with Noisy Labels

1 code implementation • 15 Mar 2024 • Yi Xu, Kunyu Peng, Di Wen, Ruiping Liu, Junwei Zheng, Yufan Chen, Jiaming Zhang, Alina Roitberg, Kailun Yang, Rainer Stiefelhagen

In this study, we bridge this gap by implementing a framework that augments well-established skeleton-based human action recognition methods with label-denoising strategies from various research areas to serve as the initial benchmark.

Action Recognition Denoising +3

Paper
Code

Real-World Computational Aberration Correction via Quantized Domain-Mixing Representation

1 code implementation • 15 Mar 2024 • Qi Jiang, Zhonghua Yi, Shaohua Gao, Yao Gao, Xiaolong Qian, Hao Shi, Lei Sun, Zhijie Xu, Kailun Yang, Kaiwei Wang

Relying on paired synthetic data, existing learning-based Computational Aberration Correction (CAC) methods are confronted with the intricate and multifaceted synthetic-to-real domain gap, which leads to suboptimal performance in real-world applications.

Unsupervised Domain Adaptation

Paper
Code

OccFiner: Offboard Occupancy Refinement with Hybrid Propagation

no code implementations • 13 Mar 2024 • Hao Shi, Song Wang, Jiaming Zhang, Xiaoting Yin, Zhongdao Wang, Zhijian Zhao, Guangming Wang, Jianke Zhu, Kailun Yang, Kaiwei Wang

Vision-based occupancy prediction, also known as 3D Semantic Scene Completion (SSC), presents a significant challenge in computer vision.

3D Semantic Scene Completion

Paper
Add Code

EchoTrack: Auditory Referring Multi-Object Tracking for Autonomous Driving

1 code implementation • 28 Feb 2024 • Jiacheng Lin, Jiajun Chen, Kunyu Peng, Xuan He, Zhiyong Li, Rainer Stiefelhagen, Kailun Yang

This paper introduces the task of Auditory Referring Multi-Object Tracking (AR-MOT), which dynamically tracks specific objects in a video sequence based on audio expressions and appears as a challenging problem in autonomous driving.

Autonomous Driving Multi-Object Tracking +1

Paper
Code

Towards Precise 3D Human Pose Estimation with Multi-Perspective Spatial-Temporal Relational Transformers

1 code implementation • 30 Jan 2024 • Jianbin Jiao, Xina Cheng, WeiJie Chen, Xiaoting Yin, Hao Shi, Kailun Yang

Due to the challenges in data collection, mainstream datasets of 3D human pose estimation are primarily composed of multi-view video data collected in laboratory environments, which contains rich spatial-temporal correlation information besides the image frame content.

3D Human Pose Estimation Scene Understanding

Paper
Code

LF Tracy: A Unified Single-Pipeline Approach for Salient Object Detection in Light Field Cameras

1 code implementation • 30 Jan 2024 • Fei Teng, Jiaming Zhang, Jiawei Liu, Kunyu Peng, Xina Cheng, Zhiyong Li, Kailun Yang

Previous approaches predominantly employ a custom two-stream design to discover the implicit angular feature within light field cameras, leading to significant information isolation between different LF representations.

Data Augmentation Decoder +3

Paper
Code

Fourier Prompt Tuning for Modality-Incomplete Scene Segmentation

1 code implementation • 30 Jan 2024 • Ruiping Liu, Jiaming Zhang, Kunyu Peng, Yufan Chen, Ke Cao, Junwei Zheng, M. Saquib Sarfraz, Kailun Yang, Rainer Stiefelhagen

Integrating information from multiple modalities enhances the robustness of scene perception systems in autonomous vehicles, providing a more comprehensive and reliable sensory framework.

Autonomous Vehicles Scene Segmentation

Paper
Code

Navigating Open Set Scenarios for Skeleton-based Action Recognition

1 code implementation • 11 Dec 2023 • Kunyu Peng, Cheng Yin, Junwei Zheng, Ruiping Liu, David Schneider, Jiaming Zhang, Kailun Yang, M. Saquib Sarfraz, Rainer Stiefelhagen, Alina Roitberg

In real-world scenarios, human actions often fall outside the distribution of training data, making it crucial for models to recognize known actions and reject unknown ones.

Novelty Detection Open Set Action Recognition +3

Paper
Code

Rethinking Event-based Human Pose Estimation with 3D Event Representations

1 code implementation • 8 Nov 2023 • Xiaoting Yin, Hao Shi, Jiaan Chen, Ze Wang, Yaozu Ye, Huajian Ni, Kailun Yang, Kaiwei Wang

Experiments on EV-3DPW demonstrate that the robustness of our proposed 3D representation methods compared to traditional RGB images and event frame techniques under the same backbones.

Autonomous Driving Pose Estimation

Paper
Code

CoBEV: Elevating Roadside 3D Object Detection with Depth and Height Complementarity

1 code implementation • 4 Oct 2023 • Hao Shi, Chengshan Pang, Jiaming Zhang, Kailun Yang, Yuhao Wu, Huajian Ni, Yining Lin, Rainer Stiefelhagen, Kaiwei Wang

Roadside camera-driven 3D object detection is a crucial task in intelligent transportation systems, which extends the perception range beyond the limitations of vision-centric vehicles and enhances road safety.

Ranked #2 on 3D Object Detection on Rope3D

feature selection Monocular 3D Object Detection +1

Paper
Code

Elevating Skeleton-Based Action Recognition with Efficient Multi-Modality Self-Supervision

1 code implementation • 21 Sep 2023 • Yiping Wei, Kunyu Peng, Alina Roitberg, Jiaming Zhang, Junwei Zheng, Ruiping Liu, Yufan Chen, Kailun Yang, Rainer Stiefelhagen

These works overlooked the differences in performance among modalities, which led to the propagation of erroneous knowledge between modalities while only three fundamental modalities, i. e., joints, bones, and motions are used, hence no additional modalities are explored.

Action Recognition Knowledge Distillation +3

Paper
Code

Unveiling the Hidden Realm: Self-supervised Skeleton-based Action Recognition in Occluded Environments

1 code implementation • 21 Sep 2023 • Yifei Chen, Kunyu Peng, Alina Roitberg, David Schneider, Jiaming Zhang, Junwei Zheng, Ruiping Liu, Yufan Chen, Kailun Yang, Rainer Stiefelhagen

To integrate action recognition methods into autonomous robotic systems, it is crucial to consider adverse situations involving target occlusions.

Action Recognition Imputation +1

Paper
Code

S$^3$-MonoDETR: Supervised Shape&Scale-perceptive Deformable Transformer for Monocular 3D Object Detection

no code implementations • 2 Sep 2023 • Xuan He, Kailun Yang, Junwei Zheng, Jin Yuan, Luis M. Bergasa, HUI ZHANG, Zhiyong Li

These methods typically use visual and depth representations to generate query points on objects, whose quality plays a decisive role in the detection accuracy.

Monocular 3D Object Detection object-detection

Paper
Add Code

FocusFlow: Boosting Key-Points Optical Flow Estimation for Autonomous Driving

1 code implementation • 14 Aug 2023 • Zhonghua Yi, Hao Shi, Kailun Yang, Qi Jiang, Yaozu Ye, Ze Wang, Huajian Ni, Kaiwei Wang

Based on the modeling method, we present FocusFlow, a framework consisting of 1) a mix loss function combined with a classic photometric loss function and our proposed Conditional Point Control Loss (CPCL) function for diverse point-wise supervision; 2) a conditioned controlling model which substitutes the conventional feature encoder by our proposed Condition Control Encoder (CCE).

Autonomous Driving Optical Flow Estimation +1

Paper
Code

EPCFormer: Expression Prompt Collaboration Transformer for Universal Referring Video Object Segmentation

1 code implementation • 8 Aug 2023 • Jiajun Chen, Jiacheng Lin, Zhiqiang Xiao, Haolong Fu, Ke Nai, Kailun Yang, Zhiyong Li

Next, we propose an Expression Alignment (EA) mechanism for audio and text expressions.

Ranked #10 on Referring Expression Segmentation on Refer-YouTube-VOS (2021 public validation)

Contrastive Learning Object +5

Paper
Code

Contrast-augmented Diffusion Model with Fine-grained Sequence Alignment for Markup-to-Image Generation

1 code implementation • 2 Aug 2023 • Guojin Zhong, Jin Yuan, Pan Wang, Kailun Yang, Weili Guan, Zhiyong Li

The recently rising markup-to-image generation poses greater challenges as compared to natural image generation, due to its low tolerance for errors as well as the complex sequence and context correlations between markup and rendered image.

Denoising Image Generation

Paper
Code

OAFuser: Towards Omni-Aperture Fusion for Light Field Semantic Segmentation

2 code implementations • 28 Jul 2023 • Fei Teng, Jiaming Zhang, Kunyu Peng, Yaonan Wang, Rainer Stiefelhagen, Kailun Yang

To avoid feature loss during network propagation and simultaneously streamline the redundant information from the light field camera, we present a simple yet very effective Sub-Aperture Fusion Module (SAFM) to embed sub-aperture images into angular features without any additional memory cost.

Autonomous Driving Scene Understanding +1

Paper
Code

Open Scene Understanding: Grounded Situation Recognition Meets Segment Anything for Helping People with Visual Impairments

1 code implementation • 15 Jul 2023 • Ruiping Liu, Jiaming Zhang, Kunyu Peng, Junwei Zheng, Ke Cao, Yufan Chen, Kailun Yang, Rainer Stiefelhagen

Grounded Situation Recognition (GSR) is capable of recognizing and interpreting visual scenes in a contextually intuitive way, yielding salient activities (verbs) and the involved entities (roles) depicted in images.

Decoder Grounded Situation Recognition +2

Paper
Code

Tightly-Coupled LiDAR-Visual SLAM Based on Geometric Features for Mobile Agents

no code implementations • 15 Jul 2023 • Ke Cao, Ruiping Liu, Ze Wang, Kunyu Peng, Jiaming Zhang, Junwei Zheng, Zhifeng Teng, Kailun Yang, Rainer Stiefelhagen

On the other hand, the entire line segment detected by the visual subsystem overcomes the limitation of the LiDAR subsystem, which can only perform the local calculation for geometric features.

Autonomous Navigation Pose Estimation +2

Paper
Add Code

Towards Anytime Optical Flow Estimation with Event Cameras

1 code implementation • 11 Jul 2023 • Yaozu Ye, Hao Shi, Kailun Yang, Ze Wang, Xiaoting Yin, Yining Lin, Mao Liu, Yaonan Wang, Kaiwei Wang

We then propose EVA-Flow, an EVent-based Anytime Flow estimation network to produce high-frame-rate event optical flow with only low-frame-rate optical flow ground truth for supervision.

Autonomous Driving Motion Estimation +1

Paper
Code

Minimalist and High-Quality Panoramic Imaging with PSF-aware Transformers

1 code implementation • 22 Jun 2023 • Qi Jiang, Shaohua Gao, Yao Gao, Kailun Yang, Zhonghua Yi, Hao Shi, Lei Sun, Kaiwei Wang

In this paper, we propose a Panoramic Computational Imaging Engine (PCIE) to address minimalist and high-quality panoramic imaging.

Super-Resolution

Paper
Code

VPUFormer: Visual Prompt Unified Transformer for Interactive Image Segmentation

1 code implementation • 11 Jun 2023 • Xu Zhang, Kailun Yang, Jiacheng Lin, Jin Yuan, Zhiyong Li, Shutao Li

Specifically, we design a Prompt-unified Encoder (PuE) by using Gaussian mapping to generate a unified one-dimensional vector for click, box, and scribble prompts, which well captures users' intentions as well as provides a denser representation of user prompts.

Image Segmentation Segmentation +1

Paper
Code

LF-PGVIO: A Visual-Inertial-Odometry Framework for Large Field-of-View Cameras using Points and Geodesic Segments

1 code implementation • 11 Jun 2023 • Ze Wang, Kailun Yang, Hao Shi, Yufan Zhang, Zhijie Xu, Fei Gao, Kaiwei Wang

The purpose of our research is to unleash the potential of point-line odometry with large-FoV omnidirectional cameras, even for cameras with negative-plane FoV.

Line Detection

Paper
Code

Towards Source-free Domain Adaptive Semantic Segmentation via Importance-aware and Prototype-contrast Learning

2 code implementations • 2 Jun 2023 • Yihong Cao, HUI ZHANG, Xiao Lu, Zheng Xiao, Kailun Yang, Yaonan Wang

It utilizes a well-trained source model and unlabeled target data to achieve adaptation in the target domain.

Segmentation Semantic Segmentation +2

Paper
Code

Exploring Few-Shot Adaptation for Activity Recognition on Diverse Domains

2 code implementations • 15 May 2023 • Kunyu Peng, Di Wen, David Schneider, Jiaming Zhang, Kailun Yang, M. Saquib Sarfraz, Rainer Stiefelhagen, Alina Roitberg

In this work, we focus on Few-Shot Domain Adaptation for Activity Recognition (FSDA-AR), which leverages a very small amount of labeled target videos to achieve effective adaptation.

Action Recognition Unsupervised Domain Adaptation

Paper
Code

SSD-MonoDETR: Supervised Scale-aware Deformable Transformer for Monocular 3D Object Detection

1 code implementation • 12 May 2023 • Xuan He, Fan Yang, Kailun Yang, Jiacheng Lin, Haolong Fu, Meng Wang, Jin Yuan, Zhiyong Li

To tackle this problem, this paper proposes a novel "Supervised Scale-aware Deformable Attention" (SSDA) for monocular 3D object detection.

Monocular 3D Object Detection Object +1

Paper
Code

Bi-Mapper: Holistic BEV Semantic Mapping for Autonomous Driving

1 code implementation • 7 May 2023 • Siyu Li, Kailun Yang, Hao Shi, Jiaming Zhang, Jiacheng Lin, Zhifeng Teng, Zhiyong Li

At the same time, an Across-Space Loss (ASL) is designed to mitigate the negative impact of geometric distortions.

Autonomous Driving

Paper
Code

AdaptiveClick: Clicks-aware Transformer with Adaptive Focal Loss for Interactive Image Segmentation

1 code implementation • 7 May 2023 • Jiacheng Lin, Jiajun Chen, Kailun Yang, Alina Roitberg, Siyu Li, Zhiyong Li, Shutao Li

Interactive Image Segmentation (IIS) has emerged as a promising technique for decreasing annotation time.

Decoder Image Segmentation +2

Paper
Code

FishDreamer: Towards Fisheye Semantic Completion via Unified Image Outpainting and Segmentation

1 code implementation • 24 Mar 2023 • Hao Shi, Yu Li, Kailun Yang, Jiaming Zhang, Kunyu Peng, Alina Roitberg, Yaozu Ye, Huajian Ni, Kaiwei Wang, Rainer Stiefelhagen

This paper raises the new task of Fisheye Semantic Completion (FSC), where dense texture, structure, and semantics of a fisheye image are inferred even beyond the sensor field-of-view (FoV).

Image Outpainting Semantic Segmentation

Paper
Code

PanoVPR: Towards Unified Perspective-to-Equirectangular Visual Place Recognition via Sliding Windows across the Panoramic View

1 code implementation • 24 Mar 2023 • Ze Shi, Hao Shi, Kailun Yang, Zhe Yin, Yining Lin, Kaiwei Wang

To address this, we propose \textit{PanoVPR}, a perspective-to-equirectangular (P2E) visual place recognition framework that employs sliding windows to eliminate feature truncation caused by hard cropping.

Autonomous Driving Image Retrieval +2

Paper
Code

360BEV: Panoramic Semantic Mapping for Indoor Bird's-Eye View

1 code implementation • 21 Mar 2023 • Zhifeng Teng, Jiaming Zhang, Kailun Yang, Kunyu Peng, Hao Shi, Simon Reiß, Ke Cao, Rainer Stiefelhagen

Seeing only a tiny part of the whole is not knowing the full circumstance.

Semantic Segmentation

Paper
Code

Delivering Arbitrary-Modal Semantic Segmentation

1 code implementation • CVPR 2023 • Jiaming Zhang, Ruiping Liu, Hao Shi, Kailun Yang, Simon Reiß, Kunyu Peng, Haodong Fu, Kaiwei Wang, Rainer Stiefelhagen

To make this possible, we present the arbitrary cross-modal segmentation model CMNeXt.

Ranked #1 on Semantic Segmentation on DSEC

Segmentation Semantic Segmentation +1

126

Paper
Code

Towards Activated Muscle Group Estimation in the Wild

1 code implementation • 2 Mar 2023 • Kunyu Peng, David Schneider, Alina Roitberg, Kailun Yang, Jiaming Zhang, Chen Deng, Kaiyu Zhang, M. Saquib Sarfraz, Rainer Stiefelhagen

In this paper, we tackle the new task of video-based Activated Muscle Group Estimation (AMGE) aiming at identifying active muscle regions during physical activity in the wild.

Human Activity Recognition Knowledge Distillation +1

Paper
Code

MateRobot: Material Recognition in Wearable Robotics for People with Visual Impairments

1 code implementation • 28 Feb 2023 • Junwei Zheng, Jiaming Zhang, Kailun Yang, Kunyu Peng, Rainer Stiefelhagen

People with Visual Impairments (PVI) typically recognize objects through haptic perception.

Material Recognition Semantic Segmentation

Paper
Code

Computational Imaging for Machine Perception: Transferring Semantic Segmentation beyond Aberrations

1 code implementation • 21 Nov 2022 • Qi Jiang, Hao Shi, Shaohua Gao, Jiaming Zhang, Kailun Yang, Lei Sun, Huajian Ni, Kaiwei Wang

Further, we propose Computational Imaging Assisted Domain Adaptation (CIADA) to leverage prior knowledge of CI for robust performance in SSOA.

Scene Understanding Semantic Segmentation +1

Paper
Code

Beyond the Field-of-View: Enhancing Scene Visibility and Perception with Clip-Recurrent Transformer

3 code implementations • 21 Nov 2022 • Hao Shi, Qi Jiang, Kailun Yang, Xiaoting Yin, Huajian Ni, Kaiwei Wang

In this paper, we propose the concept of online video inpainting for autonomous vehicles to expand the field of view, thereby enhancing scene visibility, perception, and system safety.

Ranked #1 on Seeing Beyond the Visible on KITTI360-EX

Autonomous Vehicles object-detection +4

Paper
Code

LF-VISLAM: A SLAM Framework for Large Field-of-View Cameras with Negative Imaging Plane on Mobile Agents

3 code implementations • 12 Sep 2022 • Ze Wang, Kailun Yang, Hao Shi, Peng Li, Fei Gao, Jian Bai, Kaiwei Wang

As loop closure on wide-FoV panoramic data further comes with a large number of outliers, traditional outlier rejection methods are not directly applicable.

Autonomous Driving Simultaneous Localization and Mapping

Paper
Code

Behind Every Domain There is a Shift: Adapting Distortion-aware Vision Transformers for Panoramic Semantic Segmentation

1 code implementation • 25 Jul 2022 • Jiaming Zhang, Kailun Yang, Hao Shi, Simon Reiß, Kunyu Peng, Chaoxiang Ma, Haodong Fu, Philip H. S. Torr, Kaiwei Wang, Rainer Stiefelhagen

In this paper, we address panoramic semantic segmentation which is under-explored due to two critical challenges: (1) image distortions and object deformations on panoramas; (2) lack of semantic annotations in the 360-degree imagery.

Ranked #1 on Semantic Segmentation on SynPASS

Pseudo Label Segmentation +2

Paper
Code

Trans4Map: Revisiting Holistic Bird's-Eye-View Mapping from Egocentric Images to Allocentric Semantics with Vision Transformers

1 code implementation • 13 Jul 2022 • Chang Chen, Jiaming Zhang, Kailun Yang, Kunyu Peng, Rainer Stiefelhagen

Humans have an innate ability to sense their surroundings, as they can extract the spatial representation from the egocentric perception and form an allocentric semantic map via spatial transformation and memory updating.

Decoder Semantic Segmentation

Paper
Code

Multi-modal Depression Estimation based on Sub-attentional Fusion

1 code implementation • 13 Jul 2022 • Ping-Cheng Wei, Kunyu Peng, Alina Roitberg, Kailun Yang, Jiaming Zhang, Rainer Stiefelhagen

Failure to timely diagnose and effectively treat depression leads to over 280 million people suffering from this psychological disorder worldwide.

Paper
Code

Panoramic Panoptic Segmentation: Insights Into Surrounding Parsing for Mobile Agents via Unsupervised Contrastive Learning

1 code implementation • 21 Jun 2022 • Alexander Jaus, Kailun Yang, Rainer Stiefelhagen

In order to overcome the lack of annotated panoramic images, we propose a framework which allows model training on standard pinhole images and transfers the learned features to the panoramic domain in a cost-minimizing way.

Contrastive Learning Domain Generalization +3

Paper
Code

Annular Computational Imaging: Capture Clear Panoramic Images through Simple Lens

1 code implementation • 13 Jun 2022 • Qi Jiang, Hao Shi, Lei Sun, Shaohua Gao, Kailun Yang, Kaiwei Wang

In this paper, we propose an Annular Computational Imaging (ACI) framework to break the optical limit of light-weight PAL design.

Image Restoration

Paper
Code

Efficient Human Pose Estimation via 3D Event Point Cloud

1 code implementation • 9 Jun 2022 • Jiaan Chen, Hao Shi, Yaozu Ye, Kailun Yang, Lei Sun, Kaiwei Wang

We then leverage the rasterized event point cloud as input to three different backbones, PointNet, DGCNN, and Point Transformer, with two linear layer decoders to predict the location of human keypoints.

Ranked #1 on 3D Human Pose Estimation on DHP19

3D Human Pose Estimation Edge-computing +1

Paper
Code

Review on Panoramic Imaging and Its Applications in Scene Understanding

no code implementations • 11 May 2022 • Shaohua Gao, Kailun Yang, Hao Shi, Kaiwei Wang, Jian Bai

However, while satisfying the need for large-FoV photographic imaging, panoramic imaging instruments are expected to have high resolution, no blind area, miniaturization, and multidimensional intelligent perception, and can be combined with artificial intelligence methods towards the next generation of intelligent instruments, enabling deeper understanding and more holistic perception of 360-degree real-world surrounding environments.

Autonomous Driving Depth Estimation +4

Paper
Add Code

Towards Automatic Parsing of Structured Visual Content through the Use of Synthetic Data

no code implementations • 29 Apr 2022 • Lukas Scholch, Jonas Steinhauser, Maximilian Beichter, Constantin Seibold, Kailun Yang, Merlin Knäble, Thorsten Schwarz, Alexander Mädche, Rainer Stiefelhagen

In this work, we propose a synthetic dataset, containing SVCs in the form of images as well as ground truths.

Paper
Add Code

Is my Driver Observation Model Overconfident? Input-guided Calibration Networks for Reliable and Interpretable Confidence Estimates

no code implementations • 10 Apr 2022 • Alina Roitberg, Kunyu Peng, David Schneider, Kailun Yang, Marios Koulakis, Manuel Martinez, Rainer Stiefelhagen

In this work, we for the first time examine how well the confidence values of modern driver observation models indeed match the probability of the correct outcome and show that raw neural network-based approaches tend to significantly overestimate their prediction quality.

Action Recognition Image Classification

Paper
Add Code

Indoor Navigation Assistance for Visually Impaired People via Dynamic SLAM and Panoptic Segmentation with an RGB-D Sensor

no code implementations • 3 Apr 2022 • Wenyan Ou, Jiaming Zhang, Kunyu Peng, Kailun Yang, Gerhard Jaworek, Karin Müller, Rainer Stiefelhagen

Then, poses and speed of tracked dynamic objects can be estimated, which are passed to the users through acoustic feedback.

Motion Estimation Object +1

Paper
Add Code

Towards Robust Semantic Segmentation of Accident Scenes via Multi-Source Mixed Sampling and Meta-Learning

1 code implementation • 19 Mar 2022 • Xinyu Luo, Jiaming Zhang, Kailun Yang, Alina Roitberg, Kunyu Peng, Rainer Stiefelhagen

Autonomous vehicles utilize urban scene segmentation to understand the real world like a human and react accordingly.

Ranked #1 on Semantic Segmentation on DADA-seg (using extra training data)

Autonomous Vehicles Decoder +4

Paper
Code

MatchFormer: Interleaving Attention in Transformers for Feature Matching

1 code implementation • 17 Mar 2022 • Qing Wang, Jiaming Zhang, Kailun Yang, Kunyu Peng, Rainer Stiefelhagen

While detector-based methods coupled with feature descriptors struggle in low-texture scenes, CNN-based methods with a sequential extract-to-match pipeline, fail to make use of the matching capacity of the encoder and tend to overburden the decoder for matching.

Decoder Homography Estimation +2

161

Paper
Code

CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers

1 code implementation • 9 Mar 2022 • Jiaming Zhang, Huayao Liu, Kailun Yang, Xinxin Hu, Ruiping Liu, Rainer Stiefelhagen

Pixel-wise semantic segmentation of RGB images can be advanced by exploiting complementary features from the supplementary modality (X-modality).

Ranked #1 on Semantic Segmentation on SpectralWaste

Autonomous Vehicles Image Segmentation +5

284

Paper
Code

Bending Reality: Distortion-aware Transformers for Adapting to Panoramic Semantic Segmentation

1 code implementation • CVPR 2022 • Jiaming Zhang, Kailun Yang, Chaoxiang Ma, Simon Reiß, Kunyu Peng, Rainer Stiefelhagen

To get around this domain difference and bring together semantic annotations from pinhole- and 360-degree surround-visuals, we propose to learn object deformations and panoramic image distortions in the Deformable Patch Embedding (DPE) and Deformable MLP (DMLP) components which blend into our Transformer for PAnoramic Semantic Segmentation (Trans4PASS) model.

Ranked #2 on Semantic Segmentation on SynPASS

Scene Understanding Semantic Segmentation +1

Paper
Code

TransDARC: Transformer-based Driver Activity Recognition with Latent Space Feature Calibration

1 code implementation • 2 Mar 2022 • Kunyu Peng, Alina Roitberg, Kailun Yang, Jiaming Zhang, Rainer Stiefelhagen

This module operates in the latent feature-space enriching and diversifying the training set at feature-level in order to improve generalization to novel data appearances, (e. g., sensor changes) and general feature quality.

Human Activity Recognition

Paper
Code

PanoFlow: Learning 360° Optical Flow for Surrounding Temporal Understanding

1 code implementation • 27 Feb 2022 • Hao Shi, Yifan Zhou, Kailun Yang, Xiaoting Yin, Ze Wang, Yaozu Ye, Zhe Yin, Shi Meng, Peng Li, Kaiwei Wang

PanoFlow achieves state-of-the-art performance on the public OmniFlowNet and the established FlowScape benchmarks.

Autonomous Vehicles Optical Flow Estimation

Paper
Code

TransKD: Transformer Knowledge Distillation for Efficient Semantic Segmentation

2 code implementations • 27 Feb 2022 • Ruiping Liu, Kailun Yang, Alina Roitberg, Jiaming Zhang, Kunyu Peng, Huayao Liu, Yaonan Wang, Rainer Stiefelhagen

Semantic segmentation benchmarks in the realm of autonomous driving are dominated by large pre-trained transformers, yet their widespread adoption is impeded by substantial computational costs and prolonged training durations.

Autonomous Driving Knowledge Distillation +3

Paper
Code

LF-VIO: A Visual-Inertial-Odometry Framework for Large Field-of-View Cameras with Negative Plane

1 code implementation • 25 Feb 2022 • Ze Wang, Kailun Yang, Hao Shi, Peng Li, Fei Gao, Kaiwei Wang

To tackle this issue, we propose LF-VIO, a real-time VIO framework for cameras with extremely large FoV.

Autonomous Driving Visual Odometry

114

Paper
Code

Delving Deep into One-Shot Skeleton-based Action Recognition with Diverse Occlusions

2 code implementations • 23 Feb 2022 • Kunyu Peng, Alina Roitberg, Kailun Yang, Jiaming Zhang, Rainer Stiefelhagen

Yet, the research of data-scarce recognition from skeleton sequences, such as one-shot action recognition, does not explicitly consider occlusions despite their everyday pervasiveness.

Ranked #1 on Action Classification on Toyota Smarthome dataset (Accuracy metric)

Action Classification Action Recognition +2

Paper
Code

CSFlow: Learning Optical Flow via Cross Strip Correlation for Autonomous Driving

1 code implementation • 2 Feb 2022 • Hao Shi, Yifan Zhou, Kailun Yang, Xiaoting Yin, Kaiwei Wang

In this paper, we propose a new deep network architecture for optical flow estimation in autonomous driving--CSFlow, which consists of two novel modules: Cross Strip Correlation module (CSC) and Correlation Regression Initialization module (CRI).

Autonomous Driving Optical Flow Estimation

Paper
Code

Should I take a walk? Estimating Energy Expenditure from Video Data

1 code implementation • 1 Feb 2022 • Kunyu Peng, Alina Roitberg, Kailun Yang, Jiaming Zhang, Rainer Stiefelhagen

To study this underresearched task, we introduce Vid2Burn -- an omni-source benchmark for estimating caloric expenditure from video data featuring both, high- and low-intensity activities for which we derive energy expenditure annotations based on models established in medical literature.

Video Recognition

Paper
Code

Exploring Event-driven Dynamic Context for Accident Scene Segmentation

1 code implementation • 9 Dec 2021 • Jiaming Zhang, Kailun Yang, Rainer Stiefelhagen

Moreover, in order to evaluate the segmentation performance in traffic accidents, we provide a pixel-wise annotated accident dataset, namely DADA-seg, which contains a variety of critical scenarios from traffic accidents.

Ranked #3 on Semantic Segmentation on DADA-seg (using extra training data)

Scene Segmentation Segmentation

Paper
Code

Affect-DML: Context-Aware One-Shot Recognition of Human Affect using Deep Metric Learning

1 code implementation • 30 Nov 2021 • Kunyu Peng, Alina Roitberg, David Schneider, Marios Koulakis, Kailun Yang, Rainer Stiefelhagen

Human affect recognition is a well-established research area with numerous applications, e. g., in psychological care, but existing methods assume that all emotions-of-interest are given a priori as annotated training examples.

Emotion Recognition Metric Learning +1

Paper
Code

Event-Based Fusion for Motion Deblurring with Cross-modal Attention

1 code implementation • 30 Nov 2021 • Lei Sun, Christos Sakaridis, Jingyun Liang, Qi Jiang, Kailun Yang, Peng Sun, Yaozu Ye, Kaiwei Wang, Luc van Gool

Traditional frame-based cameras inevitably suffer from motion blur due to long exposure times.

Ranked #3 on Deblurring on GoPro (using extra training data)

Deblurring Image Deblurring +1

128

Paper
Code

Transfer beyond the Field of View: Dense Panoramic Semantic Segmentation via Unsupervised Domain Adaptation

1 code implementation • 21 Oct 2021 • Jiaming Zhang, Chaoxiang Ma, Kailun Yang, Alina Roitberg, Kunyu Peng, Rainer Stiefelhagen

We look at this problem from the perspective of domain adaptation and bring panoramic semantic segmentation to a setting, where labelled training data originates from a different distribution of conventional pinhole camera images.

Ranked #7 on Semantic Segmentation on DensePASS (using extra training data)

Autonomous Vehicles Segmentation +2

Paper
Code

Trans4Trans: Efficient Transformer for Transparent Object and Semantic Scene Segmentation in Real-World Navigation Assistance

1 code implementation • 20 Aug 2021 • Jiaming Zhang, Kailun Yang, Angela Constantinescu, Kunyu Peng, Karin Müller, Rainer Stiefelhagen

In this paper, we build a wearable system with a novel dual-head Transformer for Transparency (Trans4Trans) perception model, which can segment general- and transparent objects.

Ranked #2 on Semantic Segmentation on DADA-seg (using extra training data)

Decoder Navigate +2

Paper
Code

Panoramic Depth Estimation via Supervised and Unsupervised Learning in Indoor Scenes

1 code implementation • 18 Aug 2021 • Keyang Zhou, Kailun Yang, Kaiwei Wang

With a comprehensive variety of experiments, this research demonstrates the effectiveness of our schemes aiming for indoor scene perception.

Camera Calibration Monocular Depth Estimation +2

Paper
Code

Flying Guide Dog: Walkable Path Discovery for the Visually Impaired Utilizing Drones and Transformer-based Semantic Segmentation

1 code implementation • 16 Aug 2021 • Haobin Tan, Chang Chen, Xinyu Luo, Jiaming Zhang, Constantin Seibold, Kailun Yang, Rainer Stiefelhagen

By recognizing the color of pedestrian traffic lights, our prototype can help the user to cross a street safely.

Conditional Image Generation Segmentation +1

Paper
Code

DensePASS: Dense Panoramic Semantic Segmentation via Unsupervised Domain Adaptation with Attention-Augmented Context Exchange

1 code implementation • 13 Aug 2021 • Chaoxiang Ma, Jiaming Zhang, Kailun Yang, Alina Roitberg, Rainer Stiefelhagen

First, we formalize the task of unsupervised domain adaptation for panoramic semantic segmentation, where a network trained on labelled examples from the source domain of pinhole camera data is deployed in a different target domain of panoramic images, for which no labels are available.

Segmentation Semantic Segmentation +1

Paper
Code

HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor

no code implementations • 7 Jul 2021 • Huayao Liu, Ruiping Liu, Kailun Yang, Jiaming Zhang, Kunyu Peng, Rainer Stiefelhagen

To tackle these issues, we propose HIDA, a lightweight assistive system based on 3D point cloud instance segmentation with a solid-state LiDAR sensor, for holistic indoor detection and avoidance.

Ranked #18 on 3D Instance Segmentation on ScanNet(v2)

3D Instance Segmentation Point Cloud Segmentation +2

Paper
Add Code

Trans4Trans: Efficient Transformer for Transparent Object Segmentation to Help Visually Impaired People Navigate in the Real World

1 code implementation • 7 Jul 2021 • Jiaming Zhang, Kailun Yang, Angela Constantinescu, Kunyu Peng, Karin Müller, Rainer Stiefelhagen

Common fully glazed facades and transparent objects present architectural barriers and impede the mobility of people with low vision or blindness, for instance, a path detected behind a glass door is inaccessible unless it is correctly perceived and reacted.

Ranked #1 on Semantic Segmentation on Trans10K

Decoder Navigate +2

Paper
Code

MASS: Multi-Attentional Semantic Segmentation of LiDAR Data for Dense Top-View Understanding

1 code implementation • 1 Jul 2021 • Kunyu Peng, Juncong Fei, Kailun Yang, Alina Roitberg, Jiaming Zhang, Frank Bieder, Philipp Heidenreich, Christoph Stiller, Rainer Stiefelhagen

At the heart of all automated driving systems is the ability to sense the surroundings, e. g., through semantic segmentation of LiDAR sequences, which experienced a remarkable progress due to the release of large datasets such as SemanticKITTI and nuScenes-LidarSeg.

3D Object Detection Graph Attention +4

Paper
Code

Aerial-PASS: Panoramic Annular Scene Segmentation in Drone Videos

no code implementations • 15 May 2021 • Lei Sun, Jia Wang, Kailun Yang, Kaikai Wu, Xiangdong Zhou, Kaiwei Wang, Jian Bai

A lightweight panoramic annular semantic segmentation neural network model is designed to achieve high-accuracy and real-time scene parsing.

Ranked #72 on Semantic Segmentation on Cityscapes val

Scene Parsing Scene Segmentation +1

Paper
Add Code

Capturing Omni-Range Context for Omnidirectional Segmentation

1 code implementation • CVPR 2021 • Kailun Yang, Jiaming Zhang, Simon Reiß, Xinxin Hu, Rainer Stiefelhagen

Convolutional Networks (ConvNets) excel at semantic segmentation and have become a vital component for perception in autonomous driving.

Ranked #10 on Semantic Segmentation on DensePASS (using extra training data)

Autonomous Driving Image Segmentation +2

Paper
Code

Panoptic Lintention Network: Towards Efficient Navigational Perception for the Visually Impaired

1 code implementation • 6 Mar 2021 • Wei Mao, Jiaming Zhang, Kailun Yang, Rainer Stiefelhagen

Based on Lintention, we then devise a novel panoptic segmentation model which we term Panoptic Lintention Net.

Instance Segmentation Panoptic Segmentation +1

Paper
Code

Perception Framework through Real-Time Semantic Segmentation and Scene Recognition on a Wearable System for the Visually Impaired

no code implementations • 6 Mar 2021 • Yingzhi Zhang, Haoye Chen, Kailun Yang, Jiaming Zhang, Rainer Stiefelhagen

As the scene information, including objectness and scene type, are important for people with visual impairment, in this work we present a multi-task efficient perception system for the scene parsing and recognition tasks.

Real-Time Semantic Segmentation Scene Recognition

Paper
Add Code

Panoramic Panoptic Segmentation: Towards Complete Surrounding Understanding via Unsupervised Contrastive Learning

1 code implementation • 1 Mar 2021 • Alexander Jaus, Kailun Yang, Rainer Stiefelhagen

In order to overcome the lack of annotated panoramic images, we propose a framework which allows model training on standard pinhole images and transfers the learned features to a different domain.

Contrastive Learning Panoptic Segmentation +2

Paper
Code

DR-TANet: Dynamic Receptive Temporal Attention Network for Street Scene Change Detection

1 code implementation • 1 Mar 2021 • Shuo Chen, Kailun Yang, Rainer Stiefelhagen

Street scene change detection continues to capture researchers' interests in the computer vision community.

Autonomous Vehicles Change Detection +3

Paper
Code

Panoramic annular SLAM with loop closure and global optimization

no code implementations • 26 Feb 2021 • Hao Chen, Weijian Hu, Kailun Yang, Jian Bai, Kaiwei Wang

In this paper, we propose panoramic annular simultaneous localization and mapping (PA-SLAM), a visual SLAM system based on panoramic annular lens.

Loop Closure Detection Simultaneous Localization and Mapping

Paper
Add Code

Polarization-driven Semantic Segmentation via Efficient Attention-bridged Fusion

1 code implementation • 26 Nov 2020 • Kaite Xiang, Kailun Yang, Kaiwei Wang

Semantic Segmentation (SS) is promising for outdoor scene perception in safety-critical applications like autonomous vehicles, assisted navigation and so on.

Ranked #5 on Semantic Segmentation on UPLight

Autonomous Vehicles Semantic Segmentation

Paper
Code

ISSAFE: Improving Semantic Segmentation in Accidents by Fusing Event-based Data

1 code implementation • 20 Aug 2020 • Jiaming Zhang, Kailun Yang, Rainer Stiefelhagen

Ensuring the safety of all traffic participants is a prerequisite for bringing intelligent vehicles closer to practical applications.

Ranked #6 on Semantic Segmentation on KITTI-360

Autonomous Vehicles Benchmarking +2

Paper
Code

Can we cover navigational perception needs of the visually impaired by panoptic segmentation?

no code implementations • 20 Jul 2020 • Wei Mao, Jiaming Zhang, Kailun Yang, Rainer Stiefelhagen

Navigational perception for visually impaired people has been substantially promoted by both classic and deep learning based segmentation methods.

Instance Segmentation Panoptic Segmentation +1

Paper
Add Code

Real-time Fusion Network for RGB-D Semantic Segmentation Incorporating Unexpected Obstacle Detection for Road-driving Images

1 code implementation • 24 Feb 2020 • Lei Sun, Kailun Yang, Xinxin Hu, Weijian Hu, Kaiwei Wang

Semantic segmentation has made striking progress due to the success of deep convolutional neural networks.

Ranked #11 on Semantic Segmentation on EventScape

Autonomous Driving Real-Time Semantic Segmentation +1

Paper
Code

Universal Semantic Segmentation for Fisheye Urban Driving Images

1 code implementation • 31 Jan 2020 • Yaozu Ye, Kailun Yang, Kaite Xiang, Juan Wang, Kaiwei Wang

In this paper, a seven degrees of freedom (DoF) augmentation method is proposed to transform rectilinear image to fisheye image in a more comprehensive way.

Autonomous Driving Image Segmentation +2

Paper
Code

DS-PASS: Detail-Sensitive Panoramic Annular Semantic Segmentation through SwaftNet for Surrounding Sensing

1 code implementation • 17 Sep 2019 • Kailun Yang, Xinxin Hu, Hao Chen, Kaite Xiang, Kaiwei Wang, Rainer Stiefelhagen

Semantically interpreting the traffic scene is crucial for autonomous transportation and robotics systems.

Ranked #35 on Semantic Segmentation on DensePASS

Decoder Segmentation +2

Paper
Code

Semantic Segmentation of Panoramic Images Using a Synthetic Dataset

1 code implementation • 2 Sep 2019 • Yuanyou Xu, Kaiwei Wang, Kailun Yang, Dongming Sun, Jia Fu

In addition, it has been shown that by using panoramic images with a 180 degree FoV as training data the model has better performance.

Segmentation Semantic Segmentation

Paper
Code

See Clearer at Night: Towards Robust Nighttime Semantic Segmentation through Day-Night Image Conversion

no code implementations • 16 Aug 2019 • Lei Sun, Kaiwei Wang, Kailun Yang, Kaite Xiang

However, in face of adverse conditions such as the nighttime, semantic segmentation loses its accuracy significantly.

Ranked #8 on Semantic Segmentation on Nighttime Driving

Segmentation Semantic Segmentation

Paper
Add Code

A Multimodal Vision Sensor for Autonomous Driving

no code implementations • 15 Aug 2019 • Dongming Sun, Xiao Huang, Kailun Yang

This paper describes a multimodal vision sensor that integrates three types of cameras, including a stereo camera, a polarization camera and a panoramic camera.

Autonomous Driving Semantic Segmentation

Paper
Add Code

A Comparative Study of High-Recall Real-Time Semantic Segmentation Based on Swift Factorized Network

1 code implementation • 26 Jul 2019 • Kaite Xiang, Kaiwei Wang, Kailun Yang

Furthermore, we make a detailed analysis and comparison of the three proposed methods on the promotion of recall rate.

Autonomous Vehicles Real-Time Semantic Segmentation

Paper
Code

Importance-Aware Semantic Segmentation with Efficient Pyramidal Context Network for Navigational Assistant Systems

1 code implementation • 25 Jul 2019 • Kaite Xiang, Kaiwei Wang, Kailun Yang

Semantic Segmentation (SS) is a task to assign semantic label to each pixel of the images, which is of immense significance for autonomous vehicles, robotics and assisted navigation of vulnerable road users.

Autonomous Vehicles Segmentation +1

Paper
Code

ACNet: Attention Based Network to Exploit Complementary Features for RGBD Semantic Segmentation

1 code implementation • 24 May 2019 • Xinxin Hu, Kailun Yang, Lei Fei, Kaiwei Wang

The main contributions lie in the Attention Complementary Module (ACM) and the architecture with three parallel branches.

Ranked #4 on Semantic Segmentation on KITTI-360

RGBD Semantic Segmentation Segmentation +2

159

Paper
Code

Panoramic Annular Localizer: Tackling the Variation Challenges of Outdoor Localization Using Panoramic Annular Images and Active Deep Descriptors

2 code implementations • 14 May 2019 • Ruiqi Cheng, Kaiwei Wang, Shufei Lin, Weijian Hu, Kailun Yang, Xiao Huang, Huabing Li, Dongming Sun, Jian Bai

The panoramic annular images captured by the single camera are processed and fed into the NetVLAD network to form the active deep descriptor, and sequential matching is utilized to generate the localization result.

Autonomous Vehicles Camera Localization +2

Paper
Code

Visual Localization of Key Positions for Visually Impaired People

no code implementations • 9 Oct 2018 • Ruiqi Cheng, Kaiwei Wang, Longqing Lin, Kailun Yang

On the off-the-shelf navigational assistance devices, the localization precision is limited to the signal error of global navigation satellite system (GNSS).

Visual Localization

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.