no code implementations • 2 Apr 2023 • Yuren Cong, Wentong Liao, Bodo Rosenhahn, Michael Ying Yang
Scene graph generation is conventionally evaluated by (mean) Recall@K, which measures the ratio of correctly predicted triplets that appear in the ground truth.
no code implementations • CVPR 2023 • Qianli Feng, Raghudeep Gadde, Wentong Liao, Eduard Ramon, Aleix Martinez
We derive a method that yields highly accurate semantic segmentation maps without the use of any additional neural network, layers, manually annotated training data, or supervised training.
no code implementations • ICCV 2021 • Sen He, Wentong Liao, Michael Ying Yang, Yi-Zhe Song, Bodo Rosenhahn, Tao Xiang
The generated face image given a target age code is expected to be age-sensitive reflected by bio-plausible transformations of shape and texture, while being identity preserving.
1 code implementation • ICCV 2021 • Yuren Cong, Wentong Liao, Hanno Ackermann, Bodo Rosenhahn, Michael Ying Yang
Compared to the task of scene graph generation from images, it is more challenging because of the dynamic relationships between objects and the temporal dependencies between frames allowing for a richer semantic interpretation.
1 code implementation • CVPR 2022 • Kai Hu, Wentong Liao, Michael Ying Yang, Bodo Rosenhahn
Text-to-image synthesis (T2I) aims to generate photo-realistic images which are semantically consistent with the text descriptions.
1 code implementation • CVPR 2021 • Sen He, Wentong Liao, Michael Ying Yang, Yongxin Yang, Yi-Zhe Song, Bodo Rosenhahn, Tao Xiang
We argue that these are caused by the lack of context-aware object and stuff feature encoding in their generators, and location-sensitive appearance representation in their discriminators.
Ranked #1 on Layout-to-Image Generation on COCO-Stuff 128x128
2 code implementations • 30 Oct 2020 • Hao Cheng, Wentong Liao, Xuejiao Tang, Michael Ying Yang, Monika Sester, Bodo Rosenhahn
In our framework, first, the spatial context between agents is explored by using self-attention architectures.
1 code implementation • 15 Jun 2020 • Hao Cheng, Wentong Liao, Michael Ying Yang, Bodo Rosenhahn, Monika Sester
Trajectory prediction is critical for applications of planning safe future movements and remains challenging even for the next few seconds in urban mixed traffic.
no code implementations • 28 May 2020 • Wentong Liao, Xiang Chen, Jingfeng Yang, Stefan Roth, Michael Goesele, Michael Ying Yang, Bodo Rosenhahn
This strengthens the local feature invariance for the resampled features and enables detecting vehicles in an arbitrary orientation.
2 code implementations • 29 Apr 2020 • Sen He, Wentong Liao, Hamed R. -Tavakoli, Michael Yang, Bodo Rosenhahn, Nicolas Pugeault
Inspired by the successes in text analysis and translation, previous work have proposed the \textit{transformer} architecture for image captioning.
1 code implementation • 5 Apr 2020 • Tongxin Hu, Vasileios Iosifidis, Wentong Liao, Hang Zhang, Michael YingYang, Eirini Ntoutsi, Bodo Rosenhahn
In this paper, we propose FairNN a neural network that performs joint feature representation and classification for fairness-aware learning.
1 code implementation • 14 Feb 2020 • Hao Cheng, Wentong Liao, Michael Ying Yang, Monika Sester, Bodo Rosenhahn
In inference time, we combine the past context and motion information of the target agent with samplings of the latent variables to predict multiple realistic trajectories in the future.
1 code implementation • ECCV 2020 • Cong Yuren, Hanno Ackermann, Wentong Liao, Michael Ying Yang, Bodo Rosenhahn
Detected objects, their labels and the discovered relations can be used to construct a scene graph which provides an abstract semantic interpretation of an image.
Ranked #8 on Scene Graph Generation on Visual Genome
no code implementations • 3 Apr 2019 • Wentong Liao, Cuiling Lan, Wen-Jun Zeng, Michael Ying Yang, Bodo Rosenhahn
We further explore more powerful representations by integrating language prior with the visual context in the transformation for the scene graph generation.
no code implementations • 26 Oct 2018 • Michael Ying Yang, Wentong Liao, Chun Yang, Yanpeng Cao, Bodo Rosenhahn
The experimental results show that the proposed approach outperforms the state-of-the-art methods and effective in recognizing complex security events.
no code implementations • 9 Feb 2018 • Michael Ying Yang, Wentong Liao, Yanpeng Cao, Bodo Rosenhahn
In our framework, three levels of video events are connected by Hierarchical Dirichlet Process (HDP) model: low-level visual features, simple atomic activities, and multi-agent interactions.
no code implementations • 9 Feb 2018 • Michael Ying Yang, Matthias Reso, Jun Tang, Wentong Liao, Bodo Rosenhahn
Therefore, we formulate a graphical model to select a proposal stream for each object in which the pairwise potentials consist of the appearance dissimilarity between different streams in the same video and also the similarity between the streams in different videos.
1 code implementation • 9 Feb 2018 • Wentong Liao, Michael Ying Yang, Ni Zhan, Bodo Rosenhahn
Moreover, we trained the model jointly on six different datasets, which differs from common practice - one model is just trained on one dataset and tested also on the same one.
no code implementations • 22 Jan 2018 • Michael Ying Yang, Wentong Liao, Xinbo Li, Bodo Rosenhahn
Also, the focal loss function is used to substitute for conventional cross entropy loss function in both of the region proposed network and the final classifier.
no code implementations • 16 Nov 2017 • Wentong Liao, Lin Shuai, Bodo Rosenhahn, Michael Ying Yang
Most of the existing works treat this task as a pure visual classification task: each type of relationship or phrase is classified as a relation category based on the extracted visual features.
no code implementations • 19 Sep 2016 • Michael Ying Yang, Wentong Liao, Hanno Ackermann, Bodo Rosenhahn
In contrast to previous methods for extracting support relations, the proposed approach generates more accurate results, and does not require a pixel-wise semantic labeling of the scene.