no code implementations • 20 Mar 2024 • BoWen Zhang, Tianyu Yang, Yu Li, Lei Zhang, Xi Zhao
In this paper, we present a triplane autoencoder, which encodes 3D models into a compact triplane latent space to effectively compress both the 3D geometry and texture information.
1 code implementation • 16 Mar 2024 • Zhe Kong, Yong Zhang, Tianyu Yang, Tao Wang, Kaihao Zhang, Bizhu Wu, GuanYing Chen, Wei Liu, Wenhan Luo
We also observe that the initiation denoising timestep for noise blending is the key to identity preservation and layout.
no code implementations • 6 Feb 2024 • Zhenwen Liang, Kehan Guo, Gang Liu, Taicheng Guo, Yujun Zhou, Tianyu Yang, Jiajun Jiao, Renjie Pi, Jipeng Zhang, Xiangliang Zhang
The paper introduces SceMQA, a novel benchmark for scientific multimodal question answering at the college entrance level.
1 code implementation • 19 Jan 2024 • Wenlong Liu, Tianyu Yang, YuHan Wang, QiZhi Yu, Lei Zhang
Finally, we propose a KNN interpolation mechanism for the mask attention module of the spotting head to better handle primitive mask downsampling, which is primitive-level in contrast to pixel-level for the image.
1 code implementation • 18 Jan 2024 • Xuangeng Chu, Yu Li, Ailing Zeng, Tianyu Yang, Lijian Lin, Yunfei Liu, Tatsuya Harada
Head avatar reconstruction, crucial for applications in virtual reality, online meetings, gaming, and film industries, has garnered substantial attention within the computer vision community.
no code implementations • 10 Dec 2023 • Maomao Li, Yu Li, Tianyu Yang, Yunfei Liu, Dongxu Yue, Zhihui Lin, Dong Xu
This paper presents a video inversion approach for zero-shot video editing, which aims to model the input video with low-rank representation during the inversion process.
no code implementations • 18 Oct 2023 • Xinhua Cheng, Tianyu Yang, Jianan Wang, Yu Li, Lei Zhang, Jian Zhang, Li Yuan
Recent text-to-3D generation methods achieve impressive 3D content creation capacity thanks to the advances in image diffusion models and optimizing strategies.
no code implementations • 16 Oct 2023 • Yukai Shi, Jianan Wang, He Cao, Boshi Tang, Xianbiao Qi, Tianyu Yang, Yukun Huang, Shilong Liu, Lei Zhang, Heung-Yeung Shum
In this paper, we present TOSS, which introduces text to the task of novel view synthesis (NVS) from just a single RGB image.
no code implementations • 12 Oct 2023 • Haohan Weng, Tianyu Yang, Jianan Wang, Yu Li, Tong Zhang, C. L. Philip Chen, Lei Zhang
Large image diffusion models enable novel view synthesis with high quality and excellent zero-shot capability.
no code implementations • ICCV 2023 • Qiangqiang Wu, Tianyu Yang, Wei Wu, Antoni Chan
The current popular methods for video object segmentation (VOS) implement feature matching through several hand-crafted modules that separately perform feature extraction and matching.
1 code implementation • 24 May 2023 • Tianyu Yang, Thy Thy Tran, Iryna Gurevych
These models also suffer from posterior collapse, i. e., the decoder tends to ignore latent variables and directly access information captured in the encoder through the cross-attention mechanism.
1 code implementation • CVPR 2023 • Qiangqiang Wu, Tianyu Yang, Ziquan Liu, Baoyuan Wu, Ying Shan, Antoni B. Chan
However, we find that this simple baseline heavily relies on spatial cues while ignoring temporal relations for frame reconstruction, thus leading to sub-optimal temporal matching representations for VOT and VOS.
Ranked #1 on Visual Object Tracking on TrackingNet (AUC metric)
no code implementations • 7 Dec 2022 • Yue Ma, Tianyu Yang, Yin Shan, Xiu Li
This paper presents SimVTP: a Simple Video-Text Pretraining framework via masked autoencoders.
Ranked #16 on Moment Retrieval on Charades-STA
1 code implementation • 23 Nov 2022 • Yingqing He, Tianyu Yang, Yong Zhang, Ying Shan, Qifeng Chen
Diffusion models have shown remarkable results recently but require significant computational resources.
Ranked #2 on Video Generation on Taichi
1 code implementation • CVPR 2022 • Zhihui Lin, Tianyu Yang, Maomao Li, Ziyu Wang, Chun Yuan, Wenhao Jiang, Wei Liu
Matching-based methods, especially those based on space-time memory, are significantly ahead of other solutions in semi-supervised video object segmentation (VOS).
Semantic Segmentation Semi-Supervised Video Object Segmentation +1
1 code implementation • 21 Jul 2022 • Meng Cao, Tianyu Yang, Junwu Weng, Can Zhang, Jue Wang, Yuexian Zou
To further enhance the temporal reasoning ability of the learned feature, we propose a context projection head and a temporal aware contrastive loss to perceive the contextual relationships.
1 code implementation • CVPR 2022 • Can Zhang, Tianyu Yang, Junwu Weng, Meng Cao, Jue Wang, Yuexian Zou
These pre-trained models can be sub-optimal for temporal localization tasks due to the inherent discrepancy between video-level classification and clip-level localization.
no code implementations • 8 Mar 2022 • Tianyu Yang, Hanzhou Wu, Biao Yi, Guorui Feng, Xinpeng Zhang
In this paper, we propose a novel LS method to modify a given text by pivoting it between two different languages and embed secret data by applying a GLS-like information encoding strategy.
no code implementations • CVPR 2022 • Jingjing Li, Tianyu Yang, Wei Ji, Jue Wang, Li Cheng
Inspired by recent success in unsupervised contrastive representation learning, we propose a novel denoised cross-video contrastive algorithm, aiming to enhance the feature discrimination ability of video snippets for accurate temporal action localization in the weakly-supervised setting.
1 code implementation • CVPR 2022 • Shuangrui Ding, Maomao Li, Tianyu Yang, Rui Qian, Haohang Xu, Qingyi Chen, Jue Wang, Hongkai Xiong
To alleviate such bias, we propose \textbf{F}oreground-b\textbf{a}ckground \textbf{Me}rging (FAME) to deliberately compose the moving foreground region of the selected video onto the static background of others.
1 code implementation • CVPR 2021 • Tian Pan, Yibing Song, Tianyu Yang, Wenhao Jiang, Wei Liu
By empowering the temporal robustness of the encoder and modeling the temporal decay of the keys, our VideoMoCo improves MoCo temporally based on contrastive learning.
Ranked #76 on Action Recognition on HMDB-51
no code implementations • CVPR 2020 • Tianyu Yang, Pengfei Xu, Runbo Hu, Hua Chai, Antoni B. Chan
In this paper, we design a tracking model consisting of response generation and bounding box regression, where the first component produces a heat map to indicate the presence of the object at different positions and the second part regresses the relative bounding box shifts to anchors mounted on sliding-window locations.
no code implementations • 12 Jul 2019 • Tianyu Yang, Antoni B. Chan
The reading and writing process of the external memory is controlled by an LSTM network with the search feature map as input.
no code implementations • 24 May 2019 • Makoto Naruse, Takashi Matsubara, Nicolas Chauvet, Kazutaka Kanno, Tianyu Yang, Atsushi Uchida
Here we utilize chaotic time series generated experimentally by semiconductor lasers for the latent variables of GAN whereby the inherent nature of chaos can be reflected or transformed into the generated output data.
1 code implementation • ECCV 2018 • Tianyu Yang, Antoni B. Chan
In this paper, we propose a dynamic memory network to adapt the template to the target's appearance variations during tracking.
1 code implementation • 13 Aug 2017 • Tianyu Yang, Antoni B. Chan
Recently using convolutional neural networks (CNNs) has gained popularity in visual tracking, due to its robust feature representation of images.