Search Results for author: Tianyu Yang

Found 26 papers, 13 papers with code

Compress3D: a Compressed Latent Space for 3D Generation from a Single Image

no code implementations • 20 Mar 2024 • BoWen Zhang, Tianyu Yang, Yu Li, Lei Zhang, Xi Zhao

In this paper, we present a triplane autoencoder, which encodes 3D models into a compact triplane latent space to effectively compress both the 3D geometry and texture information.

3D Generation

Paper
Add Code

OMG: Occlusion-friendly Personalized Multi-concept Generation in Diffusion Models

1 code implementation • 16 Mar 2024 • Zhe Kong, Yong Zhang, Tianyu Yang, Tao Wang, Kaihao Zhang, Bizhu Wu, GuanYing Chen, Wei Liu, Wenhan Luo

We also observe that the initiation denoising timestep for noise blending is the key to identity preservation and layout.

Denoising Text-to-Image Generation

520

Paper
Code

SceMQA: A Scientific College Entrance Level Multimodal Question Answering Benchmark

no code implementations • 6 Feb 2024 • Zhenwen Liang, Kehan Guo, Gang Liu, Taicheng Guo, Yujun Zhou, Tianyu Yang, Jiajun Jiao, Renjie Pi, Jipeng Zhang, Xiangliang Zhang

The paper introduces SceMQA, a novel benchmark for scientific multimodal question answering at the college entrance level.

Multiple-choice Question Answering

Paper
Add Code

Symbol as Points: Panoptic Symbol Spotting via Point-based Representation

1 code implementation • 19 Jan 2024 • Wenlong Liu, Tianyu Yang, YuHan Wang, QiZhi Yu, Lei Zhang

Finally, we propose a KNN interpolation mechanism for the mask attention module of the spotting head to better handle primitive mask downsampling, which is primitive-level in contrast to pixel-level for the image.

Point Cloud Segmentation Vector Graphics

Paper
Code

GPAvatar: Generalizable and Precise Head Avatar from Image(s)

1 code implementation • 18 Jan 2024 • Xuangeng Chu, Yu Li, Ailing Zeng, Tianyu Yang, Lijian Lin, Yunfei Liu, Tatsuya Harada

Head avatar reconstruction, crucial for applications in virtual reality, online meetings, gaming, and film industries, has garnered substantial attention within the computer vision community.

Neural Rendering Novel View Synthesis

220

Paper
Code

A Video is Worth 256 Bases: Spatial-Temporal Expectation-Maximization Inversion for Zero-Shot Video Editing

no code implementations • 10 Dec 2023 • Maomao Li, Yu Li, Tianyu Yang, Yunfei Liu, Dongxu Yue, Zhihui Lin, Dong Xu

This paper presents a video inversion approach for zero-shot video editing, which aims to model the input video with low-rank representation during the inversion process.

Video Editing

Paper
Add Code

Progressive3D: Progressively Local Editing for Text-to-3D Content Creation with Complex Semantic Prompts

no code implementations • 18 Oct 2023 • Xinhua Cheng, Tianyu Yang, Jianan Wang, Yu Li, Lei Zhang, Jian Zhang, Li Yuan

Recent text-to-3D generation methods achieve impressive 3D content creation capacity thanks to the advances in image diffusion models and optimizing strategies.

3D Generation Text to 3D

Paper
Add Code

TOSS:High-quality Text-guided Novel View Synthesis from a Single Image

no code implementations • 16 Oct 2023 • Yukai Shi, Jianan Wang, He Cao, Boshi Tang, Xianbiao Qi, Tianyu Yang, Yukun Huang, Shilong Liu, Lei Zhang, Heung-Yeung Shum

In this paper, we present TOSS, which introduces text to the task of novel view synthesis (NVS) from just a single RGB image.

Image-to-Image Translation Novel View Synthesis

Paper
Add Code

Consistent123: Improve Consistency for One Image to 3D Object Synthesis

no code implementations • 12 Oct 2023 • Haohan Weng, Tianyu Yang, Jianan Wang, Yu Li, Tong Zhang, C. L. Philip Chen, Lei Zhang

Large image diffusion models enable novel view synthesis with high quality and excellent zero-shot capability.

3D Generation 3D Reconstruction +3

Paper
Add Code

Scalable Video Object Segmentation with Simplified Framework

no code implementations • ICCV 2023 • Qiangqiang Wu, Tianyu Yang, Wei Wu, Antoni Chan

The current popular methods for video object segmentation (VOS) implement feature matching through several hand-crafted modules that separately perform feature extraction and matching.

Object Semantic Segmentation +2

Paper
Add Code

Dior-CVAE: Pre-trained Language Models and Diffusion Priors for Variational Dialog Generation

1 code implementation • 24 May 2023 • Tianyu Yang, Thy Thy Tran, Iryna Gurevych

These models also suffer from posterior collapse, i. e., the decoder tends to ignore latent variables and directly access information captured in the encoder through the cross-attention mechanism.

Open-Domain Dialog Response Generation

Paper
Code

DropMAE: Masked Autoencoders with Spatial-Attention Dropout for Tracking Tasks

1 code implementation • CVPR 2023 • Qiangqiang Wu, Tianyu Yang, Ziquan Liu, Baoyuan Wu, Ying Shan, Antoni B. Chan

However, we find that this simple baseline heavily relies on spatial cues while ignoring temporal relations for frame reconstruction, thus leading to sub-optimal temporal matching representations for VOT and VOS.

Ranked #1 on Visual Object Tracking on TrackingNet (AUC metric)

Semantic Segmentation Video Object Segmentation +2

Paper
Code

SimVTP: Simple Video Text Pre-training with Masked Autoencoders

no code implementations • 7 Dec 2022 • Yue Ma, Tianyu Yang, Yin Shan, Xiu Li

This paper presents SimVTP: a Simple Video-Text Pretraining framework via masked autoencoders.

Ranked #16 on Moment Retrieval on Charades-STA

Contrastive Learning Moment Retrieval +1

Paper
Add Code

Latent Video Diffusion Models for High-Fidelity Long Video Generation

1 code implementation • 23 Nov 2022 • Yingqing He, Tianyu Yang, Yong Zhang, Ying Shan, Qifeng Chen

Diffusion models have shown remarkable results recently but require significant computational resources.

Ranked #2 on Video Generation on Taichi

Denoising Image Generation +3

407

Paper
Code

SWEM: Towards Real-Time Video Object Segmentation with Sequential Weighted Expectation-Maximization

1 code implementation • CVPR 2022 • Zhihui Lin, Tianyu Yang, Maomao Li, Ziyu Wang, Chun Yuan, Wenhao Jiang, Wei Liu

Matching-based methods, especially those based on space-time memory, are significantly ahead of other solutions in semi-supervised video object segmentation (VOS).

Ranked #6 on Semi-Supervised Video Object Segmentation on DAVIS (no YouTube-VOS training)

Semantic Segmentation Semi-Supervised Video Object Segmentation +1

Paper
Code

LocVTP: Video-Text Pre-training for Temporal Localization

1 code implementation • 21 Jul 2022 • Meng Cao, Tianyu Yang, Junwu Weng, Can Zhang, Jue Wang, Yuexian Zou

To further enhance the temporal reasoning ability of the learned feature, we propose a context projection head and a temporal aware contrastive loss to perceive the contextual relationships.

Retrieval Temporal Localization +1

Paper
Code

Unsupervised Pre-training for Temporal Action Localization Tasks

1 code implementation • CVPR 2022 • Can Zhang, Tianyu Yang, Junwu Weng, Meng Cao, Jue Wang, Yuexian Zou

These pre-trained models can be sub-optimal for temporal localization tasks due to the inherent discrepancy between video-level classification and clip-level localization.

Contrastive Learning Representation Learning +4

Paper
Code

Semantic-Preserving Linguistic Steganography by Pivot Translation and Semantic-Aware Bins Coding

no code implementations • 8 Mar 2022 • Tianyu Yang, Hanzhou Wu, Biao Yi, Guorui Feng, Xinpeng Zhang

In this paper, we propose a novel LS method to modify a given text by pivoting it between two different languages and embed secret data by applying a GLS-like information encoding strategy.

Language Modelling Linguistic steganography +2

Paper
Add Code

Exploring Denoised Cross-Video Contrast for Weakly-Supervised Temporal Action Localization

no code implementations • CVPR 2022 • Jingjing Li, Tianyu Yang, Wei Ji, Jue Wang, Li Cheng

Inspired by recent success in unsupervised contrastive representation learning, we propose a novel denoised cross-video contrastive algorithm, aiming to enhance the feature discrimination ability of video snippets for accurate temporal action localization in the weakly-supervised setting.

Contrastive Learning Denoising +4

Paper
Add Code

Motion-aware Contrastive Video Representation Learning via Foreground-background Merging

1 code implementation • CVPR 2022 • Shuangrui Ding, Maomao Li, Tianyu Yang, Rui Qian, Haohang Xu, Qingyi Chen, Jue Wang, Hongkai Xiong

To alleviate such bias, we propose \textbf{F}oreground-b\textbf{a}ckground \textbf{Me}rging (FAME) to deliberately compose the moving foreground region of the selected video onto the static background of others.

Action Recognition Contrastive Learning +1

Paper
Code

VideoMoCo: Contrastive Video Representation Learning with Temporally Adversarial Examples

1 code implementation • CVPR 2021 • Tian Pan, Yibing Song, Tianyu Yang, Wenhao Jiang, Wei Liu

By empowering the temporal robustness of the encoder and modeling the temporal decay of the keys, our VideoMoCo improves MoCo temporally based on contrastive learning.

Ranked #76 on Action Recognition on HMDB-51

Action Recognition Contrastive Learning +1

140

Paper
Code

ROAM: Recurrently Optimizing Tracking Model

no code implementations • CVPR 2020 • Tianyu Yang, Pengfei Xu, Runbo Hu, Hua Chai, Antoni B. Chan

In this paper, we design a tracking model consisting of response generation and bounding box regression, where the first component produces a heat map to indicate the presence of the object at different positions and the second part regresses the relative bounding box shifts to anchors mounted on sliding-window locations.

Meta-Learning Response Generation

Paper
Add Code

Visual Tracking via Dynamic Memory Networks

no code implementations • 12 Jul 2019 • Tianyu Yang, Antoni B. Chan

The reading and writing process of the external memory is controlled by an LSTM network with the search feature map as input.

Template Matching Visual Tracking

Paper
Add Code

Generative adversarial network based on chaotic time series

no code implementations • 24 May 2019 • Makoto Naruse, Takashi Matsubara, Nicolas Chauvet, Kazutaka Kanno, Tianyu Yang, Atsushi Uchida

Here we utilize chaotic time series generated experimentally by semiconductor lasers for the latent variables of GAN whereby the inherent nature of chaos can be reflected or transformed into the generated output data.

Generative Adversarial Network Time Series +1

Paper
Add Code

Learning Dynamic Memory Networks for Object Tracking

1 code implementation • ECCV 2018 • Tianyu Yang, Antoni B. Chan

In this paper, we propose a dynamic memory network to adapt the template to the target's appearance variations during tracking.

Object Object Tracking +2

Paper
Code

Recurrent Filter Learning for Visual Tracking

1 code implementation • 13 Aug 2017 • Tianyu Yang, Antoni B. Chan

Recently using convolutional neural networks (CNNs) has gained popularity in visual tracking, due to its robust feature representation of images.

Object Visual Tracking

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.