1 code implementation • ECCV 2020 • Guangrui Li, Guoliang Kang, Wu Liu, Yunchao Wei, Yi Yang
The target of CCM is to acquire those synthetic images that share similar distribution with the real ones in the target domain, so that the domain gap can be naturally alleviated by employing the content-consistent synthetic images for training.
Ranked #12 on Semantic Segmentation on GTAV-to-Cityscapes Labels
1 code implementation • NAACL 2022 • John Lalor, Yi Yang, Kendall Smith, Nicole Forsgren, Ahmed Abbasi
While much work has highlighted biases embedded in state-of-the-art language models, and more recent efforts have focused on how to debias, research assessing the fairness and performance of biased/debiased models on downstream prediction tasks has been limited.
1 code implementation • EMNLP 2021 • Ahmed Abbasi, David Dobolyi, John P. Lalor, Richard G. Netemeyer, Kendall Smith, Yi Yang
We also discuss the important implications of our work and resulting testbed for future NLP research on psychometrics and fairness.
no code implementations • ACL 2022 • Yue Guo, Yi Yang, Ahmed Abbasi
Specifically, we propose a variant of the beam search method to automatically search for biased prompts such that the cloze-style completions are the most different with respect to different demographic groups.
no code implementations • ACL 2022 • Chengyu Chuang, Yi Yang
Given the prevalence of NLP models in financial decision making systems, this work raises the awareness of their potential implicit preferences in the stock markets.
1 code implementation • Findings (EMNLP) 2021 • Hanyu Duan, Yi Yang, Kar Yan Tam
Numeracy plays a key role in natural language understanding.
no code implementations • 30 Apr 2024 • Kaiqiao Han, Yi Yang, Zijie Huang, Xuan Kan, Yang Yang, Ying Guo, Lifang He, Liang Zhan, Yizhou Sun, Wei Wang, Carl Yang
Brain network analysis is vital for understanding the neural interactions regarding brain structures and functions, and identifying potential biomarkers for clinical phenotypes.
no code implementations • 28 Apr 2024 • Yunbing Jia, Xiaoyu Kong, Fan Tang, Yixing Gao, WeiMing Dong, Yi Yang
In this paper, we reveal the two sides of data augmentation: enhancements in closed-set recognition correlate with a significant decrease in open-set recognition.
no code implementations • 25 Apr 2024 • Zhihao Zhu, Ninglu Shao, Defu Lian, Chenwang Wu, Zheng Liu, Yi Yang, Enhong Chen
Large language models (LLMs) show early signs of artificial general intelligence but struggle with hallucinations.
no code implementations • 25 Apr 2024 • Kaixin Shen, Ruijie Quan, Linchao Zhu, Jun Xiao, Yi Yang
AudioScenic exploits the inherent properties of audio, namely, audio magnitude and frequency, to guide the editing process, aiming to control the temporal dynamics and enhance the temporal consistency.
no code implementations • 25 Apr 2024 • Kaixin Shen, Ruijie Quan, Linchao Zhu, Jun Xiao, Yi Yang
In this study, we introduce a framework called Multi-Agent Trajectory prediction via neural interaction Energy (MATE).
2 code implementations • 21 Apr 2024 • Wenhao Wang, Yifan Sun, Zhentao Tan, Yi Yang
To accommodate the "seen $\rightarrow$ unseen" generalization scenario, we construct the first large-scale pattern dataset named AnyPattern, which has the largest number of tamper patterns ($90$ for training and $10$ for testing) among all the existing ones.
no code implementations • 20 Apr 2024 • Ben Eisner, Yi Yang, Todor Davchev, Mel Vecerik, Jonathan Scholz, David Held
In this work, we propose a method for precise relative pose prediction which is provably SE(3)-equivariant, can be learned from only a few demonstrations, and can generalize across variations in a class of objects.
no code implementations • 11 Apr 2024 • Yufeng Yue, Meng Yu, Luojie Yang, Yi Yang
Image restoration is rather challenging in adverse weather conditions, especially when multiple degradations occur simultaneously.
no code implementations • 10 Apr 2024 • Longwei Zou, Qingyang Wang, Han Zhao, Jiangang Kong, Yi Yang, Yangdong Deng
The fast-growing large scale language models are delivering unprecedented performance on almost all natural language processing tasks.
2 code implementations • 8 Apr 2024 • Yufeng Yue, Yinan Deng, Jiahui Wang, Yi Yang
Implicit reconstruction of ESDF (Euclidean Signed Distance Field) involves training a neural network to regress the signed distance from any point to the nearest obstacle, which has the advantages of lightweight storage and continuous querying.
no code implementations • 5 Apr 2024 • Wenguan Wang, Yi Yang, Yunhe Pan
Visual knowledge is a new form of knowledge representation that can encapsulate visual concepts and their relations in a succinct, comprehensive, and interpretable manner, with a deep root in cognitive psychology.
no code implementations • 4 Apr 2024 • Lei Zhang, YuHang Zhou, Yi Yang, Xinbo Gao
Despite providing high-performance solutions for computer vision tasks, the deep neural network (DNN) model has been proved to be extremely vulnerable to adversarial attacks.
no code implementations • 2 Apr 2024 • Tianhao Zhao, Yongcan Chen, Yu Wu, Tianyang Liu, Bo Du, Peilun Xiao, Shi Qiu, Hongda Yang, Guozhen Li, Yi Yang, Yutian Lin
In the first stage, we train a BEV autoencoder to reconstruct the BEV segmentation maps given corrupted noisy latent representation, which urges the decoder to learn fundamental knowledge of typical BEV patterns.
no code implementations • 30 Mar 2024 • Ruijie Quan, Wenguan Wang, Fan Ma, Hehe Fan, Yi Yang
We select the highest-scoring clusters and use their medoid nodes for the next iteration of clustering, until we obtain a hierarchical and informative representation of the protein.
no code implementations • 29 Mar 2024 • Ruijie Quan, Wenguan Wang, Zhibo Tian, Fan Ma, Yi Yang
Reconstructing the viewed images from human brain activity bridges human and computer vision through the Brain-Computer Interface.
1 code implementation • 26 Mar 2024 • Guikun Chen, Xia Li, Yi Yang, Wenguan Wang
In this work, we propose feature extraction with clustering (FEC), a conceptually elegant yet surprisingly ad-hoc interpretable neural clustering framework, which views feature extraction as a process of selecting representatives from data and thus automatically captures the underlying data distribution.
1 code implementation • 25 Mar 2024 • Yuhang Ding, Liulei Li, Wenguan Wang, Yi Yang
}$ This enables knowledge acquired from prior slices to assist in the segmentation of the current slice, further efficiently bridging the communication between remote slices using mere 2D networks.
no code implementations • 24 Mar 2024 • Yucheng Suo, Fan Ma, Linchao Zhu, Yi Yang
The pseudo-word tokens generated in this stream are explicitly aligned with fine-grained semantics in the text embedding space.
no code implementations • 24 Mar 2024 • Xiangpeng Yang, Linchao Zhu, Hehe Fan, Yi Yang
We find that the crux of the issue stems from the imprecise distribution of attention weights across designated regions, including inaccurate text-to-attribute control and attention leakage.
no code implementations • 24 Mar 2024 • Zhuoyi Peng, Yi Yang
We study the patent phrase similarity inference task, which measures the semantic similarity between two patent phrases.
no code implementations • 23 Mar 2024 • Shuai Zhao, Linchao Zhu, Ruijie Quan, Yi Yang
These concealed passphrases in user documents, referred to as \textit{ghost sentences}, once they are identified in the generated content of LLMs, users can be sure that their data is used for training.
1 code implementation • 22 Mar 2024 • Tuo Feng, Wenguan Wang, Fan Ma, Yi Yang
Consequently, it is essential to develop LiDAR perception methods that are both efficient and effective.
1 code implementation • 22 Mar 2024 • Lei Zhang, Xiaowei Fu, Fuxiang Huang, Yi Yang, Xinbo Gao
Person re-identification (ReID) has made great strides thanks to the data-driven deep learning techniques.
no code implementations • 21 Mar 2024 • Jiaxin Liu, Yi Yang, Kar Yan Tam
In this paper, we introduce the Financial-STS task, a financial domain-specific NLP task designed to measure the nuanced semantic similarity between pairs of financial narratives.
1 code implementation • 21 Mar 2024 • Rui Liu, Wenguan Wang, Yi Yang
To achieve a comprehensive 3D representation with fine-grained details, we introduce a Volumetric Environment Representation (VER), which voxelizes the physical world into structured 3D cells.
1 code implementation • 14 Mar 2024 • Yinan Deng, Jiahui Wang, Jingyu Zhao, Xinyu Tian, Guangyan Chen, Yi Yang, Yufeng Yue
In this work, we propose OpenGraph, the first open-vocabulary hierarchical graph representation designed for large-scale outdoor environments.
1 code implementation • 10 Mar 2024 • Wenhao Wang, Yi Yang
However, Sora, along with other text-to-video diffusion models, is highly reliant on prompts, and there is no publicly available dataset that features a study of text-to-video prompts.
no code implementations • 6 Mar 2024 • Xiangquan Gui, Binxuan Zhang, Li Li, Yi Yang
To solve such problems, in this paper, we (1) propose DLP-GAN (Draw Modern Chinese Landscape Photos with Generative Adversarial Network), an unsupervised cross-domain image translation framework with a novel asymmetric cycle mapping, and (2) introduce a generator based on a dense-fusion module to match different translation directions.
1 code implementation • 5 Mar 2024 • Miaomiao Li, Jiaqi Zhu, Yang Wang, Yi Yang, Yilin Li, Hongan Wang
Weakly supervised text classification (WSTC), also called zero-shot or dataless text classification, has attracted increasing attention due to its applicability in classifying a mass of texts within the dynamic and open Web environment, since it requires only a limited set of seed words (label names) for each category instead of labeled data.
no code implementations • 15 Feb 2024 • Hanyu Duan, Yi Yang, Kar Yan Tam
More specifically, we check whether and how an LLM reacts differently in its hidden states when it answers a question right versus when it hallucinates.
no code implementations • 15 Feb 2024 • Chao Wang, Hehe Fan, Ruijie Quan, Yi Yang
The protein first undergoes protein encoders and PLP-former to produce protein embeddings, which are then projected by the adapter to conform with the LLM.
no code implementations • 9 Feb 2024 • Zhenglin Zhou, Fan Ma, Hehe Fan, Yi Yang
Specifically, we incorporate the FLAME into both 3D representation and score distillation: 1) FLAME-based 3D Gaussian splatting, driving 3D Gaussian points by rigging each point to a FLAME mesh.
1 code implementation • 8 Feb 2024 • Dewei Zhou, You Li, Fan Ma, Xiaoting Zhang, Yi Yang
Lastly, we aggregate all the shaded instances to provide the necessary information for accurately generating multiple instances in stable diffusion (SD).
Ranked #1 on Conditional Text-to-Image Synthesis on COCO-MIG
1 code implementation • 5 Feb 2024 • Sheng Luo, Wei Chen, Wanxin Tian, Rui Liu, Luanxuan Hou, Xiubao Zhang, Haifeng Shen, Ruiqi Wu, Shuyi Geng, Yi Zhou, Ling Shao, Yi Yang, Bojun Gao, Qun Li, Guobin Wu
Foundation models have indeed made a profound impact on various fields, emerging as pivotal components that significantly shape the capabilities of intelligent systems.
1 code implementation • 1 Feb 2024 • Chao Liang, Fan Ma, Linchao Zhu, Yingying Deng, Yi Yang
Moreover, we introduce the 3D facial prior to equip our model with control over the human head in a flexible and 3D-consistent manner.
2 code implementations • 1 Feb 2024 • Carl Doersch, Yi Yang, Dilara Gokay, Pauline Luc, Skanda Koppula, Ankush Gupta, Joseph Heyward, Ross Goroshin, João Carreira, Andrew Zisserman
To endow models with greater understanding of physics and motion, it is useful to enable them to perceive how solid surfaces move and deform in real scenes.
no code implementations • 31 Jan 2024 • Xu Zhang, Yiming Mo, Wenguan Wang, Yi Yang
As a response, we exploit easy-to-access unpaired data (i. e., one component of product-reactant(s) pair) for generating in-silico paired data to facilitate model training.
2 code implementations • 29 Jan 2024 • Qingwen Zhang, Yi Yang, Heng Fang, Ruoyu Geng, Patric Jensfelt
Scene flow estimation determines a scene's 3D motion field, by predicting the motion of points in the scene, especially for aiding tasks in autonomous driving.
Ranked #1 on Scene Flow Estimation on Argoverse 2
1 code implementation • 27 Jan 2024 • Yixuan Tang, Yi Yang
We hope MultiHop-RAG will be a valuable resource for the community in developing effective RAG systems, thereby facilitating greater adoption of LLMs in practice.
no code implementations • 23 Jan 2024 • Kexin Li, Tao Jiang, Zongxin Yang, Yi Yang, Yueting Zhuang, Jun Xiao
Interactive Video Object Segmentation (iVOS) is a challenging task that requires real-time human-computer interaction.
Interactive Video Object Segmentation Semantic Segmentation +1
no code implementations • 20 Jan 2024 • Yanlong Zang, Han Yang, Jiaxu Miao, Yi Yang
Image-based virtual try-on systems, which fit new garments onto human portraits, are gaining research attention. An ideal pipeline should preserve the static features of clothes(like textures and logos)while also generating dynamic elements(e. g. shadows, folds)that adapt to the model's pose and environment. Previous works fail specifically in generating dynamic features, as they preserve the warped in-shop clothes trivially with predicted an alpha mask by composition. To break the dilemma of over-preserving and textures losses, we propose a novel diffusion-based Product-level virtual try-on pipeline,\ie PLTON, which can preserve the fine details of logos and embroideries while producing realistic clothes shading and wrinkles. The main insights are in three folds:1)Adaptive Dynamic Rendering:We take a pre-trained diffusion model as a generative prior and tame it with image features, training a dynamic extractor from scratch to generate dynamic tokens that preserve high-fidelity semantic information.
1 code implementation • 19 Jan 2024 • Xiangpeng Yang, Linchao Zhu, Xiaohan Wang, Yi Yang
(2) Equipping the visual and text encoder with separated prompts failed to mitigate the visual-text modality gap.
1 code implementation • 16 Jan 2024 • Zongxin Yang, Guikun Chen, Xiaodi Li, Wenguan Wang, Yi Yang
Considering the video modality better reflects the ever-changing nature of real-world scenarios, we exemplify DoraemonGPT as a video agent.
no code implementations • 12 Jan 2024 • Yuanzhi Liang, Linchao Zhu, Yi Yang
To address this challenge, we introduce the Multi-Agent Interaction Evaluation Framework (AntEval), encompassing a novel interaction framework and evaluation methods.
1 code implementation • 8 Jan 2024 • Chuyang Zhao, Yifan Sun, Wenhao Wang, Qiang Chen, Errui Ding, Yi Yang, Jingdong Wang
The traditional training procedure using one-to-one supervision in the original DETR lacks direct supervision for the object detection candidates.
no code implementations • 1 Jan 2024 • Xiao Pan, Zongxin Yang, Shuai Bai, Yi Yang
Targeting these issues, we propose the GD$^2$-NeRF, a Generative Detail compensation framework via GAN and Diffusion that is both inference-time finetuning-free and with vivid plausible details.
1 code implementation • 23 Dec 2023 • MingWei Li, Jiachen Tao, Zongxin Yang, Yi Yang
In this paper, we introduce Human101, a novel framework adept at producing high-fidelity dynamic 3D human reconstructions from 1-view videos by training 3D Gaussians in 100 seconds and rendering in 100+ FPS.
no code implementations • 18 Dec 2023 • Zhihao Zhu, Rui Fan, Chenwang Wu, Yi Yang, Defu Lian, Enhong Chen
Some adversarial attacks have achieved model stealing attacks against recommender systems, to some extent, by collecting abundant training data of the target model (target data) or making a mass of queries.
no code implementations • 18 Dec 2023 • Zhihao Zhu, Chenwang Wu, Rui Fan, Yi Yang, Defu Lian, Enhong Chen
Recent research demonstrates that GNNs are vulnerable to the model stealing attack, a nefarious endeavor geared towards duplicating the target model via query permissions.
no code implementations • 13 Dec 2023 • Yuanyou Xu, Zongxin Yang, Yi Yang
For geometry, we propose to constrain the optimized avatar in a decent global shape with a template avatar.
no code implementations • 12 Dec 2023 • Fan Ma, Xiaojie Jin, Heng Wang, Yuchen Xian, Jiashi Feng, Yi Yang
This amplifies the effect of visual tokens on text generation, especially when the relative distance is longer between visual and text tokens.
Ranked #6 on Zero-Shot Video Question Answer on MSRVTT-QA
1 code implementation • 11 Dec 2023 • Sarin Chandy, Varun Gangal, Yi Yang, Gabriel Maggiotti
DYAD is based on a bespoke near-sparse matrix structure which approximates the dense "weight" matrix W that matrix-multiplies the input in the typical realization of such a layer, a. k. a DENSE.
no code implementations • 10 Dec 2023 • Zechuan Zhang, Zongxin Yang, Yi Yang
A key limitation of previous methods is their insufficient prior guidance in transitioning from 2D to 3D and in texture prediction.
no code implementations • 1 Dec 2023 • João Carreira, Michael King, Viorica Pătrăucean, Dilara Gokay, Cătălin Ionescu, Yi Yang, Daniel Zoran, Joseph Heyward, Carl Doersch, Yusuf Aytar, Dima Damen, Andrew Zisserman
We introduce a framework for online learning from a single continuous video stream -- the way people and animals learn, without mini-batches, data augmentation or shuffling.
no code implementations • 29 Nov 2023 • Jianfeng Zhang, Xuanmeng Zhang, Huichao Zhang, Jun Hao Liew, Chenxu Zhang, Yi Yang, Jiashi Feng
We study the problem of creating high-fidelity and animatable 3D avatars from only textual descriptions.
no code implementations • 27 Nov 2023 • Yu Lu, Linchao Zhu, Hehe Fan, Yi Yang
Text-to-video (T2V) generation is a rapidly growing research area that aims to translate the scenes, objects, and actions within complex video text into a sequence of coherent visual frames.
no code implementations • 23 Nov 2023 • Hao Feng, Yi Yang, Zhu Han
Experimental results suggest that the proposed method surpasses the baseline in perceiving vehicles in blind spots and effectively compresses communication data.
no code implementations • 21 Nov 2023 • Mu Chen, Zhedong Zheng, Yi Yang
Based on such observation, we propose a depth-aware framework to explicitly leverage depth estimation to mix the categories and facilitate the two complementary tasks, i. e., segmentation and depth learning in an end-to-end manner.
no code implementations • 20 Nov 2023 • Zhichao Zuo, Zhao Zhang, Yan Luo, Yang Zhao, Haijun Zhang, Yi Yang, Meng Wang
This paper presents a novel framework termed Cut-and-Paste for real-word semantic video editing under the guidance of text prompt and additional reference image.
1 code implementation • 20 Nov 2023 • Zhiyuan Min, Yawei Luo, Wei Yang, Yuesong Wang, Yi Yang
Different from existing methods that consider cross-view and along-epipolar information independently, EVE-NeRF conducts the view-epipolar feature aggregation in an entangled manner by injecting the scene-invariant appearance continuity and geometry consistency priors to the aggregation process.
Ranked #1 on Generalizable Novel View Synthesis on Shiny dataset
no code implementations • 20 Nov 2023 • Yanyan Wei, Zhao Zhang, Jiahuan Ren, Xiaogang Xu, Richang Hong, Yi Yang, Shuicheng Yan, Meng Wang
The generalization capability of existing image restoration and enhancement (IRE) methods is constrained by the limited pre-trained datasets, making it difficult to handle agnostic inputs such as different degradation levels and scenarios beyond their design scopes.
no code implementations • 17 Nov 2023 • Yi Yang, Hanyu Duan, Ahmed Abbasi, John P. Lalor, Kar Yan Tam
Although a burgeoning literature has emerged on stereotypical bias mitigation in PLMs, such as work on debiasing gender and racial stereotyping, how such biases manifest and behave internally within PLMs remains largely unknown.
no code implementations • 17 Nov 2023 • Hanyu Duan, Yixuan Tang, Yi Yang, Ahmed Abbasi, Kar Yan Tam
In this work, we explore the relationship between ICL and IT by examining how the hidden states of LLMs change in these two paradigms.
1 code implementation • 14 Nov 2023 • Yi Yang, Qingwen Zhang, Ci Li, Daniel Simões Marta, Nazre Batool, John Folkesson
The evolution of autonomous driving has made remarkable advancements in recent years, evolving into a tangible reality.
no code implementations • 27 Oct 2023 • Yucheng Suo, Linchao Zhu, Yi Yang
This task aims to identify the instance mask that is most related to a referring expression without training on pixel-level annotations.
no code implementations • 25 Oct 2023 • Zizhao Zhang, Yi Yang, Lutong Zou, He Wen, Tao Feng, Jiaxuan You
Benefiting from high-quality datasets and standardized evaluation metrics, machine learning (ML) has achieved sustained progress and widespread applications.
no code implementations • 22 Oct 2023 • Hao Di, Yi Yang, Haishan Ye, Xiangyu Chang
Personalization aims to characterize individual preferences and is widely applied across many fields.
1 code implementation • 19 Oct 2023 • Yixuan Tang, Yi Yang, Allen H Huang, Andy Tam, Justin Z Tang
In this work, we introduce an entity-level sentiment classification dataset, called \textbf{FinEntity}, that annotates financial entity spans and their sentiment (positive, neutral, and negative) in financial news.
no code implementations • 19 Oct 2023 • Yue Guo, Chenxi Hu, Yi Yang
Temporal data distribution shift is prevalent in the financial text.
no code implementations • 19 Oct 2023 • Yue Guo, Zian Xu, Yi Yang
This study compares the performance of encoder-only language models and the decoder-only language models.
1 code implementation • 19 Oct 2023 • Barrett Martin Lattimer, Patrick Chen, Xinyuan Zhang, Yi Yang
We introduce SCALE (Source Chunking Approach for Large-scale inconsistency Evaluation), a task-agnostic model for detecting factual inconsistencies using a novel chunking strategy.
no code implementations • 16 Oct 2023 • Chao Liang, Linchao Zhu, Humphrey Shi, Yi Yang
Sample selection is an effective way to deal with label noise.
no code implementations • 16 Oct 2023 • Zhe Wang, Petar Veličković, Daniel Hennes, Nenad Tomašev, Laurel Prince, Michael Kaisers, Yoram Bachrach, Romuald Elie, Li Kevin Wenliang, Federico Piccinini, William Spearman, Ian Graham, Jerome Connor, Yi Yang, Adrià Recasens, Mina Khan, Nathalie Beauguerlange, Pablo Sprechmann, Pol Moreno, Nicolas Heess, Michael Bowling, Demis Hassabis, Karl Tuyls
The utility of TacticAI is validated by a qualitative study conducted with football domain experts at Liverpool FC.
no code implementations • IEEE Transactions on Multimedia 2023 • Yuanzhi Liang, Linchao Zhu, Xiaohan Wang, Yi Yang
Video captioning is a more challenging task compared to image captioning, primarily due to differences in content density.
Ranked #5 on Video Captioning on VATEX (using extra training data)
no code implementations • ICCV 2023 • Xuanmeng Zhang, Jianfeng Zhang, Rohan Chacko, Hongyi Xu, Guoxian Song, Yi Yang, Jiashi Feng
We study the problem of 3D-aware full-body human generation, aiming at creating animatable human avatars with high-quality textures and geometries.
no code implementations • ICCV 2023 • Liulei Li, Wenguan Wang, Yi Yang
Current high-performance semantic segmentation models are purely data-driven sub-symbolic approaches and blind to the structured nature of the visual world.
1 code implementation • NeurIPS 2023 • Zechuan Zhang, Li Sun, Zongxin Yang, Ling Chen, Yi Yang
Reconstructing 3D clothed human avatars from single images is a challenging task, especially when encountering complex poses and loose clothing.
1 code implementation • 18 Sep 2023 • Kexin Li, Zongxin Yang, Lei Chen, Yi Yang, Jun Xiao
However, existing methods exhibit two limitations: 1) they address video temporal features and audio-visual interactive features separately, disregarding the inherent spatial-temporal dependence of combined audio and video, and 2) they inadequately introduce audio constraints and object-level information during the decoding stage, resulting in segmentation outcomes that fail to comply with audio directives.
1 code implementation • 16 Sep 2023 • Yi Yang, Qingwen Zhang, Thomas Gilles, Nazre Batool, John Folkesson
As the pretraining technique is growing in popularity, little work has been done on pretrained learning-based motion prediction methods in autonomous driving.
1 code implementation • 15 Sep 2023 • Yi Yang, Yixuan Tang, Kar Yan Tam
We present a new financial domain large language model, InvestLM, tuned on LLaMA-65B (Touvron et al., 2023), using a carefully curated instruction dataset related to financial investment.
no code implementations • 14 Sep 2023 • Yu Gao, Lutong Su, Hao Liang, Yufeng Yue, Yi Yang, Mengyin Fu
In this paper, we propose MC-NeRF, a method that enables joint optimization of both intrinsic and extrinsic parameters alongside NeRF.
1 code implementation • 13 Sep 2023 • Dongwei Ren, Wei Shang, Yi Yang, WangMeng Zuo
To aggregate long-term sharp features from detected sharp frames, we utilize a global Transformer with multi-scale matching capability.
1 code implementation • ICCV 2023 • Yuan Gan, Zongxin Yang, Xihang Yue, Lingyun Sun, Yi Yang
Audio-driven talking-head synthesis is a popular research topic for virtual human-related applications.
no code implementations • 10 Sep 2023 • Shuangkang Fang, Yufeng Wang, Yi Yang, Yi-Hsuan Tsai, Wenrui Ding, Shuchang Zhou, Ming-Hsuan Yang
To tackle these issues, we introduce a text-driven editing method, termed DN2N, which allows for the direct acquisition of a NeRF model with universal editing capabilities, eliminating the requirement for retraining.
1 code implementation • 4 Sep 2023 • Yunhong Lou, Linchao Zhu, Yaxiong Wang, Xiaohan Wang, Yi Yang
We present DiverseMotion, a new approach for synthesizing high-quality human motions conditioned on textual descriptions while preserving motion diversity. Despite the recent significant process in text-based human motion generation, existing methods often prioritize fitting training motions at the expense of action diversity.
Ranked #3 on Motion Synthesis on HumanML3D (using extra training data)
no code implementations • 30 Aug 2023 • Mel Vecerik, Carl Doersch, Yi Yang, Todor Davchev, Yusuf Aytar, Guangyao Zhou, Raia Hadsell, Lourdes Agapito, Jon Scholz
For robots to be useful outside labs and specialized factories we need a way to teach them new useful behaviors quickly.
no code implementations • 29 Aug 2023 • Yukun Su, Yi Yang
With the development of information technology, robot technology has made great progress in various fields.
1 code implementation • ICCV 2023 • Yuanyou Xu, Zongxin Yang, Yi Yang
Tracking any given object(s) spatially and temporally is a common purpose in Visual Object Tracking (VOT) and Video Object Segmentation (VOS).
Ranked #11 on Visual Object Tracking on LaSOT
no code implementations • ICCV 2023 • Chen Liang, Wenguan Wang, Jiaxu Miao, Yi Yang
Recent advances in semi-supervised semantic segmentation have been heavily reliant on pseudo labeling to compensate for limited labeled data, disregarding the valuable relational knowledge among semantic concepts.
no code implementations • ICCV 2023 • Jinyu Chen, Wenguan Wang, Si Liu, Hongsheng Li, Yi Yang
CCPD transfers the fundamental, point-to-point wayfinding skill that is well trained on the large-scale PointGoal task to ORAN, so as to help ORAN to better master audio-visual navigation with far fewer training samples.
1 code implementation • ICCV 2023 • Lin Li, Guikun Chen, Jun Xiao, Yi Yang, Chunping Wang, Long Chen
Specifically, we first decompose each relation triplet feature into two components: intrinsic feature and extrinsic feature, which correspond to the intrinsic characteristics and extrinsic contexts of a relation triplet, respectively.
1 code implementation • ICCV 2023 • Rui Liu, Xiaohan Wang, Wenguan Wang, Yi Yang
Vision-language navigation (VLN), which entails an agent to navigate 3D environments following human instructions, has shown great advances.
no code implementations • 31 Jul 2023 • Yue Zhang, Hehe Fan, Yi Yang, Mohan Kankanhalli
The proposed method, named Mixture of Depth and Point cloud video experts (DPMix), achieved the first place in the 4D Action Segmentation Track of the HOI4D Challenge 2023.
1 code implementation • ICCV 2023 • Jiahao Li, Zongxin Yang, Xiaohan Wang, Jianxin Ma, Chang Zhou, Yi Yang
Our method includes an encoder-decoder transformer architecture to fuse 2D and 3D representations for achieving 2D$\&$3D aligned results in a coarse-to-fine manner and a novel 3D joint contrastive learning approach for adding explicitly global supervision for the 3D feature space.
1 code implementation • ICCV 2023 • Tuo Feng, Wenguan Wang, Xiaohan Wang, Yi Yang, Qinghua Zheng
The mined patterns are, in turn, used to repaint the embedding space, so as to respect the underlying distribution of the entire training dataset and improve the robustness to the variations.
1 code implementation • 25 Jul 2023 • Haitian Zeng, Xiaohan Wang, Wenguan Wang, Yi Yang
We introduce a novel speaker model \textsc{Kefa} for navigation instruction generation.
1 code implementation • 24 Jul 2023 • Yuanzhi Liang, Linchao Zhu, Yi Yang
MOE challenges models to understand characters' intentions and accurately determine their actions within intricate contexts involving multi-character and novel object interactions.
no code implementations • ICCV 2023 • Xiao Pan, Zongxin Yang, Jianxin Ma, Chang Zhou, Yi Yang
However, such SPC-based representation i) optimizes under the volatile observation space which leads to the pose-misalignment between training and inference stages, and ii) lacks the global relationships among human parts that is critical for handling the incomplete painted SMPL.
no code implementations • 13 Jul 2023 • Shuo Huang, Zongxin Yang, Liangting Li, Yi Yang, Jia Jia
Large-scale pre-trained vision-language models allow for the zero-shot text-based generation of 3D avatars.
1 code implementation • 10 Jul 2023 • Meng Li, Yahan Yu, Yi Yang, Guanghao Ren, Jian Wang
In this paper, we propose a deep learning-based character stroke extraction method that takes semantic features and prior information of strokes into consideration.
no code implementations • 5 Jul 2023 • Jiahao Li, Yuanyou Xu, Zongxin Yang, Yi Yang, Yueting Zhuang
The Associating Objects with Transformers (AOT) framework has exhibited exceptional performance in a wide range of complex scenarios for video object segmentation.
no code implementations • 5 Jul 2023 • Yuanyou Xu, Jiahao Li, Zongxin Yang, Yi Yang, Yueting Zhuang
MSDeAOT efficiently propagates object masks from previous frames to the current frame using two feature scales of 16 and 8.
1 code implementation • 3 Jul 2023 • Chao Liang, Zongxin Yang, Linchao Zhu, Yi Yang
In real-world scenarios, collected and annotated data often exhibit the characteristics of multiple classes and long-tailed distribution.
no code implementations • 25 Jun 2023 • Zhoufutu Wen, Xinyu Zhao, Zhipeng Jin, Yi Yang, Wei Jia, Xiaodong Chen, Shuanglong Li, Lin Liu
The core of DIA is a query-image matching module performing ad image retrieval and relevance modeling.
1 code implementation • 15 Jun 2023 • Jiayi Shao, Xiaohan Wang, Ruijie Quan, Yi Yang
This report presents ReLER submission to two tracks in the Ego4D Episodic Memory Benchmark in CVPR 2023, including Natural Language Queries and Moment Queries.
Ranked #1 on Moment Queries on Ego4D
1 code implementation • ICCV 2023 • Carl Doersch, Yi Yang, Mel Vecerik, Dilara Gokay, Ankush Gupta, Yusuf Aytar, Joao Carreira, Andrew Zisserman
We present a novel model for Tracking Any Point (TAP) that effectively tracks any queried point on any physical surface throughout a video sequence.
Ranked #1 on Visual Tracking on Kinetics
no code implementations • 10 Jun 2023 • Shuo Huang, Jia Jia, Zongxin Yang, Wei Wang, Haozhe Wu, Yi Yang, Junliang Xing
However, motion interpolation is a more complex problem that takes isolated poses (e. g., only one start pose and one end pose) as input.
no code implementations • 3 Jun 2023 • Xu Zhang, Zhedong Zheng, Xiaohan Wang, Yi Yang
We propose a novel Consensus Network (Css-Net) that self-adaptively learns from noisy triplets to minimize the negative effects of triplet ambiguity.
1 code implementation • 2 Jun 2023 • Xiaoyong Mei, Yi Yang, Ming Li, Changqin Huang, Kai Zhang, Pietro Lió
In this study, we propose a feature reuse framework that guides the step-by-step texture reconstruction process through different stages, reducing the negative impacts of perceptual and adversarial loss.
1 code implementation • 29 May 2023 • Shuai Zhao, Xiaohan Wang, Linchao Zhu, Yi Yang
Given a single test sample, the VLM is forced to maximize the CLIP reward between the input and sampled results from the VLM output distribution.
1 code implementation • 28 May 2023 • Wenjie Zhuo, Yifan Sun, Xiaohan Wang, Linchao Zhu, Yi Yang
Consequently, using multiple positive samples with enhanced diversity further improves contrastive learning due to better alignment.
no code implementations • ICCV 2023 • Jiayi Shao, Xiaohan Wang, Ruijie Quan, Junjun Zheng, Jiang Yang, Yi Yang
Temporal action localization (TAL), which involves recognizing and locating action instances, is a challenging task in video understanding.
Ranked #9 on Temporal Action Localization on THUMOS’14
no code implementations • 24 May 2023 • Feifei Shao, Yawei Luo, Lei Chen, Ping Liu, Wei Yang, Yi Yang, Jun Xiao
In this paper, we conduct a thorough causal analysis to investigate the origins of biased activation.
2 code implementations • NeurIPS 2023 • Viorica Pătrăucean, Lucas Smaira, Ankush Gupta, Adrià Recasens Continente, Larisa Markeeva, Dylan Banarse, Skanda Koppula, Joseph Heyward, Mateusz Malinowski, Yi Yang, Carl Doersch, Tatiana Matejovicova, Yury Sulsky, Antoine Miech, Alex Frechette, Hanna Klimczak, Raphael Koster, Junlin Zhang, Stephanie Winkler, Yusuf Aytar, Simon Osindero, Dima Damen, Andrew Zisserman, João Carreira
We propose a novel multimodal video benchmark - the Perception Test - to evaluate the perception and reasoning skills of pre-trained multimodal models (e. g. Flamingo, SeViLA, or GPT-4).
1 code implementation • 23 May 2023 • Shuai Zhao, Ruijie Quan, Linchao Zhu, Yi Yang
With such merits, we transform CLIP into a scene text reader and introduce CLIP4STR, a simple yet effective STR method built upon image and text encoders of CLIP.
Ranked #1 on Scene Text Recognition on Uber-Text
1 code implementation • 22 May 2023 • Kezhou Lin, Xiaohan Wang, Linchao Zhu, Ke Sun, Bang Zhang, Yi Yang
In this paper, we tackle the problem of sign language translation (SLT) without gloss annotations.
1 code implementation • 22 May 2023 • Jinliang Deng, Xiusi Chen, Renhe Jiang, Du Yin, Yi Yang, Xuan Song, Ivor W. Tsang
The core issue in MTS forecasting is how to effectively model complex spatial-temporal patterns.
Ranked #1 on Time Series Forecasting on Weather (96)
no code implementations • 22 May 2023 • Xingjian He, Sihan Chen, Fan Ma, Zhicheng Huang, Xiaojie Jin, Zikang Liu, Dongmei Fu, Yi Yang, Jing Liu, Jiashi Feng
Towards this goal, we propose a novel video-text pre-training method dubbed VLAB: Video Language pre-training by feature Adapting and Blending, which transfers CLIP representations to video pre-training tasks and develops unified video multimodal models for a wide range of video-text tasks.
Ranked #1 on Visual Question Answering (VQA) on MSVD-QA (using extra training data)
1 code implementation • 20 May 2023 • Yi Yang, Hejie Cui, Carl Yang
The human brain is the central hub of the neurobiological system, controlling behavior and cognition in complex ways.
1 code implementation • NeurIPS 2023 • Guangyan Chen, Meiling Wang, Yi Yang, Kai Yu, Li Yuan, Yufeng Yue
Large language models (LLMs) based on the generative pre-training transformer (GPT) have demonstrated remarkable effectiveness across a diverse range of downstream tasks.
Ranked #3 on 3D Point Cloud Classification on ScanObjectNN (using extra training data)
1 code implementation • 17 May 2023 • Dewei Zhou, Zongxin Yang, Yi Yang
Recovering noise-covered details from low-light images is challenging, and the results given by previous methods leave room for improvement.
Ranked #6 on Low-Light Image Enhancement on LOL
1 code implementation • 11 May 2023 • Yangming Cheng, Liulei Li, Yuanyou Xu, Xiaodi Li, Zongxin Yang, Wenguan Wang, Yi Yang
This report presents a framework called Segment And Track Anything (SAMTrack) that allows users to precisely and effectively segment and track any object in a video.
2 code implementations • 8 May 2023 • Yuanyou Xu, Zongxin Yang, Yi Yang
Considering the challenges in panoptic VOS, we propose a strong baseline method named panoptic object association with transformers (PAOT), which uses panoptic identification to associate objects with a pyramid architecture on multiple scales.
1 code implementation • 26 Apr 2023 • Bingqian Lin, Zicong Chen, Mingjie Li, Haokun Lin, Hang Xu, Yi Zhu, Jianzhuang Liu, Wenjia Cai, Lei Yang, Shen Zhao, Chenfei Wu, Ling Chen, Xiaojun Chang, Yi Yang, Lei Xing, Xiaodan Liang
In MOTOR, we combine two kinds of basic medical knowledge, i. e., general and specific knowledge, in a complementary manner to boost the general pretraining process.
2 code implementations • 20 Apr 2023 • Wenhao Wang, Yifan Sun, Yi Yang
Video Copy Detection (VCD) has been developed to identify instances of unauthorized or duplicated video content.
no code implementations • CVPR 2023 • Zongheng Tang, Yifan Sun, Si Liu, Yi Yang
Second, through our design, the object queries and the foreground query in the decoder share consensus on the class semantics, therefore making the strong and weak supervision mutually benefit each other for domain alignment.
no code implementations • CVPR 2023 • Yaowei Li, Ruijie Quan, Linchao Zhu, Yi Yang
Large-scale pre-training has brought unimodal fields such as computer vision and natural language processing to a new era.
1 code implementation • NeurIPS 2023 • Wenhao Wang, Yifan Sun, Wei Li, Yi Yang
This paper explores a hierarchical prompting mechanism for the hierarchical image classification (HIC) task.
1 code implementation • 8 Apr 2023 • Shuangkang Fang, Yufeng Wang, Yi Yang, Weixin Xu, Heng Wang, Wenrui Ding, Shuchang Zhou
To address this limitation and maximize the potential of each architecture, we propose Progressive Volume Distillation with Active Learning (PVD-AL), a systematic distillation method that enables any-to-any conversions between different architectures.
1 code implementation • 6 Apr 2023 • Jiancan Wu, Yi Yang, Yuchun Qian, Yongduo Sui, Xiang Wang, Xiangnan He
Then, we recognize the crux to the inability of traditional influence function for graph unlearning, and devise Graph Influence Function (GIF), a model-agnostic unlearning method that can efficiently and accurately estimate parameter changes in response to a $\epsilon$-mass perturbation in deleted data.
2 code implementations • CVPR 2023 • Wei Shang, Dongwei Ren, Yi Yang, Hongzhi Zhang, Kede Ma, WangMeng Zuo
Moreover, on the seemingly implausible x16 interpolation task, our method outperforms existing methods by more than 1. 5 dB in terms of PSNR.
1 code implementation • CVPR 2023 • Xiaolong Shen, Zongxin Yang, Xiaohan Wang, Jianxin Ma, Chang Zhou, Yi Yang
However, using a single kind of modeling structure is difficult to balance the learning of short-term and long-term temporal correlations, and may bias the network to one of them, leading to undesirable predictions like global location shift, temporal inconsistency, and insufficient local details.
Ranked #46 on 3D Human Pose Estimation on 3DPW
no code implementations • 26 Mar 2023 • Xihan Wang, Xi Xu, Yu Gao, Yi Yang, Yufeng Yue, Mengyin Fu
Compared with the previous work for muti-point representation, the experiments show that CRRS can improve the training performance both in accurate and stability.
no code implementations • 26 Mar 2023 • Dianyi Yang, Jiadong Tang, Yu Gao, Yi Yang, Mengyin Fu
And this fact leads to poor performance on some fisheye vision tasks.
1 code implementation • 23 Mar 2023 • Wenqing Wang, Yawei Luo, Zhiqing Chen, Tao Jiang, Lei Chen, Yi Yang, Jun Xiao
Specifically, DLL decouples the predicate labels and adopts separate classifiers to learn actional and spatial patterns respectively.
Ranked #1 on Video scene graph generation on ImageNet-VidVRD
no code implementations • 20 Mar 2023 • Xingchen Li, Long Chen, Guikun Chen, Yinfu Feng, Yi Yang, Jun Xiao
To this end, we propose a novel Decomposed Prototype Learning (DPL).
1 code implementation • 18 Mar 2023 • Fanglei Xue, Yifan Sun, Yi Yang
This paper explores an expression-related self-supervised learning (SSL) method (ContraWarping) to perform expression classification in the 5th Affective Behavior Analysis in-the-wild (ABAW) competition.
1 code implementation • CVPR 2023 • Liulei Li, Wenguan Wang, Tianfei Zhou, Jianwu Li, Yi Yang
The objective of this paper is self-supervised learning of video object segmentation.
1 code implementation • 16 Mar 2023 • Fanglei Xue, Yifan Sun, Yi Yang
Therefore, given a facial image, ContraWarping employs some global transformations and local warping to generate its positive and negative samples and sets up a novel contrastive learning framework.
1 code implementation • CVPR 2023 • Xiaohan Wang, Wenguan Wang, Jiayi Shao, Yi Yang
Recently, visual-language navigation (VLN) -- entailing robot agents to follow navigation instructions -- has shown great advance.
1 code implementation • 6 Mar 2023 • Wei Li, Linchao Zhu, Longyin Wen, Yi Yang
This decoder is both data-efficient and computation-efficient: 1) it only requires the text data for training, easing the burden on the collection of paired data.
no code implementations • 1 Mar 2023 • Jingli Shi, Weihua Li, Quan Bai, Yi Yang, Jianhua Jiang
Aspect term extraction is a fundamental task in fine-grained sentiment analysis, which aims at detecting customer's opinion targets from reviews on product or service.
no code implementations • 22 Jan 2023 • Juncheng Li, Siliang Tang, Linchao Zhu, Wenqiao Zhang, Yi Yang, Tat-Seng Chua, Fei Wu, Yueting Zhuang
To systematically benchmark the compositional generalizability of temporal grounding models, we introduce a new Compositional Temporal Grounding task and construct two new dataset splits, i. e., Charades-CG and ActivityNet-CG.
no code implementations • 18 Jan 2023 • Fan Ma, Xiaojie Jin, Heng Wang, Jingjia Huang, Linchao Zhu, Jiashi Feng, Yi Yang
Specifically, text-video localization consists of moment retrieval, which predicts start and end boundaries in videos given the text description, and text localization which matches the subset of texts with the video features.
no code implementations • 17 Jan 2023 • Yu Gao, Xi Xu, Tianji Jiang, Siyuan Chen, Yi Yang, Yufeng Yue, Mengyin Fu
For example, 2D object detection usually requires a large amount of 2D annotation data with high cost.
1 code implementation • 3 Jan 2023 • Zhen Yao, Wen Zhang, Mingyang Chen, Yufeng Huang, Yi Yang, Huajun Chen
And in AnKGE, we train an analogy function for each level of analogical inference with the original element embedding from a well-trained KGE model as input, which outputs the analogical object embedding.
1 code implementation • 3 Jan 2023 • Feifei Shao, Yawei Luo, Fei Gao, Yi Yang, Jun Xiao
Previous weakly-supervised object localization (WSOL) methods aim to expand activation map discriminative areas to cover the whole objects, yet neglect two inherent challenges when relying solely on image-level labels.
no code implementations • CVPR 2023 • Jiaxu Miao, Zongxin Yang, Leilei Fan, Yi Yang
In this work, we propose FedSeg, a basic federated learning approach for class-heterogeneous semantic segmentation.
no code implementations • CVPR 2023 • Guangrui Li, Guoliang Kang, Xiaohan Wang, Yunchao Wei, Yi Yang
With the help of adversarial training, the masking module can learn to generate source masks to mimic the pattern of irregular target noise, thereby narrowing the domain gap.
no code implementations • CVPR 2023 • Hehe Fan, Linchao Zhu, Yi Yang, Mohan Kankanhalli
Deep neural networks on regular 1D lists (e. g., natural languages) and irregular 3D sets (e. g., point clouds) have made tremendous achievements.
1 code implementation • ICCV 2023 • Guangyan Chen, Meiling Wang, Li Yuan, Yi Yang, Yufeng Yue
In this paper, a critical observation is made that the invisible parts of each point cloud can be directly utilized as inherent masks, and the aligned point cloud pair can be regarded as the reconstruction target.
1 code implementation • ICCV 2023 • Liangqi Li, Jiaxu Miao, Dahu Shi, Wenming Tan, Ye Ren, Yi Yang, ShiLiang Pu
Current methods for open-vocabulary object detection (OVOD) rely on a pre-trained vision-language model (VLM) to acquire the recognition ability.
1 code implementation • ICCV 2023 • Heng Zhao, Shenxing Wei, Dahu Shi, Wenming Tan, Zheyang Li, Ye Ren, Xing Wei, Yi Yang, ShiLiang Pu
Taking the symmetry properties of objects into consideration, we design a symmetry-aware matching loss to facilitate the learning of dense point-wise geometry features and improve the performance considerably.
no code implementations • CVPR 2023 • Tianyi Ma, Yifan Sun, Zongxin Yang, Yi Yang
Based on these two common practices, the key point of ProD is using the prompting mechanism in the transformer to disentangle the domain-general (DG) and domain-specific (DS) knowledge from the backbone feature.
1 code implementation • CVPR 2023 • Chao Wang, Zhedong Zheng, Ruijie Quan, Yifan Sun, Yi Yang
(2) The conventional paradigm usually focuses on mining the abnormal pattern of a superimposed image to separate the noise, which de facto conflicts with the primary image restoration task.
1 code implementation • ICCV 2023 • Yuanzhi Liang, Xiaohan Wang, Linchao Zhu, Yi Yang
Experimental results and visualizations, based on a large-scale dataset PartNet-Mobility, show the effectiveness of MAAL in learning multi-modal data and solving the 3D articulated object affordance problem.
5 code implementations • CVPR 2023 • Wenhao Wu, Xiaohan Wang, Haipeng Luo, Jingdong Wang, Yi Yang, Wanli Ouyang
In this paper, we propose a novel framework called BIKE, which utilizes the cross-modal bridge to explore bidirectional knowledge: i) We introduce the Video Attribute Association mechanism, which leverages the Video-to-Text knowledge to generate textual auxiliary attributes for complementing video recognition.
Ranked #1 on Zero-Shot Action Recognition on ActivityNet
no code implementations • 25 Dec 2022 • Xiaolong Shen, Zhedong Zheng, Yi Yang
As its name suggests, it is made up of two modules: Part-level Spatial Modeling and Part-level Temporal Modeling.
1 code implementation • CVPR 2023 • Difei Gao, Luowei Zhou, Lei Ji, Linchao Zhu, Yi Yang, Mike Zheng Shou
To build Video Question Answering (VideoQA) systems capable of assisting humans in daily activities, seeking answers from long-form videos with diverse and complex events is a must.
Ranked #2 on Video Question Answering on AGQA 2.0 balanced
1 code implementation • 29 Nov 2022 • Shuangkang Fang, Weixin Xu, Heng Wang, Yi Yang, Yufeng Wang, Shuchang Zhou
In this paper, we propose Progressive Volume Distillation (PVD), a systematic distillation method that allows any-to-any conversions between different architectures, including MLP, sparse or low-rank tensors, hashtables and their compositions.
Ranked #1 on Novel View Synthesis on NeRF (Average PSNR metric)
1 code implementation • IEEE Transactions on Neural Networks and Learning Systems 2022 • Yuanzhi Liang, Linchao Zhu, Xiaohan Wang, Yi Yang
Second, we instantiate the loss function and provide a strong baseline for FGVC, where the performance of a naive backbone can be boosted and be comparable with recent methods.
Ranked #28 on Fine-Grained Image Classification on CUB-200-2011
Fine-Grained Image Classification Fine-Grained Visual Recognition
no code implementations • 19 Nov 2022 • Yi Yang, Zhong-Qiu Zhao, Quan Bai, Qing Liu, Weihua Li
Due to the dynamic nature, the proposed algorithms can also estimate true labels online without re-visiting historical data.
no code implementations • 18 Nov 2022 • Yanyan Wei, Zhao Zhang, ZhongQiu Zhao, Yang Zhao, Richang Hong, Yi Yang
Stereo images, containing left and right view images with disparity, are utilized in solving low-vision tasks recently, e. g., rain removal and super-resolution.
1 code implementation • 17 Nov 2022 • Jiayi Shao, Xiaohan Wang, Yi Yang
Moreover, in order to better capture the long-term temporal dependencies in the long videos, we propose a segment-level recurrence mechanism.
1 code implementation • 15 Nov 2022 • Leilei Gan, Baokui Li, Kun Kuang, Yating Zhang, Lei Wang, Luu Anh Tuan, Yi Yang, Fei Wu
Given the fact description text of a legal case, legal judgment prediction (LJP) aims to predict the case's charge, law article and penalty term.
1 code implementation • 14 Nov 2022 • Mu Chen, Zhedong Zheng, Yi Yang, Tat-Seng Chua
In an attempt to fill this gap, we propose a unified pixel- and patch-wise self-supervised learning framework, called PiPa, for domain adaptive semantic segmentation that facilitates intra-image pixel-wise correlations and patch-wise semantic consistency against different contexts.
Ranked #1 on Semantic Segmentation on SYNTHIA-to-Cityscapes
no code implementations • 11 Nov 2022 • Yong Hong, Deren Li, Shupei Luo, Xin Chen, Yi Yang, Mi Wang
This study proposes an improved end-to-end multi-target tracking algorithm that adapts to multi-view multi-scale scenes based on the self-attentive mechanism of the transformer's encoder-decoder structure.
no code implementations • 10 Nov 2022 • Tingyu Wang, Zhedong Zheng, Zunjie Zhu, Yuhan Gao, Yi Yang, Chenggang Yan
Cross-view geo-localization aims to spot images of the same location shot from two platforms, e. g., the drone platform and the satellite platform.
no code implementations • IEEE Transactions on Pattern Analysis and Machine Intelligence 2022 • Chuchu Han, Zhedong Zheng, Kai Su, Dongdong Yu, Zehuan Yuan, Changxin Gao, Nong Sang, Yi Yang
Person search aims at localizing and recognizing query persons from raw video frames, which is a combination of two sub-tasks, i. e., pedestrian detection and person re-identification.
Ranked #3 on Person Search on PRW
no code implementations • 9 Nov 2022 • Zhao Zhang, Suiyi Zhao, Xiaojie Jin, Mingliang Xu, Yi Yang, Shuicheng Yan
In this paper, we present an embarrassingly simple yet effective solution to a seemingly impossible mission, low-light image enhancement (LLIE) without access to any task-related data.
3 code implementations • 7 Nov 2022 • Carl Doersch, Ankush Gupta, Larisa Markeeva, Adrià Recasens, Lucas Smaira, Yusuf Aytar, João Carreira, Andrew Zisserman, Yi Yang
Generic motion understanding from video involves not only tracking objects, but also perceiving how their surfaces deform and move.
no code implementations • 5 Nov 2022 • Zhe Liu, Yun Li, Lina Yao, Xiaojun Chang, Wei Fang, XiaoJun Wu, Yi Yang
We design Semantic Attention (SA) and generative Knowledge Disentanglement (KD) to learn the dependence of feasibility and contextuality, respectively.
no code implementations • 2 Nov 2022 • Huan Zheng, Zhao Zhang, Jicong Fan, Richang Hong, Yi Yang, Shuicheng Yan
Specifically, we present a decoupled interaction module (DIM) that aims for sufficient dual-view information interaction.
no code implementations • 28 Oct 2022 • Wenguan Wang, Yi Yang, Fei Wu
Neural-symbolic computing (NeSy), which pursues the integration of the symbolic and statistical paradigms of cognition, has been an active research area of Artificial Intelligence (AI) for many years.
1 code implementation • 20 Oct 2022 • Zhuo Chen, Wen Zhang, Yufeng Huang, Mingyang Chen, Yuxia Geng, Hongtao Yu, Zhen Bi, Yichi Zhang, Zhen Yao, Wenting Song, Xinliang Wu, Yi Yang, Mingyi Chen, Zhaoyang Lian, YingYing Li, Lei Cheng, Huajun Chen
In this work, we share our experience on tele-knowledge pre-training for fault analysis, a crucial task in telecommunication applications that requires a wide range of knowledge normally found in both machine log data and product documents.
1 code implementation • Deep Mind 2022 • Viorica Pătrăucean, Lucas Smaira, Ankush Gupta, Adrià Recasens Continente, Larisa Markeeva, Dylan Banarse, Mateusz Malinowski, Yi Yang, Carl Doersch, Tatiana Matejovicova, Yury Sulsky, Antoine Miech, Skanda Koppula, Alex Frechette, Hanna Klimczak, Raphael Koster, Junlin Zhang, Stephanie Winkler, Yusuf Aytar, Simon Osindero, Dima Damen, Andrew Zisserman and João Carreira
We propose a novel multimodal benchmark – the Perception Test – that aims to extensively evaluate perception and reasoning skills of multimodal models.
no code implementations • 18 Oct 2022 • Ruijun Li, Weihua Li, Yi Yang, Hanyu Wei, Jianhua Jiang, Quan Bai
Recently, diffusion models have been proven to perform remarkably well in text-to-image synthesis tasks in a number of studies, immediately presenting new study opportunities for image generation.
Ranked #1 on Text-to-Image Generation on Multi-Modal-CelebA-HQ
2 code implementations • 18 Oct 2022 • Zongxin Yang, Yi Yang
To solve such a problem and further facilitate the learning of visual embeddings, this paper proposes a Decoupling Features in Hierarchical Propagation (DeAOT) approach.
Ranked #1 on Semi-Supervised Video Object Segmentation on VOT2020
2 code implementations • 13 Oct 2022 • Jian-Wei Zhang, Yifan Sun, Yi Yang, Wei Chen
With a rethink of recent advances, we find that the current FSS framework has deviated far from the supervised segmentation framework: Given the deep features, FSS methods typically use an intricate decoder to perform sophisticated pixel-wise matching, while the supervised segmentation methods use a simple linear classification head.
1 code implementation • 8 Oct 2022 • Yi Yang, Chen Zhang, Dawei Song
Recent advances in distilling pretrained language models have discovered that, besides the expressiveness of knowledge, the student-friendliness should be taken into consideration to realize a truly knowledgable teacher.
3 code implementations • 5 Oct 2022 • Chen Liang, Wenguan Wang, Jiaxu Miao, Yi Yang
Going beyond this, we propose GMMSeg, a new family of segmentation models that rely on a dense generative classifier for the joint distribution p(pixel feature, class).
no code implementations • 2 Oct 2022 • Jiahuan Ren, Zhao Zhang, Richang Hong, Mingliang Xu, Yi Yang, Shuicheng Yan
Low-light image enhancement (LLIE) aims at improving the illumination and visibility of dark images with lighting noise.
no code implementations • 30 Sep 2022 • Shuai Zhao, Xiaohan Wang, Linchao Zhu, Yi Yang
In this work, we present a one-stage solution to obtain pre-trained small models without the need for extra teachers, namely, slimmable networks for contrastive self-supervised learning (\emph{SlimCLR}).
no code implementations • 23 Sep 2022 • Tan Yu, Zhipeng Jin, Jie Liu, Yi Yang, Hongliang Fei, Ping Li
To overcome the limitations of behavior ID features in modeling new ads, we exploit the visual content in ads to boost the performance of CTR prediction models.
no code implementations • 19 Sep 2022 • Tan Yu, Jie Liu, Yi Yang, Yi Li, Hongliang Fei, Ping Li
How to pair the video ads with the user search is the core task of Baidu video advertising.
no code implementations • 7 Aug 2022 • Lin Li, Long Chen, Hanrong Shi, Wenxiao Wang, Jian Shao, Yi Yang, Jun Xiao
To this end, we propose a novel model-agnostic Label Semantic Knowledge Distillation (LS-KD) for unbiased SGG.
1 code implementation • 5 Aug 2022 • Feng Zhu, Zongxin Yang, Xin Yu, Yi Yang, Yunchao Wei
In this work, we propose a new online VIS paradigm named Instance As Identity (IAI), which models temporal information for both detection and tracking in an efficient way.
1 code implementation • 3 Aug 2022 • Xingchen Li, Long Chen, Wenbo Ma, Yi Yang, Jun Xiao
However, we argue that most existing WSSGG works only focus on object-consistency, which means the grounded regions should have the same object category label as text entities.
no code implementations • 3 Aug 2022 • Benyuan Sun, Jin Dai, Zihao Liang, Congying Liu, Yi Yang, Bo Bai
SIMT lays the foundation of pre-training with large-scale multi-task multi-domain datasets and is proved essential for stable training in our GPPF experiments.
no code implementations • 27 Jul 2022 • Lin Li, Long Chen, Hanrong Shi, Hanwang Zhang, Yi Yang, Wei Liu, Jun Xiao
To this end, we propose a novel NoIsy label CorrEction and Sample Training strategy for SGG: NICEST.
1 code implementation • 26 Jul 2022 • Wenhao Wang, Yifan Sun, Zongxin Yang, Yi Yang
While model ensemble is common, we show that combining the vision models and vision-language models brings particular benefits from their complementarity and is a key factor to our superiority.
1 code implementation • 20 Jul 2022 • Yi Yang, Chen Zhang, Benyou Wang, Dawei Song
To uncover the domain-general LM, we propose to identify domain-general parameters by playing lottery tickets (dubbed doge tickets).
1 code implementation • 19 Jul 2022 • Haitian Zeng, Xin Yu, Jiaxu Miao, Yi Yang
We propose MHR-Net, a novel method for recovering Non-Rigid Shapes from Motion (NRSfM).
no code implementations • 8 Jul 2022 • Yucheng Suo, Zhedong Zheng, Xiaohan Wang, Bang Zhang, Yi Yang
We optimize the two losses and keypoint detector network in an end-to-end manner.