no code implementations • 7 Feb 2024 • Xi Chen, Yang Cai, Yuan Wu, Bo Xiong, Taesung Park
Recently, MBConv blocks, initially designed for efficiency in resource-limited settings and later adapted for cutting-edge image classification performances, have demonstrated significant potential in image classification tasks.
1 code implementation • 14 Dec 2023 • Bo Xiong, Mojtaba Nayyeri, Linhao Luo, ZiHao Wang, Shirui Pan, Steffen Staab
NestE represents each atomic fact as a $1\times3$ matrix, and each nested relation is modeled as a $3\times3$ matrix that rotates the $1\times3$ atomic fact matrix through matrix multiplication.
no code implementations • 26 Nov 2023 • Xinyuan Wang, Changqing Su, Bo Xiong
Sparse-view CT reconstruction, aimed at reducing X-ray radiation risks, frequently suffers from image quality degradation, manifested as noise and artifacts.
1 code implementation • 15 Nov 2023 • Zifeng Ding, Heling Cai, Jingpei Wu, Yunpu Ma, Ruotong Liao, Bo Xiong, Volker Tresp
We first input the text descriptions of KG relations into large language models (LLMs) for generating relation representations, and then introduce them into embedding-based TKGF methods.
no code implementations • 3 Nov 2023 • Bo Xiong, Changqing Su, Zihan Lin, You Zhou, Zhaofei Yu
Here, we propose a neural rendering method for CT reconstruction, named Iterative Neural Adaptive Tomography (INeAT), which incorporates iterative posture optimization to effectively counteract the influence of posture perturbations in data, particularly in cases involving significant posture variations.
no code implementations • 14 Oct 2023 • Changqing Su, Zihan Lin, You Zhou, Shuai Wang, Yuhan Gao, Chenggang Yan, Bo Xiong
Moreover, by introducing the temporal continuity, our method shows the superior compression ratio on time series data of zebrafish blood vessels.
1 code implementation • 4 Sep 2023 • Linhao Luo, Jiaxin Ju, Bo Xiong, Yuan-Fang Li, Gholamreza Haffari, Shirui Pan
Logical rules are essential for uncovering the logical connections between relations, which could improve reasoning performance and provide interpretable results on knowledge graphs (KGs).
1 code implementation • 4 Aug 2023 • Jiapu Wang, Boyue Wang, Meikang Qiu, Shirui Pan, Bo Xiong, Heng Liu, Linhao Luo, Tengfei Liu, Yongli Hu, BaoCai Yin, Wen Gao
Temporal characteristics are prominently evident in a substantial volume of knowledge, which underscores the pivotal role of Temporal Knowledge Graphs (TKGs) in both academia and industry.
1 code implementation • 3 Jun 2023 • Bo Xiong, Mojtaba Nayyer, Shirui Pan, Steffen Staab
Although some recent works have proposed to embed hyper-relational KGs, these methods fail to capture essential inference patterns of hyper-relational facts such as qualifier monotonicity, qualifier implication, and qualifier mutual exclusion, limiting their generalization capability.
no code implementations • 11 May 2023 • Ming Jin, Guangsi Shi, Yuan-Fang Li, Qingsong Wen, Bo Xiong, Tian Zhou, Shirui Pan
In this paper, we establish a theoretical framework that unravels the expressive power of spectral-temporal GNNs.
no code implementations • 28 Apr 2023 • Yuchen Liu, Natasha Ong, Kaiyan Peng, Bo Xiong, Qifan Wang, Rui Hou, Madian Khabsa, Kaiyue Yang, David Liu, Donald S. Williamson, Hanchao Yu
Our model encodes different views of the input signal and builds several channel-resolution feature stages to process the multiple views of the input at different resolutions in parallel.
no code implementations • 24 Apr 2023 • Bo Xiong, Mojtaba Nayyeri, Ming Jin, Yunjie He, Michael Cochez, Shirui Pan, Steffen Staab
Geometric relational embeddings map relational data as geometric objects that combine vector information suitable for machine learning and structured/relational information for structured/relational reasoning, typically in low dimensions.
Hierarchical Multi-label Classification Knowledge Graph Completion +1
no code implementations • 12 Apr 2023 • Jiaying Lu, Jiaming Shen, Bo Xiong, Wenjing Ma, Steffen Staab, Carl Yang
Medical decision-making processes can be enhanced by comprehensive biomedical knowledge bases, which require fusing knowledge graphs constructed from different sources via a uniform index system.
no code implementations • 21 Mar 2023 • Yunjie He, Mojtaba Nayyeri, Bo Xiong, Evgeny Kharlamov, Steffen Staab
However, the role of such patterns in answering FOL queries by query embedding models has not been yet studied in the literature.
4 code implementations • CVPR 2022 • Karttikeya Mangalam, Haoqi Fan, Yanghao Li, Chao-yuan Wu, Bo Xiong, Christoph Feichtenhofer, Jitendra Malik
Reversible Vision Transformers achieve a reduced memory footprint of up to 15. 5x at roughly identical model complexity, parameters and accuracy, demonstrating the promise of reversible vision transformers as an efficient backbone for hardware resource limited training regimes.
1 code implementation • 30 Nov 2022 • Yookoon Park, Mahmoud Azab, Bo Xiong, Seungwhan Moon, Florian Metze, Gourab Kundu, Kirmani Ahmed
Cross-modal contrastive learning has led the recent advances in multimodal retrieval with its simplicity and effectiveness.
no code implementations • 1 Jun 2022 • Bo Xiong, Shichao Zhu, Mojtaba Nayyeri, Chengjin Xu, Shirui Pan, Chuan Zhou, Steffen Staab
Recent knowledge graph (KG) embeddings have been advanced by hyperbolic geometry due to its superior capability for representing hierarchies.
1 code implementation • 24 Jan 2022 • Bo Xiong, Nico Potyka, Trung-Kien Tran, Mojtaba Nayyeri, Steffen Staab
Namely, the learned model of BoxEL embedding with loss 0 is a (logical) model of the KB.
1 code implementation • CVPR 2022 • Chao-yuan Wu, Yanghao Li, Karttikeya Mangalam, Haoqi Fan, Bo Xiong, Jitendra Malik, Christoph Feichtenhofer
Instead of trying to process more frames at once like most existing methods, we propose to process videos in an online fashion and cache "memory" at each iteration.
Ranked #3 on Action Anticipation on EPIC-KITCHENS-100 (using extra training data)
7 code implementations • CVPR 2022 • Yanghao Li, Chao-yuan Wu, Haoqi Fan, Karttikeya Mangalam, Bo Xiong, Jitendra Malik, Christoph Feichtenhofer
In this paper, we study Multiscale Vision Transformers (MViTv2) as a unified architecture for image and video classification, as well as object detection.
Ranked #1 on Action Classification on Kinetics-600 (GFLOPs metric)
1 code implementation • 18 Nov 2021 • Haoqi Fan, Tullie Murrell, Heng Wang, Kalyan Vasudev Alwala, Yanghao Li, Yilei Li, Bo Xiong, Nikhila Ravi, Meng Li, Haichuan Yang, Jitendra Malik, Ross Girshick, Matt Feiszli, Aaron Adcock, Wan-Yen Lo, Christoph Feichtenhofer
We introduce PyTorchVideo, an open-source deep-learning library that provides a rich set of modular, efficient, and reproducible components for a variety of video understanding tasks, including classification, detection, self-supervised learning, and low-level processing.
1 code implementation • 6 Jun 2021 • Bo Xiong, Shichao Zhu, Nico Potyka, Shirui Pan, Chuan Zhou, Steffen Staab
Empirical results demonstrate that our method outperforms Riemannian counterparts when embedding graphs of complex topologies.
2 code implementations • CVPR 2021 • Christoph Feichtenhofer, Haoqi Fan, Bo Xiong, Ross Girshick, Kaiming He
We present a large-scale study on unsupervised spatiotemporal representation learning from videos.
Ranked #3 on Self-Supervised Action Recognition on HMDB51
Representation Learning Self-Supervised Action Recognition +1
7 code implementations • ICCV 2021 • Haoqi Fan, Bo Xiong, Karttikeya Mangalam, Yanghao Li, Zhicheng Yan, Jitendra Malik, Christoph Feichtenhofer
We evaluate this fundamental architectural prior for modeling the dense nature of visual signals for a variety of video recognition tasks where it outperforms concurrent vision transformers that rely on large scale external pre-training and are 5-10x more costly in computation and parameters.
Ranked #14 on Action Classification on Charades
1 code implementation • CVPR 2021 • Yanghao Li, Tushar Nagarajan, Bo Xiong, Kristen Grauman
We introduce an approach for pre-training egocentric video models using large-scale third-person video datasets.
no code implementations • ICCV 2021 • Bo Xiong, Haoqi Fan, Kristen Grauman, Christoph Feichtenhofer
We present a multiview pseudo-labeling approach to video learning, a novel framework that uses complementary views in the form of appearance and motion information for semi-supervised learning in video.
no code implementations • 18 Nov 2020 • Bo Xiong, Yimin Huang, Hanrong Ye, Steffen Staab, Zhenguo Li
MOFA pursues several rounds of HPO, where each round alternates between exploration of hyperparameter space by factorial design and exploitation of evaluation results by factorial analysis.
no code implementations • CVPR 2019 • Bo Xiong, Yannis Kalantidis, Deepti Ghadiyaram, Kristen Grauman
Highlight detection has the potential to significantly ease video browsing, but existing methods often suffer from expensive supervision requirements, where human viewers must manually identify highlights in training videos.
no code implementations • ECCV 2018 • Bo Xiong, Kristen Grauman
360° panoramas are a rich medium, yet notoriously difficult to visualize in the 2D image plane.
no code implementations • 11 Aug 2018 • Bo Xiong, Suyog Dutt Jain, Kristen Grauman
We propose an end-to-end learning framework for segmenting generic objects in both images and videos.
no code implementations • 31 Mar 2018 • Bo Xiong, Kristen Grauman
360$^{\circ}$ panoramas are a rich medium, yet notoriously difficult to visualize in the 2D image plane.
4 code implementations • CVPR 2018 • Ruohan Gao, Bo Xiong, Kristen Grauman
Second, we show the power of hallucinated flow for recognition, successfully transferring the learned motion into a standard two-stream network for activity recognition.
no code implementations • CVPR 2017 • Suyog Dutt Jain, Bo Xiong, Kristen Grauman
Our method learns to combine appearance and motion information to produce pixel level segmentation masks for all prominent objects in videos.
no code implementations • 30 Apr 2017 • Danna Gurari, Kun He, Bo Xiong, Jianming Zhang, Mehrnoosh Sameki, Suyog Dutt Jain, Stan Sclaroff, Margrit Betke, Kristen Grauman
We propose the ambiguity problem for the foreground object segmentation task and motivate the importance of estimating and accounting for this ambiguity when designing vision systems.
no code implementations • 19 Jan 2017 • Suyog Dutt Jain, Bo Xiong, Kristen Grauman
We propose an end-to-end learning framework for generating foreground object segmentations.
no code implementations • CVPR 2017 • Suyog Dutt Jain, Bo Xiong, Kristen Grauman
Our method learns to combine appearance and motion information to produce pixel level segmentation masks for all prominent objects in videos.
no code implementations • ICCV 2015 • Bo Xiong, Gunhee Kim, Leonid Sigal
To address this, we propose a storyline representation that expresses an egocentric video as a set of jointly inferred, through MRF inference, story elements comprising of actors, locations, supporting objects and events, depicted on a timeline.