1 code implementation • ECCV 2020 • Ziheng Cheng, Ruiying Lu, Zhengjue Wang, Hao Zhang, Bo Chen, Ziyi Meng, Xin Yuan
This measurement and the modulation masks are fed into our Recurrent Neural Network (RNN) to reconstruct the desired high-speed frames.
no code implementations • GWC 2018 • Aliaksandr Huminski, Hao Zhang
The procedure of extraction includes three steps and the results are based on the analysis of the whole set of verbs in WordNet.
no code implementations • EMNLP 2020 • Zhengjue Wang, Zhibin Duan, Hao Zhang, Chaojie Wang, Long Tian, Bo Chen, Mingyuan Zhou
Abstractive document summarization is a comprehensive task including document understanding and summary generation, in which area Transformer-based models have achieved the state-of-the-art performance.
no code implementations • ACL 2022 • Sicheng Yu, Qianru Sun, Hao Zhang, Jing Jiang
Translate-train is a general training approach to multilingual tasks.
no code implementations • SemEval (NAACL) 2022 • Junyu Lu, Hao Zhang, Tongyue Zhang, Hongbo Wang, Haohao Zhu, Bo Xu, Hongfei Lin
For Subtask B, framed as a multi-label classification problem, we utilize various improved multi-label cross-entropy loss functions and analyze the performance of our method.
1 code implementation • COLING 2022 • Yangjun Wu, Han Wang, Dongxiang Zhang, Gang Chen, Hao Zhang
Specifically, we design 5-type templates as instructional prompts, and each template includes a question that acts as the driver to teach UGEN to grasp the paradigm, options that list the candidate intents or slots to reduce the answer search space, and the context denotes original utterance.
1 code implementation • 16 May 2024 • Tianhe Ren, Qing Jiang, Shilong Liu, Zhaoyang Zeng, Wenlong Liu, Han Gao, Hongjie Huang, Zhengyu Ma, Xiaoke Jiang, Yihao Chen, Yuda Xiong, Hao Zhang, Feng Li, Peijun Tang, Kent Yu, Lei Zhang
Empirical results demonstrate the effectiveness of Grounding DINO 1. 5, with the Grounding DINO 1. 5 Pro model attaining a 54. 3 AP on the COCO detection benchmark and a 55. 7 AP on the LVIS-minival zero-shot transfer benchmark, setting new records for open-set object detection.
no code implementations • 9 May 2024 • Wenwen Zhang, Hao Zhang, Zenan Jiang, Jing Wang, Amir Servati, Peyman Servati
The wearable gait analysis suit captures the gait cycle, pattern, and parameters for both normal and pathological subjects.
1 code implementation • 30 Apr 2024 • Hang Du, Sicheng Zhang, Binzhu Xie, Guoshun Nan, Jiayang Zhang, Junrui Xu, Hangyu Liu, Sicong Leng, Jiangming Liu, Hehe Fan, Dajiu Huang, Jing Feng, Linli Chen, Can Zhang, Xuhuan Li, Hao Zhang, Jianhang Chen, Qimei Cui, Xiaofeng Tao
In pursuit of these answers, we present a comprehensive benchmark for Causation Understanding of Video Anomaly (CUVA).
no code implementations • 28 Apr 2024 • Huanshuo Liu, Bo Chen, Menghui Zhu, Jianghao Lin, Jiarui Qin, Yang Yang, Hao Zhang, Ruiming Tang
Specifically, a knowledge base, consisting of a retrieval-oriented embedding layer and a knowledge encoder, is designed to preserve and imitate the retrieved & aggregated representations in a decomposition-reconstruction paradigm.
no code implementations • 24 Apr 2024 • Kaining Ying, Fanqing Meng, Jin Wang, Zhiqian Li, Han Lin, Yue Yang, Hao Zhang, Wenbo Zhang, Yuqi Lin, Shuo Liu, Jiayi Lei, Quanfeng Lu, Runjian Chen, Peng Xu, Renrui Zhang, Haozhe Zhang, Peng Gao, Yali Wang, Yu Qiao, Ping Luo, Kaipeng Zhang, Wenqi Shao
Large Vision-Language Models (LVLMs) show significant strides in general-purpose multimodal applications such as visual dialogue and embodied navigation.
no code implementations • 23 Apr 2024 • Kuicai Dong, Derrick Goh Xin Deik, Yi Quan Lee, Hao Zhang, Xiangyang Li, Cong Zhang, Yong liu
As they do not consider content structures, the resultant chunks can exclude vital information or include irrelevant content.
1 code implementation • 12 Apr 2024 • Xuezhe Ma, Xiaomeng Yang, Wenhan Xiong, Beidi Chen, Lili Yu, Hao Zhang, Jonathan May, Luke Zettlemoyer, Omer Levy, Chunting Zhou
The quadratic complexity and weak length extrapolation of Transformers limits their ability to scale to long sequences, and while sub-quadratic solutions like linear attention and state space models exist, they empirically underperform Transformers in pretraining efficiency and downstream task accuracy.
no code implementations • 3 Apr 2024 • Hao Zhang, Fuhui Zhou, Qihui Wu, Naofal Al-Dhahir
Moreover, a modular semi-supervised learning method that combines labeled and unlabeled data using MixMatch is exploited to further improve the classification performance under few-sample conditions.
no code implementations • 3 Apr 2024 • Longfei Yun, Yonghao Zhuang, Yao Fu, Eric P Xing, Hao Zhang
Like dense models, training MoEs requires answering the same question: given a training budget, what is the optimal allocation on the model size and number of tokens?
no code implementations • 1 Apr 2024 • Fenggen Yu, Yiming Qian, Xu Zhang, Francisca Gil-Ureta, Brian Jackson, Eric Bennett, Hao Zhang
We present a differentiable rendering framework to learn structured 3D abstractions in the form of primitive assemblies from sparse RGB images capturing a 3D object.
1 code implementation • 26 Mar 2024 • YuQi Yang, Peng-Tao Jiang, Qibin Hou, Hao Zhang, Jinwei Chen, Bo Li
Furthermore, to control the parameters and computational cost brought by the increase in the number of experts, we take inspiration from LoRA and propose to leverage the low-rank format of a vanilla convolution in the expert network.
1 code implementation • 25 Mar 2024 • Xunpeng Yi, Han Xu, Hao Zhang, Linfeng Tang, Jiayi Ma
Through the text semantic encoder and semantic interaction fusion decoder, Text-IF is accessible to the all-in-one infrared and visible image degradation-aware processing and the interactive flexible fusion outcomes.
no code implementations • 21 Mar 2024 • YuQi Yang, Peng-Tao Jiang, Jing Wang, Hao Zhang, Kai Zhao, Jinwei Chen, Bo Li
Multi-modal large language models (MLLMs) can understand image-language prompts and demonstrate impressive reasoning ability.
no code implementations • 19 Mar 2024 • Mingyue Cheng, Xiaoyu Tao, Qi Liu, Hao Zhang, Yiheng Chen, Chenyi Lei
To address this challenge, we propose CrossTimeNet, a novel cross-domain SSL learning framework to learn transferable knowledge from various domains to largely benefit the target downstream task.
no code implementations • 19 Mar 2024 • Hongyang Li, Hao Zhang, Shilong Liu, Zhaoyang Zeng, Tianhe Ren, Feng Li, Lei Zhang
Based on the observation that point tracking bears a great resemblance to object detection and tracking, we borrow designs from DETR-like algorithms to address the task of TAP.
no code implementations • 14 Mar 2024 • Hao Zhang, Wenqi Shao, Hong Liu, Yongqiang Ma, Ping Luo, Yu Qiao, Kaipeng Zhang
To bridge this gap, we introduce AVIBench, a framework designed to analyze the robustness of LVLMs when facing various adversarial visual-instructions (AVIs), including four types of image-based AVIs, ten types of text-based AVIs, and nine types of content bias AVIs (such as gender, violence, cultural, and racial biases, among others).
no code implementations • 13 Mar 2024 • Mingyue Cheng, Hao Zhang, Jiqian Yang, Qi Liu, Li Li, Xin Huang, Liwei Song, Zhi Li, Zhenya Huang, Enhong Chen
Through this gateway, users have the opportunity to submit their questions, testing the models on a personalized and potentially broader range of capabilities.
no code implementations • 12 Mar 2024 • Mingyue Cheng, Hao Zhang, Qi Liu, Fajie Yuan, Zhi Li, Zhenya Huang, Enhong Chen, Jun Zhou, Longfei Li
It is also significant to model the \textit{semantic relatedness} reflected in content features, e. g., images and text.
1 code implementation • 7 Mar 2024 • Wei-Lin Chiang, Lianmin Zheng, Ying Sheng, Anastasios Nikolas Angelopoulos, Tianle Li, Dacheng Li, Hao Zhang, Banghua Zhu, Michael Jordan, Joseph E. Gonzalez, Ion Stoica
To address this issue, we introduce Chatbot Arena, an open platform for evaluating LLMs based on human preferences.
1 code implementation • 6 Mar 2024 • Zequn Zeng, Yan Xie, Hao Zhang, Chiyu Chen, Zhengjue Wang, Bo Chen
The framework of MeaCap achieves the state-of-the-art performance on a series of zero-shot IC settings.
no code implementations • 4 Mar 2024 • Cong Geng, Tian Han, Peng-Tao Jiang, Hao Zhang, Jinwei Chen, Søren Hauberg, Bo Li
Generative models have shown strong generation ability while efficient likelihood estimation is less explored.
1 code implementation • 28 Feb 2024 • Siqi Kou, Lanxiang Hu, Zhezhi He, Zhijie Deng, Hao Zhang
Parallel decoding methods such as Jacobi decoding show promise for more efficient LLM inference as it breaks the sequential nature of the LLM decoding process and transforms it into parallelizable computation.
no code implementations • 5 Feb 2024 • Maham Tanveer, Yizhi Wang, Ruiqi Wang, Nanxuan Zhao, Ali Mahdavi-Amiri, Hao Zhang
We present AnaMoDiff, a novel diffusion-based method for 2D motion analogies that is applied to raw, unannotated videos of articulated characters.
no code implementations • 4 Feb 2024 • Jingyu Hu, Ka-Hei Hui, Zhengzhe Liu, Hao Zhang, Chi-Wing Fu
First, we design the coupled neural shape (CNS) representation for supporting 3D shape editing.
1 code implementation • 3 Feb 2024 • Yichao Fu, Peter Bailis, Ion Stoica, Hao Zhang
Autoregressive decoding of large language models (LLMs) is memory bandwidth bounded, resulting in high latency and significant wastes of the parallel processing power of modern accelerators.
no code implementations • 2 Feb 2024 • Reyna Abhyankar, Zijian He, Vikranth Srivatsa, Hao Zhang, Yiying Zhang
Large language models are increasingly integrated with external tools and APIs like ChatGPT plugins to extend their capability beyond language-centric tasks.
no code implementations • 30 Jan 2024 • Hao Zhang, Qingfeng Lin, Yang Li, Lei Cheng, Yik-Chung Wu
This problem is even more severe in cell-free networks as there are many of these parameters to be acquired.
no code implementations • 26 Jan 2024 • Zihao Li, Sixu Li, Hao Zhang, Yang Zhou, Siyang Xie, Yunlong Zhang
While perception systems in Connected and Autonomous Vehicles (CAVs), which encompass both communication technologies and advanced sensors, promise to significantly reduce human driving errors, they also expose CAVs to various cyberattacks.
1 code implementation • 25 Jan 2024 • Tianhe Ren, Shilong Liu, Ailing Zeng, Jing Lin, Kunchang Li, He Cao, Jiayu Chen, Xinyu Huang, Yukang Chen, Feng Yan, Zhaoyang Zeng, Hao Zhang, Feng Li, Jie Yang, Hongyang Li, Qing Jiang, Lei Zhang
We introduce Grounded SAM, which uses Grounding DINO as an open-set object detector to combine with the segment anything model (SAM).
1 code implementation • 25 Jan 2024 • Mathieu Ravaut, Hao Zhang, Lu Xu, Aixin Sun, Yong liu
Conversational recommender systems (CRS) aim to recommend relevant items to users by eliciting user preference through natural language conversation.
1 code implementation • 19 Jan 2024 • Hao Zhang, Shuaijie Zhang
Existing researchs improve regression performance by utilizing the geometric relationship between bounding boxes, while ignoring the impact of difficult and easy sample distribution on bounding box regression.
no code implementations • 16 Jan 2024 • Hao Zhang, Fang Li, Samyak Rawlekar, Narendra Ahuja
Our method simultaneously estimates the visible (explicit) representation (3D shapes, colors, camera parameters) and the implicit skeletal representation, from motion cues in the object video without 3D supervision.
1 code implementation • 15 Jan 2024 • Xiuyuan Hu, Guoqing Liu, Yang Zhao, Hao Zhang
AI for drug discovery has been a research hotspot in recent years, and SMILES-based language models has been increasingly applied in drug molecular design.
no code implementations • 14 Jan 2024 • Shiming Wang, Zhe Ji, Liyao Xiang, Hao Zhang, Xinbing Wang, Chenghu Zhou, Bo Li
However, such methods can not defend against adaptive attacks, in which an attacker takes a countermove against a known defence strategy.
no code implementations • 10 Jan 2024 • JianQiao Sun, Yudi Su, Hao Zhang, Ziheng Cheng, Zequn Zeng, Zhengjue Wang, Bo Chen, Xin Yuan
To address these problems, in this paper, we propose a novel VC pipeline to generate captions directly from the compressed measurement, which can be captured by a snapshot compressive sensing camera and we dub our model SnapCap.
no code implementations • 9 Jan 2024 • Zilong Wang, Hao Zhang, Chun-Liang Li, Julian Martin Eisenschlos, Vincent Perot, Zifeng Wang, Lesly Miculicich, Yasuhisa Fujii, Jingbo Shang, Chen-Yu Lee, Tomas Pfister
We propose the Chain-of-Table framework, where tabular data is explicitly used in the reasoning chain as a proxy for intermediate thoughts.
Ranked #3 on Table-based Fact Verification on TabFact
no code implementations • 6 Jan 2024 • Nianwen Si, Hao Zhang, WeiQiang Zhang
Large language models are known for encoding a vast amount of factual knowledge, but they often becomes outdated due to the ever-changing nature of external information.
1 code implementation • 5 Jan 2024 • Hao Zhang, Yu-Wing Tai, Chi-Keung Tang
However, achieving simultaneously multi-view consistency and temporal coherence while editing video sequences remains a formidable challenge.
1 code implementation • 29 Dec 2023 • Hao Zhang, Shuaijie Zhang
As an important component of the detector localization branch, bounding box regression loss plays a significant role in object detection tasks.
no code implementations • 28 Dec 2023 • Hao Zhang, Qi Wang, Jun Shi, Shihui Ying, Zhijie Wen
In this paper, we construct a novel Deep Unfolding Network with Spatial Alignment, termed DUN-SA, to appropriately embed the spatial alignment task into the reconstruction process.
1 code implementation • 25 Dec 2023 • Yucong Luo, Mingyue Cheng, Hao Zhang, Junyu Lu, Qi Liu, Enhong Chen
In this study, we propose LLMXRec, a simple yet effective two-stage explainable recommendation framework aimed at further boosting the explanation quality by employing LLMs.
no code implementations • 24 Dec 2023 • Ming Yan, Ruihao Li, Hao Zhang, Hao Wang, Zhilan Yang, Ji Yan
Language agents have shown impressive problem-solving skills within defined settings and brief timelines.
1 code implementation • NeurIPS 2023 • Xiuyuan Hu, Guoqing Liu, Yang Zhao, Hao Zhang
A central challenge in this field is to generate molecules with specific properties while also producing a wide range of diverse candidates.
no code implementations • 21 Dec 2023 • Peng Gao, Ahmed Jaafar, Brian Reily, Christopher Reardon, Hao Zhang
However, visual observations of an object may not be available when it is referred to, and the number of objects and attributes may also be unbounded in open worlds.
1 code implementation • 12 Dec 2023 • Xueyan Zou, Linjie Li, JianFeng Wang, Jianwei Yang, Mingyu Ding, Zhengyuan Yang, Feng Li, Hao Zhang, Shilong Liu, Arul Aravinthan, Yong Jae Lee, Lijuan Wang
The proposed interface is adaptive to new tasks, and new models.
no code implementations • 12 Dec 2023 • Sixu Li, Mohammad Anis, Dominique Lord, Hao Zhang, Yang Zhou, Xinyue Ye
This paper presents a generic analytical framework tailored for surrogate safety measures (SSMs) that is versatile across various highway geometries, capable of encompassing vehicle dynamics of differing dimensionality and fidelity, and suitable for dynamic, real-world environments.
1 code implementation • 12 Dec 2023 • Jingze You, Chao Huang, Hao Zhang
Recently, a novel system identification method based on invariant subspace theory is introduced, aiming to address the identification problem of continuous-time (CT) linear time-invariant (LTI) systems by combining time-domain and frequency-domain methods.
no code implementations • 9 Dec 2023 • Hao Zhang, Fang Li, Lu Qi, Ming-Hsuan Yang, Narendra Ahuja
Addressing Out-Of-Distribution (OOD) Segmentation and Zero-Shot Semantic Segmentation (ZS3) is challenging, necessitating segmenting unseen classes.
1 code implementation • 5 Dec 2023 • Hao Zhang, Hongyang Li, Feng Li, Tianhe Ren, Xueyan Zou, Shilong Liu, Shijia Huang, Jianfeng Gao, Lei Zhang, Chunyuan Li, Jianwei Yang
To address this issue, we have created GVC data that allows for the combination of grounding and chat capabilities.
no code implementations • 3 Dec 2023 • Yizhi Wang, Wallace Lira, Wenqi Wang, Ali Mahdavi-Amiri, Hao Zhang
Our key observation is that object slicing is more advantageous than altering views to reveal occluded structures.
1 code implementation • 29 Nov 2023 • Yurui Zhu, Xueyang Fu, Peng-Tao Jiang, Hao Zhang, Qibin Sun, Jinwei Chen, Zheng-Jun Zha, Bo Li
This research focuses on the issue of single-image reflection removal (SIRR) in real-world conditions, examining it from two angles: the collection pipeline of real reflection pairs and the perception of real reflection locations.
no code implementations • 27 Nov 2023 • Nianwen Si, Hao Zhang, Heyu Chang, Wenlin Zhang, Dan Qu, WeiQiang Zhang
We further present evaluation datasets used in existing methods, and finally conclude this survey by presenting the ongoing challenges and future directions.
3 code implementations • 22 Nov 2023 • Feng Li, Qing Jiang, Hao Zhang, Tianhe Ren, Shilong Liu, Xueyan Zou, Huaizhe xu, Hongyang Li, Chunyuan Li, Jianwei Yang, Lei Zhang, Jianfeng Gao
In-context prompting in large language models (LLMs) has become a prevalent approach to improve zero-shot capabilities, but this idea is less explored in the vision domain.
1 code implementation • 22 Nov 2023 • Zhiqin Chen, Qimin Chen, Hang Zhou, Hao Zhang
To accommodate structural variations in the collection, our network composes each shape by a selected subset of template parts which are affine-transformed.
no code implementations • 19 Nov 2023 • Lv Tang, Peng-Tao Jiang, Zhihao Shen, Hao Zhang, Jinwei Chen, Bo Li
Large Vision-Language Model (LVLM) has seen burgeoning development and increasing attention recently.
1 code implementation • 9 Nov 2023 • Shilong Liu, Hao Cheng, Haotian Liu, Hao Zhang, Feng Li, Tianhe Ren, Xueyan Zou, Jianwei Yang, Hang Su, Jun Zhu, Lei Zhang, Jianfeng Gao, Chunyuan Li
LLaVA-Plus is a general-purpose multimodal assistant that expands the capabilities of large multimodal models.
Ranked #1 on LMM real-life tasks on Leaderboard
no code implementations • 8 Nov 2023 • Jin-Jian Xu, Hao Zhang, Chao-Sheng Tang, Lin Li, Bin Shi
Experimental results demonstrate that the effectiveness, versatility, and heuristics of the proposed framework have great potential in solving geoscience image recognition problems.
no code implementations • 6 Nov 2023 • Hao Zhang, Zhendong Pang, Jiangpeng Wang, Teng Li
Deep neural networks (DNNs) that tackle the time series classification (TSC) task have provided a promising framework in signal processing.
1 code implementation • 6 Nov 2023 • Hao Zhang, Cong Xu, Shuaijie Zhang
Based on the above, we first analyzed the BBR model and concluded that distinguishing different regression samples and using different scales of auxiliary bounding boxes to calculate losses can effectively accelerate the bounding box regression process.
Ranked #1 on Object Detection on AI-TOD (mAP50 metric)
no code implementations • 6 Nov 2023 • Hao Zhang
In spatial statistics and machine learning, the kernel matrix plays a pivotal role in prediction, classification, and maximum likelihood estimation.
no code implementations • 1 Nov 2023 • Hao Zhang, Mingyue Cheng, Qi Liu, Zhiding Liu, Enhong Chen
Sequential recommender systems (SRS) have gained widespread popularity in recommendation due to their ability to effectively capture dynamic user preferences.
1 code implementation • 29 Oct 2023 • Hao Zhang, Yang Liu, Xiaoyan Liu, Tianming Liang, Gaurav Sharma, Liang Xue, Maozu Guo
We introduce a novel graph-based framework for alleviating key challenges in distantly-supervised relation extraction and demonstrate its effectiveness in the challenging and important domain of biomedical data.
1 code implementation • 28 Oct 2023 • Yangjun Wu, Kebin Fang, Dongxiang Zhang, Han Wang, Hao Zhang, Gang Chen
Structured dropout approaches, such as attention dropout and DropHead, have been investigated to regularize the multi-head attention mechanism in Transformers.
no code implementations • 25 Oct 2023 • Hao Zhang, Fang Li, Narendra Ahuja
Current techniques for NeRF decomposition involve a trade-off between the flexibility of processing open-vocabulary queries and the accuracy of 3D segmentation.
no code implementations • 23 Oct 2023 • Zihao Yan, Fubao Su, Mingyang Wang, Ruizhen Hu, Hao Zhang, Hui Huang
We introduce an active 3D reconstruction method which integrates visual perception, robot-object interaction, and 3D scanning to recover both the exterior and interior, i. e., unexposed, geometries of a target 3D object.
no code implementations • 20 Oct 2023 • Arya D. McCarthy, Hao Zhang, Shankar Kumar, Felix Stahlberg, Ke wu
One challenge in speech translation is that plenty of spoken content is long-form, but short units are necessary for obtaining high-quality translations.
no code implementations • 18 Oct 2023 • Wei Huang, Fan Gao, Junting Wang, Hao Zhang
Underwater Sound Speed Profile (SSP) distribution has great influence on the propagation mode of acoustic signal, thus the fast and accurate estimation of SSP is of great importance in building underwater observation systems.
no code implementations • 17 Oct 2023 • Juzhan Xu, Minglun Gong, Hao Zhang, Hui Huang, Ruizhen Hu
We present a novel learning framework to solve the transport-and-packing (TAP) problem in 3D.
3 code implementations • 17 Oct 2023 • Jianwei Yang, Hao Zhang, Feng Li, Xueyan Zou, Chunyuan Li, Jianfeng Gao
We present Set-of-Mark (SoM), a new visual prompting method, to unleash the visual grounding abilities of large multimodal models (LMMs), such as GPT-4V.
no code implementations • 15 Oct 2023 • Huilin Zhou, Huijie Tang, Mingjie Li, Hao Zhang, Zhenyu Liu, Quanshi Zhang
The AI model has surpassed human players in the game of Go, and it is widely believed that the AI model has encoded new knowledge about the Go game beyond human players.
1 code implementation • 13 Oct 2023 • Dongsheng Jiang, Yuchen Liu, Songlin Liu, Jin'e Zhao, Hao Zhang, Zhen Gao, Xiaopeng Zhang, Jin Li, Hongkai Xiong
By simply equipping it with an MLP layer for alignment, DINO surpasses CLIP in fine-grained related perception tasks.
no code implementations • 12 Oct 2023 • Wei Huang, Jixuan Zhou, Fan Gao, Jiajun Lu, Sijia Li, Pengfei Wu, Junting Wang, Hao Zhang, Tianhe Xu
The proposal of SSP inversion method greatly improves the convenience and real--time performance, but the accuracy is not as good as the direct measurement method.
no code implementations • 12 Oct 2023 • Wei Huang, Hao Zhang, Kaitao Meng, Fan Gao, Wenzhou Sun, Jianxu Shu, Tianhe Xu, Deshi Li
To tackle this issue, we propose an iterative ray tracing 3D underwater localization (IRTUL) method for stratification compensation.
no code implementations • 11 Oct 2023 • Xiaoxuan Liu, Lanxiang Hu, Peter Bailis, Ion Stoica, Zhijie Deng, Alvin Cheung, Hao Zhang
We develop a prototype of online speculative decoding based on online knowledge distillation and evaluate it using both synthetic and real query data on several popular LLMs.
no code implementations • 8 Oct 2023 • Hao Zhang, Lumin Xu, Shenqi Lai, Wenqi Shao, Nanning Zheng, Ping Luo, Yu Qiao, Kaipeng Zhang
Current image-based keypoint detection methods for animal (including human) bodies and faces are generally divided into full-supervised and few-shot class-agnostic approaches.
1 code implementation • 5 Oct 2023 • Dacheng Li, Rulin Shao, Anze Xie, Eric P. Xing, Xuezhe Ma, Ion Stoica, Joseph E. Gonzalez, Hao Zhang
FlashAttention (Dao, 2023) effectively reduces the quadratic peak memory usage to linear in training transformer-based large language models (LLMs) on a single GPU.
no code implementations • 3 Oct 2023 • Hao Zhang, Nianwen Si, Yaqi Chen, Wenlin Zhang, Xukui Yang, Dan Qu, Xiaolin Jiao
The training of LST consists of two stages: (1) Modality adjustment, where the adapter is tuned to align speech representation with text embedding space, and (2) Downstream task fine-tuning, where both the adapter and LLM model are trained to optimize performance on the E2EST task.
no code implementations • 27 Sep 2023 • Hao Zhang, Yixuan Zhang, Meng Yu, Dong Yu
In this paper, we introduce a novel training framework designed to comprehensively address the acoustic howling issue by examining its fundamental formation process.
no code implementations • 27 Sep 2023 • Yixuan Zhang, Hao Zhang, Meng Yu, Dong Yu
Acoustic howling suppression (AHS) is a critical challenge in audio communication systems.
no code implementations • 25 Sep 2023 • Hao Zhang, Chunyan Feng, Jiahui Yang, Zheng Li, Caili Guo
More importantly, few works consider the background frames that are similar to action frames in pixels but dissimilar in semantics, which also leads to inaccurate temporal boundaries.
1 code implementation • 21 Sep 2023 • Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Tianle Li, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zhuohan Li, Zi Lin, Eric P. Xing, Joseph E. Gonzalez, Ion Stoica, Hao Zhang
Studying how people interact with large language models (LLMs) in real-world scenarios is increasingly important due to their widespread use in various applications.
1 code implementation • 19 Sep 2023 • Jinye Ran, Guanghua Zhang, Ximei Zhang, Juan Xie, Fan Xia, Hao Zhang
Domain adaptation (DA) has been widely applied in the diabetic retinopathy (DR) grading of unannotated ultra-wide-field (UWF) fundus images, which can transfer annotated knowledge from labeled color fundus images.
1 code implementation • 19 Sep 2023 • Junzhe Jiang, Shang Qu, Mingyue Cheng, Qi Liu, Zhiding Liu, Hao Zhang, Rujiao Zhang, Kai Zhang, Rui Li, Jiatong Li, Min Gao
Recommender systems are indispensable in the realm of online applications, and sequential recommendation has enjoyed considerable prevalence due to its capacity to encapsulate the dynamic shifts in user interests.
no code implementations • 19 Sep 2023 • Vincent Perot, Kai Kang, Florian Luisier, Guolong Su, Xiaoyu Sun, Ramya Sree Boppana, Zilong Wang, Jiaqi Mu, Hao Zhang, Nan Hua
Large Language Models (LLM) have revolutionized Natural Language Processing (NLP), improving state-of-the-art on many existing tasks and exhibiting emergent capabilities.
no code implementations • 16 Sep 2023 • Heming Wang, Meng Yu, Hao Zhang, Chunlei Zhang, Zhongweiyang Xu, Muqiao Yang, Yixuan Zhang, Dong Yu
Enhancing speech signal quality in adverse acoustic environments is a persistent challenge in speech processing.
no code implementations • 13 Sep 2023 • Hao Zhang, Yao Feng, Peter Kulits, Yandong Wen, Justus Thies, Michael J. Black
We argue that existing methods are limited because they employ a monolithic modeling approach, using a single representation for the head, face, hair, and accessories.
no code implementations • 13 Sep 2023 • Hao Zhang, Jin-Jian Xu, Hong-Wei Cui, Lin Li, Yaowen Yang, Chao-Sheng Tang, Niklas Boers
Critically, the scalability and generalizability of GFMs empower them to address a wide array of prediction, simulation, and decision tasks related to the intricate interactions among Earth system components.
4 code implementations • 12 Sep 2023 • Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Ying Sheng, Lianmin Zheng, Cody Hao Yu, Joseph E. Gonzalez, Hao Zhang, Ion Stoica
On top of it, we build vLLM, an LLM serving system that achieves (1) near-zero waste in KV cache memory and (2) flexible sharing of KV cache within and across requests to further reduce memory usage.
no code implementations • ICCV 2023 • Xunpeng Yi, Han Xu, Hao Zhang, Linfeng Tang, Jiayi Ma
Therefore, Diff-Retinex formulates the low-light image enhancement problem into Retinex decomposition and conditional image generation.
no code implementations • 14 Aug 2023 • Shaan Bijwadia, Shuo-Yiin Chang, Weiran Wang, Zhong Meng, Hao Zhang, Tara N. Sainath
Text injection for automatic speech recognition (ASR), wherein unpaired text-only data is used to supplement paired audio-text data, has shown promising improvements for word error rate.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • ICCV 2023 • Hongyang Li, Hao Zhang, Zhaoyang Zeng, Shilong Liu, Feng Li, Tianhe Ren, Lei Zhang
Existing feature lifting approaches, such as Lift-Splat-based and 2D attention-based, either use estimated depth to get pseudo LiDAR features and then splat them to a 3D space, which is a one-pass operation without feature refinement, or ignore depth and lift features by 2D attention mechanisms, which achieve finer semantics while suffering from a depth ambiguity problem.
1 code implementation • 10 Jul 2023 • Feng Li, Hao Zhang, Peize Sun, Xueyan Zou, Shilong Liu, Jianwei Yang, Chunyuan Li, Lei Zhang, Jianfeng Gao
In this paper, we introduce Semantic-SAM, a universal image segmentation model to enable segment and recognize anything at any desired granularity.
1 code implementation • 24 Jun 2023 • Daniel Zou, Xinchen Jin, Xueyang Yu, Hao Zhang, James Demmel
In anticipation of workloads that involve serving many of such large models to handle different tasks, we develop Computron, a system that uses memory swapping to serve multiple distributed models on a shared GPU cluster.
no code implementations • 16 Jun 2023 • Hongcheng Gao, Hao Zhang, Yinpeng Dong, Zhijie Deng
Text-to-image (T2I) diffusion models (DMs) have shown promise in generating high-quality images from textual descriptions.
no code implementations • 16 Jun 2023 • Ke Deng, Zhiyuan He, Hao Zhang, Haohan Lin, DeSheng Wang
In future 6G Mobile Edge Computing (MEC), autopilot systems require the capability of processing multimodal data with strong interdependencies.
no code implementations • 14 Jun 2023 • Jingyu Hu, Ka-Hei Hui, Zhengzhe Liu, Hao Zhang, Chi-Wing Fu
This paper presents CLIPXPlore, a new framework that leverages a vision-language model to guide the exploration of the 3D shape space.
1 code implementation • 12 Jun 2023 • Tianhe Ren, Shilong Liu, Feng Li, Hao Zhang, Ailing Zeng, Jie Yang, Xingyu Liao, Ding Jia, Hongyang Li, He Cao, Jianan Wang, Zhaoyang Zeng, Xianbiao Qi, Yuhui Yuan, Jianwei Yang, Lei Zhang
To address this issue, we develop a unified, highly modular, and lightweight codebase called detrex, which supports a majority of the mainstream DETR-based instance recognition algorithms, covering various fundamental tasks, including object detection, segmentation, and pose estimation.
5 code implementations • NeurIPS 2023 • Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zi Lin, Zhuohan Li, Dacheng Li, Eric P. Xing, Hao Zhang, Joseph E. Gonzalez, Ion Stoica
Evaluating large language model (LLM) based chat assistants is challenging due to their broad capabilities and the inadequacy of existing benchmarks in measuring human preferences.
Ranked #3 on Long-Context Understanding on Ada-LEval (TSort)
1 code implementation • 9 Jun 2023 • Jianghao Lin, Xinyi Dai, Yunjia Xi, Weiwen Liu, Bo Chen, Hao Zhang, Yong liu, Chuhan Wu, Xiangyang Li, Chenxu Zhu, Huifeng Guo, Yong Yu, Ruiming Tang, Weinan Zhang
In this paper, we conduct a comprehensive survey on this research direction from the perspective of the whole pipeline in real-world recommender systems.
1 code implementation • 8 Jun 2023 • Qimin Chen, Zhiqin Chen, Hang Zhou, Hao Zhang
Furthermore, we showcase the ability of our method to learn geometric details and textures from shapes reconstructed from real-world photos.
no code implementations • 7 Jun 2023 • Aditya Vora, Akshay Gadi Patil, Hao Zhang
We demonstrate that our approach is not only able to complete the surface geometry but also reconstructs surface details to a reasonable extent from a few disparate input views.
2 code implementations • NeurIPS 2023 • Hao Zhang, Yanbo Xu, Tianyuan Dai, Yu-Wing Tai, Chi-Keung Tang
The ability to create high-quality 3D faces from a single image has become increasingly important with wide applications in video conferencing, AR/VR, and advanced video editing in movie industries.
1 code implementation • 30 May 2023 • Jing Wang, Aixin Sun, Hao Zhang, XiaoLi Li
Given a query, the task of Natural Language Video Localization (NLVL) is to localize a temporal moment in an untrimmed video that semantically matches the query.
no code implementations • 29 May 2023 • Dingdong Yang, Yizhi Wang, Ali Mahdavi-Amiri, Hao Zhang
Our key codes and feature grids are jointly trained continuously with well-defined gradient flows, leading to high usage rates of the feature grids and improved generative modeling compared to discrete Vector Quantization (VQ).
no code implementations • 28 May 2023 • W. Ronny Huang, Hao Zhang, Shankar Kumar, Shuo-Yiin Chang, Tara N. Sainath
We address this limitation by distilling punctuation knowledge from a bidirectional teacher language model (LM) trained on written, punctuated text.
no code implementations • 27 May 2023 • Sukru Yaren Gelbal, Mustafa Ridvan Cantas, Bilin Aksun Guvenc, Levent Guvenc, Gopichandra Surnilla, Hao Zhang
The work we discuss in this paper is related to a mobile application that utilizes the mobile phone sensors and Bluetooth communication to implement Personal Safety Message (PSM) broadcast using the SAE J2735 standard to create a Pedestrian to Vehicle (P2V) based safety warning structure.
no code implementations • 26 May 2023 • Zhijie Deng, Hongcheng Gao, Yibo Miao, Hao Zhang
The detection of machine-generated text, especially from large language models (LLMs), is crucial in preventing serious social problems resulting from their misuse.
1 code implementation • 18 May 2023 • Tingting Wu, Xiao Ding, Minji Tang, Hao Zhang, Bing Qin, Ting Liu
To mitigate the effects of label noise, learning with noisy labels (LNL) methods are designed to achieve better generalization performance.
no code implementations • 4 May 2023 • Chen-Yu Lee, Chun-Liang Li, Hao Zhang, Timothy Dozat, Vincent Perot, Guolong Su, Xiang Zhang, Kihyuk Sohn, Nikolai Glushnev, Renshen Wang, Joshua Ainslie, Shangbang Long, Siyang Qin, Yasuhisa Fujii, Nan Hua, Tomas Pfister
In FormNetV2, we introduce a centralized multimodal graph contrastive learning strategy to unify self-supervised pre-training for all modalities in one loss.
no code implementations • 4 May 2023 • Hao Zhang, Meng Yu, Yuzhong Wu, Tao Yu, Dong Yu
During offline training, a pre-processed signal obtained from the Kalman filter and an ideal microphone signal generated via teacher-forced training strategy are used to train the deep neural network (DNN).
no code implementations • 2 May 2023 • Hao Zhang, Meng Yu, Dong Yu
In particular, the interplay between acoustic echo and acoustic howling in a hybrid meeting makes the joint suppression of them difficult.
3 code implementations • 25 Apr 2023 • Tianhe Ren, Jianwei Yang, Shilong Liu, Ailing Zeng, Feng Li, Hao Zhang, Hongyang Li, Zhaoyang Zeng, Lei Zhang
This work presents Focal-Stable-DINO, a strong and reproducible object detection model which achieves 64. 6 AP on COCO val2017 and 64. 8 AP on COCO test-dev using only 700M parameters without any test time augmentation.
Ranked #5 on Object Detection on COCO minival (using extra training data)
no code implementations • 20 Apr 2023 • Hao Zhang, Dan Qu, Keji Shao, Xukui Yang
In contrast to the general dropout method, which randomly drops neurons, DropDim drops part of the embedding dimensions.
no code implementations • 20 Apr 2023 • Hao Zhang, Nianwen Si, Yaqi Chen, Wenlin Zhang, Xukui Yang, Dan Qu, Wei-Qiang Zhang
However, the final model often performs worse on the MT task than the MT model trained alone, which means that the knowledge transfer ability of this method is also limited.
no code implementations • 20 Apr 2023 • Hao Zhang, Nianwen Si, Yaqi Chen, Wenlin Zhang, Xukui Yang, Dan Qu, Zhen Li
Existing techniques often attempt to make knowledge transfer from a powerful machine translation (MT) to speech translation (ST) model with some elaborate techniques, which often requires transcription as extra input during training.
no code implementations • 16 Apr 2023 • Zhifeng Ma, Hao Zhang, Jie Liu
The drastic variation of motion in spatial and temporal dimensions makes the video prediction task extremely challenging.
no code implementations • 13 Apr 2023 • Akshay Gadi Patil, Yiming Qian, Shan Yang, Brian Jackson, Eric Bennett, Hao Zhang
The dominant majority of 3D models that appear in gaming, VR/AR, and those we use to train geometric deep learning algorithms are incomplete, since they are modeled as surface meshes and missing their interior structures.
2 code implementations • NeurIPS 2023 • Xueyan Zou, Jianwei Yang, Hao Zhang, Feng Li, Linjie Li, JianFeng Wang, Lijuan Wang, Jianfeng Gao, Yong Jae Lee
In SEEM, we propose a novel decoding mechanism that enables diverse prompting for all types of segmentation tasks, aiming at a universal segmentation interface that behaves like large language models (LLMs).
2 code implementations • ICCV 2023 • Shilong Liu, Tianhe Ren, Jiayu Chen, Zhaoyang Zeng, Hao Zhang, Feng Li, Hongyang Li, Jun Huang, Hang Su, Jun Zhu, Lei Zhang
We point out that the unstable matching in DETR is caused by a multi-optimization path problem, which is highlighted by the one-to-one matching design in DETR.
no code implementations • 6 Apr 2023 • Hao Zhang
Then we present a labeled span mechanism to extract the objects and relations simultaneously, we use the labeled span mechanism to generate labeled spans whose start and end positions indicate the objects, and whose labels correspond to relations of subject and objects.
1 code implementation • 31 Mar 2023 • Haritz Puerto, Tim Baumgärtner, Rachneet Sachdeva, Haishuo Fang, Hao Zhang, Sewin Tariverdian, Kexin Wang, Iryna Gurevych
To ease research in multi-agent models, we extend UKP-SQuARE, an online platform for QA research, to support three families of multi-agent systems: i) agent selection, ii) early-fusion of agents, and iii) late-fusion of agents.
no code implementations • 21 Mar 2023 • Ruiqi Wang, Akshay Gadi Patil, Fenggen Yu, Hao Zhang
We introduce the first active learning (AL) framework for high-accuracy instance segmentation of moveable parts from RGB images of real indoor scenes.
no code implementations • 18 Mar 2023 • Hao Zhang, Yeo Keat Ee, Basura Fernando
Existing works highlight cues utilizing a specific prompt (e. g., colorful prompt).
Ranked #1 on Visual Abductive Reasoning on SHERLOCK
no code implementations • 16 Mar 2023 • Tsun-Hsuan Wang, Pingchuan Ma, Andrew Everett Spielberg, Zhou Xian, Hao Zhang, Joshua B. Tenenbaum, Daniela Rus, Chuang Gan
Existing work has typically been tailored for particular environments or representations.
1 code implementation • ICCV 2023 • Maham Tanveer, Yizhi Wang, Ali Mahdavi-Amiri, Hao Zhang
We introduce a novel method to automatically generate an artistic typography by stylizing one or more letter fonts to visually convey the semantics of an input word, while ensuring that the output remains readable.
2 code implementations • ICCV 2023 • Hao Zhang, Feng Li, Xueyan Zou, Shilong Liu, Chunyuan Li, Jianfeng Gao, Jianwei Yang, Lei Zhang
We present OpenSeeD, a simple Open-vocabulary Segmentation and Detection framework that jointly learns from different segmentation and detection datasets.
Ranked #2 on Instance Segmentation on ADE20K val (using extra training data)
1 code implementation • 13 Mar 2023 • Feng Li, Ailing Zeng, Shilong Liu, Hao Zhang, Hongyang Li, Lei Zhang, Lionel M. Ni
Recent DEtection TRansformer-based (DETR) models have obtained remarkable performance.
1 code implementation • CVPR 2023 • Hao Zhang, Feng Li, Huaizhe xu, Shijia Huang, Shilong Liu, Lionel M. Ni, Lei Zhang
We present a mask-piloted Transformer which improves masked-attention in Mask2Former for image segmentation.
7 code implementations • 9 Mar 2023 • Shilong Liu, Zhaoyang Zeng, Tianhe Ren, Feng Li, Hao Zhang, Jie Yang, Chunyuan Li, Jianwei Yang, Hang Su, Jun Zhu, Lei Zhang
To effectively fuse language and vision modalities, we conceptually divide a closed-set detector into three phases and propose a tight fusion solution, which includes a feature enhancer, a language-guided query selection, and a cross-modality decoder for cross-modality fusion.
Ranked #1 on Zero-Shot Object Detection on MSCOCO
1 code implementation • CVPR 2023 • Zequn Zeng, Hao Zhang, Zhengjue Wang, Ruiying Lu, Dongsheng Wang, Bo Chen
Zero-shot capability has been considered as a new revolution of deep learning, letting machines work on tasks without curated training data.
1 code implementation • 1 Mar 2023 • Mingyue Cheng, Qi Liu, Zhiding Liu, Hao Zhang, Rujiao Zhang, Enhong Chen
In this work, we propose TimeMAE, a novel self-supervised paradigm for learning transferrable time series representations based on transformer networks.
no code implementations • 25 Feb 2023 • Huilin Zhou, Hao Zhang, Huiqi Deng, Dongrui Liu, Wen Shen, Shih-Han Chan, Quanshi Zhang
Therefore, in this paper, we investigate the generalization power of each interactive concept, and we use the generalization power of different interactive concepts to explain the generalization power of the entire DNN.
no code implementations • 25 Feb 2023 • Hao Zhang, Hongyang Li, Ailing Zeng, Feng Li, Shilong Liu, Xingyu Liao, Lei Zhang
To address the second issue, we introduce an auxiliary learning task called Depth-aware Negative Suppression loss.
2 code implementations • 22 Feb 2023 • Zhuohan Li, Lianmin Zheng, Yinmin Zhong, Vincent Liu, Ying Sheng, Xin Jin, Yanping Huang, Zhifeng Chen, Hao Zhang, Joseph E. Gonzalez, Ion Stoica
Model parallelism is conventionally viewed as a method to scale a single large deep learning model beyond the memory limits of a single device.
no code implementations • 18 Feb 2023 • Hao Zhang, Meng Yu, Dong Yu
In this paper, we formulate acoustic howling suppression (AHS) as a supervised learning problem and propose a deep learning approach, called Deep AHS, to address it.
no code implementations • 8 Feb 2023 • Muhammad Hassan, Hao Zhang, Ahmed Fateh Ameen, Home Wu Zeng, Shuye Ma, Wen Liang, Dingqi Shang, Jiaming Ding, Ziheng Zhan, Tsz Kwan Lam, Ming Xu, Qiming Huang, Dongmei Wu, Can Yang Zhang, Zhou You, Awiwu Ain, Pei Wu Qin
Our proposed DL models, named FAG-Net and FGC-Net, correspondingly estimate biological traits (age and gender) and generates fundus images.
no code implementations • 29 Jan 2023 • Yixuan Zhang, Meng Yu, Hao Zhang, Dong Yu, DeLiang Wang
The robustness of the Kalman filter to double talk and its rapid convergence make it a popular approach for addressing acoustic echo cancellation (AEC) challenges.
no code implementations • 25 Jan 2023 • Yonggang Li, Hao Zhang
In this paper, we propose a self-supervised twin network approach based on this a priori.
no code implementations • ICCV 2023 • Fenggen Yu, Yiming Qian, Francisca Gil-Ureta, Brian Jackson, Eric Bennett, Hao Zhang
We present the first active learning tool for fine-grained 3D part labeling, a problem which challenges even the most advanced deep learning (DL) methods due to the significant structural variations among the small and intricate parts.
1 code implementation • 12 Jan 2023 • Zhenfang Chen, Qinhong Zhou, Yikang Shen, Yining Hong, Hao Zhang, Chuang Gan
The see stage scans the image and grounds the visual concept candidates with a visual perception model.
no code implementations • 5 Jan 2023 • Jasmine Collins, Anqi Liang, Jitendra Malik, Hao Zhang, Frédéric Devernay
We present a neural network approach to transfer the motion from a single image of an articulated object to a rest-state (i. e., unarticulated) 3D model.
no code implementations • CVPR 2023 • Feng Li, Ailing Zeng, Shilong Liu, Hao Zhang, Hongyang Li, Lei Zhang, Lionel M. Ni
Recent DEtection TRansformer-based (DETR) models have obtained remarkable performance.
no code implementations • 28 Dec 2022 • Hao Zhang, Tingting Wu, Siyao Cheng, Jie Liu
Federated learning (FL) is an emerging paradigm to train model with distributed data from numerous Internet of Things (IoT) devices.
no code implementations • 19 Dec 2022 • Arya D. McCarthy, Hao Zhang, Shankar Kumar, Felix Stahlberg, Axel H. Ng
A challenge in spoken language translation is that plenty of spoken content is long-form, but short units are necessary for obtaining high-quality translations.
1 code implementation • CVPR 2023 • Yizhi Wang, Zeyu Huang, Ariel Shamir, Hui Huang, Hao Zhang, Ruizhen Hu
We introduce anchored radial observations (ARO), a novel shape encoding for learning implicit field representation of 3D shapes that is category-agnostic and generalizable amid significant shape variations.
no code implementations • 30 Nov 2022 • Hao Zhang, Nan Zhang, Ruixin Zhang, Lei Shen, Yingyi Zhang, Meng Liu
The existing graph methods have demonstrated that 3D geometric information is significant for better performance in MPP.
1 code implementation • 28 Nov 2022 • Shilong Liu, Yaoyuan Liang, Feng Li, Shijia Huang, Hao Zhang, Hang Su, Jun Zhu, Lei Zhang
As phrase extraction can be regarded as a $1$D text segmentation problem, we formulate PEG as a dual detection problem and propose a novel DQ-DETR model, which introduces dual queries to probe different features from image and text for object prediction and phrase mask prediction.
Ranked #7 on Referring Expression Comprehension on RefCOCO
1 code implementation • 21 Nov 2022 • Hao Zhang, Tianyuan Dai, Yu-Wing Tai, Chi-Keung Tang
This paper presents the first significant work on directly predicting 3D face landmarks on neural radiance fields (NeRFs).
no code implementations • 15 Nov 2022 • Shijia Huang, Feng Li, Hao Zhang, Shilong Liu, Lei Zhang, LiWei Wang
Our mutual supervision contains two directions.
no code implementations • 14 Nov 2022 • Zifeng Wang, Zizhao Zhang, Jacob Devlin, Chen-Yu Lee, Guolong Su, Hao Zhang, Jennifer Dy, Vincent Perot, Tomas Pfister
Zero-shot transfer learning for document understanding is a crucial yet under-investigated scenario to help reduce the high cost involved in annotating document entities.
no code implementations • 10 Nov 2022 • Yonghao Zhuang, Hexu Zhao, Lianmin Zheng, Zhuohan Li, Eric P. Xing, Qirong Ho, Joseph E. Gonzalez, Ion Stoica, Hao Zhang
This pattern emerges when the two paradigms of model parallelism - intra-operator and inter-operator parallelism - are combined to support large models on large clusters.
no code implementations • 9 Nov 2022 • Yangjun Wu, Kebin Fang, Yao Zhao, Hao Zhang, Lifeng Shi, Mengqi Zhang
To accomplish punctuation restoration, most existing methods focus on introducing extra information (e. g., part-of-speech) or addressing the class imbalance problem.
1 code implementation • 2 Nov 2022 • Dacheng Li, Rulin Shao, Hongyi Wang, Han Guo, Eric P. Xing, Hao Zhang
Through extensive evaluations, we show that MPCFORMER significantly speeds up Transformer inference in MPC settings while achieving similar ML performance to the input model.
1 code implementation • 23 Oct 2022 • Zhijie Deng, Jiaxin Shi, Hao Zhang, Peng Cui, Cewu Lu, Jun Zhu
Unlike prior spectral methods such as Laplacian Eigenmap that operate in a nonparametric manner, Neural Eigenmap leverages NeuralEF to parametrically model eigenfunctions using a neural network.
no code implementations • 20 Oct 2022 • Zeyu Huang, Juzhan Xu, Sisi Dai, Kai Xu, Hao Zhang, Hui Huang, Ruizhen Hu
Given a few object manipulation demos, NIFT guides the generation of the interaction imitation for a new object instance by matching the Neural Interaction Template (NIT) extracted from the demos in the target Neural Interaction Field (NIF) defined for the new object.
1 code implementation • 19 Oct 2022 • Hao Zhang
A goodness-of-fit metric for LMD similar to the coefficient of determination is defined and used to measure the linear dependency of a set of LMs.
no code implementations • 17 Oct 2022 • Jacqueline R. M. A. Maasch, Hao Zhang, Qian Yang, Fei Wang, Volodymyr Kuleshov
The cost of manual data labeling can be a significant obstacle in supervised learning.
1 code implementation • 13 Oct 2022 • Dacheng Li, Hongyi Wang, Eric Xing, Hao Zhang
Scaling up model sizes can lead to fundamentally new capabilities in many machine learning (ML) tasks.
no code implementations • 11 Oct 2022 • Matthew Brendel, Chang Su, Zilong Bai, Hao Zhang, Olivier Elemento, Fei Wang
Single-cell RNA-sequencing (scRNA-seq) has become a routinely used technique to quantify the gene expression profile of thousands of single cells simultaneously.
1 code implementation • 27 Sep 2022 • Hao Zhang, Hao Wang, Zhen Kan
Automaton based approaches have enabled robots to perform various complex tasks.
1 code implementation • 22 Sep 2022 • Haoyu Hu, Xinyu Yi, Hao Zhang, Jun-Hai Yong, Feng Xu
Single view-based reconstruction of hand-object interaction is challenging due to the severe observation missing caused by occlusions.
no code implementations • 21 Sep 2022 • Yilin Liu, Liqiang Lin, Yue Hu, Ke Xie, Chi-Wing Fu, Hao Zhang, Hui Huang
To reconstruct a new urban scene, we first build the 3D scene proxy, then rely on the predicted reconstruction quality and uncertainty measures by our network, based off of the proxy geometry, to guide the drone path planning.
no code implementations • 21 Aug 2022 • Tingting Wu, Xiao Ding, Hao Zhang, Jinglong Gao, Li Du, Bing Qin, Ting Liu
To relieve this issue, curriculum learning is proposed to improve model performance and generalization by ordering training samples in a meaningful (e. g., easy to hard) sequence.
1 code implementation • 19 Aug 2022 • Rachneet Sachdeva, Haritz Puerto, Tim Baumgärtner, Sewin Tariverdian, Hao Zhang, Kexin Wang, Hossain Shaikh Saadi, Leonardo F. R. Ribeiro, Iryna Gurevych
In this paper, we introduce SQuARE v2, the new version of SQuARE, to provide an explainability infrastructure for comparing models based on methods such as saliency maps and graph-based explanations.
no code implementations • 7 Aug 2022 • Longxiang Jiang, Liyuan Wang, Xinkun Chu, Yonghao Xiao, Hao Zhang
Solving partial differential equations (PDEs) is an important research means in the fields of physics, biology, and chemistry.
1 code implementation • 15 Jul 2022 • Zhicai Wang, Yanbin Hao, Xingyu Gao, Hao Zhang, Shuo Wang, Tingting Mu, Xiangnan He
They use token-mixing layers to capture cross-token interactions, as opposed to the multi-head self-attention mechanism used by Transformers.
1 code implementation • 12 Jul 2022 • Hao Zhang, Lechao Cheng, Yanbin Hao, Chong-Wah Ngo
By replacing a vanilla 2D attention with the LAPS, we could adapt a static transformer into a video one, with zero extra parameters and neglectable computation overhead ($\sim$2. 6\%).
no code implementations • 30 Jun 2022 • Rui Ding, Hao Zhang, Fuhui Zhou, Qihui Wu, Zhu Han
In order to tackle these problems, a novel data-and-knowledge dual-driven automatic modulation classification scheme based on radio frequency machine learning is proposed by exploiting the attribute features of different modulations.
1 code implementation • 8 Jun 2022 • Jun Yan, Huilin Yin, Xiaoyang Deng, Ziming Zhao, Wancheng Ge, Hao Zhang, Gerhard Rigoll
Since adversarial vulnerability can be regarded as a high-frequency phenomenon, it is essential to regulate the adversarially-trained neural network models in the frequency domain.
1 code implementation • 7 Jun 2022 • Zhifeng Ma, Hao Zhang, Jie Liu
Spatiotemporal predictive learning, which predicts future frames through historical prior knowledge with the aid of deep learning, is widely used in many fields.
no code implementations • 7 Jun 2022 • Chi Zhang, Lijuan Liu, Xiaoxue Zang, Frederick Liu, Hao Zhang, Xinying Song, Jindong Chen
Convolutional Neural Networks (CNN) have dominated the field of detection ever since the success of AlexNet in ImageNet classification [12].
9 code implementations • CVPR 2023 • Feng Li, Hao Zhang, Huaizhe xu, Shilong Liu, Lei Zhang, Lionel M. Ni, Heung-Yeung Shum
In this paper we present Mask DINO, a unified object detection and segmentation framework.
Ranked #1 on Panoptic Segmentation on COCO test-dev
no code implementations • 30 May 2022 • Xu Cheng, Hao Zhang, Yue Xin, Wen Shen, Jie Ren, Quanshi Zhang
We also prove that adversarial training tends to strengthen the influence of unconfident input samples with large gradient norms in an exponential manner.
no code implementations • 27 May 2022 • Yushi Cao, Zhiming Li, Tianpei Yang, Hao Zhang, Yan Zheng, Yi Li, Jianye Hao, Yang Liu
In this paper, we combine the above two paradigms together and propose a novel Generalizable Logic Synthesis (GALOIS) framework to synthesize hierarchical and strict cause-effect logic programs.
no code implementations • 23 May 2022 • Hao Zhang, Ruimao Zhang, Zhanglin Peng, Junle Wang, Yanqing Jing
A simple pixel selection strategy followed with the construction of multi-level contrastive units is introduced to optimize the model for both domain adaptation and active supervised learning.
1 code implementation • 15 May 2022 • Cheng Zhang, Hao Zhang, Jie Wang
We present a system called TP3 to perform a downstream task of transformers on generating question-answer pairs (QAPs) from a given article.
no code implementations • 7 May 2022 • Yujia Xue, Siming Zheng, Waleed Tahir, Zhengjue Wang, Hao Zhang, Ziyi Meng, Lei Tian, Xin Yuan
We consider the image and video compression on resource limited platforms.
no code implementations • 5 May 2022 • Neil Jethani, Aahlad Puli, Hao Zhang, Leonid Garber, Lior Jankelson, Yindalon Aphinyanaphongs, Rajesh Ranganath
We found ECG-based assessment outperforms the ADA Risk test, achieving a higher area under the curve (0. 80 vs. 0. 68) and positive predictive value (13% vs. 9%) -- 2. 6 times the prevalence of diabetes in the cohort.
1 code implementation • 26 Apr 2022 • Zixuan Su, Hao Zhang, Jingjing Chen, Lei Pang, Chong-Wah Ngo, Yu-Gang Jiang
Neural networks for visual content understanding have recently evolved from convolutional ones (CNNs) to transformers.
1 code implementation • 7 Apr 2022 • Hao Zhang, Tingting Wu, Siyao Cheng, Jie Liu
On the other hand, it enlarges the distances between local models, resulting in an aggregated global model with poor performance.
1 code implementation • 2 Apr 2022 • Jing-Xiao Liao, Bo-Jian Hou, Hang-Cheng Dong, Hao Zhang, Xiaoge Zhang, Jinwei Sun, Shiping Zhang, Feng-Lei Fan
Encouraged by this inspiring theoretical result on heterogeneous networks, we directly integrate conventional and quadratic neurons in an autoencoder to make a new type of heterogeneous autoencoders.
no code implementations • 18 Mar 2022 • Yang Zhao, Hao Zhang, Xiuyuan Hu
Optimizers in RST would perform a Bernoulli trial at each iteration to choose randomly from base algorithms (SGD) and sharpness-aware algorithms (SAM) with a probability arranged by a predefined scheduling function.
1 code implementation • CVPR 2022 • Yanbin Hao, Hao Zhang, Chong-Wah Ngo, Xiangnan He
By utilizing calibrators to embed feature with four different kinds of contexts in parallel, the learnt representation is expected to be more resilient to diverse types of activities.
Ranked #3 on Egocentric Activity Recognition on EGTEA
no code implementations • 9 Mar 2022 • Hao Zhang, Jie Wang
We present a hierarchical neural network model called SemText to detect HTML boilerplate based on a novel semantic representation of HTML tags, class names, and text blocks.
no code implementations • 9 Mar 2022 • Hao Zhang, You Zhou, Jie Wang
We construct a contextual network to represent a document with syntactic and semantic relations between word-sentence pairs, based on which we devise an unsupervised algorithm called CNATAR (Contextual Network And Text Analysis Rank) to score sentences, and rank them through a bi-objective 0-1 knapsack maximization problem over topic analysis and sentence scores.
15 code implementations • 7 Mar 2022 • Hao Zhang, Feng Li, Shilong Liu, Lei Zhang, Hang Su, Jun Zhu, Lionel M. Ni, Heung-Yeung Shum
Compared to other models on the leaderboard, DINO significantly reduces its model size and pre-training data size while achieving better results.
Ranked #1 on Real-Time Object Detection on COCO 2017 val
no code implementations • 3 Mar 2022 • Feng Li, Hao Zhang, Yi-Fan Zhang, Shilong Liu, Jian Guo, Lionel M. Ni, Pengchuan Zhang, Lei Zhang
This survey is inspired by the remarkable progress in both computer vision and natural language processing, and recent trends shifting from single modality processing to multiple modality comprehension.
16 code implementations • CVPR 2022 • Feng Li, Hao Zhang, Shilong Liu, Jian Guo, Lionel M. Ni, Lei Zhang
Our method is universal and can be easily plugged into any DETR-like methods by adding dozens of lines of code to achieve a remarkable improvement.
no code implementations • 16 Feb 2022 • Hao Zhang, You-Chi Cheng, Shankar Kumar, W. Ronny Huang, Mingqing Chen, Rajiv Mathews
Capitalization normalization (truecasing) is the task of restoring the correct case (uppercase or lowercase) of noisy text.
no code implementations • 13 Feb 2022 • En Yen Puang, Hao Zhang, Hongyuan Zhu, Wei Jing
In this paper we present SA-CNN, a hierarchical and lightweight self-attention based encoding and decoding architecture for representation learning of point cloud data.
1 code implementation • 10 Feb 2022 • Muberra Ozmen, Hao Zhang, Pengyun Wang, Mark Coates
These examples motivate the modelling of multiple types of bi-directional relationships between labels.
Multi-Label Classification Multi-Label Image Classification +4
1 code implementation • 8 Feb 2022 • Yang Zhao, Hao Zhang, Xiuyuan Hu
In this paper, we propose an effective method to improve the model generalization by additionally penalizing the gradient norm of loss function during optimization.
1 code implementation • 7 Feb 2022 • Yilin He, Chaojie Wang, Hao Zhang, Bo Chen, Mingyuan Zhou
This paper introduces a graph generative process to model how the observed edges are generated by aggregating the node interactions over a set of overlapping node communities, each of which contributes to the edges via a logical OR mechanism.
2 code implementations • 4 Feb 2022 • Zhiqin Chen, Andrea Tagliasacchi, Thomas Funkhouser, Hao Zhang
We introduce neural dual contouring (NDC), a new data-driven approach to mesh reconstruction based on dual contouring (DC).