1 code implementation • EMNLP 2021 • Jihao Shi, Xiao Ding, Li Du, Ting Liu, Bing Qin
Many open-domain question answering problems can be cast as a textual entailment task, where a question and candidate answers are concatenated to form hypotheses.
1 code implementation • COLING 2022 • Xiao Ding, Bowen Chen, Li Du, Bing Qin, Ting Liu
To fill the gap, we propose CogBERT, a framework that can induce fine-grained cognitive features from cognitive data and incorporate cognitive features into BERT by adaptively adjusting the weight of cognitive features for different NLP tasks.
no code implementations • 13 Apr 2024 • Yijiang Liu, Rongyu Zhang, Huanrui Yang, Kurt Keutzer, Yuan Du, Li Du, Shanghang Zhang
Large Language Models (LLMs) have demonstrated significant potential in performing multiple tasks in multimedia applications, ranging from content generation to interactive entertainment, and artistic creation.
1 code implementation • 2 Apr 2024 • Zhouhao Sun, Xiao Ding, Li Du, Bibo Cai, Jinglong Gao, Ting Liu, Qin Bing
To address this issue, we propose a novel framework, named Generalizable and Faithful Reasoner (GFaiR), which introduces the paradigm of resolution refutation.
no code implementations • 18 Feb 2024 • Yang Zhao, Li Du, Xiao Ding, Kai Xiong, Zhouhao Sun, Jun Shi, Ting Liu, Bing Qin
Through pretraining on a corpus with various sources, Large Language Models (LLMs) have gained impressive performance.
1 code implementation • 31 Jan 2024 • Jianing Li, Xi Nan, Ming Lu, Li Du, Shanghang Zhang
To overcome this limitation in MLLMs, we introduce Proximity Question Answering (Proximity QA), a novel framework designed to enable MLLMs to infer the proximity relationship between objects in images.
no code implementations • 15 Jan 2024 • Rongyu Zhang, Zefan Cai, Huanrui Yang, Zidong Liu, Denis Gudovskiy, Tomoyuki Okuno, Yohei Nakata, Kurt Keutzer, Baobao Chang, Yuan Du, Li Du, Shanghang Zhang
Finetuning a pretrained vision model (PVM) is a common technique for learning downstream vision tasks.
no code implementations • 29 Dec 2023 • Li Du, Afra Amini, Lucas Torroba Hennigen, Xinyan Velocity Yu, Jason Eisner, Holden Lee, Ryan Cotterell
Recent papers have demonstrated the possibility of energy-based text generation by adapting gradient-based sampling algorithms, a paradigm of MCMC algorithms that promises fast convergence.
1 code implementation • 14 Dec 2023 • Xingrun Xing, Li Du, Xinyuan Wang, Xianlin Zeng, Yequan Wang, Zheng Zhang, Jiajun Zhang
Specifically, we first analyze the binarization error in self-attention operations and derive the polynomials of binarization error.
no code implementations • 7 Nov 2023 • Ryan Cotterell, Anej Svete, Clara Meister, Tianyu Liu, Li Du
Large language models have become one of the most commonly deployed NLP inventions.
1 code implementation • 19 Oct 2023 • Franz Nowak, Anej Svete, Li Du, Ryan Cotterell
We extend the Turing completeness result to the probabilistic case, showing how a rationally weighted RLM with unbounded computation time can simulate any deterministic probabilistic Turing machine (PTM) with rationally weighted transitions.
no code implementations • 11 Sep 2023 • Li Du, Yequan Wang, Xingrun Xing, Yiqun Ya, Xiang Li, Xin Jiang, Xuezhi Fang
Although demonstrating superb performance on various NLP tasks, large language models (LLMs) still suffer from the hallucination problem, which threatens the reliability of LLMs.
no code implementations • 7 Sep 2023 • Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Xuying Meng, Siqi Fan, Peng Han, Jing Li, Li Du, Bowen Qin, Zheng Zhang, Aixin Sun, Yequan Wang
We demonstrate that a 101B-parameter LLM with 0. 31T tokens can be trained with a budget of 100K US dollars.
no code implementations • 30 Aug 2023 • Qingyuan Li, Yifan Zhang, Liang Li, Peng Yao, Bo Zhang, Xiangxiang Chu, Yerui Sun, Li Du, Yuchen Xie
In this study, we propose a novel W4A8 post-training quantization method for the available open-sourced LLMs, which combines the advantages of both two recipes.
no code implementations • ICCV 2023 • Yifan Zhang, Zhen Dong, Huanrui Yang, Ming Lu, Cheng-Ching Tseng, Yuan Du, Kurt Keutzer, Li Du, Shanghang Zhang
Multi-view 3D detection based on BEV (bird-eye-view) has recently achieved significant improvements.
1 code implementation • 29 Jun 2023 • Vilém Zouhar, Clara Meister, Juan Luis Gastaldi, Li Du, Tim Vieira, Mrinmaya Sachan, Ryan Cotterell
Via submodular functions, we prove that the iterative greedy version is a $\frac{1}{{\sigma(\boldsymbol{\mu}^\star)}}(1-e^{-{\sigma(\boldsymbol{\mu}^\star)}})$-approximation of an optimal merge sequence, where ${\sigma(\boldsymbol{\mu}^\star)}$ is the total backward curvature with respect to the optimal merge sequence $\boldsymbol{\mu}^\star$.
1 code implementation • 29 Jun 2023 • Vilém Zouhar, Clara Meister, Juan Luis Gastaldi, Li Du, Mrinmaya Sachan, Ryan Cotterell
Subword tokenization is a key part of many NLP pipelines.
1 code implementation • NeurIPS 2023 • Afra Amini, Li Du, Ryan Cotterell
In this paper, we take an important step toward building a principled approach for sampling from language models with gradient-based methods.
no code implementations • 20 May 2023 • Li Du, Hongyuan Mei, Jason Eisner
To predict the next token, autoregressive models ordinarily examine the past.
no code implementations • 20 Dec 2022 • Li Du, Lucas Torroba Hennigen, Tiago Pimentel, Clara Meister, Jason Eisner, Ryan Cotterell
Language modeling, a central task in natural language processing, involves estimating a probability distribution over strings.
1 code implementation • 16 Dec 2022 • Kai Xiong, Xiao Ding, Zhongyang Li, Li Du, Bing Qin, Yi Zheng, Baoxing Huai
Causal chain reasoning (CCR) is an essential ability for many decision-making AI systems, which requires the model to build reliable causal chains by connecting causal pairs.
no code implementations • 6 Dec 2022 • Lirui Xiao, Huanrui Yang, Zhen Dong, Kurt Keutzer, Li Du, Shanghang Zhang
CSQ stabilizes the bit-level mixed-precision training process with a bi-level gradual continuous sparsification on both the bit values of the quantized weights and the bit selection in determining the quantization precision of each layer.
1 code implementation • 1 Dec 2022 • Jianing Li, Ming Lu, Jiaming Liu, Yandong Guo, Li Du, Shanghang Zhang
In this paper, we propose a unified framework named BEV-LGKD to transfer the knowledge in the teacher-student manner.
no code implementations • CVPR 2023 • Yijiang Liu, Huanrui Yang, Zhen Dong, Kurt Keutzer, Li Du, Shanghang Zhang
Building on the theoretical insight, NoisyQuant achieves the first success on actively altering the heavy-tailed activation distribution with additive noisy bias to fit a given quantizer.
no code implementations • 18 Oct 2022 • Shuo Xie, Jiahao Qiu, Ankita Pasad, Li Du, Qing Qu, Hongyuan Mei
We propose to select layers based on the variability of their hidden states given a task-specific corpus.
no code implementations • 26 Aug 2022 • Jianing Li, Jiaming Liu, Xiaobao Wei, Jiyuan Zhang, Ming Lu, Lei Ma, Li Du, Tiejun Huang, Shanghang Zhang
In this paper, we propose a novel Uncertainty-Guided Depth Fusion (UGDF) framework to fuse the predictions of monocular and stereo depth estimation networks for spike camera.
no code implementations • 21 Aug 2022 • Tingting Wu, Xiao Ding, Hao Zhang, Jinglong Gao, Li Du, Bing Qin, Ting Liu
To relieve this issue, curriculum learning is proposed to improve model performance and generalization by ordering training samples in a meaningful (e. g., easy to hard) sequence.
no code implementations • 14 Aug 2022 • Bowen Chen, Xiao Ding, Li Du, Qin Bing, Ting Liu
Given a task, human learns from easy to hard, whereas the model learns randomly.
no code implementations • Findings (ACL) 2022 • Li Du, Xiao Ding, Yue Zhang, Kai Xiong, Ting Liu, Bing Qin
To this end, we incorporate an additional structured variable into BERT to learn to predict the event connections in the training process.
no code implementations • 8 Feb 2022 • Guhong Nie, Lirui Xiao, Menglong Zhu, Dongliang Chu, Yue Shen, Peng Li, Kang Yang, Li Du, Bo Chen
For binary neural networks (BNNs) to become the mainstream on-device computer vision algorithm, they must achieve a superior speed-vs-accuracy tradeoff than 8-bit quantization and establish a similar degree of general applicability in vision tasks.
no code implementations • 12 Oct 2021 • Zhuang Shao, Xiaoliang Chen, Li Du, Lei Chen, Yuan Du, Wei Zhuang, Huadong Wei, Chenjia Xie, Zhongfeng Wang
To maintain real-time processing in embedded systems, large on-chip memory is required to buffer the interlayer feature maps.
1 code implementation • ACL 2021 • Li Du, Xiao Ding, Kai Xiong, Ting Liu, Bing Qin
ExCAR first acquires additional evidence information from a large-scale causal event graph as logical rules for causal reasoning.
1 code implementation • ACL 2021 • Li Du, Xiao Ding, Ting Liu, Bing Qin
Abductive reasoning aims at inferring the most plausible explanation for observed events, which would play critical roles in various NLP applications, such as reading comprehension and question answering.
no code implementations • IJCNLP 2019 • Li Du, Xiao Ding, Ting Liu, Zhongyang Li
Understanding event and event-centered commonsense reasoning are crucial for natural language processing (NLP).
no code implementations • 6 Sep 2019 • Jinming Lu, Siyuan Lu, Zhisheng Wang, Chao Fang, Jun Lin, Zhongfeng Wang, Li Du
With the increasing size of Deep Neural Network (DNN) models, the high memory space requirements and computational complexity have become an obstacle for efficient DNN implementations.
16 code implementations • ICLR 2020 • Ari Holtzman, Jan Buys, Li Du, Maxwell Forbes, Yejin Choi
Despite considerable advancements with deep neural language models, the enigma of neural text degeneration persists when these models are tested as text generators.
no code implementations • 19 Sep 2017 • Yuan Du, Li Du, Xuefeng Gu, Jieqiong Du, X. Shawn Wang, Boyu Hu, Mingzhe Jiang, Xiaoliang Chen, Junjie Su, Subramanian S. Iyer, Mau-Chung Frank Chang
The proposed computing engine is composed of a scalable CTT multiplier array and energy efficient analog-digital interfaces.
no code implementations • 15 Sep 2017 • Yuan Du, Li Du, Yilei Li, Junjie Su, Mau-Chung Frank Chang
Deep convolutional neural networks (CNN) are widely used in modern artificial intelligence (AI) and smart vision systems but also limited by computation latency, throughput, and energy efficiency on a resource-limited scenario, such as mobile devices, internet of things (IoT), unmanned aerial vehicles (UAV), and so on.
no code implementations • 8 Jul 2017 • Li Du, Yuan Du, Yilei Li, Mau-Chung Frank Chang
To implement image detection using CNN in the internet of things (IoT) devices, a streaming hardware accelerator is proposed.