no code implementations • 27 Feb 2024 • Qi Zhang, Yiming Zhang, Haobo Wang, Junbo Zhao
When it comes to datasets synthesized by LLMs, a common scenario in this field, dirty samples will even be selected with a higher probability than other samples.
1 code implementation • 24 Jan 2024 • Qi Wei, Lei Feng, Haobo Wang, Bo An
To address this limitation, we propose a noIse-Tolerant Expert Model (ITEM) for debiased learning in sample selection.
1 code implementation • 23 Jan 2024 • Ru Peng, Heming Zou, Haobo Wang, Yawen Zeng, Zenan Huang, Junbo Zhao
The core of the MDE is to establish a meta-distribution statistic, on the information (energy) associated with individual samples, then offer a smoother representation enabled by energy-based learning.
1 code implementation • 27 Nov 2023 • Ruixuan Xiao, Yiwen Dong, Junbo Zhao, Runze Wu, Minmin Lin, Gang Chen, Haobo Wang
While copious solutions, such as active learning for small language models (SLMs) and prevalent in-context learning in the era of large language models (LLMs), have been proposed and alleviate the labeling burden to some extent, their performances are still subject to human intervention.
no code implementations • 2 Nov 2023 • Peng Fu, Yiming Zhang, Haobo Wang, Weikang Qiu, Junbo Zhao
Briefly, the core of this technique is rooted in an ideological emphasis on the pruning and purification of the external knowledge base to be injected into LLMs.
no code implementations • 4 Oct 2023 • Hao Chen, Qi Zhang, Zenan Huang, Haobo Wang, Junbo Zhao
Distributional shift between domains poses great challenges to modern machine learning algorithms.
1 code implementation • ICCV 2023 • Ru Peng, Qiuyang Duan, Haobo Wang, Jiachen Ma, Yanbo Jiang, Yongjun Tu, Xiu Jiang, Junbo Zhao
In this work, we propose Contrastive Automatic Model Evaluation (CAME), a novel AutoEval framework that is rid of involving training set in the loop.
1 code implementation • 28 Jul 2023 • Renyu Zhu, Haoyu Liu, Runze Wu, Minmin Lin, Tangjie Lv, Changjie Fan, Haobo Wang
In this paper, we investigate the problem of learning with noisy labels in real-world annotation scenarios, where noise can be categorized into two types: factual noise and ambiguity noise.
no code implementations • 17 Jul 2023 • Liangyu Zha, Junlin Zhou, Liyao Li, Rui Wang, Qingyi Huang, Saisai Yang, Jing Yuan, Changbao Su, Xiang Li, Aofeng Su, Tao Zhang, Chen Zhou, Kaizhe Shou, Miao Wang, Wufang Zhu, Guoshan Lu, Chao Ye, Yali Ye, Wentao Ye, Yiming Zhang, Xinglong Deng, Jie Xu, Haobo Wang, Gang Chen, Junbo Zhao
Tables are prevalent in real-world databases, requiring significant time and effort for humans to analyze and manipulate.
1 code implementation • 10 Jul 2023 • Chao Ye, Guoshan Lu, Haobo Wang, Liyao Li, Sai Wu, Gang Chen, Junbo Zhao
Tabular data pervades the landscape of the World Wide Web, playing a foundational role in the digital architecture that underpins online information.
1 code implementation • 12 Jun 2023 • Senlin Shu, Shuo He, Haobo Wang, Hongxin Wei, Tao Xiang, Lei Feng
In this paper, we propose a generalized URE that can be equipped with arbitrary loss functions while maintaining the theoretical guarantees, given unlabeled data for LAC.
1 code implementation • 15 May 2023 • Wentao Ye, Mingfeng Ou, Tianyi Li, Yipeng chen, Xuetao Ma, Yifan Yanggong, Sai Wu, Jie Fu, Gang Chen, Haobo Wang, Junbo Zhao
With most of the related literature in the era of LLM uncharted, we propose an automated workflow that copes with an upscaled number of queries/responses.
1 code implementation • 14 May 2023 • Zenan Huang, Haobo Wang, Junbo Zhao, Nenggan Zheng
Understanding the dynamics of time series data typically requires identifying the unique latent factors for data generation, \textit{a. k. a.
no code implementations • 10 May 2023 • Haobo Wang, Shisong Yang, Gengyu Lyu, Weiwei Liu, Tianlei Hu, Ke Chen, Songhe Feng, Gang Chen
In partial multi-label learning (PML), each data example is equipped with a candidate label set, which consists of multiple ground-truth labels and other false-positive labels.
1 code implementation • 11 Apr 2023 • Jianan Yang, Haobo Wang, YanMing Zhang, Ruixuan Xiao, Sai Wu, Gang Chen, Junbo Zhao
The recent large-scale generative modeling has attained unprecedented performance especially in producing high-fidelity images driven by text prompts.
1 code implementation • ICCV 2023 • Zenan Huang, Haobo Wang, Junbo Zhao, Nenggan Zheng
In this work, we first characterize that this failure of conventional ML models in DG is attributed to an inadequate identification of causal structures.
1 code implementation • 21 Sep 2022 • Haobo Wang, Mingxuan Xia, Yixuan Li, YUREN MAO, Lei Feng, Gang Chen, Junbo Zhao
Partial-label learning (PLL) is a peculiar weakly-supervised learning task where the training samples are generally associated with a set of candidate labels instead of single ground truth.
1 code implementation • 21 Jul 2022 • Ruixuan Xiao, Yiwen Dong, Haobo Wang, Lei Feng, Runze Wu, Gang Chen, Junbo Zhao
To overcome the potential side effect of excessive clean set selection procedure, we further devise a novel SSL framework that is able to train balanced and unbiased classifiers on the separated clean and noisy samples.
Ranked #1 on Learning with noisy labels on CIFAR-10N-Worst
1 code implementation • 22 Jan 2022 • Haobo Wang, Ruixuan Xiao, Yixuan Li, Lei Feng, Gang Niu, Gang Chen, Junbo Zhao
Partial label learning (PLL) is an important problem that allows each training example to be labeled with a coarse candidate set, which well suits many real-world data annotation scenarios with label ambiguity.
1 code implementation • ICLR 2022 • Haobo Wang, Ruixuan Xiao, Sharon Li, Lei Feng, Gang Niu, Gang Chen, Junbo Zhao
Partial label learning (PLL) is an important problem that allows each training example to be labeled with a coarse candidate set, which well suits many real-world data annotation scenarios with label ambiguity.
1 code implementation • 19 Jun 2021 • Abhinav Goel, Caleb Tung, Xiao Hu, Haobo Wang, James C. Davis, George K. Thiruvathukal, Yung-Hsiang Lu
At each node in the hierarchy, a small DNN identifies a different attribute of the query image.
no code implementations • 23 Nov 2020 • Weiwei Liu, Haobo Wang, Xiaobo Shen, Ivor W. Tsang
Exabytes of data are generated daily by humans, leading to the growing need for new efforts in dealing with the grand challenges for multi-label learning brought by big data.
2 code implementations • 6 Jun 2019 • Justas Dauparas, Haobo Wang, Avi Swartz, Peter Koo, Mor Nitzan, Sergey Ovchinnikov
Revealing the functional sites of biological sequences, such as evolutionary conserved, structurally interacting or co-evolving protein sites, is a fundamental, and yet challenging task.
Quantitative Methods