1 code implementation • 23 May 2024 • Peng Wang, Zexi Li, Ningyu Zhang, Ziwen Xu, Yunzhi Yao, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen
In WISE, we design a dual parametric memory scheme, which consists of the main memory for the pretrained knowledge and a side memory for the edited knowledge.
no code implementations • 23 May 2024 • Zexi Li, Lingzhi Gao, Chao Wu
Generative artificial intelligence (GenAI) has made significant progress in understanding world knowledge and generating content from human languages across various modalities, like text-to-text large language models, text-to-image stable diffusion, and text-to-video Sora.
no code implementations • 16 May 2024 • Jinge Wu, Hang Dong, Zexi Li, Arijit Patra, Honghan Wu
The infrequency and heterogeneity of clinical presentations in rare diseases often lead to underdiagnosis and their exclusion from structured datasets.
no code implementations • 29 Feb 2024 • Zexi Li, Jie Lin, Zhiqi Li, Didi Zhu, Chao Wu
Bridging the gap between LMC and FL, in this paper, we leverage fixed anchor models to empirically and theoretically study the transitivity property of connectivity from two models (LMC) to a group of models (model fusion in FL).
no code implementations • 19 Feb 2024 • Didi Zhu, Zhongyi Sun, Zexi Li, Tao Shen, Ke Yan, Shouhong Ding, Kun Kuang, Chao Wu
Catastrophic forgetting emerges as a critical challenge when fine-tuning multi-modal large language models (MLLMs), where improving performance on unseen tasks often leads to a significant performance drop on the original tasks.
1 code implementation • 10 Feb 2024 • Rui Ye, Wenhao Wang, Jingyi Chai, Dihan Li, Zexi Li, Yinda Xu, Yaxin Du, Yanfeng Wang, Siheng Chen
Trained on massive publicly available data, large language models (LLMs) have demonstrated tremendous success across various fields.
no code implementations • 2 Feb 2024 • Zexi Li, Zhiqi Li, Jie Lin, Tao Shen, Tao Lin, Chao Wu
In deep learning, stochastic gradient descent often yields functionally similar yet widely scattered solutions in the weight space even under the same initialization, causing barriers in the Linear Mode Connectivity (LMC) landscape.
1 code implementation • 19 Dec 2023 • Ruiyuan Zhang, Jiaxiang Liu, Zexi Li, Hao Dong, Jie Fu, Chao Wu
Therefore, there is a need to develop a scalable framework for geometric fracture assembly without relying on semantic information.
no code implementations • 30 Nov 2023 • Lingzhi Gao, Zexi Li, Yang Lu, Chao Wu
A typical way of pFL focuses on label distribution skew, and they adopt a decoupling scheme where the model is split into a common feature extractor and two prediction heads (generic and personalized).
no code implementations • 28 Jun 2023 • Didi Zhu, Zexi Li, Min Zhang, Junkun Yuan, Yunfeng Shao, Jiashuo Liu, Kun Kuang, Yinchuan Li, Chao Wu
It is found that NC optimality of text-to-image representations shows a positive correlation with downstream generalizability, which is more severe under class imbalance settings.
no code implementations • ICCV 2023 • Didi Zhu, Yincuan Li, Junkun Yuan, Zexi Li, Kun Kuang, Chao Wu
To address this issue, we propose a Universal Attention Matching (UniAM) framework by exploiting the self-attention mechanism in vision transformer to capture the crucial object information.
Ranked #1 on Universal Domain Adaptation on Office-Home
no code implementations • 12 Apr 2023 • Zexi Li, Qunwei Li, Yi Zhou, Wenliang Zhong, Guannan Zhang, Chao Wu
Federated learning (FL) is a popular way of edge computing that doesn't compromise users' privacy.
no code implementations • 6 Apr 2023 • Chenrui Wu, Zexi Li, Fangxin Wang, Chao Wu
It includes a noise-resilient local solver and a robust global aggregator.
1 code implementation • ICCV 2023 • Zexi Li, Xinyi Shang, Rui He, Tao Lin, Chao Wu
Recent advances in neural collapse have shown that the classifiers and feature prototypes under perfect training scenarios collapse into an optimal structure called simplex equiangular tight frame (ETF).
1 code implementation • 14 Feb 2023 • Zexi Li, Tao Lin, Xinyi Shang, Chao Wu
In federated learning (FL), weighted aggregation of local models is conducted to generate a global model, and the aggregation weights are normalized (the sum of weights is 1) and proportional to the local data sizes.
no code implementations • 23 Mar 2022 • Zexi Li, Jiaxun Lu, Shuang Luo, Didi Zhu, Yunfeng Shao, Yinchuan Li, Zhimeng Zhang, Yongheng Wang, Chao Wu
In the literature, centralized clustered FL algorithms require the assumption of the number of clusters and hence are not effective enough to explore the latent relationships among clients.
no code implementations • 26 Oct 2021 • Shuang Luo, Didi Zhu, Zexi Li, Chao Wu
Despite federated learning endows distributed clients with a cooperative training mode under the premise of protecting data privacy and security, the clients are still vulnerable when encountering adversarial samples due to the lack of robustness.