1 code implementation • 12 Oct 2023 • Yite Wang, Jiahao Su, Hanlin Lu, Cong Xie, Tianyi Liu, Jianbo Yuan, Haibin Lin, Ruoyu Sun, Hongxia Yang
Our empirical results demonstrate that LEMON reduces computational costs by 56. 7% for Vision Transformers and 33. 2% for BERT when compared to training from scratch.
no code implementations • 13 Apr 2022 • Hanlin Lu, Changchang Liu, Shiqiang Wang, Ting He, Vijay Narayanan, Kevin S. Chan, Stephen Pasteris
Coresets are small, weighted summaries of larger datasets, aiming at providing provable error bounds for machine learning (ML) tasks while significantly reducing the communication and computation costs.
no code implementations • 8 Feb 2021 • Hanlin Lu, Ting He, Shiqiang Wang, Changchang Liu, Mehrdad Mahdavi, Vijaykrishnan Narayanan, Kevin S. Chan, Stephen Pasteris
We consider the problem of computing the k-means centers for a large high-dimensional dataset in the context of edge-based machine learning, where data sources offload machine learning computation to nearby edge servers.
no code implementations • 6 Jul 2020 • Hanlin Lu, Changchang Liu, Ting He, Shiqiang Wang, Kevin S. Chan
Distributed machine learning generally aims at training a global model based on distributed data without collecting all the data to a centralized location, where two different approaches have been proposed: collecting and aggregating local models (federated learning) and collecting and training over representative data summaries (coreset).
no code implementations • 11 Apr 2019 • Hanlin Lu, Ming-Ju Li, Ting He, Shiqiang Wang, Vijaykrishnan Narayanan, Kevin S. Chan
Coreset, which is a summary of the original dataset in the form of a small weighted set in the same sample space, provides a promising approach to enable machine learning over distributed data.