no code implementations • 18 Apr 2024 • David Restrepo, Chenwei Wu, Constanza Vásquez-Venegas, Luis Filipe Nakayama, Leo Anthony Celi, Diego M López
In the big data era, integrating diverse data modalities poses significant challenges, particularly in complex fields like healthcare.
no code implementations • 4 Oct 2023 • Chenwei Wu, Li Erran Li, Stefano Ermon, Patrick Haffner, Rong Ge, Zaiwei Zhang
Compositionality is a common property in many modalities including natural languages and images, but the compositional generalization of multi-modal models is not well-understood.
1 code implementation • 17 Apr 2023 • Kathryn Wantlin, Chenwei Wu, Shih-Cheng Huang, Oishi Banerjee, Farah Dadabhoy, Veeral Vipin Mehta, Ryan Wonhee Han, Fang Cao, Raja R. Narayan, Errol Colak, Adewole Adamson, Laura Heacock, Geoffrey H. Tison, Alex Tamkin, Pranav Rajpurkar
Finally, we evaluate performance on out-of-distribution data collected at different hospitals than the training data, representing naturally-occurring distribution shifts that frequently degrade the performance of medical AI models.
1 code implementation • 24 Feb 2023 • Muthu Chidambaram, Chenwei Wu, Yu Cheng, Rong Ge
Furthermore, drawing from the growing body of work on self-supervised learning, we propose a novel masking objective for which recovering the ground-truth dictionary is in fact optimal as the signal increases for a large class of data-generating processes.
1 code implementation • 24 Oct 2022 • Muthu Chidambaram, Xiang Wang, Chenwei Wu, Rong Ge
Mixup is a data augmentation technique that relies on training using random convex combinations of data points and their labels.
1 code implementation • ICLR 2022 • Muthu Chidambaram, Xiang Wang, Yuzheng Hu, Chenwei Wu, Rong Ge
Despite seeing very few true data points during training, models trained using Mixup seem to still minimize the original empirical risk and exhibit better generalization and robustness on various tasks when compared to standard training.
no code implementations • NeurIPS 2020 • Xiang Wang, Chenwei Wu, Jason D. Lee, Tengyu Ma, Rong Ge
We show that in a lazy training regime (similar to the NTK regime for neural networks) one needs at least $m = \Omega(d^{l-1})$, while a variant of gradient descent can find an approximate tensor when $m = O^*(r^{2. 5l}\log d)$.
no code implementations • 8 Oct 2020 • Yikai Wu, Xingyu Zhu, Chenwei Wu, Annie Wang, Rong Ge
We can analyze the properties of these smaller matrices and prove the structure of top eigenspace random 2-layer networks.
1 code implementation • 24 Sep 2020 • Chenwei Wu, Chenzhuang Du, Yang Yuan
In the classical multi-party computation setting, multiple parties jointly compute a function without revealing their own input data.
1 code implementation • 30 Jun 2020 • Xiang Wang, Shuai Yuan, Chenwei Wu, Rong Ge
Solving this problem using a learning-to-learn approach -- using meta-gradient descent on a meta-objective based on the trajectory that the optimizer generates -- was recently shown to be effective.
no code implementations • ICLR 2018 • Chenwei Wu, Jiajun Luo, Jason D. Lee
Deep learning models can be efficiently optimized via stochastic gradient descent, but there is little theoretical evidence to support this.