1 code implementation • 28 Mar 2024 • Niklas Stoehr, Mitchell Gordon, Chiyuan Zhang, Owen Lewis
Can we localize the weights and mechanisms used by a language model to memorize and recite entire paragraphs of its training data?
no code implementations • 26 Mar 2024 • Lynn Chua, Badih Ghazi, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Amer Sinha, Chiyuan Zhang
We demonstrate a substantial gap between the privacy guarantees of the Adaptive Batch Linear Queries (ABLQ) mechanism under different types of batch sampling: (i) Shuffling, and (ii) Poisson subsampling; the typical analysis of Differentially Private Stochastic Gradient Descent (DP-SGD) follows by interpreting it as a post-processing of ABLQ.
no code implementations • 26 Jan 2024 • Lynn Chua, Qiliang Cui, Badih Ghazi, Charlie Harrison, Pritish Kamath, Walid Krichene, Ravi Kumar, Pasin Manurangsi, Krishna Giri Narra, Amer Sinha, Avinash Varadarajan, Chiyuan Zhang
Motivated by problems arising in digital advertising, we introduce the task of training differentially private (DP) machine learning models with semi-sensitive features.
no code implementations • NeurIPS 2023 • Ashwinkumar Badanidiyuru, Badih Ghazi, Pritish Kamath, Ravi Kumar, Ethan Leeman, Pasin Manurangsi, Avinash V Varadarajan, Chiyuan Zhang
We propose a new family of label randomizers for training regression models under the constraint of label differential privacy (DP).
1 code implementation • 18 Jul 2023 • Pratyush Maini, Michael C. Mozer, Hanie Sedghi, Zachary C. Lipton, J. Zico Kolter, Chiyuan Zhang
Recent efforts at explaining the interplay of memorization and generalization in deep overparametrized networks have posited that neural networks $\textit{memorize}$ "hard" examples in the final few layers of the model.
no code implementations • 27 Jun 2023 • Badih Ghazi, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Ayush Sekhari, Chiyuan Zhang
Subsequently, given any subset of examples that wish to be unlearnt, the goal is to learn, without the knowledge of the original training dataset, a good predictor that is identical to the predictor that would have been produced when learning from scratch on the surviving examples.
no code implementations • 8 May 2023 • Badih Ghazi, Pritish Kamath, Ravi Kumar, Raghu Meka, Pasin Manurangsi, Chiyuan Zhang
We introduce a new mechanism for stochastic convex optimization (SCO) with user-level differential privacy guarantees.
no code implementations • 12 Dec 2022 • Badih Ghazi, Pritish Kamath, Ravi Kumar, Ethan Leeman, Pasin Manurangsi, Avinash V Varadarajan, Chiyuan Zhang
We study the task of training regression models with the guarantee of label differential privacy (DP).
no code implementations • 21 Nov 2022 • Carson Denison, Badih Ghazi, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Krishna Giri Narra, Amer Sinha, Avinash V Varadarajan, Chiyuan Zhang
A well-known algorithm in privacy-preserving ML is differentially private stochastic gradient descent (DP-SGD).
no code implementations • 31 Oct 2022 • Daphne Ippolito, Florian Tramèr, Milad Nasr, Chiyuan Zhang, Matthew Jagielski, Katherine Lee, Christopher A. Choquette-Choo, Nicholas Carlini
Studying data memorization in neural language models helps us understand the risks (e. g., to privacy or copyright) associated with models regurgitating training data and aids in the development of countermeasures.
no code implementations • 30 Jun 2022 • Matthew Jagielski, Om Thakkar, Florian Tramèr, Daphne Ippolito, Katherine Lee, Nicholas Carlini, Eric Wallace, Shuang Song, Abhradeep Thakurta, Nicolas Papernot, Chiyuan Zhang
In memorization, models overfit specific training examples and become susceptible to privacy attacks.
no code implementations • 21 Jun 2022 • Nicholas Carlini, Matthew Jagielski, Chiyuan Zhang, Nicolas Papernot, Andreas Terzis, Florian Tramer
Machine learning models trained on private datasets have been shown to leak their private data.
1 code implementation • 26 May 2022 • Emmanuel Abbe, Samy Bengio, Elisabetta Cornacchia, Jon Kleinberg, Aryo Lotfi, Maithra Raghu, Chiyuan Zhang
More generally, the paper considers the learning of logical functions with gradient descent (GD) on neural networks.
1 code implementation • 15 Apr 2022 • Weiyan Shi, Ryan Shea, Si Chen, Chiyuan Zhang, Ruoxi Jia, Zhou Yu
Utilizing the fact that sensitive information in language data tends to be sparse, Shi et al. (2021) formalized a DP notion extension called Selective Differential Privacy (SDP) to protect only the sensitive tokens defined by a policy function.
2 code implementations • 15 Feb 2022 • Nicholas Carlini, Daphne Ippolito, Matthew Jagielski, Katherine Lee, Florian Tramer, Chiyuan Zhang
Large language models (LMs) have been shown to memorize parts of their training data, and when prompted appropriately, they will emit the memorized training data verbatim.
no code implementations • 15 Oct 2021 • Yao Qin, Chiyuan Zhang, Ting Chen, Balaji Lakshminarayanan, Alex Beutel, Xuezhi Wang
We show that patch-based negative augmentation consistently improves robustness of ViTs across a wide set of ImageNet based robustness benchmarks.
4 code implementations • NeurIPS 2021 • Maithra Raghu, Thomas Unterthiner, Simon Kornblith, Chiyuan Zhang, Alexey Dosovitskiy
Finally, we study the effect of (pretraining) dataset scale on intermediate features and transfer learning, and conclude with a discussion on connections to new architectures such as the MLP-Mixer.
2 code implementations • 27 Jul 2021 • Chiyuan Zhang, Maithra Raghu, Jon Kleinberg, Samy Bengio
In PVR, this is done by having one part of the task input act as a pointer, giving instructions on a different input location, which forms the output.
1 code implementation • ACL 2022 • Katherine Lee, Daphne Ippolito, Andrew Nystrom, Chiyuan Zhang, Douglas Eck, Chris Callison-Burch, Nicholas Carlini
As a result, over 1% of the unprompted output of language models trained on these datasets is copied verbatim from the training data.
no code implementations • 15 Mar 2021 • Piotr Teterwak, Chiyuan Zhang, Dilip Krishnan, Michael C. Mozer
We use our reconstruction model as a tool for exploring the nature of representations, including: the influence of model architecture and training objectives (specifically robust losses), the forms of invariance that networks achieve, representational differences between correctly and incorrectly classified images, and the effects of manipulating logits and images.
no code implementations • NeurIPS 2021 • Badih Ghazi, Noah Golowich, Ravi Kumar, Pasin Manurangsi, Chiyuan Zhang
The Randomized Response (RR) algorithm is a classical technique to improve robustness in survey aggregation, and has been widely adopted in applications with differential privacy guarantees.
1 code implementation • NeurIPS 2020 • Behnam Neyshabur, Hanie Sedghi, Chiyuan Zhang
One desired capability for machines is the ability to transfer their knowledge of one domain to another where data is (usually) scarce.
no code implementations • NeurIPS 2020 • Vitaly Feldman, Chiyuan Zhang
First, natural image and data distributions are (informally) known to be long-tailed, that is have a significant fraction of rare and atypical examples.
1 code implementation • 8 Feb 2020 • Ziheng Jiang, Chiyuan Zhang, Kunal Talwar, Michael C. Mozer
We obtain empirical estimates of this score for individual instances in multiple data sets, and we show that the score identifies out-of-distribution and mislabeled examples at one end of the continuum and strongly regular examples at the other end.
2 code implementations • NeurIPS 2019 • Maithra Raghu, Chiyuan Zhang, Jon Kleinberg, Samy Bengio
Investigating the learned representations and features, we find that some of the differences from transfer learning are due to the over-parametrization of standard models rather than sophisticated feature reuse.
no code implementations • ICLR 2020 • Chiyuan Zhang, Samy Bengio, Moritz Hardt, Michael C. Mozer, Yoram Singer
We study the interplay between memorization and generalization of overparameterized networks in the extreme case of a single training example and an identity-mapping task.
2 code implementations • ICML Workshop Deep_Phenomen 2019 • Chiyuan Zhang, Samy Bengio, Yoram Singer
Morally, layers of large deep neural networks can be categorized as either "robust" or "critical".
1 code implementation • 22 Sep 2018 • Tom B. Brown, Nicholas Carlini, Chiyuan Zhang, Catherine Olsson, Paul Christiano, Ian Goodfellow
We introduce a two-player contest for evaluating the safety and robustness of machine learning systems, with a large prize pool.
1 code implementation • 18 Apr 2018 • Chiyuan Zhang, Oriol Vinyals, Remi Munos, Samy Bengio
We conclude with a general discussion on overfitting in RL and a study of the generalization behaviors from the perspective of inductive bias.
no code implementations • ICML 2018 • Neil C. Rabinowitz, Frank Perbet, H. Francis Song, Chiyuan Zhang, S. M. Ali Eslami, Matthew Botvinick
We design a Theory of Mind neural network -- a ToMnet -- which uses meta-learning to build models of the agents it encounters, from observations of their behaviour alone.
no code implementations • 7 Jan 2018 • Chiyuan Zhang, Qianli Liao, Alexander Rakhlin, Brando Miranda, Noah Golowich, Tomaso Poggio
In Theory IIb we characterize with a mix of theory and experiments the optimization of deep convolutional networks by Stochastic Gradient Descent.
7 code implementations • 10 Nov 2016 • Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, Oriol Vinyals
Despite their massive size, successful deep artificial neural networks can exhibit a remarkably small difference between training and test performance.
6 code implementations • 21 Apr 2016 • Tianqi Chen, Bing Xu, Chiyuan Zhang, Carlos Guestrin
In the extreme case, our analysis also shows that the memory consumption can be reduced to O(log n) with as little as O(n log n) extra cost for forward computation.
2 code implementations • 3 Dec 2015 • Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, Zheng Zhang
This paper describes both the API design and the system implementation of MXNet, and explains how embedding of both symbolic expression and tensor operation is handled in a unified fashion.
no code implementations • NeurIPS 2015 • Charlie Frogner, Chiyuan Zhang, Hossein Mobahi, Mauricio Araya-Polo, Tomaso Poggio
In this paper we develop a loss function for multi-label learning, based on the Wasserstein distance.
no code implementations • 16 Jun 2014 • Georgios Evangelopoulos, Stephen Voinea, Chiyuan Zhang, Lorenzo Rosasco, Tomaso Poggio
Recognition of speech, and in particular the ability to generalize and learn from small sets of labelled examples like humans do, depends on an appropriate representation of the acoustic input.
no code implementations • 1 Apr 2014 • Chiyuan Zhang, Georgios Evangelopoulos, Stephen Voinea, Lorenzo Rosasco, Tomaso Poggio
We present the main theoretical and computational aspects of a framework for unsupervised learning of invariant audio representations, empirically evaluated on music genre classification.
no code implementations • NeurIPS 2012 • Binbin Lin, Sen yang, Chiyuan Zhang, Jieping Ye, Xiaofei He
MTVFL has the following key properties: (1) the vector fields we learned are close to the gradient fields of the prediction functions; (2) within each task, the vector field is required to be as parallel as possible which is expected to span a low dimensional subspace; (3) the vector fields from all tasks share a low dimensional subspace.
no code implementations • NeurIPS 2011 • Binbin Lin, Chiyuan Zhang, Xiaofei He
To achieve this goal, we show that the second order smoothness measures the linearity of the function, and the gradient field of a linear function has to be a parallel vector field.