no code implementations • 23 Apr 2024 • Gavin Brown, Jonathan Hayase, Samuel Hopkins, Weihao Kong, Xiyang Liu, Sewoong Oh, Juan C. Perdomo, Adam Smith
We present a sample- and time-efficient differentially private algorithm for ordinary least squares, with error that depends linearly on the dimension and is independent of the condition number of $X^\top X$, where $X$ is the design matrix.
no code implementations • 28 Nov 2023 • Weihao Kong, Mingda Qiao, Rajat Sen
We study the problem of recovering Gaussian data under adversarial corruptions when the noises are low-rank and the corruptions are on the coordinate level.
no code implementations • 14 Nov 2023 • Reese Pathak, Rajat Sen, Weihao Kong, Abhimanyu Das
In this work, we investigate the hypothesis that transformers can learn an optimal predictor for mixtures of regressions.
1 code implementation • 14 Oct 2023 • Abhimanyu Das, Weihao Kong, Rajat Sen, Yichen Zhou
Motivated by recent advances in large language models for Natural Language Processing (NLP), we design a time-series foundation model for forecasting whose out-of-the-box zero-shot performance on a variety of public datasets comes close to the accuracy of state-of-the-art supervised forecasting models for each individual dataset.
no code implementations • 5 Sep 2023 • Ayush Jain, Rajat Sen, Weihao Kong, Abhimanyu Das, Alon Orlitsky
A common approach assumes that the sources fall in one of several unknown subgroups, each with an unknown input distribution and input-output relationship.
3 code implementations • 17 Apr 2023 • Abhimanyu Das, Weihao Kong, Andrew Leach, Shaan Mathur, Rajat Sen, Rose Yu
Recent work has shown that simple linear models can outperform several Transformer based approaches in long term time-series forecasting.
Ranked #3 on Time Series Forecasting on ETTh2 (192) Multivariate
no code implementations • 19 Feb 2023 • Jonathan N. Lee, Weihao Kong, Aldo Pacchiano, Vidya Muthukumar, Emma Brunskill
Whether this is possible for more realistic context distributions has remained an open and important question for tasks such as model selection.
no code implementations • 30 Jan 2023 • Xiyang Liu, Prateek Jain, Weihao Kong, Sewoong Oh, Arun Sai Suggala
Under label-corruption, this is the first efficient linear regression algorithm to guarantee both $(\varepsilon,\delta)$-DP and robustness.
no code implementations • 23 Nov 2022 • Abhimanyu Das, Ayush Jain, Weihao Kong, Rajat Sen
We begin the study of list-decodable linear regression using batches.
no code implementations • 9 Jun 2022 • Pranjal Awasthi, Abhimanyu Das, Weihao Kong, Rajat Sen
We study the problem of learning generalized linear models under adversarial corruptions.
no code implementations • 27 May 2022 • Xiyang Liu, Weihao Kong, Prateek Jain, Sewoong Oh
For sub-Gaussian data, we provide nearly optimal statistical error rates even for $n=\tilde O(d)$.
no code implementations • 21 Apr 2022 • Abhimanyu Das, Weihao Kong, Biswajit Paria, Rajat Sen
Probabilistic, hierarchically coherent forecasting is a key problem in many practical forecasting applications -- the goal is to obtain coherent probabilistic predictions for a large number of time series arranged in a pre-specified tree hierarchy.
no code implementations • 12 Nov 2021 • Xiyang Liu, Weihao Kong, Sewoong Oh
The key insight is that if we design an exponential mechanism that accesses the data only via one-dimensional robust statistics, then the resulting local sensitivity can be dramatically reduced.
no code implementations • 6 Jun 2021 • Zhen Miao, Weihao Kong, Ramya Korlakai Vinayak, Wei Sun, Fang Han
This paper investigates the theoretical and empirical performance of Fisher-Pitman-type permutation tests for assessing the equality of unknown Poisson mixture distributions.
1 code implementation • 22 Apr 2021 • Jonathan Hayase, Weihao Kong, Raghav Somani, Sewoong Oh
There have been promising attempts to use the intermediate representations of such a model to separate corrupted examples from clean ones.
1 code implementation • NeurIPS 2021 • Xiyang Liu, Weihao Kong, Sham Kakade, Sewoong Oh
In statistical learning and analysis from shared data, which is increasingly widely adopted in platforms such as federated learning and meta-learning, there are two major concerns: privacy and robustness.
no code implementations • 19 Nov 2020 • Jonathan N. Lee, Aldo Pacchiano, Vidya Muthukumar, Weihao Kong, Emma Brunskill
Towards this end, we consider the problem of model selection in RL with function approximation, given a set of candidate RL algorithms with known regret guarantees.
no code implementations • NeurIPS 2020 • Weihao Kong, Raghav Somani, Sham Kakade, Sewoong Oh
Together, this approach is robust against outliers and achieves a graceful statistical trade-off; the lack of $\Omega(k^{1/2})$-size tasks can be compensated for with smaller tasks, which can now be as small as $O(\log k)$.
no code implementations • ICML 2020 • Weihao Kong, Raghav Somani, Zhao Song, Sham Kakade, Sewoong Oh
In modern supervised learning, there are a large number of tasks, but many of them are associated with only a small amount of labeled data.
no code implementations • 12 Dec 2019 • Weihao Kong, Gregory Valiant, Emma Brunskill
We study the problem of estimating the expected reward of the optimal policy in the stochastic disjoint linear bandit setting.
no code implementations • 28 Nov 2019 • Ramya Korlakai Vinayak, Weihao Kong, Sham M. Kakade
Provided these paired observations, $\{(X_i, Y_i) \}_{i=1}^N$, our goal is to accurately estimate the \emph{distribution of the change in parameters}, $\delta_i := q_i - p_i$, over the population and properties of interest like the \emph{$\ell_1$-magnitude of the change} with sparse observations ($t\ll N$).
no code implementations • 12 Feb 2019 • Ramya Korlakai Vinayak, Weihao Kong, Gregory Valiant, Sham M. Kakade
Precisely, for sufficiently large $N$, the MLE achieves the information theoretic optimal error bound of $\mathcal{O}(\frac{1}{t})$ for $t < c\log{N}$, with regards to the earth mover's distance (between the estimated and true distributions).
no code implementations • 31 May 2018 • Ilias Diakonikolas, Weihao Kong, Alistair Stewart
An error of $\Omega (\epsilon \sigma)$ is information-theoretically necessary, even with infinite sample size.
no code implementations • NeurIPS 2018 • Weihao Kong, Gregory Valiant
In this setting, we show that with $O(\sqrt{d})$ samples, one can accurately estimate the fraction of the variance of the label that can be explained via the best linear function of the data.
no code implementations • NeurIPS 2017 • Kevin Tian, Weihao Kong, Gregory Valiant
Consider the following estimation problem: there are $n$ entities, each with an unknown parameter $p_i \in [0, 1]$, and we observe $n$ independent random variables, $X_1,\ldots, X_n$, with $X_i \sim $ Binomial$(t, p_i)$.
no code implementations • 21 Feb 2016 • Qingqing Huang, Sham M. Kakade, Weihao Kong, Gregory Valiant
When can accurate reconstruction be accomplished in the sparse data regime?
1 code implementation • 30 Jan 2016 • Weihao Kong, Gregory Valiant
We consider this fundamental recovery problem in the regime where the number of samples is comparable, or even sublinear in the dimensionality of the distribution in question.
no code implementations • NeurIPS 2012 • Weihao Kong, Wu-Jun Li
Most existing hashing methods adopt some projection functions to project the original data into several dimensions of real values, and then each of these projected dimensions is quantized into one bit (zero or one) by thresholding.