Search Results for author: Florentina Bunea

Found 11 papers, 3 papers with code

Estimation and inference for the Wasserstein distance between mixing measures in topic models

no code implementations • 26 Jun 2022 • Xin Bing, Florentina Bunea, Jonathan Niles-Weed

Our results establish this metric to be a canonical choice.

Paper
Add Code

Likelihood estimation of sparse topic distributions in topic models and its applications to Wasserstein document distance calculations

no code implementations • 12 Jul 2021 • Xin Bing, Florentina Bunea, Seth Strimas-Mackey, Marten Wegkamp

When $A$ is unknown, we estimate $T$ by optimizing the likelihood function corresponding to a plug in, generic, estimator $\hat{A}$ of $A$.

Topic Models

Paper
Add Code

Prediction in latent factor regression: Adaptive PCR and beyond

no code implementations • 20 Jul 2020 • Xin Bing, Florentina Bunea, Seth Strimas-Mackey, Marten Wegkamp

Our primary contribution is in establishing finite sample risk bounds for prediction with the ubiquitous Principal Component Regression (PCR) method, under the factor regression model, with the number of principal components adaptively selected from the data -- a form of theoretical guarantee that is surprisingly lacking from the PCR literature.

Model Selection regression

Paper
Add Code

Interpolating Predictors in High-Dimensional Factor Regression

no code implementations • 6 Feb 2020 • Florentina Bunea, Seth Strimas-Mackey, Marten Wegkamp

If the effective rank of the covariance matrix $\Sigma$ of the $p$ regression features is much larger than the sample size $n$, we show that the min-norm interpolating predictor is not desirable, as its risk approaches the risk of trivially predicting the response by 0.

regression Vocal Bursts Intensity Prediction

Paper
Add Code

Optimal estimation of sparse topic models

no code implementations • 22 Jan 2020 • Xin Bing, Florentina Bunea, Marten Wegkamp

We derive a finite sample upper bound for our estimator, and show that it matches the minimax lower bound in many scenarios.

Dimensionality Reduction Topic Models +1

Paper
Add Code

High-Dimensional Inference for Cluster-Based Graphical Models

no code implementations • 13 Jun 2018 • Carson Eisenach, Florentina Bunea, Yang Ning, Claudiu Dinicu

We employ model assisted clustering, in which the clusters contain features that are similar to the same unobserved latent variable.

Clustering Vocal Bursts Intensity Prediction

Paper
Add Code

A fast algorithm with minimax optimal guarantees for topic models with an unknown number of topics

1 code implementation • 17 May 2018 • Xin Bing, Florentina Bunea, Marten Wegkamp

We propose a new method of estimation in topic models, that is not a variation on the existing simplex finding algorithms, and that estimates the number of topics K from the observed data.

Topic Models valid

Paper
Code

Adaptive Estimation in Structured Factor Models with Applications to Overlapping Clustering

no code implementations • 23 Apr 2017 • Xin Bing, Florentina Bunea, Yang Ning, Marten Wegkamp

This work introduces a novel estimation method, called LOVE, of the entries and structure of a loading matrix A in a sparse latent factor model X = AZ + E, for an observable random vector X in Rp, with correlated unobservable factors Z \in RK, with K unknown, and independent noise E. Each row of A is scaled and sparse.

Clustering

Paper
Add Code

PECOK: a convex optimization approach to variable clustering

1 code implementation • 16 Jun 2016 • Florentina Bunea, Christophe Giraud, Martin Royer, Nicolas Verzelen

The problem of variable clustering is that of grouping similar components of a $p$-dimensional vector $X=(X_{1},\ldots, X_{p})$, and estimating these groups from $n$ independent copies of $X$.

Statistics Theory Statistics Theory

Paper
Code

Model Assisted Variable Clustering: Minimax-optimal Recovery and Algorithms

1 code implementation • 8 Aug 2015 • Florentina Bunea, Christophe Giraud, Xi Luo, Martin Royer, Nicolas Verzelen

We quantify the difficulty of clustering data generated from a G-block covariance model in terms of cluster proximity, measured with respect to two related, but different, cluster separation metrics.

Clustering

Paper
Code

Convex Banding of the Covariance Matrix

no code implementations • 23 May 2014 • Jacob Bien, Florentina Bunea, Luo Xiao

Empirical studies demonstrate its practical effectiveness and illustrate that our exactly-banded estimator works well even when the true covariance matrix is only close to a banded matrix, confirming our theoretical results.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.