Exchangeable Variational Autoencoders with Applications to Genomic Data

Exchangeable-structured datapoints are ubiquitous in statistical problems ranging from point clouds to graphs to sets. Particularly in biological settings where multiple experiments derived from a noisy scientific process attempt to measure a latent variable of interest, experimental datapoints are often exchangeable demanding the development of methods which can exploit this structure. Modern machine learning approaches to scalable Bayesian inference typically use autoencoding variational Bayes -- marrying ideas from deep learning and probabilistic modeling to achieve practical inference for expressive models. Current VAE-based approaches do not naturally handle exchangeable (but non-iid) datapoints. Often exchangeable-structured datapoints may contain heterogeneity in datapoint dimensions precluding a staightforward application of the vanilla VAE framework. In this work, we develop the Exchangeable Variational Autoencoder which provides inferential and computational benefits while enabling varying set size data to be robustly handled in the VAE framework. We then demonstrate its efficacy in two settings: (1) on the well-studied Latent Dirichlet Allocation model and (2) on the bootstrapped, isoform-level uncertainty estimates of single-cell RNA-seq data.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here