Papers

My research is supported in part by the Simons Foundation. Previously, my research was supported in part by the National Science Foundation (CAREER award DMS-1653017; DMS-1405746) and the National Institutes of Health (R01 GM123993).

Preprints

Yiling Huang, Snigdha Panigrahi, Guo Yu, and Jacob Bien (2025) Reluctant Interaction Inference after Additive Modeling [pdf]
Xiaozhu Zhang, Jacob Bien, and Armeen Taeb (2025) Quantifying Uncertainty and Stability Among Highly Correlated Predictors: A Subspace Perspective [pdf] [software] [code to reproduce all results]
Sangwon Hyun, Tim Coleman, Francois Ribalet, and Jacob Bien (2025) Trend Filtered Mixture of Experts for Automated Gating of High-Frequency Flow Cytometry Data [pdf] [software]
Ameer Dharamshi, Anna Neufeld, Lucy Gao, Daniela Witten, and Jacob Bien (2025) Thinning a Wishart Random Matrix [pdf] [software]
Oh-Ran Kwon, Gourab Mukherjee, and Jacob Bien (2024) Semi-Supervised Learning of Noisy Mixture of Experts Models [pdf]
Gregory Faletto and Jacob Bien (2022) Cluster Stability Selection [pdf] [software]
Guo Yu, Jacob Bien, and Ryan Tibshirani (2019) Reluctant Interaction Modeling [pdf] [software]
Jacob Bien (2016) Simulator: An Engine to Streamline Simulations [pdf] [website]

Publications

Ronan Perry, Snigdha Panigrahi, Jacob Bien, and Daniela Witten (2025) Inference on the Proportion of Variance Explained in Principal Component Analysis, accepted to Journal of the American Statistical Association [pdf]
Ameer Dharamshi, Anna Neufeld, Lucy Gao, Jacob Bien, and Daniela Witten (2025) Decomposing Gaussians with Unknown Covariance, accepted to Biometrika [pdf] [software]
Anna Neufeld, Ameer Dharamshi, Lucy L Gao, Daniela Witten, and Jacob Bien (2025) Discussion of ``Data Fission: Splitting a Single Data Point’’, Journal of the American Statistical Association [pdf] [software]
Jacob Bien and Gourab Mukherjee (2025) Generative AI for Data Science 101: Coding Without Learning to Code, Journal of Statistics and Data Science Education [pdf]
Adel Javanmard, Simeng Shao, and Jacob Bien (2025) Prediction Sets for High-Dimensional Mixture of Experts Models, Journal of the Royal Statistical Society, Series B [pdf]
Ameer Dharamshi, Anna Neufeld, Keshav Motwani, Lucy L Gao, Daniela Witten, and Jacob Bien (2025) Generalized Data Thinning using Sufficient Statistics, Journal of the American Statistical Association [pdf] [software]
Arkajyoti Saha, Daniela Witten, and Jacob Bien (2025) Inferring Independent Sets of Gaussian Variables after Thresholding Correlations, Journal of the American Statistical Association [pdf] [software]
Lucy L Gao, Jacob Bien, and Daniela Witten (2024) Selective Inference for Hierarchical Clustering, Journal of the American Statistical Association [pdf] [website] [software]
Simeng Shao, Jacob Bien, and Adel Javanmard (2024) Controlling the False Split Rate in Tree-Based Aggregation, Journal of the American Statistical Association [software]
Gregory Faletto and Jacob Bien (2023) Predicting Rare Events by Shrinking Towards Proportional Odds, International Conference on Machine Learning 2023 [pdf] [software]
Sangwon Hyun, Mattias Rolf Cape, Francois Ribalet, and Jacob Bien (2023) Modeling Cell Populations Measured by Flow Cytometry with Covariates using Sparse Mixture of Regressions, Annals of Applied Statistics [pdf] [software]
Andee Kaplan and Jacob Bien (2023) Interactive Exploration of Large Dendrograms with Prototypes, The American Statistician [pdf] [software] [code for paper examples]
Bror Fredrik Jönsson, Christopher Follett, Jacob Bien, Stephanie Dutkiewicz, Sangwon Hyun, Gemma Kulk, Gael Forget, Christian Müller, Marie-Fanny Racault, Christopher Nigel Hill, Thomas Jackson, and Shubha Sathyendranath (2023) Using Probability Density Functions to Evaluate Models (PDFEM, V1. 0) to Compare a Biogeochemical Model with Satellite Derived Chlorophyll, Geoscientific Model Development [pdf]
Ryan Reynolds, Sangwon Hyun, Benjamin Tully, Jacob Bien, and Naomi M Levine (2023) Identification of Microbial Metabolic Functional Guilds from Large Genomic Datasets, Frontiers in Microbiology [pdf]
Ines Wilms, Sumanta Basu, Jacob Bien, and David S Matteson (2023) Sparse Identification and Estimation of Large-Scale Vector Autoregressive Moving Averages, Journal of the American Statistical Association [pdf] [software]
Ines Wilms and Jacob Bien (2022) Tree-Based Node Aggregation in Sparse Graphical Models, The Journal of Machine Learning Research [pdf] [software]
Evan L. Ray, Logan C. Brooks, Jacob Bien, Matthew Biggerstaff, Nikos I. Bosse, Johannes Bracher, Estee Y. Cramer, Sebastian Funk, Aaron Gerding, Michael A. Johansson, Aaron Rumack, Yijin Wang, Martha Zorn, Ryan J. Tibshirani, and Nicholas G. Reich (2022) Comparing Trained and Untrained Probabilistic Ensemble Forecasts of COVID-19 Cases and Deaths in the United States, International Journal of Forecasting [pdf]
Sangwon Hyun, Aditya Mishra, Christopher L. Follett, Bror Jonsson, Gemma Kulk, Gael Forget, Marie-Fanny Racault, Thomas Jackson, Stephanie Dutkiewicz, Christian L. Müller, and Jacob Bien (2022) Ocean Mover’s Distance: Using Optimal Transport for Analyzing Oceanographic Data, Proceedings of the Royal Society A [pdf] [software]
Estee Y. Cramer, …, Jacob Bien, …, and Nicholas G. Reich (2022) Evaluation of Individual and Ensemble Probabilistic Forecasts of COVID-19 Mortality in the US, Proceedings of the National Academy of Sciences [pdf]
Lucy L. Gao, Daniela Witten, and Jacob Bien (2022) Testing for Association in Multiview Network Data, Biometrics [pdf] [software] [code to reproduce all results]
Guo Yu, Daniela Witten, and Jacob Bien (2022) Controlling Costs: Feature Selection on a Budget, Stat [pdf]
Daniel J. McDonald, Jacob Bien, Alden Green, Addison J. Hu, Nat DeFries, Sangwon Hyun, Natalia L. Oliveira, James Sharpnack, Jingjing Tang, Robert Tibshirani, Valérie Ventura, Larry Wasserman, and Ryan J. Tibshirani (2021) Can Auxiliary Indicators Improve COVID-19 Forecasting and Hotspot Prediction?, Proceedings of the National Academy of Sciences [pdf] [supplement] [code to reproduce all results]
Alex Reinhart, …, Jacob Bien, …, Roni Rosenfeld, and Ryan Tibshirani (2021) An Open Repository of Real-Time COVID-19 Indicators, Proceedings of the National Academy of Sciences [pdf] [supplement] [code to reproduce all results]
Jacob Bien, Xiaohan Yan, Leo Simpson, and Christian Müller (2021) Tree-Aggregated Predictive Modeling of Microbiome Data, Scientific Reports [pdf] [software] [code to reproduce all results]
Xiaohan Yan and Jacob Bien (2021) Rare Feature Selection in High Dimensions, Journal of the American Statistical Association [pdf] [software] [vignette]
William B Nicholson, Ines Wilms, Jacob Bien, and David S Matteson (2020) High Dimensional Forecasting via Interpretable Vector Autoregression, The Journal of Machine Learning Research [pdf] [software]
Shuxiao Chen and Jacob Bien (2020) Valid Inference Corrected for Outlier Removal, Journal of Computational and Graphical Statistics [pdf] [software]
Lucy L Gao, Jacob Bien, and Daniela Witten (2020) Are Clusterings of Multiple Data Views Independent?, Biostatistics [pdf] [software] [code to reproduce all results]
Guo Yu, Jacob Bien, and Daniela Witten (2019) Discussion of ``Covariate-Assisted Ranking and Screening for Large-Scale Two-Sample Inference’’, Journal of the Royal Statistical Society, Series B [pdf] [supplement]
Guo Yu and Jacob Bien (2019) Estimating the Error Variance in a High-Dimensional Linear Model, Biometrika [pdf] [journal] [software] [vignette]
Jacob Bien (2019) Graph-Guided Banding of the Covariance Matrix, Journal of the American Statistical Association [pdf] [software]
James Li, Jacob Bien, and Martin T Wells (2018) rTensor: An R Package for Multidimensional Array (Tensor) Unfolding, Multiplication, and Decomposition, Journal of Statistical Software [pdf]
Jacob Bien, Irina Gaynanova, Johannes Lederer, and Christian L Müller (2019) Prediction Error Bounds for Linear Regression with the TREX, Test [pdf]
Jacob Bien, Irina Gaynanova, Johannes Lederer, and Christian L Müller (2018) Non-Convex Global Minimization and False Discovery Rate Control for the TREX, Journal of Computational and Graphical Statistics [pdf] [software]
Xiaohan Yan and Jacob Bien (2017) Hierarchical Sparse Modeling: A Choice of Two Group Lasso Formulations, Statistical Science [pdf] [software]
Guo Yu and Jacob Bien (2017) Learning Local Dependence in Ordered Data, Journal of Machine Learning Research [pdf] [software] [vignette]
Ines Wilms, Sumanta Basu, Jacob Bien, and David S Matteson (2017) Interpretable Vector AutoRegressions with Exogenous Time Series, arXiv preprint arXiv:1711.03623 [pdf]
William B Nicholson, David S Matteson, and Jacob Bien (2017) VARX-L: Structured Regularization for Large Vector Autoregressions with Exogenous Variables, International Journal of Forecasting [pdf] [software]
Yin Lou, Jacob Bien, Rich Caruana, and Johannes Gehrke (2016) Sparse Partially Linear Additive Models, Journal of Computational and Graphical Statistics [pdf] [software]
Jacob Bien, Florentina Bunea, and Luo Xiao (2016) Convex Banding of the Covariance Matrix, Journal of the American Statistical Association [pdf] [software] [vignette]
Jacob Bien and Daniela Witten (2016) Penalized Estimation in Complex Models (book chapter) in P. Bühlmann and P. Drineas and M. Kane and M. {van der Laan} (Eds.), Handbook of Big Data. Chapman and Hall/CRC Reference. [link]
Jacob Bien, Noah Simon, and Robert Tibshirani (2015) Convex Hierarchical Testing of Interactions, Annals of Applied Statistics [pdf] [software]
Jacob Bien, Jonathan Taylor, and Robert Tibshirani (2013) A Lasso for Hierarchical Interactions, Annals of Statistics [pdf] [software]
Jacob Bien and Marten Wegkamp (2013) Discussion of ``Correlated Variables in Regression: Clustering and Sparse Estimation’’, Journal of Statistical Planning and Inference
Jacob Bien and Robert Tibshirani (2011) Hierarchical Clustering with Prototypes via Minimax Linkage, Journal of the American Statistical Association [software]
Jacob Bien and Robert Tibshirani (2011) Sparse Estimation of a Covariance Matrix, Biometrika [software]
Robert Tibshirani, Jacob Bien, Jerome Friedman, Trevor Hastie, Noah Simon, Jonathan Taylor, and Ryan J Tibshirani (2012) Strong Rules for Discarding Predictors in Lasso-Type Problems, Journal of the Royal Statistical Society: Series B (Statistical Methodology)
Jacob Bien and Robert Tibshirani (2011) Prototype Selection for Interpretable Classification, Annals of Applied Statistics [software]
Neema Moraveji, Daniel Russell, Jacob Bien, and David Mease (2011) Measuring Improvement in User Search Performance Resulting from Optimal Search Tips, Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information [abstract]
Jacob Bien, Yan Xu, and Michael Mahoney (2010) CUR from a Sparse Optimization Viewpoint, Advances in Neural Information Processing Systems 23 [pdf]