Erin Craig

Assistant Professor of Biostatistics, University of Michigan

I am an assistant professor at University of Michigan’s Department of Biostatistics. My work focuses on improving healthcare through:

My research applications are most often in cancer and immunology, but I am generally driven to solve problems that will help people.

I earned a PhD in Biomedical Data Science from Stanford, where I was advised by Rob Tibshirani. I also hold a BA in Mathematics and an MS in Data Science from New College of Florida, where I was advised by Pat McDonald and Gary Kalmanovich.

Before my PhD, I led the math content team at Wolfram|Alpha. Notably, we developed Step-by-step Solutions and the Wolfram Problem Generator.

Want to collaborate? Please reach out at ercr@umich.edu.

Selected publications

Craig, Pilanci, Le Menestrel, Narasimhan, Rivas, Gullaksen, Dehghannasiri, Salzman, Taylor, Tibshirani. Pretraining and the lasso. Journal of the Royal Statistical Society, Series B, 2026. doi

Zaslavsky, Craig, et al. Disease diagnostics using machine learning of B cell and T cell receptor sequences. Science, 387(6736), 2025. doi

Craig, Keyes, et al. Annotation-free discovery of disease-relevant cells in single-cell datasets. Science Advances, 11(35), 2025. doi

Le Menestrel, Craig, Tibshirani, Hastie, Rivas. Using pre-training and interaction modeling for ancestry-specific disease prediction using multiomics data from the UK Biobank. PLoS One, 20(12), 2025. doi

Craig, Zhong, Tibshirani. A review of survival stacking: a method to cast survival regression analysis as a classification problem. International Journal of Biostatistics, 21(1), 2025. doi

Hamilton, Craig, et al. CAR19 monitoring by peripheral blood immunophenotyping reveals histology-specific expansion and toxicity. Blood Advances, 8(12), 2024. doi

Full list of publications, talks, and grants in my CV and on Google Scholar.

Software

ptLasso. Pretraining for the lasso — transfer learning for sparse, interpretable linear models. Video and code tutotials!

MMIL. Mixture models for multiple-instance learning: find disease-relevant cells from patient-level labels.

sweetspot. Find and assess treatment-effect sweet spots in clinical trials.

Contact