student work
nice job everyone!
:::
2024:
- Sara Colando (2024): “Selecting ChIP-Seq Normalization Methods from the Perspective of their Technical Conditions”; Currently: PhD candidate, Carnegie Mellon University, Statistics.
2023:
- Ian Krupkin (2023): “Prediction Error Estimation in Random Forests”; Currently: Boston Consulting Group.
- Olivia Leu (2023): “Mathematics of Redistricting: Identifying Gerrymandering Through Outlier Analysis”; Currently: Data Analyst, Democracy Program at The Carter Center.
- Summer Will (2023): “Theoretical Properties of Oversampling Techniques”; Currently: Digital Analytics Intern, CVS Health.
- Julie Ye (2023): “Permutation Tests for Multiple Linear Regression Models”; Currently: MS candidate, Yale University, Statistics & Data Science.
2022:
- Will Gray (2022): “Detecting Rotation Periods of Near-Earth Asteroids: An investigation of Fourier Analysis and the Lomb-Scargle Periodogram”;
- Lauren Quesada (2022): “Permutation Tests: A Deep Dive into Applications in Multiple Linear Regression”; Currently: PhD candidate, Colorado School of Mines, Statistics.
- Moe Sunami (2022): “Conformal Prediction Intervals”; Currently: Carbon Data Analyst, Watershed
- Nick Waalkes (2022): “Simulation and Application Study of Online False Discovery Control Methods”; Currently: Analyst, Ridgepeak Partners.
2021:
- Ethan Ashby (2021): “Extracting hitherto unseen variant signals from the cancer genome using data de-sparsification strategies”; Currently: PhD candidate, University of Washington, Biostatistics.
- Annie Cohen (2021, Scripps): “Investigating changes in the SEIRMD model applied to COVID-19”; Currently: PhD candidate, University of Michigan, Biostatistics.
- Emma Godfrey (2021): “Non-parametric Alternative Techniques for Propensity Score Estimation”; Currently: Statistical Analyst, ZipRecruiter
- Nat Serrurier (2021, PPA Biology): “Palliative Care and Dementia: An Underutilized Method in the Fight Against a Health Crisis”; Currently: Doctor of Physical Therapy candidate, Columbia University, Vagelos College of Physicians and Surgeons.
2020:
- Helen Lan (2020): “Multiple Comparison Test”; Currently: MBA candidate, University of Pennsylvania, The Wharton School.
- Nolan McCafferty (2020): “Style Transfer with Neural Networks”; Currently: Software Engineer, Omnistrate
- Zach Senator (2020): “Random Forests and Beyond”; Currently: Investment Banking Associate, J.P. Morgan.
- Amy Watt (2020): “The Expectation Maximization Algorithm in RNA-Sequencing Read Alignment”; Currently: PhD candidate, University of Washington, Biostatistics.
2019:
- Alex Gui (2019): “Local Prediction Confidence for Classification Random Forests”; MS 2021, Stanford University, Statistics: Data Science, Currently: Data Scientist, Meta.
- Frances Hung (2019): “Active Learning Experimental Design of Bayesian Networks”; MS 2021, Duke University, Statistics, Currently: JD candidate, UCLA
- Vedant Vohra (2019): “Estimating the proportion who benefit from a treatment in a Randomized Controlled Trial”; Currently: PhD candidate, UC San Diego, Economics
- Justin Weltz (2019): “Over-Policing and Fairness in Machine Learning”; Currently: PhD candidate, Duke University, Statistics
- Christina Duron (2019, PhD CGU): “The Distribution of Betweenness Centrality in Exponential Random Graph Models”; Currently: Assistant Professor of Mathematics, Pepperdine University
2018:
- Chris Barnes (2018): “Artistic Style Transfer using Deep Learning”; Currently: Partner, Skye Global Management.
- Kalyan Chadalavada (2018): “Partial Least Squares Regression in Football Projections”; MS 2019, Tulane University, Pharmacology, Currently: Practice Manager, Springfield Pulmonary Medicine and Sleep Clinic.
- Luis Espino (2018): “Racism without a Face: Predictive Statistics in the Criminal Justice System”; Currently: Director of Product Management, fwd.us, Community Department.
- Kashvi Tibrewal (2018): “Evaluating Splitting Criteria in Classification Trees”; MBA 2024, University of Michigan, Stephen Ross School of Business.
2017:
- Benji Lu (2017): “Constructing Prediction Intervals for Random Forests”; Ph.D. 2024, UC Berkeley, Statistics, JD 2024, Yale University. Currently: .
- Maria Martinez Lainez (2017): “The EM algorithm and RNA sequencing”; Currently: Software Engineer II, Intuit.
- Yenny Zhang (2017): “Integrating Random Forests into the Bag of Little Bootstraps”; Currently: Senior Software Engineer, Medallion.
2016:
- Isaiah Boone (2016): “SVM and the Application of Prediction Rules”; Currently: Partner, Sequoia Capital.
- John Bryan (2016): “Developing Inference Frameworks for Random Forests Using Bag of Little Bootstraps and Related Methods”; M.D. 2023, Northwestern University Feinberg School of Medicine. Currently: ophthalmology resident, Northwestern University Department of Ophthalmology
- Ciaran Evans (2016): “Normalization of RNA-Seq data in the case of asymmetric differential expression”; Ph.D. 2021, Carnegie Mellon University, Statistics. Currently: Assistant Professor of Mathematics & Statistics, Wake Forest University.
- Dylan Quantz (2016): “Analyzing Centrality in Complex Gene Networks”; Currently: Player Development Trainee, Atlanta Braves.
2015:
- Rebecca Baiman (2015): “A Critical Comparison of Methods in Statistical Inference Education”; Masters of Education 2017, Math Secondary Education, Vanderbilt University; Currently: Ph.D. candidate, CU Boulder, Atmospheric Science.
- Jacob Fiksel (2015): “Differential Gene Expression Analysis with Microarray and RNA-seq Data”; Ph.D. 2020, Johns Hopkins Bloomberg School of Public Health, Biostatistics; Currently: Senior Preclinical Research Statistician, Vertex Pharmaceuticals.
Jacob’s blog (about grad school and data science, among other things)
Interview of Jacob as part of the The Johns Hopkins #100 Alumni Voices Project
- Chris Garnatz (2015): “Trusting the Black Box: Confidence with Bag of Little Bootstraps”; Currently: Data Engineer, TRM Labs
- Caroline Zaia (2015): “Multilevel Regression in Value Added Modeling for Teacher Assessment”; Currently: Revenue & Margin Manager, Stitch Fix
2014:
- Maricela Cruz (2014): “Long-term Averages of the Stochastic Logistic Map”; Ph.D. 2019, University of California, Irvine, Statistics; Currently: Assistant Investigator at Kaiser Permanente Washington Health Research Institute
- Thalia Rodriguez (2014): “Towards a More Conceptual Way of Understanding and Implementing Inferential Rules”; MEd 2015, the University of Southern California Rossier School of Education; Currently: Mathematics Teacher, Heninger Elementary.
- Brian Williamson (2014): “Shrinkage Estimators for High-Dimensional Covariance Matrices”; Ph.D. 2019, University of Washington, Biostatistics; Currently: Assistant Investigator, Kaiser Permanente Washington Health Research Institute
2013:
- Melinda Borello (2013, Pitzer): “Standardization and Singular Value Decomposition in Canonical Correlation Analysis”;
- Jacob Coleman (2013): “Robust Sparse Canonical Correlation Analysis and PITCHf/x”; Ph.D. 2019, Duke University, Statistics; Currently: .
- Karl Kumbier (2013): “Detecting and Estimating Filamentary Structures in the Presence of Background Noise”; Ph.D. 2019, UC Berkeley, Statistics; Currently: Data Scientist, Pharmaceutical Chemistry, School of Pharmacy, UCSF.
- Guy Stevens (2013): “Bayesian Statistics and Baseball” ; Currently: Owner & CEO, Winning Insight Consulting.
- Yuanxi Zhang (2013): “How Does a Bayesian Investor Time the Market”; M.S. 2018, U Chicago, Economics.
2012:
- Tim Stutz (2012): “Modeling the Evolution of Sexual Diploid Populations via a Stochastic Moran Process”; Ph.D. 2020, UCLA, Biomathematics. Currently: .
2011:
- Kate Brieger (2011, EA, independent study): “The Evolution of Statistics in Medicine” ; M.D. / Ph.D. 2022, University of Michigan, Epidemiology; Currently: Clinical Fellow in Psychiatry, Brigham and Women’s Hospital, Harvard Medical School.
- Christine Ju (2011, Scripps): “Determining Overrepresentation of Gene Ontology Terms using the Hypergeometric Distribution”; M.S. 2013, Duke, Biostatistics; Currently: Associate Director, Biostatistics, ALX Oncology.
2010:
- Minsoo Kim (2010): “Statistical Classification”; M.S. 2015, U Georgia, Statistics; Currently: Scientific Computing Professional Associate, Carl Vinson Institute of Government.
- Mary Owen (2010): “Tukey’s Biweight Correlation and the Breakdown”; Currently: Operations Assistant, Jennie’s Kitchen.
- Mark Simon (2010): “Randomly Generating Computationally Stable Correlation Matrices”; Currently: Senior Investment Analyst, Taal Capital Management.
2009:
- Patrick Kimes (2009): “Understanding q-values as a More Intuitive Alternative to p-values”; Ph.D. 2015 (“New Statistical Learning Approaches with Applications to RNA-Sequencing Data.” Advisor: J.S. Marron), UNC, Statistics; Currently: Principal Statistical Scientist, Genentech.
- Alison Kosel (2009): “Simulating Correlated Multivariate Normal Data”; Ph.D. 2016 (“Local Estimation of Patient Prognosis” Advisor: Patrick Heagerty), U Washington, Biostatistics; Currently: Data Scientist, Facebook.
- Daniel Scinto (2009): “Stock Ranking and Portfolio Selection: Revising and Developing Z-scores”; Currently: Portfolio Manager, Alphadyne Asset Management.
2008:
- Brianna Pasco (2008, Scripps): “A Basic Introduction and Comparison of Linear Discriminant Analysis and Support Vector Machines”.
- Nick Conway (2008): “A Resistant Measure of Distance in DNA Microarrays”.
- Austen Head (2008): “Correlation Correction of Sample Measures from Bivariate Distributions”; Ph.D. 2014, Stanford, Statistics; Currently: Staff Machine Learning Engineer, PayJoy.
- Robert Kurtzman (2008): “The Advantages of a Biweight Metric in Clustering Microarray Data”; Ph.D. 2015, UCLA, Economics; Currently: Group Manager, Federal Reserve Board.
2007:
- Jeffery Joe Nanda (2007): “Correcting for Bias in Correlation Coefficients Due to Intraindividual Variability”; MBA 2011, Stanford; Currently: Head of Investments, American Triple.
- Andrea Vijverberg Seo (2007): “Clustering Microarray Data”; M.S. Computational Finance 2008; Currently: Quantitative Model Analyst, Counterparty Risk, US Bank
- Jonathan Buster Zalkind (2007): “Four Colors is not Enough: Visualizations of Simulated Spatial-Model Elections Under Different Voting Methods”; MBA, The University of Chicago Booth School of Business, Currently: Vice President, Playco.
2006:
- Aya Mitani (2006, Pitzer): “Biweight Correlation as a Measure of Distance between Genes on a Microarray” (abstract, presentation); MPH 2008, Yale, Biostatistics; Ph.D. 2019, Boston University, Biostatistics; Currently: Assistant Professor of Biostatistics, Dalla Lana School of Public Health, University of Toronto.
2005:
- Joseph Richards (2005): “Classification of Geologic Units on Ganiki Planitia Quadrangle (V14) Venus Using Statistical Clustering Methods”, Ph.D. 2010, Carnegie Mellon University, Statistics; Currently: COO, Down to Cook.
- Alison Wise (2005): “Statistical Analysis of Microarrays to Determine Genetic Changes in Aging Yeast”; Ph.D. candidate, UNC, Biostatistics.
2004:
- Lee (Strassenburg) Shanahan (2004): “A Statistical Comparison of the Average Waiting Times Between Flares in Lupus Patients”, Currently: Technical Solutions Engineer at findhelp.
2003:
- Veronica (Montes De Oca) Aispuro (2003): “Methods for Evaluating Health Care Claims Data” (an application of Bootstrapping); Ph.D. 2008, University of California, Riverside, Applied Statistics; Currently: Senior Director of Stars Survey Analytics, UnitedHealth Group.
2024:
- Jazelle Saligumba ’26, Kellie Au ’26, Julianne Louie ’26, Chau Vu ’26, Yunju Song ’26 DataFest - The Don Ylvisaker Best Insight Award (Honorable Mention)
- Sara Colando ’24 compilation of data science ethics curricula, to accompany our joint paper Philosophy within Data Science Ethics Courses.
2022:
-
Pipi Gao ’22 Finalist in the MD++ Datathon 2022 as part of team
WeLoveRData
. The Datathon was founded and organized by Lathan Liou ’19. - Xuehuai He ’25, Saatvik Kher ’24, Samson Zhang ’25 DataFest - The Don Ylvisaker Best Insight Award (Honorable Mention)
- Aditya Bhalla ’23, Alan Zhou ’23 DataFest - Best Use of Statistical Models (Honorable Mention)
2021:
- Ethan Ashby ’21 A Regularized Cox Regression Approach to the Health Evaluation and Linkage to Primary Care (HELP) Clinical Trial 2nd place Paper Undergraduate Statistics Class (intermediate) Project Competition
- Hannah Mandel ‘23, Emily Tomz ’23, Adeena Liang ’23, Chloe Sun ’23, Ian Krupkin ’23 DataFest - Judges’ Choice Award
2020:
- Amber Lee ’22 Exploring Missingness and its Implications on Traffic Stop Data 2nd place Paper Undergraduate Statistics Research Project Competition
- Amber Lee ’22, Arm Wonghirundacha ’22, Emma Godfrey ’21, Ethan Ong ’21, Ivy Yuan ’21, Oliver Chang ’22, and Will Gray ’22; Data Exploration of US Police Stops, Data Science Research Circle, supervised by Jo Hardin and Ghassan Sarkis
- Guy Thampakkul ‘23, Tai Xiang ’23; DataFest - Judges’ Choice Award
2019:
- Christina Duron, PhD Claremont Graduate University 2019 The Distribution of Betweenness Centrality in Exponential Random Graph Models
- Amy Watt ’20, Adam Rees ’20, Ethan Ashby ’21, Connor Ford ’20, Madelyn Andersen ’22 (HMC); DataFest – Best Use of External Data
2018:
- Vedant Vohra ’19, Zihao Xu ’19, Madison Hobbs ’19 (Scripps), Xiaotong Gui ’19 DataFest - Best Insight, honorable mention
2017:
- Zihao Xu ’19 Bag of Little Random Bootstraps Winning Paper Undergraduate Statistics Research Project Competition
- Jeff Carney ‘18, Hyeong Shin ’19, Adam Starr ’18 DataFest - Judges’ Choice Award
2014:
- Tim Kaye ’15, David Khatami ’16, Daniel Metz ’16, Emily Proulx ’16 Quantifying and Comparing Centrality Measures for Network Individuals as Applied to the Enron Corpus Winning Paper Undergraduate Statistics Research Project Competition, Data Science Research Circle, supervised by Jo Hardin and Ghassan Sarkis
- Tim Kaye ’15, David Khatami ’16, Daniel Metz ’16, Emily Proulx ’16 Quantifying and Comparing Centrality Measures for Network Individuals as Applied to the Enron Corpus; SIAM Undergraduate Research Online, 7: 2014.
2013:
- Jacob Coleman ’13, Maricela Cruz ’14, Bill DeRose ’15, Ciaran Evans ’15, Rob Knickerbocker ’15, Kevin Lu ’14, Derek Owens-Oas ’13, Ben Shand ’14, Brian Williamson ’14; DataFest - Best Insight
2012:
- Karl Kumbier ’13, Erika Parks ’13, Joseph Replogle ’13 DataFest - Best Use of External Data
- Drew DiPalma ’13, Tim Stutz ’12; DataFest - Best Visualization, honorable mention