student work
nice job everyone!
:::
2024:
- Sara Colando (2024): “Selecting ChIP-Seq Normalization Methods from the Perspective of their Technical Conditions”; Currently: PhD candidate, Carnegie Mellon University, Statistics.
2023:
- Ian Krupkin (2023): “Prediction Error Estimation in Random Forests”; Currently: Boston Consulting Group.
- Olivia Leu (2023): “Mathematics of Redistricting: Identifying Gerrymandering Through Outlier Analysis”; Currently: Data Analyst, Democracy Program at The Carter Center.
- Summer Will (2023): “Theoretical Properties of Oversampling Techniques”; Currently: Digital Analytics Intern, CVS Health.
- Julie Ye (2023): “Permutation Tests for Multiple Linear Regression Models”; Currently: MS candidate, Yale University, Statistics & Data Science.
2022:
- Will Gray (2022): “Detecting Rotation Periods of Near-Earth Asteroids: An investigation of Fourier Analysis and the Lomb-Scargle Periodogram”;
- Lauren Quesada (2022): “Permutation Tests: A Deep Dive into Applications in Multiple Linear Regression”; Currently: PhD candidate, Colorado State University, Statistics.
- Moe Sunami (2022): “Conformal Prediction Intervals”; Currently: Carbon Data Analyst, Watershed
- Nick Waalkes (2022): “Simulation and Application Study of Online False Discovery Control Methods”; Currently: Analyst, Ridgepeak Partners.
2021:
- Ethan Ashby (2021): “Extracting hitherto unseen variant signals from the cancer genome using data de-sparsification strategies”; Currently: PhD candidate, University of Washington, Biostatistics.
- Annie Cohen (2021, Scripps): “Investigating changes in the SEIRMD model applied to COVID-19”; Currently: PhD candidate, University of Michigan, Biostatistics.
- Emma Godfrey (2021): “Non-parametric Alternative Techniques for Propensity Score Estimation”; Currently: Statistical Analyst, ZipRecruiter
- Nat Serrurier (2021, PPA Biology): “Palliative Care and Dementia: An Underutilized Method in the Fight Against a Health Crisis”; .
2020:
- Helen Lan (2020): “Multiple Comparison Test”; Currently: Analyst, Cornerstone Research.
- Nolan McCafferty (2020): “Style Transfer with Neural Networks”; Currently: Software Engineer, Amazon.
- Zach Senator (2020): “Random Forests and Beyond”; Currently: Business Analyst, Strategy & Operations, Deloitte.
- Amy Watt (2020): “The Expectation Maximization Algorithm in RNA-Sequencing Read Alignment”; Currently: Data Scientist, Analytics, Facebook.
2019:
- Alex Gui (2019): “Local Prediction Confidence for Classification Random Forests”; MS 2021, Stanford University, Statistics: Data Science, Currently: Data Scientist at Pinterest
- Frances Hung (2019): “Active Learning Experimental Design of Bayesian Networks”; MS 2021, Duke University, Statistics, Currently: JD candidate, UCLA
- Vedant Vohra (2019): “Estimating the proportion who benefit from a treatment in a Randomized Controlled Trial”; Currently: PhD candidate, UC San Diego, Economics
- Justin Weltz (2019): “Over-Policing and Fairness in Machine Learning”; Currently: PhD candidate, Duke University, Statistics
- Christina Duron (2019, PhD CGU): “The Distribution of Betweenness Centrality in Exponential Random Graph Models”; Currently: Assistant Professor of Mathematics, Pepperdine University
2018:
- Chris Barnes (2018): “Artistic Style Transfer using Deep Learning”; Currently: Analyst, KKR & Co. L.P.
- Kalyan Chadalavada (2018): “Partial Least Squares Regression in Football Projections”; MS 2019, Tulane University, Pharmacology, Currently: Practice Manager, Springfield Pulmonary Medicine and Sleep Clinic.
- Luis Espino (2018): “Racism without a Face: Predictive Statistics in the Criminal Justice System”; Currently: Technology Associate, fwd.us, Community Department.
- Kashvi Tibrewal (2018): “Evaluating Splitting Criteria in Classification Trees”; Currently: IEQ Capital
2017:
- Benji Lu (2017): “Constructing Prediction Intervals for Random Forests”; Ph.D. candidate, UC Berkeley, Statistics; JD candidate, Yale University.
- Maria Martinez (2017): “The EM algorithm and RNA sequencing”; Currently: Software Engineer, Intuit.
- Yenny Zhang (2017): “Integrating Random Forests into the Bag of Little Bootstraps”; Currently: Software Engineer, Medallion.
2016:
- Isaiah Boone (2016): “SVM and the Application of Prediction Rules”; Currently: Partner, Sequoia Capital.
- John Bryan (2016): “Developing Inference Frameworks for Random Forests Using Bag of Little Bootstraps and Related Methods”; M.D. candidate, Northwestern University Feinberg School of Medicine.
- Ciaran Evans (2016): “Normalization of RNA-Seq data in the case of asymmetric differential expression”; Ph.D. 2021, Carnegie Mellon University, Statistics. Currently: Assistant Professor of Mathematics & Statistics, Wake Forest University.
- Dylan Quantz (2016): “Analyzing Centrality in Complex Gene Networks”; Currently: Player Development Trainee, Atlanta Braves.
2015:
- Rebecca Baiman (2015): “A Critical Comparison of Methods in Statistical Inference Education”; Masters of Education 2017, Math Secondary Education, Vanderbilt University; Currently: Ph.D. candidate, CU Boulder, Atmospheric Science.
- Jacob Fiksel (2015): “Differential Gene Expression Analysis with Microarray and RNA-seq Data”; Ph.D. 2020, Johns Hopkins Bloomberg School of Public Health, Biostatistics; Currently: Senior Preclinical Research Statistician, Vertex Pharmaceuticals.
Jacob’s blog (about grad school and data science, among other things)
Interview of Jacob as part of the The Johns Hopkins #100 Alumni Voices Project
- Chris Garnatz (2015): “Trusting the Black Box: Confidence with Bag of Little Bootstraps”; Currently: Data Architect, Spring.
- Caroline Zaia (2015): “Multilevel Regression in Value Added Modeling for Teacher Assessment”; Currently: Merchandise Planner, Stitch Fix
2014:
- Maricela Cruz (2014): “Long-term Averages of the Stochastic Logistic Map”; Ph.D. 2019, University of California, Irvine, Statistics; Currently: Assistant Investigator at Kaiser Permanente Washington Health Research Institute
- Thalia Rodriguez (2014): “Towards a More Conceptual Way of Understanding and Implementing Inferential Rules”; M.A. in Teaching 2016, the University of Southern California Rossier School of Education; Currently: Mathematics Teacher, Santa Ana Unified School District
- Brian Williamson (2014): “Shrinkage Estimators for High-Dimensional Covariance Matrices”; Ph.D. 2019, University of Washington, Biostatistics; Currently: Assistant Investigator, Kaiser Permanente Washington Health Research Institute
2013:
- Melinda Borello (2013, Pitzer): “Standardization and Singular Value Decomposition in Canonical Correlation Analysis”;
- Jacob Coleman (2013): “Robust Sparse Canonical Correlation Analysis and PITCHf/x”; Ph.D. 2019, Duke University, Statistics; Currently: Senior Quantitative Analyst, Los Angeles Dodgers.
- Karl Kumbier (2013): “Detecting and Estimating Filamentary Structures in the Presence of Background Noise”; Ph.D. 2019, UC Berkeley, Statistics; Currently: Postdoctoral researcher, UCSF.
- Guy Stevens (2013): “Bayesian Statistics and Baseball” ; Currently: Data Science Lead, Viaduct.
- Yuanxi Zhang (2013): “How Does a Bayesian Investor Time the Market”; M.S. 2018, U Chicago, Economics.
2012:
- Tim Stutz (2012): “Modeling the Evolution of Sexual Diploid Populations via a Stochastic Moran Process”; Ph.D. candidate, UCLA, Biomathematics.
2011:
- Kate Brieger (2011, EA, independent study): “The Evolution of Statistics in Medicine” ; M.D. / Ph.D. 2022, University of Michigan, Epidemiology; Currently: resident, Brigham and Women’s Hospital, Harvard Medical School.
- Christine Ju (2011, Scripps): “Determining Overrepresentation of Gene Ontology Terms using the Hypergeometric Distribution”; M.S. 2013, Duke, Biostatistics; Currently: Senior Manger, Biostatistics, ALX Oncology.
2010:
- Minsoo Kim (2010): “Statistical Classification”; M.S. 2015, U Georgia, Statistics; Currently: Scientific Computing Professional Associate, Carl Vinson Institute of Government.
- Mary Owen (2010): “Tukey’s Biweight Correlation and the Breakdown”; Currently: Pastry Cook, Marta.
- Mark Simon (2010): “Randomly Generating Computationally Stable Correlation Matrices”; Currently: Financial Analyst, Taal Capital Management.
2009:
- Patrick Kimes (2009): “Understanding q-values as a More Intuitive Alternative to p-values”; Ph.D. 2015 (“New Statistical Learning Approaches with Applications to RNA-Sequencing Data.” Advisor: J.S. Marron), UNC, Statistics; Currently: Senior Statistical Scientist, Genentech.
- Alison Kosel (2009): “Simulating Correlated Multivariate Normal Data”; Ph.D. 2016 (“Local Estimation of Patient Prognosis” Advisor: Patrick Heagerty), U Washington, Biostatistics; Currently: Data Scientist, Facebook.
- Daniel Scinto (2009): “Stock Ranking and Portfolio Selection: Revising and Developing Z-scores”; Currently: Partner, BFAM Partners.
2008:
- Brianna Pasco (2008, Scripps): “A Basic Introduction and Comparison of Linear Discriminant Analysis and Support Vector Machines”.
- Nick Conway (2008): “A Resistant Measure of Distance in DNA Microarrays”.
- Austen Head (2008): “Correlation Correction of Sample Measures from Bivariate Distributions”; Ph.D. 2014, Stanford, Statistics; Currently: Head of Data Science, PayJoy.
- Robert Kurtzman (2008): “The Advantages of a Biweight Metric in Clustering Microarray Data”; Ph.D. 2015, UCLA, Economics; Currently: Principal Economist, Federal Reserve Board.
2007:
- Jeffery Joe Nanda (2007): “Correcting for Bias in Correlation Coefficients Due to Intraindividual Variability”; MBA 2011, Stanford; Currently: Investment Manager, IFC Asset Management Company.
- Andrea Vijverberg (2007): “Clustering Microarray Data”; M.S. Computational Finance 2008; Currently: FX Options Trader, Bunge.
- Jonathan Buster Zalkind (2007): “Four Colors is not Enough: Visualizations of Simulated Spatial-Model Elections Under Different Voting Methods”; MBA, The University of Chicago Booth School of Business, Currently: Vice President, Playco.
2006:
- Aya Mitani (2006, Pitzer): “Biweight Correlation as a Measure of Distance between Genes on a Microarray” (abstract, presentation); MPH 2008, Yale, Biostatistics; Ph.D. 2019, Boston University, Biostatistics; Currently: Assistant Professor of Biostatistics, Dalla Lana School of Public Health, University of Toronto.
2005:
- Joseph Richards (2005): “Classification of Geologic Units on Ganiki Planitia Quadrangle (V14) Venus Using Statistical Clustering Methods”, Ph.D. 2010, Carnegie Mellon University, Statistics; Currently: COO, Down to Cook.
- Alison Wise (2005): “Statistical Analysis of Microarrays to Determine Genetic Changes in Aging Yeast”; Ph.D. candidate, UNC, Biostatistics.
2004:
- Lee (Strassenburg) Shanahan (2004): “A Statistical Comparison of the Average Waiting Times Between Flares in Lupus Patients”, Currently: Technical Solutions Engineer at findhelp.
2003:
- Veronica (Montes De Oca) Aispuro (2003): “Methods for Evaluating Health Care Claims Data” (an application of Bootstrapping); Ph.D. 2008, University of California, Riverside, Applied Statistics; Currently: Director of Stars Survey Analytics, UnitedHealth Group.
2022:
-
Pipi Gao ’22 Finalist in the MD++ Datathon 2022 as part of team
WeLoveRData
. The Datathon was founded and organized by Lathan Liou ’19. - Xuehuai He ’25, Saatvik Kher ’24, Samson Zhang ’25 DataFest - The Don Ylvisaker Best Insight Award (Honorable Mention)
- Aditya Bhalla ’23, Alan Zhou ’23 DataFest - Best Use of Statistical Models (Honorable Mention)
2021:
- Ethan Ashby ’21 A Regularized Cox Regression Approach to the Health Evaluation and Linkage to Primary Care (HELP) Clinical Trial 2nd place Paper Undergraduate Statistics Class (intermediate) Project Competition
- Hannah Mandel ‘23, Emily Tomz ’23, Adeena Liang ’23, Chloe Sun ’23, Ian Krupkin ’23 DataFest - Judges’ Choice Award
2020:
- Amber Lee ’22 Exploring Missingness and its Implications on Traffic Stop Data 2nd place Paper Undergraduate Statistics Research Project Competition
- Amber Lee ’22, Arm Wonghirundacha ’22, Emma Godfrey ’21, Ethan Ong ’21, Ivy Yuan ’21, Oliver Chang ’22, and Will Gray ’22; Data Exploration of US Police Stops, Data Science Research Circle, supervised by Jo Hardin and Ghassan Sarkis
- Guy Thampakkul ‘23, Tai Xiang ’23; DataFest - Judges’ Choice Award
2019:
- Christina Duron, PhD Claremont Graduate University 2019 The Distribution of Betweenness Centrality in Exponential Random Graph Models
- Amy Watt ’20, Adam Rees ’20, Ethan Ashby ’21, Connor Ford ’20, Madelyn Andersen ’22 (HMC); DataFest – Best Use of External Data
2018:
- Vedant Vohra ’19, Zihao Xu ’19, Madison Hobbs ’19 (Scripps), Xiaotong Gui ’19 DataFest - Best Insight, honorable mention
2017:
- Zihao Xu ’19 Bag of Little Random Bootstraps Winning Paper Undergraduate Statistics Research Project Competition
- Jeff Carney ‘18, Hyeong Shin ’19, Adam Starr ’18 DataFest - Judges’ Choice Award
2014:
- Tim Kaye ’15, David Khatami ’16, Daniel Metz ’16, Emily Proulx ’16 Quantifying and Comparing Centrality Measures for Network Individuals as Applied to the Enron Corpus Winning Paper Undergraduate Statistics Research Project Competition, Data Science Research Circle, supervised by Jo Hardin and Ghassan Sarkis
- Tim Kaye ’15, David Khatami ’16, Daniel Metz ’16, Emily Proulx ’16 Quantifying and Comparing Centrality Measures for Network Individuals as Applied to the Enron Corpus; SIAM Undergraduate Research Online, 7: 2014.
2013:
- Jacob Coleman ’13, Maricela Cruz ’14, Bill DeRose ’15, Ciaran Evans ’15, Rob Knickerbocker ’15, Kevin Lu ’14, Derek Owens-Oas ’13, Ben Shand ’14, Brian Williamson ’14; DataFest - Best Insight
2012:
- Karl Kumbier ’13, Erika Parks ’13, Joseph Replogle ’13 DataFest - Best Use of External Data
- Drew DiPalma ’13, Tim Stutz ’12; DataFest - Best Visualization, honorable mention