2025
Enumerating and assessing the travel of homeless in western Washington
In collaboration with social scientist and researcher Zack Almquist, PhD, I am helping estimate the number of people living unsheltered within Washington state. We are leveraging a novel survey methodology with spatial analysis to account for limitations from small sample sizes (e.g., small-area estimation). I am comparing the performance of direct estimators to more sophisticated global and local smoothing models.
We are currently preparing a publication of this work for submission to the scientific journal Nature Cities.
Analysis of the Pediatric Health Information System (PHIS), 2016-2021
In sponsorship from primary care pediatrician Alexis Ball, MD, MPP and principal data scientist Dwight Barry, PhD my team of fellow graduates described and modeled recent trends in pediatric health.
Alongside Lingfei “Ellen” Jiang, Shizhao “Joshua” Yang and Ruyue “Jasmine” Wang, I co-authored a Statistical Analysis Plan of advanced statistical techniques for clustered longitudinal data (e.g., mixed-effect models and interrupted time-series analysis) and conducted comparative analysis between patient demographics. An outstanding achievement was my facilitating team discussion and decision-making, and spearheading scientific approaches and project planning.
View our final presentation, delivered March 10th, 2025.
In collaboration with epidemiologists and health specialists, I analyzed the survival and relapse of patients following their bone marrow transplantation. I translated scientific aims to statistical methods (e.g., survival analysis with competing risks), implemented data analysis in R, performed hypothesis testing, and contributed to post-hoc discussion.
Read our final report.
2024
Analysis of the National Registry of Exonerees
In collaboration with professional statistician Nayak Pollisar, PhD and fellow graduate student Cindy Elder, MA, MS, I studied factors that contributed to the wrongful convictions of people enrolled in a national registry. I completed intensive data cleaning and producing data visualizations and summaries to identify popular demographic profiles within the registry. I fit multiple nonlinear regression models to investigated meaningful associations with penalty-related outcomes (e.g., a wrongful conviction’s sentence and duration). I primarily investigated the inter-related association between race and alleged crime through hypothesis tests of statistical interaction, also known as effect modification.
I collaborated with three epidemiologists to analyze risk of pulmonary tuberculosis from a matched case-control clinical study. I translated scientific aims to statistical methods (e.g., logistic regression), implemented data analysis in R, performed hypothesis testing, and authored relevant sections of a summary report.
2023
An efficient learning model to predict environmental suitability
As a solution to a unique agricultural classification problem–in which we must teach a computer to distinguish between two classes, but do not have complete faith in our labelled data–I developed a specialized learning model that iteratively optimizes itself to honor a statistical principle known as “maximum (informational) entropy”, under guidance from John Korah, PhD. Our application used a large data set of geographic data, which I pre-processed with Python 3 scripts and ESRI’s ArcGIS software. We designed our algorithm to use parallel executions to shorten computation time.
View the poster I presented at Cal Poly Pomona’s College of Science research symposium and the Southern California Conference for Undergraduate Research (SCCUR).