2025

Homelessness in King County, WA

Enumerating and assessing the travel of homeless in western Washington

In collaboration with social scientist and researcher Zack Almquist, PhD, I am helping estimate the number of people living unsheltered within Washington state. We are leveraging a novel survey methodology with spatial analysis to account for limitations from small sample sizes (e.g., small-area estimation). I am comparing the performance of direct estimators to more sophisticated global and local smoothing models.

We are currently preparing a publication of this work for submission to the scientific journal Nature Cities.

Acute leukemia, a blood and bone marrow cancer

In collaboration with epidemiologists and health specialists, I analyzed the survival and relapse of patients following their bone marrow transplantation. I translated scientific aims to statistical methods (e.g., survival analysis with competing risks), implemented data analysis in R, performed hypothesis testing, and contributed to post-hoc discussion.

Read our final report.

2024

Race and wrongful convictions in the United States

Analysis of the National Registry of Exonerees

In collaboration with professional statistician Nayak Pollisar, PhD and fellow graduate student Cindy Elder, MA, MS, I studied factors that contributed to the wrongful convictions of people enrolled in a national registry. I completed intensive data cleaning and producing data visualizations and summaries to identify popular demographic profiles within the registry. I fit multiple nonlinear regression models to investigated meaningful associations with penalty-related outcomes (e.g., a wrongful conviction’s sentence and duration). I primarily investigated the inter-related association between race and alleged crime through hypothesis tests of statistical interaction, also known as effect modification.

Pulmonary tuberculosis, a bacterial lung infection

I collaborated with three epidemiologists to analyze risk of pulmonary tuberculosis from a matched case-control clinical study. I translated scientific aims to statistical methods (e.g., logistic regression), implemented data analysis in R, performed hypothesis testing, and authored relevant sections of a summary report.

2023

Maximum entropy for machine learning

An efficient learning model to predict environmental suitability

As a solution to a unique agricultural classification problem–in which we must teach a computer to distinguish between two classes, but do not have complete faith in our labelled data–I developed a specialized learning model that iteratively optimizes itself to honor a statistical principle known as “maximum (informational) entropy”, under guidance from John Korah, PhD. Our application used a large data set of geographic data, which I pre-processed with Python 3 scripts and ESRI’s ArcGIS software. We designed our algorithm to use parallel executions to shorten computation time.

View the poster I presented at Cal Poly Pomona’s College of Science research symposium and the Southern California Conference for Undergraduate Research (SCCUR).