Overview
Analysed multiple real-world public-health datasets in R on topics including alcohol affordability, alcohol-related deaths, household alcohol expenditure and prescription trends. I cleaned and subset the data then visually explored distributions and used hypothesis testing to test claims and quantify uncertainty in the results.
Why it’s interesting
This project demonstrates practical statistical reasoning on messy real-world data rather than idealised textbook examples. The main challenge was deciding when statistical assumptions were reasonable; for example, in one test, I had to conclude I could not proceed because the data was too skewed for standard normal-based analysis to be convincing.
Key Technical Points
- Data preparation: Cleaned and subset multiple public-health datasets which covered topics such as: alcohol affordability, household expenditure on alcohol, alcohol related deaths and prescription trends.
- Exploratory analysis: Produced visualisations to inspect long-term trends, compare variables and understand the shape of the data.
- Statistical inference: Applied confidence intervals and hypothesis testing to evaluate regional differences in prescription trends.
- Presented findings: Presented the methodology and findings clearly by explaining both the statistical results and the reasoning behind them to a panel of my peers.
Tech Stack
- Analysis: R, Statistics