I'm Ahmad Abdel-Azim.

Studying statistics and molecular & cellular biology at Harvard

I develop and implement novel statistical methods to overcome challenges in genomic medicine, all while collaborating in fast-paced, impact-oriented, and innovative research environments.

I am driven to apply computational and statistical techniques to advance the prospect of precision medicine and design patient-tailored therapies for the treatment of human diseases. Toward that goal, I have worked as a researcher and intern at several academic labs, companies, and pharmaceutical startups. At Harvard, I am currently pursuing an A.B./A.M. concurrent degree in statistics and molecular & cellular biology.

This website is a collection of my projects and professional experiences. I also write about various topics in statistics and deep learning here!



HSPH Researcher, Xihong Lin Lab @ Harvard 2022 - Present


Regeneron Intern, Regeneron Genetics Center 2023 - 2023
Biotia Bioinformatics Consultant 2022 - 2023
BWH Researcher, Kwiatkowski Lab @ Harvard 2020 - 2022
MIT CSAIL Research Student, Manolis Kellis Lab @ MIT (6.047) 2021 - 2021
HMS Researcher, Debora Marks Lab @ Harvard 2019 - 2020


STAT 117 Teaching Fellow, Statistics Dept. (Harvard) Spring 2024
STAT 110 Teaching Fellow, Statistics Dept. (Harvard) Fall 2023
STAT 185 Teaching Fellow, Statistics Dept. (Harvard) Fall 2023
STAT 111 Teaching Fellow, Statistics Dept. (Harvard) Spring 2023
MCB 112 Course Assistant, MCB Dept. (Harvard) Fall 2022

Selected Projects

Disease Risk Prediction Ongoing @ Lin Lab
Developing polygenic risk scores for correlated data structures.

We introduce a unified statistical framework for polygenic risk score prediction in the presence of correlated data, leveraging mixed effects modeling. We are evaluating the performance of this new approach via simulation across several genetic architectures. We further apply our approach to compute PRS for common complex diseases in the UK Biobank and demonstrate the potential of our method for clinical translation of polygenic prediction.

Links: GitHub, Project Page

Dispict preview image

Deep learning for AD diagnosis November 2021
Integrating multiple data modalities for accurate diagnosis.

Applying deep learning techniques to to predict Alzheimer's disease (AD) onset and progression from genetic, neuroimaging, and clinical data to facilitate rapid and precise diagnosis. We employ XGBoost models to select important genomic features, build 3D convolutional neural network (CNN) for MRI data, and design a classifier to combine retained features from both models with clinical information. With a focus on accuracy and predictive capabilities, the method shows promise for early detection.

Links: GitHub, Project Page

Dispict preview image

Latent Regression Analysis May 2023
Building a Gibbs sampler to aggregate expert insights.

We present a comprehensive statistical framework for analyzing expert rankings of NFL quarterbacks. The approach utilizes latent regression models, specifically designed to handle the idiosyncrasies and biases inherent in expert evaluations. We derive and implement a Gibbs sampler in R, infering a consensus ranking from multiple experts, while accounting for missing data and heterogeneity in the expert opinions. The statistical method prioritizes the exploration of underlying patterns in the data over the application of rankings, offering a robust tool for similar analyses in other domains.

Links: GitHub, Project Page

Dispict preview image