I bridge 10+ years of healthcare operations with modern data engineering and clinical machine learning, turning messy healthcare data into production-grade pipelines and predictive models.
I started my career in healthcare operations working across payers, hospitals, and clinical practices and spent a decade understanding how healthcare data is actually generated, broken, and used at the ground level.
Now I'm translating that domain depth into a technical skill set: building end-to-end data pipelines aligned to clinical standards, training ML models on real-world healthcare datasets, and evaluating LLMs in clinical contexts.
My edge isn't just the code, it's that I've lived inside the systems the data comes from. I know why claims get denied, what makes a clinical workflow break, and what a data model needs to be actually useful to a health system.
I'm actively seeking a Data Analyst or Data Scientist or Analytics Engineer role where I can build data infrastructure with real clinical impact.
End-to-end data engineering, clinical ML, and AI evaluation work built across graduate research and real-world internship experience.
Built a full data engineering pipeline ingesting real adverse event reports from the FDA's openFDA API. Designed an ICH E2B(R3)-aligned MySQL schema, built an automated ETL process, and deployed a Flask + Plotly dashboard for exploratory signal detection. Mirrors how real pharmacovigilance systems work in industry.
Trained logistic regression and random forest classifiers on the TwoSIDES dataset to predict adverse drug-drug interactions. Full ML pipeline from feature engineering to model evaluation with clinical interpretability focus.
Built a logistic regression model predicting CVD risk using three independent public clinical datasets. Validated across datasets to test generalizability — a key challenge in clinical predictive modeling.
Evaluated LLM performance on clinical tasks at Northwell Health. Conducted structured misclassification analysis to surface failure modes, edge cases, and bias patterns — directly informing responsible deployment decisions.
Building an analytics engineering pipeline using CMS Medicare data. Applying dbt to model claims data from raw staging to production-ready mart tables with data quality tests and documentation.
Developing a structured evaluation framework for clinical NLP using scispaCy and MIMIC-III discharge summaries. Builds on Northwell internship methodology to create reusable evaluation tooling for clinical AI.
Built through graduate coursework, self-directed learning, and applied project work across the full data lifecycle.
A decade of healthcare operations, now intersecting with data science and health AI.
I'm actively seeking Data Scientist and Analytics Engineering roles. If you're building something at the intersection of health data and AI, I'd love to talk.