Machine Learning & Artificial Intelligence | Department of Medicine Statistics Core

Unit Lead

Dennis Ruenger, PhD

Principal Statistician, Division of General Internal Medicine and Health Services Research

Dennis Ruenger, PhD, is a statistician and research psychologist whose work spans cognitive science, behavioral health, and clinical medicine. He earned his doctorate in cognitive psychology at Humboldt-Universität zu Berlin and held postdoctoral appointments at UC Santa Barbara and USC studying the cognitive and neural mechanisms of habit formation. Before joining DOMStat in 2019, he conducted research with high-risk transgender women and substance-using populations at Friends Research Institute in Los Angeles. His training in experimental design and quantitative behavioral science shapes how he translates clinical research questions into well-specified statistical problems.

At DOMStat, Dennis collaborates across cardiology, oncology, pulmonary and critical care, surgery, and health policy. His methodological focus includes clinical prediction modeling, causal inference for observational data, machine learning for clinical decision support, and the use of large language models for thematic analysis of free-text survey data. He has particular experience with difference-in-differences designs and interactive R Shiny-based tools for communicating model output to clinical audiences.

About

We work with investigators on research problems that involve prediction, classification, or extraction of information from complex clinical data. Our team of statisticians focuses on clinical prediction modeling, decision support tool development, natural language processing of electronic health records, and related ML/AI methods. We emphasize approaches rooted in biostatistical thinking—model validation, calibration, interpretability, and fairness—and support projects from study design and grant preparation through analysis and publication.

Services

Grant/Proposal Development Support

Consult on study design for ML/AI-based research projects
Evaluate sample size and power considerations for prediction modeling studies
Draft ML/AI methodology, power and sample size, and analysis plan sections for grant applications
Advise on reporting standards (e.g., TRIPOD, TRIPOD-AI, PROBAST)
Navigate ethical considerations for algorithmic research (bias auditing, fairness)

Predictive Modeling & Clinical Decision Support

Develop and validate clinical prediction models
- Supervised learning: penalized regression (LASSO, ridge, elastic net), random forests, gradient-boosted machines (e.g., XGBoost, LightGBM), support vector machines
- Survival and time-to-event prediction: Cox-based ML methods, random survival forests, penalized Cox regression
- Model selection, hyperparameter tuning, and cross-validation strategies
Perform internal and external validation, calibration assessment, and recalibration; optimize classification thresholds (e.g., decision curve analysis, net benefit)
Stratify patient risk and identify clinical phenotypes

Natural Language Processing (NLP) & Unstructured Data Classification

Clinical Information Extraction & Text Processing
- Extract clinical concepts from free-text EHR notes, pathology reports, and radiology narratives
- Classify text, recognize named entities, and analyze sentiment in medical narratives
- Apply rule-based and validated NLP pipelines to structure free-text clinical data for statistical analysis
Phenotyping & Cohort Construction
- Identify phenotypes and define cohorts from unstructured clinical data
- Develop probabilistic phenotype algorithms integrating structured EHR data and NLP-derived features
- Design and conduct chart review validation studies to evaluate NLP-derived phenotypes
Advanced Methods & LLM Applications
- Leverage large language models (LLMs) and transformer-based architectures for thematic analysis of free-text survey responses and qualitative coding at scale

Model Interpretability & Fairness

Quantify feature importance and variable contributions (SHAP, partial dependence plots, permutation importance)
Audit algorithmic fairness across demographic and clinical subgroups
Prepare transparency reports and model documentation for publication purposes

Interactive Dashboards & Deployment

Build interactive R Shiny and web-based dashboards for model visualization and clinical use
Create real-time prediction interfaces with gauge charts, risk displays, and decision boxes
Deploy ML model prototypes to support research dissemination

Manuscript Preparation & Reporting

Write statistical and methodological sections for ML/AI studies
Prepare TRIPOD-compliant reports of prediction model development and validation
Generate publication-ready figures: ROC/PR curves, calibration plots, SHAP visualizations, feature importance panels
Support response to reviewers and manuscript revision

Unit Lead

Dennis Ruenger, PhD

Dennis Ruenger, PhD

About

Services

Interested in Collaborating or Learning More?