Course Aim: is to describe the linkage between the conceptual and practice views of the integration between the machine learning and statistical techniques in predicting clinical outcomes.
Course Prerequisite: programming in Python– all examples will run on Jupyter Notebook.
Course Objectives are to:
- Understand the methodology of developing and validating the predictive models for clinical outcomes.
- Learn the main state-of-the-art machine learning and statistics techniques that are commonly used in literature predictive modeling
- Apply the methodology on different use cases.
- Practice these use cases using python on MIMIC data, and individual-level satisfaction, choice and preference, and response data, as a publicly available or provided datasets.
Initial Course Contents:
- Behavioral Use Cases:
- Covariates of patient satisfaction
- Structure of stated preferences
- Predictors of response to an outbound communication campaign
- MIMIC Use Cases:
- Disease Diagnosis
- Clustering symptoms and comorbidities
- Day 1: Introduction:
Part 1: Basics
- Research Question: The audience might understand the difference among questions related to classification, prediction, and clustering.
- Descriptive Analysis:
- Summary statistics.
- Statistical hypothesis testing.
- Bayesian approach: frequentist perspective vs. Bayesian perspective.
- Machine Learning Techniques: They primarily include:
- Classification: Like Rule-based, Tree-based, Function-based, Bayesian categories
- Clustering: Kmeans and Hierarchal Clustering
- Validation Methods and Metrics: They includes
- Cross validation and bootstrapping methods and how to validate clustering.
- Metrics: AUC, PPV, NPV, F-measure, purity (clustering), p-value and confidence interval.
- Result Interpretation:
- Threshold settings for minimizing classification errors.
- Black-Box versus interpretable machine learning techniques.
- Health Data Problems: Debates, Principles, and Methods
- Missing data
- Imbalance Classes
Part 2: Pandas and sklearn practice
- Day 2: Regression and Classification
- Part 1: Single Methods [Regression, Logistic Regression, Decision Tree, KNN, SVM]
Part 2: Combining Methods [Boosting, Bagging, Voting]
- Day 3: Clustering and Deep Learning Introduction
- Part 1: Clustering: [Kmeans, Hierarchical Clustering, and Model-Based Clustering]
Part 2: Deep Learning History and Neural Network
- Day 4: Keras (Deep learning in Python)
- We will work on binary outcomes for classification.
- No Time or longitudinal analysis.
- For each use case or algorithm, we will start with a toy example to demonstrate the related basics.
- Based course discussion and student feedback, we will continue refining the course materials.
Course Fee: 4 days course
Students (Undergraduate, Graduate & Post Doc) $60.00
Faculty & Non-Academic $ 180.00
Entire Summer Course: (can choose up to eleven courses)
Students (Undergraduate, Graduate & Post Doc) $125.00
Faculty & Non-Academic $ 400.00