July 25-28, 2017
(Instructor: Samir Abdelrahman and Andrew Redd)
Course Aim: is to describe the linkage between the conceptual and practice views of the integration between the machine learning and statistical techniques in predicting clinical outcomes.
Course Objectives are to:
- Understand the methodology of developing and validating the predictive models for clinical outcomes.
- Learn the main state-of-the-art machine learning and statistics techniques that are commonly used in literature predictive modeling
- Apply the methodology on different use cases.
- Practice these use cases using python/R on MIMIC as a publicly available dataset.
Initial Course Contents:
- Day 1: Introduction:
- Research Question: The audience might understand the difference among questions related to classification, prediction, and clustering.
- Data Quality Methods: They include some statistical basic methods and distributions that identify data noise, outliers, and missing data.
- Predictor Selection Methods: They include machine learning feature selection methods and hypothetical tests.
- Machine Learning Techniques: They primarily include:
- Classification: Like Rule-based, Tree-based, Function-based, Bayesian categories
- Clustering: K-means and Hierarchal Clustering
- Validation Methods and Metrics: They includes
- Cross validation and bootstrapping methods and how to validate clustering.
- Metrics: AUC, PPV, NPV, F-measure, purity (clustering), p-value and confidence interval.
- Result Interpretation:
- Threshold setting
- Black-Box versus interpretable machine learning techniques.
- Day 2 and 3: Use Cases: [Apply individual techniques from the above]
- Disease Diagnosis
- Clustering symptoms and comorbidities
- Day 4: Combining Modeling Techniques: [Apply combining approaches for the above]
- Super Learner
- Bagging and Boosting
- Voting and Stacking
- Meta Classification: Classifier that uses another classifier.
- We will work on binary outcomes for classification.
- No Time or longitudinal analysis.
- For each use case, we will present and describe the predictive modeling literature overview and select to use the most outperformers (if any).
- Based on 3 and further discussion, we will continue refining the course materials.