Healthcare data are often available to payers and health care systems in real time, but are massive, high dimensional, and complex. Machine learning merges statistics, computer science, artificial intelligence, and information theory and offers powerful computational tools to enhance the extraction of useful information from complex healthcare data, build highly interpretable models, and make accurate predictions. This course gives an overview of basic machine learning concepts and provides an introduction to a few commonly used machine learning techniques and their practical applications in healthcare and pharmaceutical outcomes research. Participants will be introduced to foundational principles and concepts of statistical machine learning, then be provided with several specific machine learning techniques and their applications in health and pharmaceutical outcomes research. Different machine learning approaches using R will be demonstrated including tree-based methods, penalized regression, and neural networks analysis, as well as techniques for dimension reduction/feature selection. Participants will have hands-on practical experiences with machine learning and gain experience interpreting and evaluating the results and prediction performance that comes from machine learning modeling.
Distinguishing prediction modeling from research on real-world data meant for causal inference in pharmacoepidemiology will be also presented and discussed. This is an entry-level course but is designed for those with some familiarity with traditional statistical modeling techniques (eg, linear regression, logistic regression).
Participants who wish to gain hands-on experience are required to bring their laptops with R and RStudio installed.