Robert B. Gramacy Professor of Statistics

Intermediate Data Analytics and Machine Learning

CMDA/CS/STAT 4654 is a technical analytics course that will teach supervised and unsupervised learning strategies, including regression, generalized linear models, regularization, dimension reduction methods, tree-based methods for classification, and clustering. Upper-level analytical methods are shown in practice: e.g., neural networks and Gaussian processes. It is targeted towards students who have completed (and remember the concepts from) a course in introductory statistics and mathematical modeling. We will make extensive use of calculus, linear algrbra, and probability. Computational tools, such as the R language for statistical computing, will be used for illustration in class be essential for completing homework problems.


  • Class is canceled on Wednesday Feb 7. Office hours on Tuesday Feb 6 are canceled. Monday and Wednesday (TA) office hours are still on. Homework 1 is still due Feb 7.
  • The TA will hold office hours in the Old Security Building, Wed 1-2pm and Thu 11am-12pm.
  • Lectures will primarily be slides-based, supplemented by board calculations and computing demonstration in R. For complete notes you must come to class!


Homework Due at the start of lecture


The recommended language for this course is R, which can be obtained from CRAN. Other languages such as MATLAB are allowed but are not recommended. Examples in lecture, and help in office hours, etc., will be exclusively in R. Below are some helpful R resources: