Robert B. Gramacy Professor of Statistics

Teaching

I regularly teach the following undergraduate classes through the Department of Statistics at Virginia Tech.

(Int.) Data Analytics (& Machine Learning)

STAT/CS 5525 (CMDA/CS/STAT 4654) is a technical analytics course that will teach supervised and unsupervised learning strategies, including regression, generalized linear models, regularization, dimension reduction methods, tree-based methods for classification, and clustering. Upper-level analytical methods are shown in practice: e.g., neural networks and Gaussian processes. It is targeted towards students who have completed (and remember the concepts from) a course in introductory statistics and mathematical modeling. We will make extensive use of calculus, linear algebra, and probability. Computational tools, such as the R language for statistical computing, will be used for illustration in class and be essential for completing homework problems.

(4654) Course Syllabus (5525) Course Syllabus

Nonparametric Statistics

STAT 3504 is an undergruadate course focused on statistical methodology based on ranks, empirical distributions, and runs. One and two sample tests, ANOVA, correlation, goodness of fit, rank regression, R-estimates and confidence intervals. We will learn comparisons with classical parametric methods. There will be an emphasis on assumptions and interpretation. It is targeted towards students who have completed (and remember the concepts from) a course in introductory statistics. We will make extensive use of computational tools, such as the R language for statistical computing, both for illustration in class and in homework problems.

Course Syllabus

Integrated Quantitative Science II

CMDA 2006 is a second class on statistical and applied mathematical methods. Statistical topics approach modern methodology and implementation, begining by revisiting the basics: probability distributions, principles of estimation, likelihood, sampling distributions, multivatiate analysis, linear regression, ANOVA, categorical data, nonlinear regrerssion, nonparametric methods and simulation. We introduce and make extensive use of computational tools, such as the R language for statistical computing, both for illustration in class and in homework problems.

Course Syllabus

In alternating years, I usually teach one of the following two graduate classes, through the Department of Statistics at Virginia Tech.

Advanced Statistical Computing

STAT 6984 is a second (graduate) course on statistical computing. Although basics will be revisited, the pace will be swift. The main programming language will be R, but we will explore many other languages and tools. We will learn how statisticians can best leverage modern desktop computing (multiple cores), cluster computing (multiple nodes) and distributed computing (hadoop/Amazon EC2). An aspect of that preparation will be "back to basics" with navigating the Unix shell, manipulating data therein, compiling libraries with make, version control (e.g., Git), and good habits/best practice with code development and data management.

Course Syllabus

Surrogate Modeling

STAT 6544 is a graduate "topics" statistics course at the interface between mathematical modeling via computer simulation, computer model meta-modeling (i.e., emulation/surrogate modeling), calibration of computer models to data from field experiments, and model-based sequential design and optimization under uncertainty. The treatment will include some of the historical methodology in the literature, and canonical examples, but will concentrate on modern statistical methods, computation and implementation in R, as motivated by modern application/data type and size.

Course Syllabus

Previously, I gave the following graduate classes within the Booth School of Business at the University of Chicago. See my CV for a more complete teaching record, including classes given while at the Statistical Laboratory at the University of Cambridge.

Bayesian Inference

BUS 41913 is a graduate course in Bayesian Inference. The course will focus on understanding the principles underlying Bayesian modeling and on building experience in the use of Bayesian analysis for making inference about real world problems. Particular attention will be paid to the computational techniques (e.g., MCMC) needed for most problems and their implementation in the R language for statistical computing.

Course Syllabus

Applied Regression Analysis

BUS 41100 (Sections 01, 02 and 085) is a course about regression, a powerful and widely used data analysis technique. Students will learn how to use regression to analyze a variety of complex real world problems. Heavy emphasis will be placed on analysis of actual datasets, and implementation in the R language for statistical computing. Topics covered include: simple linear regression, multiple regression, prediction, variable selection, residual diagnostics, time series (auto-regression), and classification (logistic regression).

Course Syllabus