Robert B. Gramacy Professor of Statistics
Teaching
I regularly teach the following undergraduate classes through the Department of Statistics at Virginia Tech.
(Int.) Data Analytics (& Machine Learning)
STAT/CS 5525 (CMDA/CS/STAT 4654)
is a technical analytics course that will teach supervised and unsupervised learning strategies, including regression, generalized linear models, regularization, dimension reduction methods, tree-based methods for classification, and clustering. Upper-level analytical methods are shown in practice: e.g., neural networks and Gaussian processes. It is targeted towards students who have completed (and remember the concepts from) a course in introductory statistics and mathematical modeling.
We will make extensive use of calculus, linear algebra, and probability.
Computational tools, such as the R
language
for statistical computing, will be used for illustration in class and be essential for completing homework problems.
Nonparametric Statistics
STAT 3504
is an undergruadate course focused on statistical methodology based on ranks, empirical distributions, and runs.
One and two sample tests, ANOVA, correlation, goodness of fit, rank regression, R-estimates and confidence intervals.
We will learn comparisons with classical parametric methods. There will be an emphasis on assumptions and interpretation.
It is targeted towards students who have completed (and remember the concepts from) a course in introductory statistics.
We will make extensive use of computational tools, such as the R
language
for statistical computing, both for illustration in class and in homework problems.
Integrated Quantitative Science II
CMDA 2006
is a second class on statistical and applied mathematical methods. Statistical topics approach
modern methodology and implementation, begining
by revisiting the basics: probability distributions, principles of estimation, likelihood,
sampling distributions, multivatiate analysis, linear regression, ANOVA, categorical
data, nonlinear regrerssion, nonparametric methods and simulation.
We introduce and make extensive use of computational tools, such as the R
language
for statistical computing, both for illustration in class and in homework problems.
In alternating years, I usually teach one of the following two graduate classes, through the Department of Statistics at Virginia Tech.
Advanced Statistical Computing
STAT 6984
is a second (graduate) course on statistical computing. Although basics will be revisited,
the pace will be swift. The main programming language will be
R
,
but we will explore many other languages and tools. We will learn how statisticians can best leverage modern
desktop computing (multiple cores), cluster computing (multiple nodes) and distributed computing (hadoop/Amazon EC2). An aspect of that preparation will be "back to basics" with navigating the Unix shell,
manipulating data therein, compiling libraries with make, version control
(e.g., Git), and good habits/best practice with code development and data management.
Surrogate Modeling
STAT 6544 is a graduate "topics" statistics course at the interface between mathematical modeling via
computer simulation, computer model meta-modeling (i.e., emulation/surrogate modeling), calibration of
computer models to data from field experiments, and model-based sequential design and optimization under
uncertainty. The treatment will include some of the historical methodology in the literature, and canonical
examples, but will concentrate on modern statistical methods, computation and implementation in
R
, as motivated by modern application/data type and size.
Previously, I gave the following graduate classes within the Booth School of Business at the University of Chicago. See my CV for a more complete teaching record, including classes given while at the Statistical Laboratory at the University of Cambridge.
Bayesian Inference
BUS 41913 is a graduate course in Bayesian Inference. The course will focus on understanding the principles underlying Bayesian modeling and on building experience in the use of Bayesian analysis for making inference about real world problems. Particular attention will be paid to the computational techniques (e.g., MCMC) needed for most problems and their implementation in the R
language for statistical computing.
Applied Regression Analysis
BUS 41100 (Sections 01, 02 and 085) is a course about regression, a powerful and widely used data analysis technique. Students will learn how to use regression to analyze a variety of complex real world problems. Heavy emphasis will be placed on analysis of actual datasets, and implementation in the R
language for statistical computing. Topics covered include: simple linear regression, multiple regression, prediction, variable selection, residual diagnostics, time series (auto-regression), and classification (logistic regression).