Robert B. Gramacy Professor of Statistics

Multivariate normal inference under monotone missingness

monomvn is an R package for estimation of multivariate normal and Student-t data of arbitrary dimension where the pattern of missing data is monotone. Through the use of parsimonious/shrinkage regressions (plsr, pcr, lasso, ridge, etc.), where standard regressions fail, the package can handle a nearly arbitrary amount of missing data.

This software is licensed under the GNU Lesser Public License (LGPL), version 2 or later. See the change log and an archive of previous versions.

The current version provides:

maximum likelihood inference with optional penalties such as ridge, lasso, partial least squares, principal components, etc.
Bayesian inference employing scale-mixture data augmentation
A fully functional standalone interface to the Bayesian lasso (from Park & Casella), Normal-Gamma (from Griffin & Brown), Horseshoe (from Carvalho, Polson, & Scott), and ridge regression with model selection via Reversible Jump, and student-t errors (from Geweke).
Monotone data augmentation extends the Bayesian approach to arbitrary missingness patterns.

Obtaining the package

Download R from cran.r-project.org by selecting the version for your operating system.
Install the monomvn, pls and lars packages, from within R.
R> install.packages(c("monomvn", "pls", "lars"))
Optionally, install the mvtnorm and accuracy packages.
R> install.packages(c("mvtnorm", "accuracy"))
Load the library as you would for any R library.
R> library(monomvn)

Documentation

See the package documentation. A pdf version of the reference manual, or help pages, is also available. The help pages can be accessed from within R. Try starting with:

References

Gramacy, R.B., Pantaleo, E. (2009). Shrinkage regression for multivariate inference with missing data, and an application to portfolio balancing. Bayesian Analysis 5(2), pp. 237-262; preprint on arXiv:0907.2135
Gramacy, R.B., Lee JH. (2007). On estimating covariances between many assets with histories of highly variable length. arXiv:0710.5837
Roderick J.A. Little and Donald B. Rubin (2002). Statistical Analysis with Missing Data, Second Edition. Wilely.
Bjorn-Helge Mevik and Ron Wehrens (2007). The pls Package: Principal Component and Partial Least Squares Regression in R. Journal of Statistical Software 18(2)
Efron, B., Hastie, T., Johnstone, I., and Tibshirani, R. (2003). Least Angle Regression (with discussion). Annals of Statistics 32(2)
Park, T., Casella, G. (2008). The Bayesian Lasso. Journal of the American Statistical Association 103(482), pp. 681-686(6)
Griffin, J.E., Brown, P.J. (2009) Inference with Normal-Gamma prior distributions in regression problems. Bayesian Analysis, 5(1), pp. 171-188
Carvalho, C.M., Polson, N.G., and Scott, J.G. (2010) The horseshoe estimator for sparse signals. Biometrika 97(2): pp. 465-480.
Geweke, J. (1996). Variable selection and model comparison in regression. In Bayesian Statistics 5. Editors: J.M. Bernardo, J.O. Berger, A.P. Dawid and A.F.M. Smith, 609-620. Oxford Press.
Trevor Hastie, Robert Tibshirani and Jerome Friedman (2002). Elements of Statistical Learning. Springer, NY.
Some of the code for monomvn, and its subroutines, was inspired by code written by Daniel Heitjan.