November 7, 2017



Rcpp is one of the most popular package on CRAN.

It is dedicated to simplify the connection between R and C++.

Useful resources:

  • Advanced R by Hadley Wickham

  • Rcpp gallery

  • Eddelbuettel, Dirk (2013) Seamless R and C++ Integration with Rcpp. Springer, New York. ISBN 978-1-4614-6867-7.

Installing Rcpp


  • Windows: Rtools

  • OSX: Xcode command line tools

  • Linux (e.g., Debian/Ubuntu): core software developement utilities

sudo apt-get install r-base-dev

Let's check that it works:

## [1] 2

R programming

  • Vectorization is primordial for performant R code (related functions are loops in C…)

  • But you need a lot of R vocabulary …

  • … and the code is somewhat harder to read

  • in addition, not always possible

Why C++

  • loops are ok!

  • many existing and optimized built-in libraries (e.g., STL)

  • recursion

  • much easier than C or Fortran (with Rcpp)


  • development time is longer, harder debugging
  • need to learn another language

Rcpp is here to make C++ code similar to R code

  • Well integrated in RStudio

  • but the resulting code is less generic


A useful package for comparing pieces of code: microbenchmark.


V <- runif(1e7)

microbenchmark(V*0.1, V/10)


Rcpp types

First example

Let's create a .cpp file, usually in a src folder

#include <Rcpp.h> 
using namespace Rcpp;

// [[Rcpp::export]]
NumericVector timesTwo(NumericVector x) {
  return x * 2;

and source it (e.g., with sourceCpp(file = ) or from RStudio).

Can also be used directly in 'R':

cppFunction("NumericVector timesTwo(NumericVector x) {
  return x * 2;

Second example: column-wise multiplication

We aim at rescaling columns. R versions:

## Don't do this
naive_mat <- function(m, v){
  for(i in 1:nrow(m)){
    for(j in 1:ncol(m)){m[i, j] <- m[i, j]/v[j]}

## Better
naive_mat2 <- function(m, v){
  for(j in 1:ncol(m)){m[, j] <- m[, j]/v[j]}

Second example (cont'd)

By Googling, you can find more efficient versions:

m <- matrix(rnorm(1200000), ncol=600)
v <- rep(c(1.5, 3.5, 4.5, 5.5, 6.5, 7.5), length = ncol(m))

microbenchmark(# naive_mat(m,v), ## way too slow
  naive_mat2(m, v),
  t(t(m) * v), 
  m %*% diag(v),
  m * rep(v,,length(v))), 
  m * rep(v, rep(nrow(m),length(v))), 
  m * rep(v, each = nrow(m)))

Second example (Rcpp version)

## Create a new matrix and return it
NumericMatrix naive_mat_cpp0(NumericMatrix m, NumericVector v){
  NumericMatrix m2(m.nrow(), m.ncol());
  for(int i = 0; i < m.nrow(); i++){
    for(int j = 0; j < m.ncol(); j++){
      m2(i, j) = m(i, j) * v(j);

microbenchmark(m * rep(v,,length(v))),
               naive_mat_cpp0(m, v))

Second example (alternative Rcpp version)

## Modifies m directly
void naive_mat_cpp1(NumericMatrix m, NumericVector v){
  for(int i = 0; i < m.nrow(); i++){
    for(int j = 0; j < m.ncol(); j++){
      m(i, j) *= v(j); // equivalent to m(i, j) = m(i, j) * v(j)

print(m[1:3,1:3]); naive_mat_cpp1(m, v); print(m[1:3,1:3])

microbenchmark(m * rep(v,,length(v))),
               naive_mat_cpp1(m, v))

Second example (cont'd)

What exactly is slow? Accessing the elements (e.g., m[i,j] or m(i,j))

Instead, let's exploit that matrices are vectors (stored by columns):

## Modifies m directly, works with pointers
cppFunction("void naive_mat_cpp_p(NumericMatrix m, NumericVector v){
  double* ptrm = &m(0,0); // pointer to m
  double* ptrv = &v(0); // pointer to v
  for(int j = 0; j < m.ncol(); j++, ptrv++){
    for(int i = 0; i < m.nrow(); i++, ptrm++){
      *ptrm *= *ptrv;

microbenchmark(m * rep(v,,length(v))),
               naive_mat_cpp_p(m, v))

Main differences

  • indices starts at 0

  • explicit return with predefined type

  • semicolons at the end of statements

  • use parenthesis instead of square brackets to access elements

  • C++ bool is true or false, R logical is (TRUE, FALSE, NA)

  • careful with types, e.g., division by integer:

evalCpp("5/3"); evalCpp("5/3.")
## [1] 1
## [1] 1.666667

Third example: STL library

The Standard Template Library contains useful data structures.

  IntegerVector which_cpp(NumericVector v, double d){
    std::vector<int> res; // empty vector
    for(int i = 0; i < v.length(); i++){
      if(v[i] >= d) res.push_back(i+1);

w <- runif(1e7)

microbenchmark(which(w >= 0.99999),
               which_cpp(w, 0.99999))


Rcpp sugar

Make C++ functions look like their R equivalent.

No need to reinvent the wheel.

Available functions include:

  • logical operators (<, <=, >, >=, ==, !=, !)

  • arithmetic operators (+, -, *, /)

  • logical summary function (any(), all())

  • stats functions (p/d/r/q)

  • other functions (abs, pow, diag, pmin, sapply))

Sugar example


any_naR <- function(x) any(

cppFunction("bool any_naC(NumericVector x) {
  return is_true(any(is_na(x)));

x0 <- runif(1e5); x1 <- c(x0, NA); x2 <- c(NA, x0)

  any_naR(x0), any_naC(x0),
  any_naR(x1), any_naC(x1),
  any_naR(x2), any_naC(x2)


R functions can be called from C++

 NumericMatrix chol_from_cpp(NumericMatrix M, Function f){
xgrid <- seq(0,1, length.out = 101)

M <- exp(-(as.matrix(dist(xgrid)^2))) + diag(1e-6, 101)

chol_M <- chol_from_cpp(M, chol)

Robject is more generic to collect the output of an R function.


Linear algebra

Popular Rcpp package include RcppArmadillo and RcppEigen relying on the corresponding C++ template libraries.


Rcpp works with OpenMP (maybe not on MacOS) by adding

  • // [[Rcpp::plugins(openmp)]]

to the .cpp file.


Rcpp Modules provide a way to expose more complex C++ classes and the related functions to R.

One example is given at the end of the Quick reference guide

Creating packages

Creating a Rcpp package from RStudio

It is well automated.

Here: R 3.4.2 and Rcpp 0.12.13, devtools 1.11.1

Also, roxygen2 makes documenting functions easier.