## Instructions

This homework is due on Tuesday, December 5th at 12:30pm (the start of class). Please turn in all your work. The primary purpose of this homework is get familiar with ARC. This description may change at any time, however notices about substantial changes (requiring more/less work) will be additionally noted on the class web page. Note that there are two prongs to submission, via Canvas and Bitbucket (in asc-repo/hwk/hw6). You don’t need to use Rmarkdown but your work should be just as pretty if you want full marks.

## Problem 1: Duplicating the examples (33 pts)

Get the MCPI and MH (snow) examples from the ARC R User Guide to work on dragonstooth. You may need to modify the qsub scripts . Provide evidence that the code worked, which may include a summary of output, jobload summary, the output of gstatement -h -a ascclass, etc.

## Problem 2: Predicting satellite drag (33 pts)

Run the satellite drag bakeoff, provided on the class web page, on dragonstooth.

• Allocate five nodes (so that five-fold CV is used) and you may need to reserve up to 12 hours.
• Provide evidence that the code is fully utilizing the resources you have allocated via jobload, and when it is done show the allocation expenditure via gstatement -h -a ascclass.
• Provide a boxplot of the RMSPEs that come out.

## Problem 3: Spam Bakeoff (34 pts)

Revisit the spam bakeoff from homework 4

1. first with GNU parallel on cascades; you may wish to consult spam_mc.qsub, however you will need to make some modifications in order to get GNU parallel to distribute runs to multiple nodes;
2. then with parallel/Rmpi on dragonstooth; you may wish to consult spam_snow.R

In both cases you must set things up so that all cores of five nodes are in fully utilized simultaneously.

• Perform at least thirty reps of 10-fold CV.
• Note that on cascades there are 32 cores, but only 24 on dragonstooth.
• Also note that nothing is OMP-parallelized here (although MKL is used), so you don’t a special mpirun setup here as we did for satellite drag. (Using those arguments will slow things down.)
• Provide evidence that the code is fully utilizing the resources you have allocated via jobload, and when it is done show the allocation expenditure via gstatement -h -a ascclass.
• Provide a boxplot of the hit rates that come out, side-by-side for parts a. and b.