Department of Statistics, Virginia Tech

This homework is due on **Tuesday, November 14th at 2pm** (the start of class). Please turn in all your work. This homework primarily covers on statistical tests based on ranks (via variances) and measures of rank correlation.

*Test the following data to see if the variation in high temperature in Des Moines is higher than the that for the high temperature in Spokane, for randomly sampled days in the summer.*

```
desmoines <- c(83, 91, 94, 89, 89, 96, 91, 92, 90)
spokane <- c(78, 82, 81, 77, 79, 81, 80, 81)
```

- (10 pts)
*Perform the test “by hand”: show the ranking, clearly state the hypotheses, test statistic and conclusion.*

The hypotheses are

\[ \begin{aligned} \mathcal{H}_0 &: \; \sigma^2_\mathrm{S} \leq \sigma^2_\mathrm{DM} \\ \mathcal{H}_1 &: \; \sigma^2_\mathrm{S} \geq \sigma^2_\mathrm{DM} \end{aligned} \]

- Equivalently, we could frame the hypotheses in terms of independence in the null.

This is a lower-tailed test, and we will treat Spokane as \(X\) and Des Moines as \(Y\).

```
x <- spokane
y <- desmoines
xbar <- mean(x)
ybar <- mean(y)
```

Now we can calculate the \(U\)s and \(V\)s.

```
u <- abs(x - xbar)
v <- abs(y - ybar)
```

Next calculate the ranks of the combined sample.

```
r <- rank(c(u, v))
ru <- r[seq_along(u)]
rv <- r[-seq_along(u)]
list(spokane=ru, desmoines=rv)
```

```
## $spokane
## [1] 12 13 7 14 5 7 1 7
##
## $desmoines
## [1] 17.0 2.5 15.0 10.5 10.5 16.0 2.5 9.0 4.0
```

- As you can see there are two ties.

Lets calculate \(t\), the observed value of \(T\).

```
t <- sum(ru^2)
t
```

`## [1] 682`

Now lets calculate the standardized version, \(t_1\) for \(T_1\), which is appropriate when we have ties. First some auxiliary calculations …

```
nx <- length(x)
ny <- length(y)
N <- nx + ny
r2bar <- (sum(ru^2) + sum(rv^2))/N
r2bar
```

`## [1] 104.8235`

```
r4sum <- sum(r^4)
r4sum
```

`## [1] 326429.2`

We’re ready to go.

```
t1 <- (t - nx*r2bar)/sqrt(nx*ny/(N*(N-1))*r4sum - nx*ny/(N-1)*r2bar^2)
t1
```

`## [1] -0.8144833`

And the \(p\)-value for this lower-tailed test may be calculated as follows.

`pnorm(t1)`

`## [1] 0.2076841`

- We fail to reject the null hypothesis at the 5% level, and conclude there is not enough evidence to say that the variation in high temperature in DesMonies is higher than in Spokane.

- (5 pts)
*Perform the test “using a software library”.*

Below we borrow the library we coded up in class. I have added the suffix “`approx`

” to clarify that the CLT is used.

```
sqranks.test.approx <- function(x, y, alternative=c("two.sided", "greater", "less"))
{
## check for partial matches of the alternative argument
alternative <- match.arg(alternative)
## calculate the absolute discrepancies from the (estimated) mean
xbar <- mean(x)
ybar <- mean(y)
u <- abs(x - xbar)
v <- abs(y - ybar)
## calculate the observed ranks, and t
r <- rank(c(u, v))
ru <- r[seq_along(u)]
rv <- r[-seq_along(u)]
t <- sum(ru^2)
## calculate the standardized version, t1
nx <- length(x)
ny <- length(y)
N <- nx + ny
r2bar <- (sum(ru^2) + sum(rv^2))/N
r4sum <- sum(r^4)
t1 <- (t - nx*r2bar)/sqrt(nx*ny/(N*(N-1))*r4sum - nx*ny/(N-1)*r2bar^2)
## calculate the p-value
if(alternative == "two.sided")
phi <- 2*pnorm(-abs(t1))
else if(alternative == "less")
phi <- pnorm(t1)
else if(alternative == "greater")
phi <- pnorm(t1, lower.tail=FALSE)
## return the stats and p-value
return(list(t=t, t1=t1, p.value=phi))
}
```

Now we can use it on our data.

`sqranks.test.approx(spokane, desmoines, alternative="less")`

```
## $t
## [1] 682
##
## $t1
## [1] -0.8144833
##
## $p.value
## [1] 0.2076841
```

- Identical result.

*In a controlled environment laboratory, 10 men and 10 women were tested to determine the room temperature they found to be the most comfortable. There results were:*

```
men <- c(74, 72, 77, 76, 76, 73, 75, 73, 74, 75)
women <- c(75, 77, 78, 79, 77, 73, 78, 79, 78, 80)
```

*Assuming that these temperatures resemble a random sample from their respective populations, is the variation in comfortable temperature the same for men and women?*

- (10 pts)
*Perform the test “by hand”: show the ranking, clearly state the hypotheses, test statistic and conclusion.*

The hypotheses are

\[ \begin{aligned} \mathcal{H}_0 &: \; \sigma^2_\mathrm{M} = \sigma^2_\mathrm{W} \\ \mathcal{H}_1 &: \; \sigma^2_\mathrm{M} \ne \sigma^2_\mathrm{W} \end{aligned} \]

- Equivalently, we could frame the hypotheses in terms of independence in the null.

This is a two-tailed test, and we will treat men as \(X\) and women as \(Y\).

First calculate the averages to use in place of the means.

```
x <- men
y <- women
xbar <- mean(x)
ybar <- mean(y)
```

Now we can calculate the \(U\)s and \(V\)s.

```
u <- abs(x - xbar)
v <- abs(y - ybar)
```

Next calculate the ranks of the combined sample.

```
r <- rank(c(u, v))
ru <- r[seq_along(u)]
rv <- r[-seq_along(u)]
rbind(men=ru, women=rv)
```

```
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## men 4.5 17.5 17.5 11.5 11.5 11.5 4.5 11.5 4.5 4.5
## women 16.0 1.5 8.0 14.5 1.5 20.0 8.0 14.5 8.0 19.0
```

- As you can see there are several ties.

Lets calculate \(t\), the observed value of \(T\).

```
t <- sum(ru^2)
t
```

`## [1] 1222.5`

Now lets calculate the standardized version, \(t_1\) for \(T_1\), which is appropriate when we have ties. First some auxiliary calculations …

```
nx <- length(x)
ny <- length(y)
N <- nx + ny
r2bar <- (sum(ru^2) + sum(rv^2))/N
r2bar
```

`## [1] 142.825`

```
r4sum <- sum(r^4)
r4sum
```

`## [1] 715743.9`

We’re ready to go.

```
t1 <- (t - nx*r2bar)/sqrt(nx*ny/(N*(N-1))*r4sum - nx*ny/(N-1)*r2bar^2)
t1
```

`## [1] -0.7229738`

And the \(p\)-value for this two-tailed test may be calculated as follows.

`2*pnorm(- abs(t1))`

`## [1] 0.469696`

- We fail to reject the \(\mathcal{H}_0\) at the 5% level and conclude there is not enough evidence to say that the variability for men and women is different.

- (5 pts)
*Perform the test “using a software library”.*

`sqranks.test.approx(men, women)`

```
## $t
## [1] 1222.5
##
## $t1
## [1] -0.7229738
##
## $p.value
## [1] 0.469696
```

- Same result.

*The tensile strength of silicone rubber is thought to be a function of curing temperature. A study was carried out in which samples of 12 specimens of the rubber were prepared using curing temperatures of 20 degrees Celsius and 45 degrees Celsius. The data below show the tensile strength values in megapascals.*

```
c20 <- c(2.07, 2.14, 2.22, 2.03, 2.21, 2.03, 2.05, 2.18, 2.09, 2.14, 2.11, 2.02)
c45 <- c(2.52, 2.15, 2.49, 2.03, 2.37, 2.05, 1.99, 2.42, 2.08, 2.42, 2.29, 2.01)
```

*You should perform all the following tasks using software.*

- (5 pts)
*Calculate numerical summaries of your data, this includes at least finding the mean, the median and standard deviation for each type of fabric.*

The mean, median and standard deviation for both fabrics may be computed as follows.

```
rbind(c20=c(mean=mean(c20), median=median(c20), sd=sd(c20)),
c45=c(mean=mean(c45), median=median(c45), sd=sd(c45)))
```

```
## mean median sd
## c20 2.1075 2.10 0.07085517
## c45 2.2350 2.22 0.20317928
```

- The means and medians are very similar, but there is a factor of three difference in the standard deviations.

A quick way to get similar information is via the `summary`

command. Rather than standard deviation we get the inter-quartile range.

`rbind(c20=summary(c20), c45=summary(c45))`

```
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## c20 2.02 2.045 2.10 2.1075 2.15 2.22
## c45 1.99 2.045 2.22 2.2350 2.42 2.52
```

- These results are less stark. Clearly the means are quite similar, but it is a tough call on the variation.

- (5 pts)
*Construct a helpful plot to compare your data, I suggest a boxplot. If you can create a single boxplot to summarize both curing temperatures, even better. Comment on the plots in terms of mean/median values and variability.*

```
boxplot(c20, c45, names=c("20 degrees", "45 degrees"), ylab="tensile strength",
xlab="curing temperature", main="Silicone Rubber")
```

- This looks more convincing: the
`c45`

numbers seem to exhibit a longer right-hand tail; i.e., higher variability. It could be that the means are different too.

- (10 pts)
*Perform a two sided test comparing the mean value of tensile strength at both temperatures. You should provide the hypotheses, and values for the test statistic and \(p\)-value.*

We shall test the following two-tailed hypotheses.

\[ \begin{aligned} \mathcal{H}_0 &: \; \mu_{20} = \mu_{45} \\ \mathcal{H}_1 &: \; \mu_{20} \ne \mu_{45} \end{aligned} \]

Lets go through the test fairly quickly because we did a bunch of these on the last homework. We’ll treat `c20`

as \(X\) and `c45`

as \(Y\).

```
x <- c20
y <- c45
nx <- length(x)
ny <- length(y)
r <- rank(c(x, y))
rx <- r[seq_along(x)]
t <- sum(rx)
t
```

`## [1] 130.5`

Lets see how many ties there are

```
ties <- length(r) - length(unique(r))
ties
```

`## [1] 5`

Substantial, so we will use the Gaussian approximation.

```
N <- nx + ny
r2sum <- sum(r^2)
r2sum
```

`## [1] 4896.5`

```
t1 <- t - nx*(N+1)/2
t1 <- t1/sqrt(nx*ny/((N*(N-1)))*r2sum - nx*ny*(N+1)^2/(4*(N-1)))
t1
```

`## [1] -1.12755`

And then the \(p\)-value calculation from a standard normal.

`2*min(pnorm(t1), pnorm(t1, lower.tail=FALSE))`

`## [1] 0.2595099`

- Not enough evidence to reject \(\mathcal{H}_0\) at thr 5% level: means are the same.

- (10 pts)
*Perform a two sided test comparing the variability of tensile strength at both temperatures. You should provide the hypotheses, and values for the test statistic and p-value.*

Now the hypotheses are

\[ \begin{aligned} \mathcal{H}_0 &: \; \sigma^2_\mathrm{20} = \sigma^2_\mathrm{45} \\ \mathcal{H}_1 &: \; \sigma^2_\mathrm{20} \ne \sigma^2_\mathrm{45} \end{aligned} \]

- Equivalently, we could frame the hypotheses in terms of independence in the null.

This is a two-tailed test, and we will treat `c20`

as \(X\) and `c45`

as \(Y\). Since this is mostly a cut-and-paste of the code above, we’ll skip most of the commentary.

```
x <- c20
y <- c45
xbar <- mean(x)
ybar <- mean(y)
u <- abs(x - xbar)
v <- abs(y - ybar)
r <- rank(c(u, v))
ru <- r[seq_along(u)]
rv <- r[-seq_along(u)]
rbind(c20=ru, c45=rv)
```

```
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
## c20 5 3.5 14 9.5 13 9.5 7 8 2 3.5 1 12
## c45 24 11.0 23 20.0 15 18.0 22 18 16 18.0 6 21
```

- As you can see there are several ties, so we will use the CLT approximation again.

```
t <- sum(ru^2)
nx <- length(x)
ny <- length(y)
N <- nx + ny
r2bar <- (sum(ru^2) + sum(rv^2))/N
r4sum <- sum(r^4)
t1 <- (t - nx*r2bar)/sqrt(nx*ny/(N*(N-1))*r4sum - nx*ny/(N-1)*r2bar^2)
t1
```

`## [1] -3.575146`

And the \(p\)-value for this two-tailed test may be calculated as follows.

`2*pnorm(- abs(t1))`

`## [1] 0.0003500321`

- An easy reject of \(\mathcal{H}_0\) at the 5% level; we conclude that the variability of tensile strengths at 20 degrees is different than 45 degrees.

*Random samples from each of three different types of light bulbs were tested to see how long the light bulbs lasted, with the following results:*

```
bulbs <- list(
A=c(73, 64, 67, 62, 70),
B=c(84, 80, 81, 77),
C=c(82, 79, 71, 75))
```

*Do these results indicate a significant difference in the variability between brands?*

We consider the following hypotheses.

\[ \begin{aligned} \mathcal{H}_0 &: \; \mbox{All 3 types of bulbs are identical life spans, except for possibly different means} \\ \mathcal{H}_1 &: \; \mbox{Some variances of the bulb life spans are not equal to each other} \end{aligned} \]

First, lets summarize the sizes of the samples.

```
k <- length(bulbs)
n <- sapply(bulbs, length)
N <- sum(n)
```

Then calculate absolute deviations from respective (estimated) means.

`U <- lapply(bulbs, function(x) { abs(x - mean(x)) })`

Next calculate the ranks of the combined samples.

`r <- rank(unlist(U))`

Then calculate the sum of the squared ranks for each sample as follows.

```
g <- factor(rep.int(seq_len(k), n))
s <- tapply(r, g, function(x) { sum(x^2) })
s
```

```
## 1 2 3
## 355 157 306
```

Next calculate the observed values of auxiliary \(D^2\) and \(\bar{S}\) quantities.

```
sbar <- sum(s)/N
d2 <- (sum(r^4) - N*sbar^2)/(N-1)
c(sbar, d2)
```

`## [1] 62.92308 3130.34776`

Finally, we are ready to calculate \(t_2\), the observed value of \(T_2\).

```
t2 <- (sum(s^2/n) - N*sbar^2)/d2
t2
```

`## [1] 1.055849`

The \(p\)-value is calculated as follows.

`pchisq(t2, k-1, lower.tail=FALSE)`

`## [1] 0.589828`

- We fail to reject the null hypothesis and conclude the variability is the same across the different types of lightbulbs.

*An investment class was divided into three groups of students. One group was instructed to invest in bonds, the second in blue chips stocks, and the third in speculative issues. Each student “invested” $10,000 and evaluated the hypothetical profit or loss at the end of 3 months with the following results.*

```
invest <- list(bonds=c(146, 180, 192, 185, 153),
bluechip=c(176, 110, 212, 108, 196),
speculative=c(-540, 1052, 642, -281, 67))
```

*Is the difference in variance significant?*

Again consider the following hypotheses.

\[ \begin{aligned} \mathcal{H}_0 &: \; \mbox{All 3 strategies, except for possibly different means} \\ \mathcal{H}_1 &: \; \mbox{Some variances of the strategies are not equal to each other} \end{aligned} \]

Since this is very similar to the previous problem we will dispense with the chatter.

```
k <- length(invest)
n <- sapply(invest, length)
N <- sum(n)
U <- lapply(invest, function(x) { abs(x - mean(x)) })
r <- rank(unlist(U))
g <- factor(rep.int(seq_len(k), n))
s <- tapply(r, g, function(x) { sum(x^2) })
sbar <- sum(s)/N
d2 <- (sum(r^4) - N*sbar^2)/(N-1)
t2 <- (sum(s^2/n) - N*sbar^2)/d2
t2
```

`## [1] 11.70985`

The \(p\)-value is calculated as follows.

`pchisq(t2, k-1, lower.tail=FALSE)`

`## [1] 0.002865751`

- We can reject \(\mathcal{H}_0\) at the 5% level and conclude the variability differs across investment strategy.

A husband and a wife go bowling together and they kept their scores for 10 games to see if there was a correlation between their scores. The scores were recorded in order as follows.

```
bowling <- data.frame(husband=c(147, 158, 131, 142, 183, 151, 196, 129, 155, 158),
wife=c(122, 128, 125, 123, 115, 120, 108, 143, 124, 123))
```

- (10 pts) Compute \(\rho\).

Here is three ways to do it.

- The hard way:

```
rx <- rank(bowling$husband)
ry <- rank(bowling$wife)
n <- nrow(bowling)
numer <- sum(rx*ry) - n*((n+1)/2)^2
denomx <- sqrt(sum(rx^2) - n*((n+1)/2)^2)
denomy <- sqrt(sum(ry^2) - n*((n+1)/2)^2)
rho <- numer/(denomx*denomy)
rho
```

`## [1] -0.6128049`

- The easy way.

`cor(rank(bowling$husband), rank(bowling$wife))`

`## [1] -0.6128049`

- The easiest way.

`cor(bowling$husband, bowling$wife, method="spearman")`

`## [1] -0.6128049`

- All identical.
- It would appear that the husband and wife’s scores are negatively correlated. When one is doing well, the other suffers.

- (10 pts) Test the hypothesis of independence using a two-tailed test.

Performing a two-tailed test under the following hypotheses is straightforward.

\[ \begin{aligned} \mathcal{H}_0 &: \; \mbox{The husband and wife's scores are mutually independent.} \\ \mathcal{H}_1 &: \; \mbox{Either (a) there is a tendency for the larger values of of the husband's score to be paired with larger values of wife's, or} \\ & \;\;\;\; \mbox{(b) larger values of the husband's score with smaller values of wife's, and vice versa with smaller and larger switched.} \end{aligned} \]

This may be entertained as simply as:

```
library(SuppDists)
pSpearman(rho, n)
```

`## [1] 0.0333675`

However, there are ties in the ranks

`rbind(husband=rx, wife=ry)`

```
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## husband 4 7.5 2 3.0 9 5 10 1 6 7.5
## wife 4 9.0 8 5.5 2 3 1 10 7 5.5
```

so we should be using the CLT approximation instead. For that we get

`2*min(pSpearman(rho, n), 1-pSpearman(rho, n))`

`## [1] 0.06673501`

Using the library testing routine we obtain

`cor.test(bowling$husband, bowling$wife, method="spearman", exact=FALSE)`

```
##
## Spearman's rank correlation rho
##
## data: bowling$husband and bowling$wife
## S = 266.11, p-value = 0.05961
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho
## -0.6128049
```

- A very similar result.
- This is a tough call, but the more appropriate computations, associate with the CLT, agree that the \(p\)-value is above the 5% threshold. Therefore we (barely) fail to reject \(\mathcal{H}_0\), and conclude that the scores of the couples are independent of one another.