Instructions

This homework is due on Thursday, October 26th at 2pm (the start of class). Please turn in all your work. This homework primarily covers \(r \times c\) contingency tables, median test, goodness-of-fit test, contingency coefficients and Cochran’s test.

Problem 1: Grading policies (15 pts)

Three professors are teaching large classes in introductory statistics. At the end of the semester, they compare grades to see if there are significant differences in their grading policies. Are these differences significant?

Prof A B C D F WP WF
Smith 12 45 49 6 13 18 2
Jones 10 32 43 18 4 12 6
White 15 19 32 20 6 9 7
  1. (3 pts) Which test are you using?

Pearson’s Chi-square test for difference in probabilities.

  1. (7 pts) Perform the appropriate test “by hand”.

The hypotheses are

\[ \begin{aligned} \mathcal{H}_0 &: \; \mbox{All of the probabilities for the same grade are equal to each other} \\ \mathcal{H}_1 &: \; \mbox{All of the probabilities for the same grade are not equal to each other} \end{aligned} \]

First, lets collect the essence of the table in R.

O <- rbind(smith=c(12, 45, 49, 6, 13, 18, 2),
    jones=c(10, 32, 43, 18, 4, 12, 6),
    white=c(15, 19, 32, 20, 6, 9, 7))
colnames(O) <- c("A", "B", "C", "D", "F", "WP", "WF")

Now we can calculate column and row sums, etc.

n <- rowSums(O)
C <- colSums(O)
N <- sum(n)

And we can flesh out the table with those values.

tab <- rbind(O, C)
tab <- cbind(tab, c(n, N))
rownames(tab)[nrow(tab)] <- colnames(tab)[ncol(tab)] <- "tot"
tab
##        A  B   C  D  F WP WF tot
## smith 12 45  49  6 13 18  2 145
## jones 10 32  43 18  4 12  6 125
## white 15 19  32 20  6  9  7 108
## tot   37 96 124 44 23 39 15 378

Now, calculate the expected counts under the null hypothesis.

E <- outer(n, C/N)

And then the test statistic.

t <- sum((O - E)^2/E)
t
## [1] 28.91509

Finally, the \(p\)-value is

df <- (length(n)-1)*(length(C)-1)
df
## [1] 12
pchisq(t, df, lower.tail=FALSE)
## [1] 0.004055965
  • An easy reject of \(\mathcal{H}_0\) at the 5% level. We conclude that there are significant differences in their grading policies.
  1. (5 pts) Perform the test in (ii) “using a software library”.
chisq.test(O, correct=FALSE)
## Warning in chisq.test(O, correct = FALSE): Chi-squared approximation may be
## incorrect
## 
##  Pearson's Chi-squared test
## 
## data:  O
## X-squared = 28.915, df = 12, p-value = 0.004056
  • Same result; never-mind the warning.

Problem 2: Religious preferences (15 pts)

A random sample of 30 graduating university seniors was categorized by college and religious preference as follows. Is there a relationship between college and religious preference?

Arts & Sciences Business Engineering Other
Catholic Protestant Catholic Protestant
Catholic Catholic Protestant Catholic
Jewish Jewish Protestant Catholic
Catholic Jewish Protestant Other
Protestant Protestant Other Catholic
Protestant Protestant Other
Other Other Catholic
Protestant Jewish
Other Other
  1. (3 pts) Which test are you using?

Chi-squared test for independence.

  1. (7 pts) Perform the appropriate test “by hand”.

Here we will test the following hypotheses.

\[ \begin{aligned} \mathcal{H}_0: & \; \mbox{Religion is independent of college affiliation} \\ \mathcal{H}_1: & \; \mbox{Religion depends, in part, on college affiliation} \end{aligned} \]

First we have to build the contingency table, which I will do below directly in R.

O <- rbind(catholic=c(3, 1, 1, 4),
    jewish=c(1, 2, 0, 1),
    protestant=c(3, 3, 3, 1),
    other=c(2, 1, 1, 3))
colnames(O) <- c("art & science", "business", "engineering", "other")
R <- rowSums(O)
C <- colSums(O)
N <- sum(R)
tab <- rbind(O, C)
tab <- cbind(tab, c(R, N))
rownames(tab)[nrow(tab)] <- colnames(tab)[ncol(tab)] <- "total"
tab
##            art & science business engineering other total
## catholic               3        1           1     4     9
## jewish                 1        2           0     1     4
## protestant             3        3           3     1    10
## other                  2        1           1     3     7
## total                  9        7           5     9    30

Now we can calculate the expected counts.

E <- outer(R, C/N)

Then the observed value of the \(T\) statistic can be calculated as follows.

t <- sum((O-E)^2/E)

And the \(p\)-value can be calculated as.

df <- (length(R)-1)*(length(C) - 1)
pchisq(t, df, lower.tail=FALSE)
## [1] 0.6781851
  • We fail to reject \(\mathcal{H}_0\) at the 5% level and conclude that religious preference and college are independent.
  1. (5 pts) Perform the test in (ii) “using a software library”.
chisq.test(O, correct=FALSE)
## Warning in chisq.test(O, correct = FALSE): Chi-squared approximation may be
## incorrect
## 
##  Pearson's Chi-squared test
## 
## data:  O
## X-squared = 6.6048, df = 9, p-value = 0.6782
  • Same result; never-mind the warning.

Problem 3: Compression strength (15 pts)

An experiment was conducted to determine the compression strength (lb) for different types of boxes. We are interested in determining if the median is the same for all types of boxes.

 
Box 1 655.5 788.3 734.3 721.4
Box 2 789.2 772.5 786.9 686.1
Box 3 737.1 539.0 696.3 671.7
Box 4 535.1 628.7 542.4 559.0
  1. (10 pts) Perform the test “by hand” including pairwise comparisons if appropriate.

We will be performing the median test with the following hypotheses.

\[ \begin{aligned} \mathcal{H}_0 &: \; \mbox{All boxes have the same median.} \\ \mathcal{H}_1 &: \; \mbox{At least two boxes have different medians.} \end{aligned} \]

Lets create the lists in R.

boxes <- list(box1=c(655.5, 788.3, 734.3, 721.4),
    box2=c(789.2, 772.5, 786.9, 686.1),
    box3=c(737.1, 539.0, 696.3, 671.7),
    box4=c(535.1, 628.7, 542.4, 559.0))

First we extract the lengths of the lists

n <- sapply(boxes, length)
n
## box1 box2 box3 box4 
##    4    4    4    4

and the grand median.

grandmed <- median(unlist(boxes))
grandmed
## [1] 691.2

With that we can create the table. First lets calculate the “greater” row and then the “lesser” row.

Og <- sapply(boxes, function(x) { sum(x > grandmed) } )
Oleq <- n - Og
O <- rbind(Og, Oleq)
rownames(O) <- c("> Med", "<= Med")
O
##        box1 box2 box3 box4
## > Med     3    3    2    0
## <= Med    1    1    2    4

Then we can calculate the row sums.

a <- sum(Og)
b <- sum(Oleq)
N <- sum(n)

Now we can dress up the table for inspection, although we don’t need it for anything specifically.

tab <- rbind(O, n)
tab <- cbind(tab, c(a, b, N))
rownames(tab)[nrow(tab)] <- colnames(tab)[ncol(tab)] <- "tot"
tab
##        box1 box2 box3 box4 tot
## > Med     3    3    2    0   8
## <= Med    1    1    2    4   8
## tot       4    4    4    4  16

The test statistic may be calculated as follows.

t <- (N^2 / (a*b)) * sum((Og - n*a/N)^2/n)
t
## [1] 6

Finally, the \(p\)-value is …

df <- length(boxes) - 1
pchisq(t, df, lower.tail=FALSE)
## [1] 0.1116102
  • We fail to reject the null hypothesis at the 5% level, and conclude that all boxes have the same median.
  1. (5 pts) Perform the test “using a software library”.

Here we will borrow the library we created in class.

median.test <- function(x) 
  {
    ## summarize lengths of populations
    n <- sapply(x, length)

    ## calculate grand mean and build observation table
    grandmed <- median(unlist(x))
    Og <- sapply(x, function(x) { sum(x > grandmed) } )
    Oleq <- n - Og
    O <- rbind(Og, Oleq)

    ## calculate totals columns of table
    a <- sum(Og)
    b <- sum(Oleq)
    N <- sum(n)
    
    ## build the table for visualization
    tab <- rbind(Og, Oleq, n)
    tab <- cbind(tab, c(a, b, N))
    rownames(tab)[nrow(tab)] <- colnames(tab)[ncol(tab)] <- "tot"

    ## calculate the test statistic, null distribution and p-value 
    t <- (N^2 / (a*b)) * sum((Og - n*a/N)^2/n)
    df <- length(x) - 1
    phi <- pchisq(t, df, lower.tail=FALSE)

    ## return info
    return(list(grand.med=grandmed, table=tab, t.stat=t, p.value=phi))
}

Here it is on our boxes data.

median.test(boxes)
## $grand.med
## [1] 691.2
## 
## $table
##      box1 box2 box3 box4 tot
## Og      3    3    2    0   8
## Oleq    1    1    2    4   8
## tot     4    4    4    4  16
## 
## $t.stat
## [1] 6
## 
## $p.value
## [1] 0.1116102
  • Same result; excellent!

Problem 4: Stock performance (15 pts)

A random sample of 30 stocks was selected from each of the three major U.S exchanges, and their performance over the previous year was monitored. The median performance for all 90 stocks was noted and the following table constructed:

Exchange Above Median
New York 18
American 17
NASDAQ 10

Was there a significant difference in the performance of stocks on the three exchanges during the previous year?

  1. (8 pts) Perform the test “by hand”.

Again, we are performing the median test with the same hypotheses as above.

This is a little easier because we have some of the table already populated for us.

n <- rep(30, 3)
Og <- c(18, 17, 10)
Oleq <- n - Og
O <- rbind(Og, Oleq)
colnames(O) <- c("NY", "American", "NASDAQ")
rownames(O) <- c("> Med", "<= Med")
O
##        NY American NASDAQ
## > Med  18       17     10
## <= Med 12       13     20

Now we can dress up the table for inspection, although we don’t need it for anything specifically.

tab <- rbind(O, n)
tab <- cbind(tab, c(a, b, N))
rownames(tab)[nrow(tab)] <- colnames(tab)[ncol(tab)] <- "tot"
tab
##        NY American NASDAQ tot
## > Med  18       17     10   8
## <= Med 12       13     20   8
## tot    30       30     30  16

The test statistic may be calculated as follows.

t <- (N^2 / (a*b)) * sum((Og - n*a/N)^2/n)
t
## [1] 5.066667

Finally, the \(p\)-value is …

df <- length(n) - 1
pchisq(t, df, lower.tail=FALSE)
## [1] 0.07939393
  • Fail to reject the null at the 5% level; we conclude that all exchanges have the same median.
  1. (7 pts) Perform the test “using a software library”.

We have two choices here: (a) we could try to use our library function above, but we’ll need to fabricate some original data with an appropriate list of values; or (b), we can modify the function to work only with the counts, rather than directly with the list.

For (a) consider the following.

grandmed <- 0
exchange <- list(NY=c(rep(1,Og[1]), rep(-1,Oleq[1])),
    American=c(rep(1,Og[2]), rep(-1,Oleq[2])),
    NASDAQ=c(rep(1,Og[3]), rep(-1,Oleq[3])))
median.test(exchange)
## $grand.med
## [1] 0
## 
## $table
##      NY American NASDAQ tot
## Og   18       17     10  45
## Oleq 12       13     20  45
## tot  30       30     30  90
## 
## $t.stat
## [1] 5.066667
## 
## $p.value
## [1] 0.07939393
  • Bingo.

Now for (b), consider the following new library function.

median.test.counts <- function(O) 
  {
    ## summarize lengths of populations
    n <- colSums(O)

    ## calculate totals columns of table
    a <- sum(Og)
    b <- sum(Oleq)
    N <- sum(n)
    
    ## build the table for visualization
    tab <- rbind(Og, Oleq, n)
    tab <- cbind(tab, c(a, b, N))
    rownames(tab)[nrow(tab)] <- colnames(tab)[ncol(tab)] <- "tot"

    ## calculate the test statistic, null distribution and p-value 
    t <- (N^2 / (a*b)) * sum((Og - n*a/N)^2/n)
    df <- ncol(O) - 1
    phi <- pchisq(t, df, lower.tail=FALSE)

    ## return info
    return(list(table=tab, t.stat=t, p.value=phi))
}

Lets try it.

median.test.counts(O)
## $table
##      NY American NASDAQ tot
## Og   18       17     10  45
## Oleq 12       13     20  45
## tot  30       30     30  90
## 
## $t.stat
## [1] 5.066667
## 
## $p.value
## [1] 0.07939393
  • Perfect.

Problem 5: Chromosomal structure (15 pts)

A certain type of insect that is found in lakes in the southwestern United States is studied to see if the chromosomal structure is significantly different among states. The number of insects of various chromosomal types is recorded as follows:

Type Texas New Mexico Arizona California
A 54 72 83 96
B 20 6 18 6
C 17 3 12 0
D 0 12 14 1
E 0 10 0 0
  1. (10 pts) Compute the following “by hand”:

    1. (3 pts) \(T\).

To compute \(T\) we will need to complete the totals column in the table.

First, lets tabulate the data values in R.

O <- rbind(A=c(54, 72, 83, 96),
    B=c(20, 6, 18, 6),
    C=c(17, 3, 12, 0),
    D=c(0, 12, 14, 1),
    E=c(0, 10, 0, 0))
colnames(O) <- c("Texas", "New Mexico", "Arizona", "California")

Now we can compute the row an column totals.

r <- rowSums(O)
c <- colSums(O)
N <- sum(r)

Then we can fill out the table.

tab <- cbind(O, r)
tab <- rbind(tab, c(c, N))
rownames(tab)[nrow(tab)] <- colnames(tab)[ncol(tab)] <- "tot"
tab
##     Texas New Mexico Arizona California tot
## A      54         72      83         96 305
## B      20          6      18          6  50
## C      17          3      12          0  32
## D       0         12      14          1  27
## E       0         10       0          0  10
## tot    91        103     127        103 424

The expected counts are …

E <- outer(r, c/N)

… so the observed value of the test statistic is

t <- sum((O-E)^2/E)
t
## [1] 100.913
  1. (2 pts) \(R_1\).
r1 <- t/(N*(min(length(r),length(c))-1))
r1
## [1] 0.07933409
  1. (1 pts) Cramer’s contingency coefficient.
Ccc <- sqrt(r1)
Ccc
## [1] 0.2816631
  1. (2 pts) Pearson’s contingency coefficient.
r2 <- sqrt(t/(N+t))
r2
## [1] 0.4384598
  1. (2 pts) Pearson’s Mean-Square contingency coefficient.
r3 <- t/N
r3
## [1] 0.2380023
  1. (5 pts) Find the available coefficients “using a software library”.
library(vcd)
## Loading required package: grid
assocstats(O)
##                     X^2 df   P(> X^2)
## Likelihood Ratio 108.40 12 0.0000e+00
## Pearson          100.91 12 3.3307e-16
## 
## Phi-Coefficient   : NA 
## Contingency Coeff.: 0.438 
## Cramer's V        : 0.282

Problem 6: Rolling dice (10 pts)

A die was cast 600 times with the following results:

 
Occurrence 1 2 3 4 5 6
Frequency 87 96 108 89 122 98

Is the die balanced?

  1. (5 pts) Perform the test by hand.

First, lets code the data in R.

O <- c(87, 96, 108, 89, 122, 98)
N <- sum(O)
N
## [1] 600

A balanced die would have a probability of \(1/6\) for each number on each of the six faces. Therefore, in a total number of counts of, 600, we would expect the following counts.

pstar <- rep(1/6, 6)
E <- N * pstar
E
## [1] 100 100 100 100 100 100

To settle the discrepancy we can perform a Goodness-of-fit test, with the null hypothesis that \(p_i* = 1/6\) for \(i=1,\dots, 6\), versus the alternative that at least two of those probabilities are different.

The test statistic is calculated as follows.

t <- sum((O-E)^2/E)
t
## [1] 8.58

And the \(p\)-value is

pchisq(t, length(O)-1, lower.tail=FALSE)
## [1] 0.1270355
  • Therefore we fail to reject the null hypothesis at the 5% level and conclude that the die is fair.
  1. (5 pts) Perform the test using software.
chisq.test(O, p=pstar)
## 
##  Chi-squared test for given probabilities
## 
## data:  O
## X-squared = 8.58, df = 5, p-value = 0.127
  • Same result.

Problem 7: Power of tests (15 pts)

In an attempt to compare the relative power of three statistical tests, 100 sets of artificial data were generated using a computer. On each set of data the three statistical tests were used with \(\alpha=0.05\), and the decision to accept or reject the null hypothesis was recored. The results are as follows:

Test 1 Test 2 Test 3 Numbers of Sets of Data
Accept Accept Accept 26
Accept Accept Reject 6
Accept Reject Accept 12
Reject Accept Accept 4
Reject Reject Accept 18
Reject Accept Reject 5
Accept Reject Reject 7
Reject Reject Reject 22

Is there a difference in the power of the three tests when applied to populations from which the simulated data were obtained?

  1. (3 pts) Which test are you using?

Cochran’s test.

  1. (7 pts) Perform the appropriate test “by hand”.

The trick here is to expand out each binary pattern above the requestie number of times, as indicated in the final column. The version below codes “Accept” as 1 and “Reject” as 0.

m1 <- matrix(rep(c(1,1,1),26), nrow=26, byrow=TRUE)
m2 <- matrix(rep(c(1,1,0),6), nrow=6, byrow=TRUE)
m3 <- matrix(rep(c(1,0,1),12), nrow=12, byrow=TRUE)
m4 <- matrix(rep(c(0,1,1),4), nrow=4, byrow=TRUE)
m5 <- matrix(rep(c(0,0,1),18), nrow=18, byrow=TRUE)
m6 <- matrix(rep(c(0,1,0),5), nrow=5, byrow=TRUE)
m7 <- matrix(rep(c(1,0,0),7), nrow=7, byrow=TRUE)
m8 <- matrix(rep(c(0,0,0),22), nrow=22, byrow=TRUE)
X <- rbind(m1, m2, m3, m4, m5, m6, m7, m8)

Now we can calculate the row and column sums.

R <- rowSums(X)
C <- colSums(X)
N <- sum(R)

I’ll forgo presenting the table because it is too big.

The observed value of the test statistic is

t <- length(C)*(length(C)-1)*(sum((C - N/length(C))^2))/sum(R*(length(C)- R))
t
## [1] 10.42308

Then we may calculate the \(p\)-value as follows.

pchisq(t, length(C)-1, lower.tail=FALSE)
## [1] 0.005453278
  • Reject \(\mathcal{H}_0\) at the 5% level and conclude that the three test do not have the same statistical power.
  1. (5 pts) Perform the test in (ii) “using a software library”.

The library we built in class is pasted below

cochran.test <- function(X) 
  {
    ## build out table
    R <- rowSums(X)
    C <- colSums(X)
    N <- sum(R)
    tab <- rbind(X, C)
    tab <- cbind(tab, c(R, N))
    rownames(tab)[nrow(tab)] <- colnames(tab)[ncol(tab)] <- "tot"

    ## calculate test statistic
    t <- length(C)*(length(C)-1)*(sum((C - N/length(C))^2))/sum(R*(length(C)- R))

    ## calculate the p-value
    phi <- pchisq(t, length(C)-1, lower.tail=FALSE)

    return(list(table=tab, t.stat=t, p.value=phi))
  }

Here is the result on this data, showing just the test statistic and \(p\)-value, so that big table doesn’t get printed to the screen.

out <- cochran.test(X)
out$t.stat
## [1] 10.42308
out$p.value
## [1] 0.005453278
  • Bingo!