Department of Statistics, Virginia Tech

It is estimated that at least half of the men who currently undergo an operation to remove prostate cancer suffer from a particular undesirable side effect. In an effort to reduce the likelihood of this side effect the FDA studied a new method of performing the operation. Out of 19 operations only 3 men suffered the unpleasant side effect. Is it safe to conclude the new method of operation is effective in reducing the side effect?

Let \(p\) be the probability of the patient experiencing the side effect. Then this is a lower-tailed test of

\[ \mathcal{H}_0: p \geq 0.5 \quad \mbox{ v.s. } \quad H_1: p < 0.5. \]

- Under \(\mathcal{H}_0\) the test statistic has the null distribution \(T \sim \mathrm{Bin}(19, 0.5)\).
- The observed value of \(T\) is \(t=3\).
- So the \(p\)-value is \(\mathbb{P}(T \leq 3)\) which may be calculated in R as

```
t <- 3
n <- 19
pstar <- 0.5
phi <- pbinom(t, n, pstar)
phi
```

`## [1] 0.002212524`

- At the typical signifigance level of \(\alpha = 0.05\) this is a clear rejection of \(\mathcal{H}_0\).

To find a confidence interval we need to find the value of \(p\) would yield \(\mathbb{P}(T \leq t) = \alpha\) for some choice of \(\alpha\).

- For example \(\alpha = 0.05\).

```
alpha <- 0.05
f <- function(x, alpha, lower.tail=TRUE) { alpha - pbinom(t, n, x, lower.tail=lower.tail) }
soln <- uniroot(f, c(0,1), alpha=alpha)
pU <- soln$root
pU
```

`## [1] 0.3594267`

- Here is what we’re doing visually. Lets plot the binomial cumulative mass as a function of \(p^* \in [0,1]\) and draw a horizontal line for \(\alpha\) over it.

```
p <- seq(0,1, length=1000)
plot(p, pbinom(t, n, p), type="l", lwd=2, main="finding upper confidence limit")
abline(h=alpha, col="gray")
legend("topright", c("pbinom up to t=3 (n=19)", "alpha=0.05"), lty=1, lwd=c(2,1), col=c("black", "gray"))
```

- The easier way using the “Bayesian” Beta distribution idea.

`qbeta(1-alpha, t+1, n-t)`

`## [1] 0.3594256`

- Almost identical.
- Therefore a 95% CI is \([0,0.36]\).

R has a built-in library for this sort of thing.

`binom.test(t, n, alternative="less")`

```
##
## Exact binomial test
##
## data: t and n
## number of successes = 3, number of trials = 19, p-value = 0.002213
## alternative hypothesis: true probability of success is less than 0.5
## 95 percent confidence interval:
## 0.0000000 0.3594256
## sample estimates:
## probability of success
## 0.1578947
```

- Notice that you get both the test and the confidence interval from one command.

Under simple Mendelian inheritance a cross between plants of two particular genotypes may be expected to produce progeny one-fourth of which are “dwarf” and three-fourths of which are “tall”. In an experiment to determine if the assumption of simple Mendelian inheritance is reasonable in a certain situation, a cross results in progeny having 243 dwarf plants and 682 tall plants.

If “class 1” denotes “tall”, then \(p^* = 3/4\) and \(T\) is the number of tall plants. The null hypothesis of simple Mendelian inheritance is equivalent under the model to \(\mathcal{H}_0: p = 3/4\), with the alternative of interest being two-sided, \(\mathcal{H}_1: p \ne 3/4\).

We may now commence with the test.

- Here are the relevant values

```
n <- 925
t <- 682
pstar <- 3/4
```

- The observed test statistic, 682, is below the null mean which is 693.75, so we have to calculate \(t^U\),
- the number values on the right hand side of the mean have mass less than or equal to the mass at \(t\).

```
rhs <- seq(ceiling(n*pstar), n)
tU <- sum(dbinom(rhs, n, pstar) <= dbinom(t, n, pstar))
tU
```

`## [1] 220`

- Now we can follow the formula to calculate the \(p\)-value \(\phi = \mathbb{P}(T \leq t) + \mathbb{P}(T > t^U)\).

```
phi <- pbinom(t, n, pstar) + pbinom(n - tU, n, pstar, lower.tail=FALSE)
phi
```

`## [1] 0.3824916`

- A simpler alternative is to take two times the lower of the two probabilities \(\mathbb{P}(T \leq t)\) and \(\mathbb{P}(T \geq t)\). Since \(t\) is below the null mean the former will be smaller (check this), and so we obtain

`2*pbinom(t, n, pstar)`

`## [1] 0.3920185`

- clearly a decent approximation, which we will find handy when things get more complicated going forward.

For the CI we follow the same procedure as above, except use \(\alpha/2\).

- First via optimization.

```
alpha <- 0.05
f <- function(x, alpha, lower.tail=TRUE) { alpha - pbinom(t, n, x, lower.tail=lower.tail) }
pU <- uniroot(f, c(0,1), alpha=alpha/2)$root
pL <- uniroot(f, c(0,1), alpha=alpha/2, lower.tail=FALSE)$root
c(pL, pU)
```

`## [1] 0.7087742 0.7654085`

- Alternatively via the Beta quantiles.

`c(qbeta(alpha/2, t+1, n-t+1), qbeta(1-alpha/2, t+1, n-t))`

`## [1] 0.7079776 0.7654066`

Finally, we can do the whole thing with one call to a library routine.

`binom.test(t, n, p=3/4)`

```
##
## Exact binomial test
##
## data: t and n
## number of successes = 682, number of trials = 925, p-value =
## 0.3825
## alternative hypothesis: true probability of success is not equal to 0.75
## 95 percent confidence interval:
## 0.7076683 0.7654066
## sample estimates:
## probability of success
## 0.7372973
```