## Instructions

This homework is due on Thursday, September 14th at 2pm (the start of class). Please turn in all your work. The purpose of this homework is to refresh concepts learned in previous statistics courses.

### Problem 1: Sample Space (10 pts)

An experiment consists of tossing a die and then flipping a coin once if the number on the die is even. If the number on the die is odd, the coin is flipped twice. Using the notation 4H, for example to denote the outcome that the die comes up 4 and then the coin comes up heads, and 3HT to denote the outcome that the die comes up 3 followed by a head and then a tail on the coin, show all the elements of the sample space. Hint: There are 18.

### Problem 2: Probability (10 pts)

In the field of quality control the science of statistics is often used to determine if a process is “out of control”. Suppose the process is, indeed, out of control and 20% of items produced are defective.

1. If three items arrive off the process line in succession, what is the probability that all three are defective?
2. If four items arrive in succession, what is the probability that three are defective?

### Problem 3: Binomial Distribution (15 pts)

The probability that a patient recovers from a rare blood disease is 0.4. If 15 people are known to have contracted this disease, what is the probability that

1. at least 10 survive.
2. from 3 to 8 survive.
3. exactly 5 survive.

### Problem 4: Binomial Probabilities (15 pts)

A multiple-choice quiz has 200 questions, each with 4 possible answers of which only 1 is the correct answer. Suppose that a student has no knowledge on 80 of the 200 problems and therefore will guess. Calculate the probability that

1. guesswork in the 80 problems will yield less than 20 correct answers.
2. guesswork in the 80 problems will yield more than 40 correct answers.
3. guesswork yields from 25 to 30 correct answers inclusive.

Hint: feel free to use the normal approximation to the binomial distribution; however going forward I will encourage you to calculate these values directly, say via the pbinom command in R.

### Problem 5: Confidence Intervals (20 pts)

The following data represent the running times of films produced by motion picture companies.

times <- c(103, 94, 110, 87, 98, 97, 82, 123, 92, 175, 88, 118)

Assume a normal distribution and do the following:

1. Find a 95% confidence interval for the mean of film running times.
2. Find a 95% confidence interval for the variance of running times.

You now find that these data were collected from two different companies as follows:

list(C1=times[1:5], C2=times[6:length(times)])
## $C1 ## [1] 103 94 110 87 98 ## ##$C2
## [1]  97  82 123  92 175  88 118
1. Find a 90% confidence interval for the difference between the average running times of films produced by the two companies under the assumption of unknown but equal variances.
2. Find a 90% confidence interval for the difference between the average running times of films produced by the two companies under the assumption of unknown and unequal variances.

### Problem 6: Confidence Intervals for Proportions (10 pts)

1. In a random sample of $$n=500$$ families owning television sets in the city of Hamilton, Canada, it is found that $$x=345$$ subscribed to HBO. Find a 95% confidence interval for the actual proportion of families in this city who subscribe to HBO.
2. A certain change in a process for manufacture of component parts is being considered. Samples are taken using both the existing and the new procedure so as to determine if the new process results in an improvement. If 75 of 1500 items from the existing procedure were found to be defective, and 80 of 2000 items from the new procedure were found to be defective, find a 90% confidence interval for the true difference in the fraction of defectives between the existing and the new process.

### Problem 7: Hypothesis Tests (20 pts)

1. The Edison Electric Institute has published figures on the annual number of kilowatt hours expended by various home appliances. It is claimed that a vacuum cleaner expends an average of 46 kilowatt hours per year. If a random samples of 12 homes included in a planned study indicates that vacuum cleaners expend an average of 42 kilowatt hours per year with a standard deviation of 11.9 kilowatt hours, does this suggest at the 0.05 level of significance that vacuum cleaners expend, on the average, less than 46 kilowatt hours annually? Assume the population of kilowatt hours to be normal.
2. An experiment was performed to compare the abrasive wear of two different laminated materials. Twelve pieces of material 1 were tested by exposing each piece to a machine measuring wear. Ten pieces of material 2 were similarly tested. In each case, the depth of wear was observed. The samples of material 1 gave an average (coded) wear of 85 units with a sample standard deviation of 4, while the samples of material 2 gave an average of 81 and a sample standard deviation of 5. Can we conclude at the 0.05 level of significance that the abrasive wear of material 1 exceeds that of material 2 by more than 2 units? Assume the populations to be approximately normal with equal variances.
3. A builder claims that heat pumps are installed in 70% of all homes being constructed today in the city of Richmond, VA. Would you agree with this claim if a random survey of new homes in this city shows that 8 out of 15 had heat pumps installed? Use a 0.01 level of significance.
4. A vote is to be taken among the residents of a town and the surrounding county to determine whether a proposed chemical plant should be constructed. The construction site is within the town limits, and for this reason many voters in the county feel that the proposal will pass because of the large proportion of town voters who favor the construction. To determine if there is a significant difference in the proportion of town voters and county voters favoring the proposal, a poll is taken. If 120 of 200 town voters favor the proposal and 240 of 500 county voters favor it, would you agree that the proportion of town voters favoring the proposal is higher than the proportion of county voters at the 0.05 significance level?