This homework is due on Wednesday, January 24th at 4pm (the start of class). Its purpose is to refresh prerequisite concepts. All work must be submitted electronically. For full credit you must show all of your steps. Use of computational tools (e.g., R) is encouraged; and when you do, code inputs and outputs must be shown in-line (not as an appendix) and be accompanied by plain English that briefly explains what the code is doing.

Problem 1: Gaussian probabilities (10 pts)

Suppose \(X \sim \mathcal{N}(-10, 5^2)\), i.e., \(X\) has a Gaussian (normal) distribution with a mean of \(-10\) and a variance of 25.

  1. (6 pts) Compute the following: \[ \begin{aligned} \mathbb{P}(X > -10) && \mathbb{P}(X < -20) && \mbox{and} && \mathbb{P}(X = 0) \end{aligned} \]

  2. (4 pts) Express \(\mathbb{P}(-22 \leq X \leq -12)\) in terms of \(Z\), the standard normal random variable: \(Z \sim \mathcal{N}(0,1)\), and then use that expression to calculate the value of that probability statement.

Problem 2: Functions of random variables (10 pts)

Suppose that \(\mathbb{E}\{X\} = \mathbb{E}\{Y\} = 0\), \(\mathbb{V}\mathrm{ar}\{X\} = \mathbb{V}\mathrm{ar}\{Y\} = 1\) and \(\mathbb{C}\mathrm{or}(X,Y) = 0.5\). Compute:

\[ \begin{aligned} \mathbb{E}\{3X - 2Y\} && \mathbb{V}\mathrm{ar}\{3X - 2Y\} && \mbox{and} && \mathbb{E}\{X^2\}. \end{aligned} \]

Problem 3: Summation notation – computation (10 pts)

Let \(z\) be a vector of length \(n = 4\) defined as z in R as follows.

z <- c(2, -2, 3, -3)
n <- length(z)
## [1] 4
  1. (3 pts) Compute \(\sum_{i=1}^n z_i\) for z defined above.
  2. (3 pts) Let \(\bar{z} = \frac{1}{n} \sum_{i=1}^n z_i\) and calculate \(\sum_{i=1}^n (z_i - \bar{z})^2\) for z above.
  3. (4 pts) Provide an expression for the sample variance using summation notation, generically for an independent and identically distributed (iid) sample of observations \(z_1, \dots, z_n\). Then calculate the sample variance of z as defined above.

Problem 4: Summation notation – algebra (10 pts)

For two general collections of \(n\) numbers \(X_1, \dots, X_n\) and \(Y_1, \dots, Y_n\) show that

\[ \sum_{i=1}^n (X_i - \bar{X})(Y_i - \bar{Y}) = \sum_{i=1}^n (X_i - \bar{X}) Y_i. \]

Problem 5: The sampling distribution (15 pts)

Suppose that we have a random sample \(Y_1, \dots, Y_n\) where \(Y_i \stackrel{\mathrm{iid}}{\sim} \mathcal{N}(\mu, 4)\) for \(i=1,\dots, n\) for some value \(\mu\) depicting the mean of the Gaussian distribution.

  1. (4 pts) What is the expectation of the sample mean: \(\mathbb{E}\{ \bar{Y} \}\)?
  2. (4 pts) What is the variance of the sample mean: \(\mathbb{V}\mathrm{ar}\{ \bar{Y} \}\)?
  3. (4 pts) What is the variance for another iid realization, \(Y_{n+1}\)?
  4. (3 pts) What is the standard error of \(\bar{Y}\)?

Problem 6: Calculus (25 pts)

  1. (6 pts) Let \(f(x) = \lambda e^{-\lambda x}\) for some fixed parameter \(\lambda\) and calculate the following. \[ \begin{aligned} \frac{d}{dx} f(x) && \int_0^1 f(x) \; dx && \mbox{and} && \int_0^\infty f(x) \; dx \end{aligned} \]

  2. (7 pts) Let \(g(x) = \exp\left\{-\frac{(x - \mu)^2}{2 \sigma^2}\right\}\) for some fixed parameters \(\mu\) and \(\sigma^2\) and calculate the following. \[ \begin{aligned} \frac{d}{dx} g(x) && \int_{-\infty}^\infty g(x) \; dx && \mbox{and} && \int_{\mu}^\infty g(x) \; dx \end{aligned} \]

  3. (8 pts) With \(g(x)\) defined above, but now viewing it as a function of \(x\) and \(\mu\), i.e., \(g(x, \mu)\), find an expression for the value of \(\mu\) for fixed \(x_1,\dots, x_n\) which maximizes \[ \prod_{i=1}^n g(x_i, \mu). \] Hint: start by taking the \(\log\).

  4. (4 pts) Again with \(g(x)\) defined as above, setting \(\mu=2\) and \(\sigma^2=4\), evaluate the following. \[ \int_3^4 g(x) \; dx \] Hint: you may find software/numerical procedures helpful here.

Problem 7: Linear algebra (20 pts)

  1. (4 pts) Suppose that \(X\) is an \(n \times p\) matrix, and \(Y\) is an \(n \times 1\) matrix, i.e., an \(n\)-vector. Write \(X^\top Y\) as a vector of sums using the \(\Sigma\) notation.

  2. (3 pts) Using \(X\) and \(Y\) as above, what is the dimension of the following compound matrix–vector product?

\[ (X^\top X)^{-1} X^\top Y \]

  1. (6 pts) Now, let \(\beta\) be a \(p \times 1\) vector and \(X\) and \(Y\) defined as above. Find an expression for the value of \(\beta\) that gives the smallest value of \[ || Y - X \beta ||^2 = (Y - X\beta)^\top (Y - X\beta). \] Hint: start with \(p=1\) and see if that helps guide you toward the general-\(p\) solution.

  2. (2 pts) What criteria must \(X\) satisfy in order for such a solution (your expression for the optimal \(\beta\) above) to exist?

  3. (5 pts) Suppose \(X\) and \(y\) were defined by the X and y variables in R below. Calculate the value of \(\beta\) minimizing \(|| Y - X \beta ||^2\). Hint: using R’s built-in matrix–vector operations is easier than writing your own with double sums.

X <- cbind(1, 1:10)
y <- c(1.391, 0.036, 1.625, 2.427, 3.162, 3.181, 4.715, 1.678, 7.074, 5.981)