Homework 4

Table of Contents

Problem 1

To determine whether the bacteria count was lower in the west basin of Lake Macatawa than in the east basin, n=37n = 37 samples of water were taken from the west basin and the number of bacteria colonies in 100100 milliliters of water was counted. The sample characteristics were x=11.95\mean{x} = 11.95 and s=11.80s = 11.80, measured in hundreds of colonies. Find an approximate 95%95\% confidence interval for the mean number of colonies (say, μW\mu_W) in 100100 milliliters of water in the west basin.

Solution.

While the true standard deviation is unknown, the sample size is large enough to allow us to use the central limit theorem, so the confidence interval will be

11.95±z0.02511.8037=[8.15,15.75].11.95 \pm z_{0.025} \frac{11.80}{\sqrt{37}} = \boxed{\br{8.15, 15.75}}.

Problem 2

A leakage test was conducted to determine the effectiveness of a seal designedto keep the inside of a plug airtight. An air needle was inserted into the plug, and the plug and needlewere placed under water. The pressure was then increased until leakage was observed. Let XX equal the pressure in pounds per square inch. Assume that the distribution of XX is N(μ,σ2)\mathcal{N}\p{\mu, \sigma^2}. The following n=10n = 10 observations of XX were obtained:

3.13.34.54.23.53.53.74.23.93.3\begin{array}{rrrrr} 3.1 & 3.3 & 4.5 & 4.2 & 3.5 \\ 3.5 & 3.7 & 4.2 & 3.9 & 3.3 \end{array}

Use the observations to:

  1. Find a point estimate of μ\mu.
  2. Find a point estimate of σ\sigma.
  3. Find a 95%95\% one-sided confidence interval for μ\mu that provides an upper bound for μ\mu.
Solution.
  1. x3.72\boxed{\mean{x} \approx 3.72}
  2. s0.46\boxed{s \approx 0.46}
  3. Since the sample is small and the variance is unknown, we need to use a tt-distribution with 101=910 - 1 = 9 degrees of freedom:

    T=XμS/10t(9)T = \frac{\mean{X} - \mu}{S/\sqrt{10}} \sim t\p{9}

    For a 95%95\% one-sided confidence interval, we use

    P(Tt0.05(9))=0.95.\P\p{T \leq t_{0.05}\p{9}} = 0.95.

    This gives the one-sided interval

    μx+t0.05(9)s10=(,3.99].\mu \leq \mean{x} + t_{0.05}\p{9} \frac{s}{\sqrt{10}} = \boxed{\poc{-\infty, 3.99}}.

Problem 3

Students in a semester-long health-fitness program have their percentage of body fat measured at the beginning of the semester and at the end of the semester. The following measurements give these percentages for 99 men and for 88 women:

MalesFemalesPre-program (%)Post-program (%)11.1019.9719.5015.8014.0013.0218.309.2812.4011.5117.8917.4012.1010.7018.3010.4012.3111.40Pre-program (%)Post-program (%)22.9022.8931.6033.4727.7025.7521.7019.8019.3618.0025.0322.3326.9025.2625.7524.90s\begin{array}{cc} \text{Males} & \text{Females} \\\hline \\[-2ex] \begin{array}{cc} \text{Pre-program (\%)} & \text{Post-program (\%)} \\\hline \\[-2ex] 11.10 & \phantom{1}9.97 \\ 19.50 & 15.80 \\ 14.00 & 13.02 \\ \phantom{1}8.30 & 9.28 \\ 12.40 & 11.51 \\ \phantom{1}7.89 & \phantom{1}7.40 \\ 12.10 & 10.70 \\ \phantom{1}8.30 & 10.40 \\ 12.31 & 11.40 \end{array} & \begin{array}{cc} \text{Pre-program (\%)} & \text{Post-program (\%)} \\\hline \\[-2ex] 22.90 & 22.89 \\ 31.60 & 33.47 \\ 27.70 & 25.75 \\ 21.70 & 19.80 \\ 19.36 & 18.00 \\ 25.03 & 22.33 \\ 26.90 & 25.26 \\ 25.75 & 24.90 \\ \phantom{s} \end{array} \end{array}
  1. Let XX be the change in percentage of body fat for females before and after the program. Assume XN(μX,σX2)X \sim \mathcal{N}\p{\mu_X, \sigma_X^2} for some unknown σX2\sigma_X^2. Construct a 90%90\% confidence interval for μX\mu_X.

  2. Let YY be the change in percentage of body fat for males before and after the program. Assume YN(μY,σY2)Y \sim \mathcal{N}\p{\mu_Y, \sigma_Y^2} for some unknown σY2\sigma_Y^2. Construct a 90%90\% confidence interval for μY\mu_Y.

  3. Assume σX2=σY2=σ2\sigma_X^2 = \sigma_Y^2 = \sigma^2 for some unknown σ2\sigma^2. Construct a 90%90\% confidence interval for μXμY\mu_X - \mu_Y.

  4. Is the program effective in reducing percentage of body fat? Is it more effective for males or for females?

Solution.
  1. Since the sample size is small and the variance is unknown, we use a tt-distribution for our confidence interval. Our sample for XX is

    0.011.871.951.901.362.701.640.85\begin{array}{rrrr} -0.01 & 1.87 & -1.95 & -1.90 \\ -1.36 & -2.70 & -1.64 & -0.85 \end{array}

    Thus, our confidence interval is

    y±t0.05(7)sY8=[2.03,0.11].\mean{y} \pm t_{0.05}\p{7} \frac{s_Y}{\sqrt{8}} = \boxed{\br{-2.03, -0.11}}.
  2. Similarly, our sample for YY is

    1.133.700.980.980.890.491.402.100.91\begin{array}{rrrrr} -1.13 & -3.70 & -0.98 & 0.98 & -0.89 \\ -0.49 & -1.40 & 2.10 & -0.91 \end{array}

    We get the confidence interval

    x±t0.05(8)sX9=[1.70,0.28].\mean{x} \pm t_{0.05}\p{8} \frac{s_X}{\sqrt{9}} = \boxed{\br{-1.70, 0.28}}.
  3. Since the variances are the same, we will use a tt-distribution with degrees of freedom 9+82=159 + 8 - 2 = 15. Recall the pooled sample variance:

    Sp2=(nX1)SX2+(nY1)SY2nX+nY2.S_p^2 = \frac{\p{n_X-1} S_X^2 + \p{n_Y-1} S_Y^2}{n_X + n_Y - 2}.

    The final confidence interval will be

    xy±t0.05(15)sp19+18=[0.94,1.65].\mean{x} - \mean{y} \pm t_{0.05}\p{15} s_p \sqrt{\frac{1}{9} + \frac{1}{8}} = \boxed{\br{-0.94, 1.65}}.
  4. Based on the samples and confidence intervals, the program seems to be effective for females, but inconclusive for males (since it contains 00). Similarly, based on the confidence interval for the difference of the means, there is no evidence that the program is more effective for a particular sex.

Problem 4

Let X1,,X5X_1, \ldots, X_5 be a random sample of SAT mathematics scores, assumed to be N(μX,σ2)\mathcal{N}\p{\mu_X, \sigma^2}, and let Y1,Y5Y_1, \ldots Y_5 be an independent random sample of SAT verbal scores, assumed to be N(μY,σ2)\mathcal{N}\p{\mu_Y , \sigma^2}. If the following data are observed:

x1=644,x2=493,x3=532,x4=462,x5=565y1=632,y2=472,y3=492,y4=661,y5=540,\begin{aligned} x_1 &= 644, & x_2 &= 493, & x_3 &= 532, & x_4 &= 462, & x_5 &= 565 \\ y_1 &= 632, & y_2 &= 472, & y_3 &= 492, & y_4 &= 661, & y_5 &= 540, \end{aligned}

find a 90%90\% confidence interval for μXμY\mu_X - \mu_Y.

Solution.

Because the sample size is small and the variances are unknown, we use a tt-distribution with degrees of freedom 5+52=85 + 5 - 2 = 8. The confidence interval is

xy±t0.05(8)sp15+15=[111.27,70.87].\mean{x} - \mean{y} \pm t_{0.05}\p{8} s_p \sqrt{\frac{1}{5} + \frac{1}{5}} = \boxed{\br{-111.27, 70.87}}.

Alternatively, you can also view the problem as n=5n = 5 samples from

XY=N(μXμY,2σ2),X - Y = \mathcal{N}\p{\mu_X - \mu_Y, 2\sigma^2},

in which case, you use a tt-distribution with degrees of freedom 51=45 - 1 = 4, which gives

xy±t0.05(4)s5=[115.98,75.58].\mean{x} - \mean{y} \pm t_{0.05}\p{4} \frac{s}{\sqrt{5}} = \boxed{\br{-115.98, 75.58}}.

Problem 5

Let XX and YY be the life time (in hours) of two types of light bulbs, respectively. Assume XN(μX,689)X \sim \mathcal{N}\p{\mu_X, 689} and YN(μY,735)Y \sim \mathcal{N}\p{\mu_Y , 735}. Suppose a random sample of 2323 type XX light bulbs yields an average life time of 956.2956.2 hours, and a random sample of 2828 type YY light bulbs yields an average life time of 978.6978.6 hours. Construct a 95%95\% confidence interval for μXμY\mu_X - \mu_Y.

Solution.

The variances are known, so we can use a normal distribution.

xy±z0.025σX223+σY228=[37.09,7.71].\mean{x} - \mean{y} \pm z_{0.025} \sqrt{\frac{\sigma_X^2}{23} + \frac{\sigma_Y^2}{28}} = \boxed{\br{-37.09, -7.71}}.