Homework 8
Table of Contents
Problem 1
Let X X X be a Bernoulli random variable with success probability p p p . Consider the null hypothesis H 0 : p = 1 2 H_0\colon p = \frac{1}{2} H 0 : p = 2 1 and alternative hypothesis H 1 : p = 1 3 H_1\colon p = \frac{1}{3} H 1 : p = 3 1 . For a sample size n = 5 n = 5 n = 5 , find C C C , the best critical region of size α = 0.1875 \alpha = 0.1875 α = 0.1875 . Find the power of the test associated with C C C .
Solution.
To find best critical regions, we need to:
Write down the likelihood ratio L ( θ 0 ) L ( θ 1 ) \frac{L\p{\theta_0}}{L\p{\theta_1}} L ( θ 1 ) L ( θ 0 ) .
Rewrite the inequality L ( θ 0 ) L ( θ 1 ) ≤ k \frac{L\p{\theta_0}}{L\p{\theta_1}} \leq k L ( θ 1 ) L ( θ 0 ) ≤ k in terms of u ≤ c u \leq c u ≤ c or u ≥ c u \geq c u ≥ c , where u = u ( X 1 , … , X n ) u = u\p{X_1, \ldots, X_n} u = u ( X 1 , … , X n ) is some statistic (for example, ∑ i = 1 n X i \sum_{i=1}^n X_i ∑ i = 1 n X i ). We typically take logs at this step.
If given α \alpha α , you need to solve for c c c in
P ( u ≤ c | H 0 ) = α or P ( u ≥ c | H 0 ) = α \P\p{u \leq c \given H_0} = \alpha
\quad\text{or}\quad
\P\p{u \geq c \given H_0} = \alpha P ( u ≤ c ∣ H 0 ) = α or P ( u ≥ c ∣ H 0 ) = α
using the distribution of u u u . The direction of the inequality is exactly the same as what you got in step 2.
First, let's write down the likelihood ratio.
L ( 1 2 ) L ( 1 3 ) = ∏ i = 1 5 ( 1 2 ) x i ( 1 − 1 2 ) 1 − x i ∏ i = 1 5 ( 1 3 ) x i ( 1 − 1 3 ) 1 − x i = ( 1 2 ) 5 ( 1 3 ) ∑ i = 1 5 x i ( 2 3 ) 5 − ∑ i = 1 5 x i . \frac{L\p{\frac{1}{2}}}{L\p{\frac{1}{3}}}
= \frac{\prod_{i=1}^5 \p{\frac{1}{2}}^{x_i} \p{1 - \frac{1}{2}}^{1-x_i}}{\prod_{i=1}^5 \p{\frac{1}{3}}^{x_i} \p{1 - \frac{1}{3}}^{1-x_i}}
= \frac{\p{\frac{1}{2}}^5}{\p{\frac{1}{3}}^{\sum_{i=1}^5 x_i} \p{\frac{2}{3}}^{5-\sum_{i=1}^5 x_i}}. L ( 3 1 ) L ( 2 1 ) = ∏ i = 1 5 ( 3 1 ) x i ( 1 − 3 1 ) 1 − x i ∏ i = 1 5 ( 2 1 ) x i ( 1 − 2 1 ) 1 − x i = ( 3 1 ) ∑ i = 1 5 x i ( 3 2 ) 5 − ∑ i = 1 5 x i ( 2 1 ) 5 .
I will use the notation y = ∑ i = 1 5 x i y = \sum_{i=1}^5 x_i y = ∑ i = 1 5 x i for brevity. Next, we take logs on both sides of L ( 1 2 ) L ( 1 3 ) ≤ k \frac{L\p{\frac{1}{2}}}{L\p{\frac{1}{3}}} \leq k L ( 3 1 ) L ( 2 1 ) ≤ k , which gives
( 1 2 ) 5 ( 1 3 ) ∑ i = 1 5 x i ( 2 3 ) 5 − ∑ i = 1 5 x i ≤ k ⟺ 5 ln ( 1 2 ) − y ln ( 1 3 ) − ( 5 − y ) ln ( 2 3 ) ≤ ln k ⟺ y ln 2 ≤ ln k − 5 ln ( 1 2 ) + 5 ln ( 2 3 ) ( ∗ ) ⟺ y ≤ ln k − 5 ln ( 1 2 ) + 5 ln ( 2 3 ) ln 2 = c . \begin{aligned}
\frac{\p{\frac{1}{2}}^5}{\p{\frac{1}{3}}^{\sum_{i=1}^5 x_i} \p{\frac{2}{3}}^{5-\sum_{i=1}^5 x_i}} \leq k
&\iff 5\ln\p{\frac{1}{2}} - y \ln\p{\frac{1}{3}} - \p{5 - y} \ln\p{\frac{2}{3}}
\leq \ln k \\
&\iff y\ln 2
\leq \ln k - 5\ln\p{\frac{1}{2}} + 5\ln\p{\frac{2}{3}}
&& \p{*} \\
&\iff y
\leq \frac{\ln k - 5\ln\p{\frac{1}{2}} + 5\ln\p{\frac{2}{3}}}{\ln 2} = c.
\end{aligned} ( 3 1 ) ∑ i = 1 5 x i ( 3 2 ) 5 − ∑ i = 1 5 x i ( 2 1 ) 5 ≤ k ⟺ 5 ln ( 2 1 ) − y ln ( 3 1 ) − ( 5 − y ) ln ( 3 2 ) ≤ ln k ⟺ y ln 2 ≤ ln k − 5 ln ( 2 1 ) + 5 ln ( 3 2 ) ⟺ y ≤ ln 2 ln k − 5 ln ( 2 1 ) + 5 ln ( 3 2 ) = c . ( ∗ )
This means that a best critical region is of the form Y = ∑ i = 1 5 X i ≤ c Y = \sum_{i=1}^5 X_i \leq c Y = ∑ i = 1 5 X i ≤ c . Recall that a sum of 5 5 5 independent Bernoulli trials (with the same success probability) has distribution Bin ( 5 , p ) \operatorname{Bin}\p{5, p} Bin ( 5 , p ) , so we need to solve
P ( Y ≤ c | p = 1 2 ) = 0.1875 \P\p{Y \leq c \given p = \frac{1}{2}} = 0.1875 P ( Y ≤ c ∣ ∣ p = 2 1 ) = 0.1875
for c c c . Using the binomial cdf table, we get
The power is
K ( 1 3 ) = P ( Y ≤ 1 | p = 1 3 ) = ∑ k = 0 1 ( 5 k ) ( 1 3 ) k ( 1 − 1 3 ) 5 − k = 0.4609 . \begin{aligned}
K\p{\frac{1}{3}}
&= \P\p{Y \leq 1 \given p = \frac{1}{3}} \\
&= \sum_{k=0}^1 \binom{5}{k} \p{\frac{1}{3}}^k \p{1 - \frac{1}{3}}^{5-k} \\
&= \boxed{0.4609}.
\end{aligned} K ( 3 1 ) = P ( Y ≤ 1 ∣ ∣ p = 3 1 ) = k = 0 ∑ 1 ( k 5 ) ( 3 1 ) k ( 1 − 3 1 ) 5 − k = 0.4609 .
Problem 2
Let X X X have an exponential distribution with a mean of θ \theta θ ; that is, the pdf of X is
f ( x ; θ ) = 1 θ e − x θ , 0 < x < ∞ . f\p{x; \theta} = \frac{1}{\theta} e^{-\frac{x}{\theta}}, \quad 0 < x < \infty. f ( x ; θ ) = θ 1 e − θ x , 0 < x < ∞.
Let X 1 , … , X n X_1, \ldots, X_n X 1 , … , X n be a random sample from this distribution.
Show that a best critical region for testing H 0 : θ = 3 H_0\colon \theta = 3 H 0 : θ = 3 against H 1 : θ = 5 H_1\colon \theta = 5 H 1 : θ = 5 can be based on the statistic ∑ i = 1 n X i \sum_{i=1}^n X_i ∑ i = 1 n X i .
If n = 12 n = 12 n = 12 , use the fact that 2 θ ∑ i = 1 12 X i \frac{2}{\theta} \sum_{i=1}^{12} X_i θ 2 ∑ i = 1 12 X i is χ 2 ( 24 ) \chi^2\p{24} χ 2 ( 24 ) to find a best critical region of size α = 0.1 \alpha = 0.1 α = 0.1 .
Solution.
The likelihood is
L ( θ ) = ∏ i = 1 n 1 θ e − x i θ = 1 θ n e − 1 θ ∑ i = 1 n x i . L\p{\theta}
= \prod_{i=1}^n \frac{1}{\theta} e^{-\frac{x_i}{\theta}}
= \frac{1}{\theta^n} e^{-\frac{1}{\theta} \sum_{i=1}^n x_i}. L ( θ ) = i = 1 ∏ n θ 1 e − θ x i = θ n 1 e − θ 1 ∑ i = 1 n x i .
Thus,
L ( 3 ) L ( 5 ) = ( 5 3 ) n exp ( ( − 1 3 + 1 5 ) ∑ i = 1 n x i ) . \frac{L\p{3}}{L\p{5}}
= \p{\frac{5}{3}}^n \exp\p{\p{-\frac{1}{3} + \frac{1}{5}} \sum_{i=1}^n x_i}. L ( 5 ) L ( 3 ) = ( 3 5 ) n exp ( ( − 3 1 + 5 1 ) i = 1 ∑ n x i ) .
We get
( 5 3 ) n exp ( ( − 1 3 + 1 5 ) ∑ i = 1 n x i ) ≤ k ⟺ n ln ( 5 3 ) + ( − 1 3 + 1 5 ) ∑ i = 1 n x i ≤ ln k ⟺ ( − 1 3 + 1 5 ) ∑ i = 1 n x i ≤ ln k − n ln ( 5 3 ) ⟺ ∑ i = 1 n x i ≥ ln k − n ln ( 5 3 ) − 1 3 + 1 5 = c . \begin{aligned}
\p{\frac{5}{3}}^n \exp\p{\p{-\frac{1}{3} + \frac{1}{5}} \sum_{i=1}^n x_i} \leq k
&\iff n\ln\p{\frac{5}{3}} + \p{-\frac{1}{3} + \frac{1}{5}} \sum_{i=1}^n x_i \leq \ln k \\
&\iff \p{-\frac{1}{3} + \frac{1}{5}} \sum_{i=1}^n x_i \leq \ln k - n\ln\p{\frac{5}{3}} \\
&\iff \sum_{i=1}^n x_i \geq \frac{\ln k - n\ln\p{\frac{5}{3}}}{-\frac{1}{3} + \frac{1}{5}} = c.
\end{aligned} ( 3 5 ) n exp ( ( − 3 1 + 5 1 ) i = 1 ∑ n x i ) ≤ k ⟺ n ln ( 3 5 ) + ( − 3 1 + 5 1 ) i = 1 ∑ n x i ≤ ln k ⟺ ( − 3 1 + 5 1 ) i = 1 ∑ n x i ≤ ln k − n ln ( 3 5 ) ⟺ i = 1 ∑ n x i ≥ − 3 1 + 5 1 ln k − n ln ( 3 5 ) = c .
Note that because − 1 3 + 1 5 < 0 -\frac{1}{3} + \frac{1}{5} < 0 − 3 1 + 5 1 < 0 that the inequality flips.
From part 1, we need to solve
P ( ∑ i = 1 n X i ≥ c | θ = 3 ) = 0.1. \P\p{\sum_{i=1}^n X_i \geq c \given \theta = 3} = 0.1. P ( i = 1 ∑ n X i ≥ c ∣ ∣ θ = 3 ) = 0.1.
Here, we don't know the distribution of ∑ i = 1 12 X i \sum_{i=1}^{12} X_i ∑ i = 1 12 X i , but we do know the distribution of 2 θ ∑ i = 1 12 X i \frac{2}{\theta} \sum_{i=1}^{12} X_i θ 2 ∑ i = 1 12 X i , so we write
P ( ∑ i = 1 12 X i ≥ c | θ = 3 ) = P ( 2 3 ∑ i = 1 12 X i ≥ 2 c 3 | θ = 3 ) = 0.1 ⟹ P ( 2 3 ∑ i = 1 12 X i ≤ 2 c 3 | θ = 3 ) = 0.9. \begin{gathered}
\P\p{\sum_{i=1}^{12} X_i \geq c \given \theta = 3}
= \P\p{\frac{2}{3} \sum_{i=1}^{12} X_i \geq \frac{2c}{3} \given \theta = 3}
= 0.1 \\
\implies
\P\p{\frac{2}{3} \sum_{i=1}^{12} X_i \leq \frac{2c}{3} \given \theta = 3}
= 0.9.
\end{gathered} P ( i = 1 ∑ 12 X i ≥ c ∣ ∣ θ = 3 ) = P ( 3 2 i = 1 ∑ 12 X i ≥ 3 2 c ∣ ∣ θ = 3 ) = 0.1 ⟹ P ( 3 2 i = 1 ∑ 12 X i ≤ 3 2 c ∣ ∣ θ = 3 ) = 0.9.
When using the chi-square table, you have to read it carefully. Even though this is equal to 0.9 0.9 0.9 , we will need to use χ 0.1 2 ( 24 ) = 33.20 \chi^2_{0.1}\p{24} = 33.20 χ 0.1 2 ( 24 ) = 33.20 , so
c = 3 χ 0.1 2 ( 24 ) 2 = 49.8. c = \frac{3 \chi^2_{0.1}\p{24}}{2} = 49.8. c = 2 3 χ 0.1 2 ( 24 ) = 49.8.
Thus, a best critical region is
∑ i = 1 12 X i ≥ 49.8 . \boxed{\sum_{i=1}^{12} X_i \geq 49.8}. i = 1 ∑ 12 X i ≥ 49.8 .
Problem 3
(If you finished your homework early, note that the problem was changed to use a two-sided alternative and one of the parts was removed.)
Let X 1 , … , X n X_1, \ldots, X_n X 1 , … , X n be a random sample of size n n n from a normal distribution N ( μ , 100 ) \mathcal{N}\p{\mu, 100} N ( μ , 100 ) .
To test H 0 : μ = 230 H_0\colon \mu = 230 H 0 : μ = 230 against H 1 : μ ≠ 230 H_1\colon \mu \neq 230 H 1 : μ = 230 , what is the critical region specified by the likelihood ratio test?
If a random sample of n = 16 n = 16 n = 16 yielded x ‾ = 232.6 \mean{x} = 232.6 x = 232.6 , is H 0 H_0 H 0 accepted at a significance level of α = 0.1 \alpha = 0.1 α = 0.1 ?
Solution.
The likelihood is
L ( μ ) = ∏ i = 1 n 1 2 π σ 2 exp ( − 1 2 σ 2 ( x i − μ ) 2 ) = 1 ( 2 π σ 2 ) n 2 exp ( − 1 2 σ 2 ∑ i = 1 n ( x i − μ ) 2 ) . \begin{aligned}
L\p{\mu}
&= \prod_{i=1}^n \frac{1}{\sqrt{2\pi \sigma^2}} \exp\p{-\frac{1}{2\sigma^2} \p{x_i - \mu}^2} \\
&= \frac{1}{\p{2\pi\sigma^2}^{\frac{n}{2}}} \exp\p{-\frac{1}{2\sigma^2} \sum_{i=1}^n \p{x_i - \mu}^2}.
\end{aligned} L ( μ ) = i = 1 ∏ n 2 π σ 2 1 exp ( − 2 σ 2 1 ( x i − μ ) 2 ) = ( 2 π σ 2 ) 2 n 1 exp ( − 2 σ 2 1 i = 1 ∑ n ( x i − μ ) 2 ) .
To perform a likelihood ratio test, we need to optimize it, so we essentially need to find the MLE. We can optimize the log-likelihood like usual.
ln L ( μ ) = − n 2 ln ( 2 π σ 2 ) − 1 2 σ 2 ∑ i = 1 n ( x i − μ ) 2 ⟹ ∂ ∂ μ ln L ( μ ) = 1 σ 2 ∑ i = 1 n ( x i − μ ) . \ln L\p{\mu}
= -\frac{n}{2} \ln\p{2\pi \sigma^2} - \frac{1}{2\sigma^2} \sum_{i=1}^n \p{x_i - \mu}^2 \\
\implies \frac{\partial}{\partial \mu} \ln L\p{\mu}
= \frac{1}{\sigma^2} \sum_{i=1}^n \p{x_i - \mu}. ln L ( μ ) = − 2 n ln ( 2 π σ 2 ) − 2 σ 2 1 i = 1 ∑ n ( x i − μ ) 2 ⟹ ∂ μ ∂ ln L ( μ ) = σ 2 1 i = 1 ∑ n ( x i − μ ) .
Setting it equal to 0 0 0 and solving gives μ ^ = x ‾ \widehat{\mu} = \mean{x} μ = x . Thus,
λ = L ( 230 ) L ( x ‾ ) = exp ( − 1 2 σ 2 ∑ i = 1 n ( ( x i − 230 ) 2 − ( x i − x ‾ ) 2 ) ) = exp ( − 1 2 σ 2 ∑ i = 1 n ( x i − 230 − ( x i − x ‾ ) ) ( x i − 230 + ( x i − x ‾ ) ) ) = exp ( − 1 2 σ 2 ( x ‾ − 230 ) ∑ i = 1 n ( 2 x i − 230 − x ‾ ) ) = exp ( − 1 2 σ 2 ( x ‾ − 230 ) ( 2 n x ‾ − 230 n − n x ‾ ) ) = exp ( − n 2 σ 2 ( x ‾ − 230 ) 2 ) . \begin{aligned}
\lambda
&= \frac{L\p{230}}{L\p{\mean{x}}} \\
&= \exp\p{-\frac{1}{2\sigma^2} \sum_{i=1}^n \p{\p{x_i - 230}^2 - \p{x_i - \mean{x}}^2}} \\
&= \exp\p{-\frac{1}{2\sigma^2} \sum_{i=1}^n \p{x_i - 230 - \p{x_i - \mean{x}}} \p{x_i - 230 + \p{x_i - \mean{x}}}} \\
&= \exp\p{-\frac{1}{2\sigma^2} \p{\mean{x} - 230} \sum_{i=1}^n \p{2x_i - 230 - \mean{x}}} \\
&= \exp\p{-\frac{1}{2\sigma^2} \p{\mean{x} - 230} \p{2n\mean{x} - 230n - n\mean{x}}} \\
&= \exp\p{-\frac{n}{2\sigma^2} \p{\mean{x} - 230}^2}.
\end{aligned} λ = L ( x ) L ( 230 ) = exp ( − 2 σ 2 1 i = 1 ∑ n ( ( x i − 230 ) 2 − ( x i − x ) 2 ) ) = exp ( − 2 σ 2 1 i = 1 ∑ n ( x i − 230 − ( x i − x ) ) ( x i − 230 + ( x i − x ) ) ) = exp ( − 2 σ 2 1 ( x − 230 ) i = 1 ∑ n ( 2 x i − 230 − x ) ) = exp ( − 2 σ 2 1 ( x − 230 ) ( 2 n x − 230 n − n x ) ) = exp ( − 2 σ 2 n ( x − 230 ) 2 ) .
We get
λ ≤ k ⟺ exp ( − n 2 σ 2 ( x ‾ − 230 ) 2 ) ≤ k ⟺ − n 2 σ 2 ( x ‾ − 230 ) 2 ≤ ln k ⟺ ( x ‾ − 230 ) 2 ≥ − 2 σ 2 ln k n ⟺ ∣ x ‾ − 230 ∣ ≥ − 2 σ 2 ln k n = c ′ . \begin{aligned}
\lambda \leq k
&\iff \exp\p{-\frac{n}{2\sigma^2} \p{\mean{x} - 230}^2} \leq k \\
&\iff -\frac{n}{2\sigma^2} \p{\mean{x} - 230}^2 \leq \ln k \\
&\iff \p{\mean{x} - 230}^2 \geq -\frac{2\sigma^2 \ln k}{n} \\
&\iff \abs{\mean{x} - 230} \geq \sqrt{-\frac{2\sigma^2 \ln k}{n}} = c'.
\end{aligned} λ ≤ k ⟺ exp ( − 2 σ 2 n ( x − 230 ) 2 ) ≤ k ⟺ − 2 σ 2 n ( x − 230 ) 2 ≤ ln k ⟺ ( x − 230 ) 2 ≥ − n 2 σ 2 ln k ⟺ ∣ x − 230 ∣ ≥ − n 2 σ 2 ln k = c ′ .
(I write c ′ c' c ′ here because we're going to replace the constant one more time.)
Note that the inequality flips and that − ln k ≥ 0 -\ln k \geq 0 − ln k ≥ 0 , so the negative in the square root looks funny, but isn't actually a problem. This tells us that our critical region has the form
∣ X ‾ − 230 ∣ ≥ c ′ . \abs{\mean{X} - 230} \geq c'. ∣ ∣ X − 230 ∣ ∣ ≥ c ′ .
Recall that under H 0 H_0 H 0 ,
Z = X ‾ − 230 10 / 16 ∼ N ( 0 , 1 ) Z = \frac{\mean{X} - 230}{10/\sqrt{16}} \sim \mathcal{N}\p{0, 1} Z = 10/ 16 X − 230 ∼ N ( 0 , 1 )
and that ∣ X ‾ − 230 ∣ ≥ c ′ \abs{\mean{X} - 230} \geq c' ∣ ∣ X − 230 ∣ ∣ ≥ c ′ is equivalent to ∣ Z ∣ = ∣ X ‾ − 230 ∣ 10 / 16 ≥ c ′ 10 / 16 = c \abs{Z} = \frac{\abs{\mean{X} - 230}}{10/\sqrt{16}} \geq \frac{c'}{10/\sqrt{16}} = c ∣ Z ∣ = 10/ 16 ∣ X − 230 ∣ ≥ 10/ 16 c ′ = c . Thus, to ensure that the test has significance level α \alpha α , we need
P ( ∣ X ‾ − 230 ∣ ≥ c ′ ) = P ( ∣ Z ∣ ≥ c ) = α ⟹ c = z α 2 , \begin{gathered}
\P\p{\abs{\mean{X} - 230} \geq c'}
= \P\p{\abs{Z} \geq c}
= \alpha \\
\implies c = z_{\frac{\alpha}{2}},
\end{gathered} P ( ∣ ∣ X − 230 ∣ ∣ ≥ c ′ ) = P ( ∣ Z ∣ ≥ c ) = α ⟹ c = z 2 α ,
so the critical region is
∣ Z ∣ ≥ z α 2 . \boxed{\abs{Z} \geq z_{\frac{\alpha}{2}}}. ∣ Z ∣ ≥ z 2 α .
From the z z z -table, we have z 0.05 2 = 1.645 z_{\frac{0.05}{2}} = 1.645 z 2 0.05 = 1.645 , so we reject if ∣ z ∣ ≥ 1.645 \abs{z} \geq 1.645 ∣ z ∣ ≥ 1.645 . The observed z z z -statistic is
∣ z ∣ = ∣ 232.6 − 230 ∣ 10 / 16 = 1.04 , \abs{z} = \frac{\abs{232.6 - 230}}{10/\sqrt{16}} = 1.04, ∣ z ∣ = 10/ 16 ∣ 232.6 − 230 ∣ = 1.04 ,
so we fail to reject H 0 H_0 H 0 at α = 0.1 \alpha = 0.1 α = 0.1 .
Problem 4
Let X X X equal the number of female children in a three-child family. We shall use a chi-square goodness-of-fit statistic to test the null hypothesis that the distribution of X X X is Bin ( 3 , 1 2 ) \operatorname{Bin}\p{3, \frac{1}{2}} Bin ( 3 , 2 1 ) .
Define the test statistic and critical region, using an α = 0.05 \alpha = 0.05 α = 0.05 significance level.
Among students who were taking a statistics course, 52 52 52 came from families with three children. Let x = 0 , 1 , 2 x = 0, 1, 2 x = 0 , 1 , 2 and 3 3 3 represent the number of female children, and for these 52 52 52 families the frequencies for each possible outcome are 5 , 17 , 24 5, 17, 24 5 , 17 , 24 , and 6 6 6 , respectively. Calculate the value of the test statistic and state your conclusion.
Solution.
Testing the goodness-of-fit of this model means we need to test the hypothesis
H 0 : p i = p i 0 , p i 0 = P ( Bin ( 3 , 1 2 ) = i ) , 0 ≤ i ≤ 3. H_0\colon p_i = p_{i0},
\quad p_{i0} = \P\p{\operatorname{Bin}\p{3, \frac{1}{2}} = i},
\quad 0 \leq i \leq 3. H 0 : p i = p i 0 , p i 0 = P ( Bin ( 3 , 2 1 ) = i ) , 0 ≤ i ≤ 3.
These are
p 00 = 1 8 , p 10 = 3 8 , p 20 = 3 8 , p 30 = 1 8 . p_{00} = \frac{1}{8},
\quad p_{10} = \frac{3}{8},
\quad p_{20} = \frac{3}{8},
\quad p_{30} = \frac{1}{8}. p 00 = 8 1 , p 10 = 8 3 , p 20 = 8 3 , p 30 = 8 1 .
There are 4 4 4 probabilities to test, so k = 4 k = 4 k = 4 . The test statistic is
Q k − 1 = Q 3 = ∑ i = 0 3 ( x i − n p i 0 ) 2 n p i 0 , Q_{k-1}
= Q_3
= \sum_{i=0}^3 \frac{\p{x_i - np_{i0}}^2}{np_{i0}}, Q k − 1 = Q 3 = i = 0 ∑ 3 n p i 0 ( x i − n p i 0 ) 2 ,
and we reject if q 3 ≥ χ 0.05 2 ( 3 ) = 7.815 q_3 \geq \chi^2_{0.05}\p{3} = 7.815 q 3 ≥ χ 0.05 2 ( 3 ) = 7.815 .
Our observed test statistic is
q 3 = ( 5 − 52 ⋅ 1 8 ) 2 52 ⋅ 1 8 + ( 17 − 52 ⋅ 3 8 ) 2 52 ⋅ 3 8 + ( 24 − 52 ⋅ 3 8 ) 2 52 ⋅ 3 8 + ( 6 − 52 ⋅ 1 8 ) 2 52 ⋅ 1 8 = 1.7436 , q_3
= \frac{\p{5 - 52 \cdot \frac{1}{8}}^2}{52 \cdot \frac{1}{8}}
+ \frac{\p{17 - 52 \cdot \frac{3}{8}}^2}{52 \cdot \frac{3}{8}}
+ \frac{\p{24 - 52 \cdot \frac{3}{8}}^2}{52 \cdot \frac{3}{8}}
+ \frac{\p{6 - 52 \cdot \frac{1}{8}}^2}{52 \cdot \frac{1}{8}}
= 1.7436, q 3 = 52 ⋅ 8 1 ( 5 − 52 ⋅ 8 1 ) 2 + 52 ⋅ 8 3 ( 17 − 52 ⋅ 8 3 ) 2 + 52 ⋅ 8 3 ( 24 − 52 ⋅ 8 3 ) 2 + 52 ⋅ 8 1 ( 6 − 52 ⋅ 8 1 ) 2 = 1.7436 ,
so we fail to reject H 0 H_0 H 0 at α = 0.05 \alpha = 0.05 α = 0.05 .
Problem 5
In the Michigan Lottery Daily3 Game, twice a day a three-digit integer is generated one digit at a time. Let p i p_i p i denote the probability of generating digit i i i , i = 0 , 1 , … , 9 i = 0, 1, \ldots, 9 i = 0 , 1 , … , 9 . Let α = 0.05 \alpha = 0.05 α = 0.05 , and use the following 50 digits to test H 0 : p 0 = p 1 = ⋯ = p 9 = 1 10 H_0\colon p_0 = p_1 = \cdots = p_9 = \frac{1}{10} H 0 : p 0 = p 1 = ⋯ = p 9 = 10 1 .
1 6 9 9 3 8 5 0 6 7 4 7 5 9 4 6 5 6 4 4 4 8 0 9 3 2 1 5 4 5 7 3 2 1 4 6 7 1 3 4 4 8 8 6 1 6 1 2 8 8 \begin{array}{rrrrrrrrrr}
1 & 6 & 9 & 9 & 3 & 8 & 5 & 0 & 6 & 7 \\
4 & 7 & 5 & 9 & 4 & 6 & 5 & 6 & 4 & 4 \\
4 & 8 & 0 & 9 & 3 & 2 & 1 & 5 & 4 & 5 \\
7 & 3 & 2 & 1 & 4 & 6 & 7 & 1 & 3 & 4 \\
4 & 8 & 8 & 6 & 1 & 6 & 1 & 2 & 8 & 8
\end{array} 1 4 4 7 4 6 7 8 3 8 9 5 0 2 8 9 9 9 1 6 3 4 3 4 1 8 6 2 6 6 5 5 1 7 1 0 6 5 1 2 6 4 4 3 8 7 4 5 4 8
Solution.
We have 10 10 10 probabilities to test, so k = 10 k = 10 k = 10 . Thus, our test statistic is Q k − 1 = Q 9 Q_{k-1} = Q_9 Q k − 1 = Q 9 and we reject if q 9 ≥ χ 0.05 2 ( 9 ) = 16.92 q_9 \geq \chi^2_{0.05}\p{9} = 16.92 q 9 ≥ χ 0.05 2 ( 9 ) = 16.92 . From counting the numbers in the list, our observed values are given by the following table.
x 0 1 2 3 4 5 6 7 8 9 count 2 6 3 4 9 5 7 4 6 4 \begin{array}{r|rrrrrrrrrr}
x & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \\
\text{count} & 2 & 6 & 3 & 4 & 9 & 5 & 7 & 4 & 6 & 4
\end{array} x count 0 2 1 6 2 3 3 4 4 9 5 5 6 7 7 4 8 6 9 4
For each p i p_i p i , the expected number of observations is 5 5 5 , so
q 9 = ( 2 − 5 ) 2 5 + ( 6 − 5 ) 2 5 + ⋯ + ( 4 − 5 ) 2 5 = 7.6. q_9 = \frac{\p{2 - 5}^2}{5} + \frac{\p{6 - 5}^2}{5} + \cdots + \frac{\p{4 - 5}^2}{5} = 7.6. q 9 = 5 ( 2 − 5 ) 2 + 5 ( 6 − 5 ) 2 + ⋯ + 5 ( 4 − 5 ) 2 = 7.6.
Thus, we fail to reject H 0 H_0 H 0 at α = 0.05 \alpha = 0.05 α = 0.05 .