Efficiency and Relative Efficiency of Tests. Chi

advertisement
Efficiency and Relative Efficiency
of Tests. Chi-Square Tests
Scientific Seminar
“Asymptotic Statistics”
Olena Korzhevska
Table of Contents
• Relative Efficiency of Tests
– Asymptotic Power Functions. Consistency. Asymptotic
Relative Efficiency
• Efficiency of Tests
– Asymptotic Representation Theorem. Testing Normal
Means. Local Asymptotic Normality. One-Sample Location.
Two-Sample Problems
• Chi-Square Tests
– Quadratic Forms in Normal Vectors. Pearson Statistic.
Testing Independence. Goodness-of-Fit Tests. Asymptotic
Efficiency
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
2/42
1. Relative Efficiency of Tests
Asymptotic Power Functions
• The relative efficiency of two sequences of tests is the
quotient of the numbers of observations needed with the two
tests to obtain the same level and power.
• Testing problem:
๐ป0 : ๐œƒ ∈ Θ0 ๐‘ฃ๐‘ . ๐ป1 : ๐œƒ ∈ Θ1
• The power function of a test that rejects ๐ป0 if a test statistics
๐‘‡๐‘› falls into critical region ๐พ๐‘› :
๐œƒ โŸผ ๐œ‹๐‘› ๐œƒ = Ρ๐œƒ (๐‘‡๐‘› ∈ ๐พ๐‘› )
• The test is of the level ๐œถ if its size sup ๐œ‹๐‘› ๐œƒ : ๐œƒ ∈ Θ0 does
not exceed ๐›ผ.
• The sequence of tests is asymptotically of level ๐œถ if
limsup sup ๐œ‹๐‘› ๐œƒ ≤ ๐›ผ.
๐‘›→∞
14/02/2015
๐œƒ∈Θ0
Olena Korzhevska. Asymptotic Statistics
Seminar
4/42
Asymptotic Power Functions
• The test with power function ๐œ‹๐‘› is better than the test with
power function ๐œ‹๐‘›′ if both
๐œ‹๐‘› (๐œƒ) ≤ ๐œ‹′๐‘› (๐œƒ), ๐œƒ ∈ Θ0 ,
๐œ‹๐‘› (๐œƒ) ≥ ๐œ‹′๐‘› (๐œƒ), ๐œƒ ∈ Θ1 .
• Aim: to compare tests asymptotically.
• Consider 2 sequences of tests, with power functions ๐œ‹๐‘› and
๐œ‹′๐‘› (Tests of each sequences are of the same type).
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
5/42
Asymptotic Power Functions
• First idea – compare limiting power functions of the form :
๐œ‹ ๐œƒ = lim ๐œ‹๐‘› ๐œƒ .
๐‘›→∞
• Example(Sign test). ๐‘‹1 , ๐‘‹2 , … , ๐‘‹๐‘› r.v. form the distribution with
unique median ๐œƒ.
- Test: ๐ป0 : ๐œƒ = 0 ๐‘ฃ๐‘ . ๐ป1 : ๐œƒ > 0,
- Test statistics: ๐‘†๐‘› = ๐‘›−1 ๐‘›๐‘–=1 1{๐‘‹๐‘– >0} ,
- Distribution function of the observations ๐น ๐‘ฅ − θ ,
๐œ‡ ๐œƒ = 1 − ๐น −θ ,
1 − ๐น −θ ๐น −๐œƒ
๐œŽ2 ๐œƒ
=
,
๐‘›
๐‘›
1
๐œ‡ 0 = ,
2
1
๐œŽ 0 = .
4
- ๐‘› ๐‘†๐‘› − ๐œ‡ ๐œƒ โ‡ ๐‘(0, ๐œŽ 2 (๐œƒ)) asymptotically,
- under ๐ป0 : ๐‘› ๐‘†๐‘› − 1/2 โ‡ ๐‘(0,1/4)
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
6/42
Asymptotic Power Functions
• Example(Sign test).
- Test that rejects ๐ป0 if ๐‘› ๐‘†๐‘› − 1/2 > ๐‘ง๐›ผ /2 has power function:
๐‘ง๐›ผ
๐œ‹๐‘› ๐œƒ = ๐‘ƒ๐œƒ ( ๐‘› ๐‘†๐‘› − ๐œ‡ ๐œƒ >
− ๐‘› ๐œ‡ ๐œƒ −๐œ‡ 0
2
๐‘ง๐›ผ
− ๐‘› ๐น 0 − ๐น −๐œƒ
2
=1− Φ
+ ๐‘œ(1)
๐œŽ ๐œƒ
- as ๐น 0 − ๐น −๐œƒ > 0 for every ๐œƒ > 0, it follows that for ๐›ผ = ๐›ผ๐‘› → 0
sufficiently slowly
0 ๐‘–๐‘“ ๐œƒ = 0,
๐œ‹๐‘› ๐œƒ =
1 ๐‘–๐‘“ ๐œƒ > 0.
- In this case the limit power function corresponds to the perfect test
with all error probabilities equal to zero.
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
7/42
Asymptotic Power Functions
How do we compare tests?
We need to make the problem of discriminating between the
null and the alternative hypotheses more difficult as n increases.
It is natural to consider a shrinking alternative, that converges to
the null.
To test: ๐ป0 : ๐œƒ = 0 ๐‘ฃ๐‘ . ๐ป1 : ๐œƒ๐‘› > 0, with ๐œƒ๐‘› → 0
Example(Sign test, continued). (on a board)
In this situation a reasonable method for asymptotic comparison
of 2 sequences of tests is to consider local limiting power
functions:
๐œ‹ โ„Ž = lim ๐œ‹๐‘›
๐‘›→∞
14/02/2015
โ„Ž
๐‘›
, โ„Ž ≥ 0.
Olena Korzhevska. Asymptotic Statistics
Seminar
8/42
Asymptotic Power Functions
Theorem: Suppose that ๐‘‡๐‘› , ๐œ‡, and ๐œŽ are such that, for all โ„Ž and
๐œƒ๐‘› = โ„Ž/ ๐‘›,
๐‘›(๐‘‡๐‘› −๐œ‡(๐œƒ๐‘› ))
โ‡
๐œŽ(๐œƒ๐‘› )
๐œƒ๐‘›
๐‘(0,1)
๐œ‡ is differentiable in 0, ๐œŽ is continuous in 0. Then the tests that
reject ๐ป0 : ๐œƒ = 0 for large values of ๐‘‡๐‘› and are asymptotically of
level ๐›ผ satisfy, for all โ„Ž,
โ„Ž
๐œ‡′ 0
๐œ‹๐‘›
→ 1 − Φ ๐‘ง๐›ผ − โ„Ž
.
๐œŽ 0
๐‘›
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
9/42
Asymptotic Power Functions
Proof:
Substituting โ„Ž = 0 shows that the asymptotic level of the test is
๐›ผ iff ๐ป0 : ๐œƒ = 0 is rejected for
๐‘›(๐‘‡๐‘› −๐œ‡(0))
๐œŽ(0)
Thus,
๐œ‹๐‘› ๐œƒ๐‘› = ๐‘ƒ๐œƒ๐‘›
= ๐‘ƒ๐œƒ๐‘›
๐‘› ๐‘‡๐‘› − ๐œ‡ 0
๐‘› ๐‘‡๐‘› −๐œ‡ ๐œƒ๐‘›
๐œŽ(๐œƒ๐‘› )
→ 1 − Φ ๐‘ง๐›ผ −
14/02/2015
>
> ๐‘ง๐›ผ .
> ๐œŽ 0 ๐‘ง๐›ผ
๐œŽ 0 ๐‘ง๐›ผ − ๐‘› ๐œ‡ ๐œƒ๐‘› −๐œ‡ 0
๐œŽ(๐œƒ๐‘› )
๐œ‡′ 0
โ„Ž
๐œŽ 0
Olena Korzhevska. Asymptotic Statistics
Seminar
10/42
Asymptotic Power Functions
•
๐œ‡′ 0
๐œŽ 0
- slope of the sequence of tests.
• Example (Sign test): The sign test has slope
• Example (t-test):๐‘‡๐‘› =
๐‘‹
๐‘†๐‘›
, ๐‘›
๐‘‹−๐œƒ
๐‘†๐‘›
๐œ‡′ 0
๐œŽ 0
= 2f 0 .
โ‡ ๐‘ 0,1 .
๐œƒ
Reject H0 if ๐‘›๐‘‡๐‘› > ๐‘ง๐›ผ .
๐‘›
๐‘‹
๐‘†
−
๐œ‡ ๐œƒ =
14/02/2015
โ„Ž/ ๐‘›
๐œŽ
๐œƒ
,๐œŽ
๐œŽ
=
๐‘›(๐‘‹−โ„Ž / ๐‘›)
๐‘†
๐œƒ = 1.
๐œ‡′ 0
๐œŽ 0
+โ„Ž
1
๐‘†
1
−
๐œŽ
โ‡ ๐‘(0,1)
โ„Ž/ ๐‘›
= 1/๐œŽ.
Olena Korzhevska. Asymptotic Statistics
Seminar
11/42
Asymptotic Power Functions
Example (Sign test vs. t-test):
• ๐‘‹1 , ๐‘‹2 , … , ๐‘‹๐‘› random sample from a ๐‘“(๐‘ฅ − ๐œƒ)-density, ๐‘“symmetric about 0, has unique median & finite 2๐‘›๐‘‘ moment.
• Test: ๐ป0 : ๐œƒ = 0 that the observations are symmetrically
distributes around 0. Compare the performance of sign and ttest.
• Suffices to compare the slopes of 2 tests:
1
๐œŽ
2f 0 and , respectively.
• For ๐‘(0,1) the slopes are 2/๐œ‹ and 1.
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
12/42
Asymptotic Power Functions
Relative efficiency of the sign test versus the ttest for some distributions.
DISTRIBUTION
EFFICIENCY(SIGN/T-TEST)
Logistic
๐œ‹ 2 /12
Normal
2/๐œ‹
Laplace
2
Uniform
1/3
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
13/42
Consistency
• Definition: A sequence of tests with power functions ๐œƒ โŸผ
๐œ‹๐‘› ๐œƒ is asymptotically consistent at level ๐›ผ against
alternative ๐œƒ if it is asymptotically of the level ๐›ผ and ๐œ‹๐‘› ๐œƒ →
1.
• If a family of sequences of tests contains for every level ๐›ผ a
sequence that is consistent against every alternative, then the
corresponding tests are simply called consistent.
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
14/42
Consistency
๐‘ƒ๐œƒ
Lemma 1: ๐‘‡๐‘› a sequence of statistics: ๐‘‡๐‘› ๐œ‡(๐œƒ) for every ๐œƒ. Then the
family of tests that reject the null hypothesis ๐ป0 : ๐œƒ = 0 for large values of ๐‘‡๐‘›
is consistent against every ๐œƒ such that ๐œ‡ ๐œƒ > ๐œ‡(0).
Lemma 2: Suppose that ๐‘‡๐‘› , ๐œ‡, and ๐œŽ are such that, for all โ„Ž and ๐œƒ๐‘› =
โ„Ž/ ๐‘›,
๐‘›(๐‘‡๐‘› −๐œ‡(๐œƒ๐‘› ))
โ‡
๐œŽ(๐œƒ๐‘› )
๐œƒ๐‘›
๐‘(0,1),
๐œ‡′ (0) > 0, ๐œŽ – continuous at 0 and σ 0 > 0. Suppose that the tests that
reject ๐ป0 for the large values of ๐‘‡๐‘› have nondecreasing power functions
๐œƒ โŸผ ๐œ‹๐‘› ๐œƒ . Then this family of tests is consistent against every alternative
๐œƒ > 0.
Moreover, if ๐œ‹๐‘› 0 → ๐›ผ, then ๐œ‹๐‘› ๐œƒ๐‘› → ๐›ผ when ๐‘›๐œƒ๐‘› → 0,
or ๐œ‹๐‘› ๐œƒ๐‘› → 1 when ๐‘›๐œƒ๐‘› → ∞.
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
15/42
Consistency
• Example(t-test):
The two-sample t-statistics (๐‘‹๐‘› − ๐‘Œ๐‘› )/๐‘† converges in
probability to E(๐‘Œ − ๐‘‹)/๐œŽ, where
๐œŽ 2 = lim ๐‘ฃ๐‘Ž๐‘Ÿ(๐‘Œ๐‘› − ๐‘‹๐‘› ).
n→∞
If the null hypothesis postulates that E๐‘Œ = ๐ธ๐‘‹, then
the test that rejects the null hypothesis for the large
values of the t-statistics is consistent against every
alternative for which E๐‘Œ > ๐ธ๐‘‹.
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
16/42
Asymptotic relative efficiency
• Sequence of tests can be ranked in quality by
comparing their asymptotic power functions.
• For the test statistics we have seen so far this
comparison involves “slopes” of the tests.
• The concept of relative efficiency yields a method to
quantify the interpretation of the slopes.
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
17/42
Asymptotic relative efficiency
• Sequence of testing problems to test: ๐ป0 : ๐œƒ = 0 vs. ๐ป1 : ๐œƒ = ๐œƒ๐œ .
• Requirement: tests need to attain asymptotic level ๐›ผ and power
๐›พ ∈ (๐›ผ, 1).
• ๐œ‹๐‘› is a power function of a test if n observations are available, ๐‘›๐œ is
minimal number of observations such that both
๐œ‹๐‘›๐œ (0) ≤ ๐›ผ and ๐œ‹๐‘›๐œ (๐œƒ๐œ ) ≥ ๐›พ.
• The limit (if exists) lim
๐‘›๐œ,2
๐‘›→∞ ๐‘›๐œ,1
is called (asymptotic) relative efficiency
or Pitman efficiency of the first sequence of tests with respect to
second one.
• A relative efficiency larger than 1 indicates that fewer observations
are needed with the first sequence of tests, which may then be
considered the better one.
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
18/42
Asymptotic relative efficiency
Theorem: Consider stat. models (๐‘ƒ๐‘›,๐œƒ : ๐œƒ ≥ 0) : ๐‘ƒ๐‘›,๐œƒ − ๐‘ƒ๐‘›,0
๐‘› ๐‘‡๐‘›,๐‘– −๐œ‡๐‘– ๐œƒ๐‘›
Let ๐‘‡๐‘›,1 , ๐‘‡๐‘›,2 – sequences of statistics:
๐œŽ๐‘– ๐œƒ๐‘›
๐œƒ→0
0, ∀๐‘›.
โ‡ ๐‘ 0,1 , ∀๐œƒ๐‘› → 0,
๐œƒ๐‘›
functions: ๐œ‡๐‘– − differentiable at 0, ๐œ‡′ ๐‘– 0 > 0, and ๐œŽ๐‘– −continuous at 0,
๐œŽ๐‘– 0 > 0, i ∈ 1,2 . Then the relative efficiency of the tests that reject
๐ป0 : ๐œƒ = 0 for large values of ๐‘‡๐‘›,๐‘– is equal to
๐œ‡1′ (0)/๐œŽ1 (0)
๐œ‡2′ (0)/๐œŽ2 (0)
2
, ∀ ๐œƒ๐œˆ ↓ 0,
∀๐œƒ๐œ → 0 independently of ๐›ผ > 0 and ๐›พ ∈ ๐›ผ, 1 .
If the power function of the test based on ๐‘‡๐‘›,๐‘– are nondecreasing for every
n, then the assumption of asymptotic normality of ๐‘‡๐‘›,๐‘– can be relaxed to
asymptotic normality under every sequence ๐œƒ๐‘› = ๐‘‚(1/ ๐‘›) only.
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
19/42
2. Efficiency of Tests
Asymptotic Representation Theorem
• Randomized test (test function) ๐œ™ in an experiment
(๐œ’, ๐ด, ๐‘ƒโ„Ž : โ„Ž ∈ ๐ป) is a measurable map ๐œ™: ๐œ’ โŸผ [0,1] on the
sample space.
• The power function of a test ๐œ™ is the function โ„Ž โŸผ ๐œ‹ โ„Ž =
๐ธโ„Ž ๐œ™ ๐‘‹ .
Theorem: Let the sequence of experiments โ„ฐ๐‘› = (๐‘ƒ๐‘›,โ„Ž : โ„Ž ∈ ๐ป)
converge to a dominated experiment โ„ฐ= (๐‘ƒโ„Ž : โ„Ž ∈ ๐ป). Suppose
that a sequence of power functions ๐œ‹๐‘› of tests in โ„ฐ๐‘› converges
poinwise, i.e., ๐œ‹๐‘› โ„Ž → ๐œ‹(โ„Ž), for every h and some arbitrary
function ๐œ‹. Then ๐œ‹ is a power function in the limit experiment,
i.e., there exists a test ๐œ™ in โ„ฐ with ๐œ‹ โ„Ž = ๐ธโ„Ž ๐œ™ ๐‘‹ for every h.
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
21/42
Testing Normal Means
• Suppose X is ๐‘๐‘˜ (โ„Ž, Σ)-distributed, Σ – known, h –
unknown.
• Test: ๐ป0 : ๐‘ ๐‘‡ โ„Ž = 0 vs. ๐ป1 : ๐‘ ๐‘‡ โ„Ž > 0, for known vector
c, ๐‘ ๐‘‡ Σ๐‘ > 0
Proposition: The test that rejects ๐ป0 if ๐‘ ๐‘‡ ๐‘‹ > ๐‘ง๐›ผ ๐‘ ๐‘‡ Σ๐‘
is uniformly most powerful at level ๐›ผ for testing the
๐ป0 : ๐‘ ๐‘‡ โ„Ž = 0 vs. ๐ป1 : ๐‘ ๐‘‡ โ„Ž > 0, based on X.
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
22/42
Local Asymptotic Normality
• If the model (๐‘ƒ๐œƒ : ๐œƒ ∈ Θ) is differentiable in quadratic mean, then
the local experiment converges to the Gaussian experiment (recall
yesterday last talk!)
๐‘ƒ๐œƒ๐‘›0 +โ„Ž/
−1
๐‘˜
๐‘˜
:
โ„Ž
∈
๐‘…
→
๐‘
โ„Ž,
๐ผ
:
โ„Ž
∈
๐‘…
๐œƒ
๐‘›
0
• The sequence of power functions ๐œƒ โ†ฆ ๐œ‹๐‘› (๐œƒ) in original
experiments induces the sequence of power functions h โ†ฆ
๐œ‹๐‘› (๐œƒ0 + โ„Ž/ ๐‘›) in the local experiments. Suppose ๐œ‹๐‘› ๐œƒ0 +
โ„Ž
๐‘›
→
๐œ‹ โ„Ž ∀โ„Ž, some ๐œ‹. Then by the asymptotic representation theorem,
this limit ๐œ‹ is the power function in the Gaussian limit experiment.
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
23/42
Local Asymptotic Normality
• Suppose ๐œƒ-real, ๐œ‹๐‘› is of asymptotic level ๐›ผ to test:
๐ป0 : ๐œƒ ≤ ๐œƒ0 vs. ๐ป1 : ๐œƒ > ๐œƒ0
Then, ๐œ‹ 0 = lim ๐œ‹๐‘› ๐œƒ0 ≤ ๐›ผ, and hence ๐œ‹ corresponds to a level ๐›ผ
๐‘›→∞
test for:
๐ป0 : โ„Ž = 0 vs. ๐ป1 : โ„Ž > 0
in the limit experiment.
• By Proposition for testing normal means, ๐œ‹ must be bounded by the
power function of the uniformly most powerful level ๐›ผ test in the
limit experiment. Thus ∀h,(c=1, Σ = ๐ผ๐œƒ−1
๐‘–๐‘› ๐‘ƒ๐‘Ÿ๐‘œ๐‘๐‘œ๐‘ ๐‘–๐‘ก๐‘–๐‘œ๐‘›)
0
โ„Ž
lim ๐œ‹๐‘› ๐œƒ0 +
≤ 1 − Φ ๐‘ง๐›ผ − โ„Ž ๐ผ๐œƒ0
๐‘›→∞
๐‘›
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
24/42
Local Asymptotic Normality
• As stated earlier, sequence of power function
๐œ‹๐‘› ๐œƒ0 + โ„Ž/ ๐‘› → 1 − Φ(๐‘ง๐›ผ − โ„Ž๐‘ )
for every h, has slope s. From the upper bound, ๐ผ๐œƒ0 is the largest
possible slope.
• The relative efficiency of the best test and the test with a slope s is:
๐ผ๐œƒ0 /๐‘  2
which can be interpreted as the number of observations needed with
the given sequence of tests with the slope s divided by the number of
observations needed with the best test to obtain the same power.
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
25/42
Local Asymptotic Normality
Theorem 15.4:
Let Θ ⊂ ๐‘…๐‘˜ -open, ๐œ“: Θ โŸผ ๐‘…-differentiable in ๐œƒ0 , with ๐œ“ ≠
0: ๐œ“ ๐œƒ0 = 0. Let (๐‘ƒ๐‘› ,๐œƒ : ๐œƒ ∈ Θ) be locally asymptotically normal
at ๐œƒ0 with nonsingular I, ๐‘Ÿ๐‘› → ∞ -const.
Then, ๐œƒ โ†ฆ ๐œ‹๐‘› (๐œƒ) of any sequence of level ๐›ผ tests for testing:
๐ป0 : ๐œ“(๐œƒ) ≤ 0 vs. ๐ป1 : ๐œ“(๐œƒ) > 0 satisfy for every h: ๐œ“๐œƒ0 โ„Ž > 0:
๐œ“๐œƒ0 โ„Ž
โ„Ž
limsup๐œ‹๐‘› ๐œƒ0 +
≤ 1 − Φ ๐‘ง๐›ผ −
.
๐‘Ÿ๐‘›
๐‘›→∞
๐‘‡
๐œ“๐œƒ0 ๐ผ๐œƒ−1
๐œ“
๐œƒ0
0
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
26/42
Local Asymptotic Normality
Addendum:
Let ๐‘‡๐‘› be statistics such that
๐œ“๐œƒ0 ๐ผ๐œƒ−1
Δ
0 ๐‘›,๐œƒ0
๐‘‡๐‘› =
+ ๐‘œ๐‘ƒ๐‘›,๐œƒ 1 .
0
−1 ๐‘‡
๐œ“๐œƒ0 ๐ผ๐œƒ0 ๐œ“๐œƒ0
Then the sequence of tests that reject ๐ป0 for the values of ๐‘‡๐‘› > z๐›ผ is
asymptotically optimal in the sense that the sequence for every h
๐‘ƒ๐œƒ0 +๐‘Ÿ๐‘›−1 โ„Ž ๐‘‡๐‘› ≥ ๐‘ง๐›ผ → 1 − Φ ๐‘ง๐›ผ −
๐œ“๐œƒ0 โ„Ž
๐‘‡
๐œ“๐œƒ0 ๐ผ๐œƒ−1 ๐œ“๐œƒ
0
0
*(Δ๐‘›,๐œƒ0 - sequence of statistics that converges in distribution under ๐œƒ0
to a normal ๐‘๐‘˜ (0, ๐ผ๐œƒ0 )-distribution).
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
27/42
Local Asymptotic Normality
• The point ๐œƒ0 in the theorem is on the boundary of ๐ป0 and ๐ป1 .
• If the dimension k>1, then this boundary is (k-1)-dimentional,
and there are many possible values for ๐œƒ0 .
• If dimension k=1, the boundary point ๐œƒ0 is typically unique
−1/2
and hence known, and we could use Tn = I๐œƒ0 Δ๐‘›,๐œƒ0 to
construct an optimal sequence of tests for the problem
๐ป0 : ๐œƒ = ๐œƒ0 .There are known as score tests.
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
28/42
One-Sample Location
• ๐‘‹1 , ๐‘‹2 , … , ๐‘‹๐‘› sample from a ๐‘“(๐‘ฅ − ๐œƒ)-density, ๐‘“-symmetric about
0, has finite ๐ผ๐‘“ , may be known or (partially) unknown.
• To test: ๐ป0 : ๐œƒ = 0 vs. ๐ป1 : ๐œƒ > 0.
• For fixed ๐‘“, ( ๐‘›๐‘–=1 ๐‘“ ๐‘ฅ๐‘– − ๐œƒ : ๐œƒ ∈ R ) is locally asymptotically
normal at ๐œƒ = 0 with Δ๐‘›,0 = −๐‘›−1/2 ๐‘›๐‘–=1 ๐‘“/๐‘“′ (๐‘‹๐‘– ), norming rate
๐‘›, Fisher information ๐ผ๐‘“ .
• From the preceding sections, the best asymptotic level ๐›ผ power
function for known ๐‘“ is 1 − Φ ๐‘ง๐›ผ − โ„Ž ๐ผ๐‘“ .
• ๐‘‡๐‘› = −
1 1
๐‘› ๐ผ๐‘“
′
๐‘› ๐‘“
๐‘–=1 ๐‘“
๐‘‹๐‘– + ๐‘œ๐‘ƒ0 (1)
• Than according to the Theorem 15.4, the sequence of tests that
reject ๐ป0 if ๐‘‡๐‘› > ๐‘ง๐›ผ attains bound and hence is asymptotically
optimal.
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
29/42
One-Sample Location
Example(t-test):
The standard normal density ๐‘“0 possesses score function
๐‘“0′ /๐‘“0 ๐‘ฅ = −๐‘ฅ and I๐‘“0 = 1. Consequently, if the underlying
distribution is normal, then the optimal test statistics
should satisfy: Tn = ๐‘›๐‘‹๐‘› /๐œŽ + ๐‘œ๐‘ƒ0 (๐‘›−1/2 ).
The t-statistics ๐‘‹๐‘› /๐‘†๐‘›∗ fulfill the requirements. That is the
case because for normally distributed observations the ttest is uniformly most powerful for every finite n and
hence is certainly asymptotically optimal.
*t-statistics simply replaces unknown standard deviation ๐œŽ by an estimate ๐‘†๐‘›
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
30/42
One-Sample Location
In this example, t-statistics simply replaces the unknown
standard deviation ๐œŽ by an estimate. This approach can be
followed for the most scale families. Under some regularity
conditions, the statistics
๐‘›
1 1
๐‘“0′ ๐‘‹๐‘–
๐‘‡๐‘› = −
๐‘“0 ๐œŽ๐‘›
๐‘› ๐ผ๐‘“0
๐‘–=1
Should yield asymptotically optimal tests, given a consistent
sequence of scale estimators ๐œŽ๐‘› .
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
31/42
3. Chi-Square Tests
Quadratic Forms in Normal Vectors
• ๐œ’๐‘˜2 โ‰
•
๐‘˜
2
๐‘
๐‘–=1 ๐‘–
๐‘˜
2
๐‘
๐‘–=1 ๐‘–
for i.i.d. ๐‘ 0,1 -distributed ๐‘1 , ๐‘2 , … , ๐‘๐‘˜
โ‰ ๐‘
2
of standard normal vector ๐‘ = (๐‘1 , … , ๐‘๐‘˜ )
Lemma: If vector ๐‘‹ is ๐‘๐‘˜ (0, Σ)-distributed, then ๐‘‹ 2 is
distributed as ๐‘˜๐‘–=1 ๐œ†2๐‘– ๐‘๐‘–2 for i.i.d. ๐‘ 0,1 -distributed ๐‘1 , … , ๐‘๐‘˜
and ๐œ†1 , … , ๐œ†๐‘˜ the eigenvalues of Σ.
Proof: There exists an orthogonal matrix ๐‘‚ : ๐‘‚Σ๐‘‚๐‘‡ = ๐‘‘๐‘–๐‘Ž๐‘”(๐œ†๐‘– ).
Then the vector ๐‘‚๐‘‹~๐‘๐‘˜ (0, ๐‘‘๐‘–๐‘Ž๐‘”(๐œ†๐‘– )), which is the same as the
distribution of the vector ( ๐œ†1 ๐‘1 , … , ๐œ†๐‘˜ ๐‘๐‘˜ ). Now ๐‘‹ 2 =
๐‘‚๐‘‹
2
14/02/2015
has the same distribution as
๐‘˜
๐‘–=1
Olena Korzhevska. Asymptotic Statistics
Seminar
2
๐œ†๐‘– ๐‘๐‘– .
33/42
Pearson Statistics
• Suppose we observe ๐‘‹๐‘› = (๐‘‹๐‘›,1 , … , ๐‘‹๐‘›,๐‘˜ ) with multinomial
distribution corresponding to ๐’ trials and ๐’Œ classes having
probabilities ๐‘ = (๐‘1 , … , ๐‘๐‘˜ ).
• The Pearson statistics for the testing ๐ป0 : ๐‘ = ๐‘Ž is given by
๐‘˜
๐ถ๐‘› a =
๐‘–=1
๐‘‹๐‘›,๐‘– − ๐‘›๐‘Ž๐‘–
๐‘›๐‘Ž๐‘–
2
Theorem: If the vector ๐‘‹๐‘› is multinomially distributed with the
parameters ๐‘› and ๐‘Ž = ๐‘Ž1 , … , ๐‘Ž๐‘˜ > 0, then the sequence
๐‘ƒ
2
๐ถ๐‘› a → ๐œ’๐‘˜−1
under ๐‘Ž.
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
34/42
Pearson Statistics
• The Pearson statistic is oddly asymetric in the observed and
true frequencies(which is motivated be the form of the
asymptotic covariance matrix).
• The method to symmetrize the statistic leads to the Hellinger
statistic
๐‘˜
๐ป๐‘›2
a =4
๐‘–=1
๐‘‹๐‘›,๐‘– − ๐‘›๐‘Ž๐‘–
๐‘˜
2
๐‘‹๐‘›,๐‘– + ๐‘›๐‘Ž๐‘–
2
=4
๐‘‹๐‘›,๐‘– − ๐‘›๐‘Ž๐‘–
2
๐‘–=1
• Up to a multiplicative constant it’s a Hellinger distance
between the discrete probability distribution on {1, … , ๐‘˜} with
probability vectors ๐‘Ž and ๐‘‹๐‘› /๐‘›, respectively.
๐‘ƒ
• As (๐‘‹๐‘› /๐‘› − ๐‘Ž) → 0, ๐ป๐‘›2 is asymptotically equivalent to ๐ถ๐‘› .
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
35/42
Testing Independence
• Suppose that each element of a population can be classified
by two characteristics, having ๐’Œ and ๐’“ levels, respectively :
๐‘11 โ‹ฏ ๐‘1๐‘Ÿ
๐‘1 .
โ‹ฎ
โ‹ฑ
โ‹ฎ
โ‹ฎ
๐‘๐‘˜1 โ‹ฏ ๐‘๐‘˜๐‘Ÿ ๐‘๐‘˜ .
…………………………
๐‘.1 … ๐‘.๐‘Ÿ
๐‘
• Classification for a random sample of size ๐’ from the
population – matrix ๐‘‹๐‘›,๐‘–๐‘— : multinomially distributed with
parameters ๐’ and probabilities ๐‘๐‘–๐‘— = ๐‘๐‘–๐‘— /๐‘.
• ๐ป0 : ๐‘๐‘–๐‘— = ๐‘Ž๐‘– ๐‘๐‘— − ๐‘๐‘Ž๐‘ก๐‘’๐‘”๐‘œ๐‘Ÿ๐‘–๐‘’๐‘  ๐‘Ž๐‘Ÿ๐‘’ ๐‘–๐‘›๐‘‘๐‘’๐‘๐‘’๐‘›๐‘‘๐‘’๐‘›๐‘ก for unknown
probability vectors ๐‘Ž๐‘– and ๐‘๐‘— .
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
36/42
Testing Independence
• The ML-estimators of ๐‘Ž and ๐‘ under ๐ป0 :
๐‘Ž๐‘– = ๐‘‹๐‘›,๐‘–. /๐‘› and ๐‘๐‘— = ๐‘‹๐‘›,.๐‘— /๐‘›
• Modified Pearson statistic with these estimators:
๐‘˜
๐‘Ÿ
๐ถ๐‘› ๐‘Ž๐‘› โจ‚๐‘๐‘› =
๐‘–=1 ๐‘—=1
๐‘‹๐‘›,๐‘–๐‘— − ๐‘›๐‘Ž๐‘– ๐‘๐‘—
2
๐‘›๐‘Ž๐‘– ๐‘๐‘—
Corollary: If the (๐‘˜ × ๐‘Ÿ) matrices ๐‘‹๐‘› are multinomially
distributed with parameters ๐‘› and ๐‘๐‘–๐‘— = ๐‘Ž๐‘– ๐‘๐‘— > 0, then the
sequence ๐ถ๐‘› ๐‘Ž๐‘› โจ‚๐‘๐‘› converges in distribution to the
2
๐œ’(๐‘˜−1)(๐‘Ÿ−1)
-distribution.
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
37/42
Testing Independence
Example: Google wants to test the performance of new search
algorithms. Google might test three algorithms using a sample of
10,000 google.com search queries.
Search algorithm
No new search
New search
Total
current
test 1
test 2
3511
1749
1818
1489
751
682
5000
2500
2500
Total
7078
2922
10000
• To test:
๐ป0 : The algorithms each perform equally well.
๐ป1 : The algorithms do not perform equally well.
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
38/42
Testing Independence
Example: ML estimators for ๐‘Ž and ๐‘: ๐‘Ž๐‘– = ๐‘‹๐‘›,๐‘–. /๐‘›, ๐‘๐‘— = ๐‘‹๐‘›,.๐‘— /๐‘›,
• ๐‘›๐‘Ž๐‘– ๐‘๐‘— = ๐‘‹๐‘›,๐‘–. โˆ™ ๐‘‹๐‘›,.๐‘— /๐‘› – expected count of each cell (ij).
Search algorithm
current
test 1
test 2
No new search 3511 (3539) 1749 (1769.5) 1818 (1769.5)
New search
1489 (1461) 751 (730.5)
682 (730.5)
Total
5000
2500
2500
• ๐ถ๐‘› ๐‘Ž๐‘› โจ‚๐‘๐‘› =
๐‘œ๐‘๐‘ ๐‘’๐‘Ÿ๐‘ฃ๐‘’๐‘‘ ๐‘๐‘œ๐‘ข๐‘›๐‘ก−๐‘’๐‘ฅ๐‘๐‘’๐‘๐‘ก๐‘’๐‘‘ ๐‘๐‘œ๐‘ข๐‘›๐‘ก 2
๐‘’๐‘ฅ๐‘๐‘’๐‘๐‘ก๐‘’๐‘‘ ๐‘๐‘œ๐‘ข๐‘›๐‘ก
Total
7078
2922
10000
= 6.120
• ๐‘‘๐‘“ = ๐‘˜ − 1 ๐‘Ÿ − 1 = 2 − 1 3 − 1 = 2
• p−value = 0.047, thus we reject ๐ป0 at significance level ๐›ผ = 0.05.
That is, the data provide convincing evidence that there is some
difference in performance among the algorithms.
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
39/42
Goodness-of-Fit Tests
• Given a random sample ๐‘‹1 , ๐‘‹2 , … , ๐‘‹๐‘› from a distribution ๐‘ƒ,
we want to test H0 : ๐‘ƒ ∈ ๐’ซ0
• Testing goodness-of-fit typically focuses on no particular
alternative, that is why ๐œ’ 2 statistics are reasonable.
• Partition ๐‘‹ =∪๐‘— ๐‘‹๐‘— of the sample space into finitely many sets
• โ„™๐‘› ๐ด = ๐‘›−1 (1 ≤ ๐‘– ≤ ๐‘›: ๐‘‹๐‘– ∈ ๐ด) fraction of observations in
๐ด
• Vector ๐‘›(โ„™๐‘› ๐‘‹1 , … , โ„™๐‘› ๐‘‹๐‘˜ ) is multinominal distributes,
modified chi-squared statistics is given:
๐‘˜
๐‘› โ„™๐‘› ๐‘‹๐‘– − ๐‘ƒ ๐‘‹๐‘–
๐‘–=1
14/02/2015
2
๐‘ƒ ๐‘‹๐‘–
Olena Korzhevska. Asymptotic Statistics
Seminar
40/42
Asymptotic Efficiency
• The asymptotic null distributions of various versions of the
Pearson statistic enable us to set critical values but by
themselves do not give information on the asymptotic power
of the tests.
• The asymptotic power can be measured in various ways:
– the most important method – to consider local limiting
power functions (discussed earlier)
– A second method to evaluate the asymptotic power is by
Bahadur efficiencies
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
41/42
Thank you for attention.
14/02/2015
Olena Korzhevska. Asymptotic Statistics
Seminar
Download