A student of architecture takes a trip to France to make visits to the 92 cathedrals of the Roman Catholic Church in France. France is divided into 9 regions (excluding overseas territories). During his trips he visits 45 different cathedrals and the number of visited cathedrals by region is given in the table below. Use a chi square goodness of fit test to test the hypothesis that a cathedral visited by the student is equally likely to be any region of France. An alternative hypothesis is that the probability that a cathedral in a region is visited by the student is proportional to the number of cathedrals in that region of France. Show that Ei , the expected number of cathedrals visited by the student in the ith region, is given by ri j 1 n j k Ei k , for 1 i k , r j 1 j k Proof. Let R r j By an alternative hypothesis is that the probability that a j 1 cathedral in a region is visited by the student is proportional to the number of cathedrals in that region of France, we know the probability distribution is Region j 1 2 ……. k -------------------------------------------------------------------------------------------r1 r2 rk / R Probability ……. R R k We know the total number of cathedrals visited by the student is N n j , so we j 1 know the expected number of cathedrals visited by the student in the ith region, Ei , is Ei = N ri / R namely, k ri j 1 n j Ei k j 1 rj for 1 i k . Where ni , is the number of cathedrals visited by the student in the ith region, ri , is the number of cathedrals in the ith region and k is the number of regions. Hence use a chi square goodness of fit test to test this alternative hypothesis. Region 1 (i) Number 8 of visited ( ni ) Number 8 2 3 4 5 6 7 8 9 Total 4 3 4 10 0 9 7 0 45 8 10 13 10 11 12 11 9 92 in region ( ri ) By the above proof, we can compute the expected number of cathedrals visited by the student in the ith region, we use red number to denote the expected value. So we form the following table. Region 1 2 3 4 5 6 7 8 9 Total (i) Number 8(3.91) 4(3.91) 3(4.89) 4(6.36) 10(4.89) 0(5.38) 9(5.87) 7(5.38) 0(4.40) 45 of visited ( ni ) Number 8 8 10 13 10 11 12 11 9 92 in region ( ri ) Then we can compute the chi-square test statistic as follows. *2 ( 83.91) 2 3.91 .91) .89) .36) .38) .87 ) .38) .40) ( 433.91 (344.89 ( 466.36 (104.489.89) ( 055.38 ( 955.87 ( 7 55.38 ( 044.40 2 2 2 2 2 2 3 2 4.278 0.002 0.73 0.876 5.34 5.38 1.67 0.488 4.40 23.164 By looking up the p-value table at website http://duke.usask.ca/~rbaker/Tables.html , Note that the degree of freedom is n-1=9-1=8. We get p value Pr( 2 23.164) 0.0032 0.05 ------------------------------------------------------------------------------------------------------Note: How to use that website? You just visited the above website and click “Go”. Then choose “p-value for chi-squared distribution”. Then you input the chi-squared value 23.164 and degree of freedom “8”, then click “Go”. It came up with p value Pr( 2 23.164) 0.0032 So we should reject the hypothesis: The probability that a cathedral in a region is visited by the student is proportional to the number of cathedrals in that region of France