Tutorial 1 – Basic concepts, probabilities and frequencies

advertisement
F73PS1 - Term 1 - 2006-7
Tutorial 1 – Basic concepts, probabilities and frequencies
Solutions
1.
Consider the following probabilities and discuss which of them can be interpreted as
frequencies. If so, describe the repeatable experiment to which the frequency refers.
Otherwise, give a reason why the probability has no frequentist interpretation.
i)
the probability that you pass this module;
ii)
the probability that a person randomly selected from the population scores 20
with a single throw of a dart.
iii)
the probability that you score 20 with a single throw of a dart. (What might
happen if you tried to repeat the experiment?)
iv)
the probability of scoring a double 6 with a pair of dice.
v)
the probability that the human race will become extinct within 1000 years due to
global warming.
Solution:
i) There is no repeatable experiment here, since the proposition refers to a set of
circumstances which is not repeatable. Therefore there is no frequentist interpretation to
this probability.
ii) There is a repeatable experiment here. We could repeatedly sample from the population
and the probability has an interpretation as a frequency.
iii) Arguably this experiment is not repeatable under identical conditions because each time
you throw the dart, your competence may be expected to increase due to practice effects.
iv) Yes - there is a clear interpretation as a frequency, since you could repeatedly throw the
dice under identical conditions.
v) This is definitely a one-off event and there is no interpretation of the probability as a
frequency.
2. Let S be the sample space for an experiment and let P() define a probability function on
the subsets of S. Use the axioms of probability to convince yourself that:
i)
P(E) ≤ 1, for any event E.
ii)
If E = {s1, s2, s3, …, sk} then P(E) = P(s1) + P(s2) + … + P(sk)
iii)
P(AB) = P(A) + P(B) – P(AB)
Solution:
The axioms (see notes) state that: A) P(E)  0 for all events, B) P(S) = 1; C) If E and F
have no outcomes in common then P(EF) = P(E) + P(F). This leads to the following
solutions for i)- iii) above.
i) Let E be any event (subset of sample space S) and let F be the set of all outcomes that are
not in E. (Notation: F = S\E or Ec). Then E and F are disjoint and EF = S. It follows from
B and C above that
P(E) + P(F) = 1,
so that P(E) = 1 - P(F). Moreover, P(F)  0 so that P(E) ≤ 1.
ii) We can write E = {s1}  {s2}  {s3} ...  {sk}, a union of disjoint events. Applying
axiom C to this situation, it follows that P(E) = P(s1) + P(s2) + ... + P(sk).
iii) Let B\A denote the set of elements that are in B but not in A and let A\B denote the set of
elements in A but not in B. Then AB = (A\B)  (AB)  (B\A) and these three sets are
disjoint from each other.
From C) we have that
P(A) = P(A\B) + P(AB)
P(B) = P(B\A) + P(AB)
P(AB) = P(A\B) + P(B\A) + P(AB) = P(A) + P(B) - P(AB),
3.
Random sampling from a finite population. Suppose that we have n objects of which r
share some a particular characteristic. Suppose we select an object at random is such a
way that each of the objects is equally likely to be chosen.
i)
Let A denote the event that the selected object has the characteristic. Show from
the axioms that P(A) = r/n.
Solution: There are n possible outcomes from this experiment corresponding to the n
different objects that can be drawn. Now r of these (s1, ..., sr}, say, form the event A. Now
for any outcome, s, P(s) = 1/n since all objects are equally likely. It follows that
P(A) = P(s1) + P(s2) + ... + P(sn) = r/n.
ii)
Now suppose that the first object is not replaced and a second object is then
drawn. Let B denote the event that the second object drawn has the characteristic.
Discuss whether the events A and B are independent.
Solution: If A occurs then there are only (r-1) objects with the property out of the remaining
(n-1) and the probability that B occurs in this case is (r-1)/(n-1). If A doesn't occur then
after the removal of the first object there are still r objects with the property in which case
P(B) = r/(n-1). Therefore the probability of B occurs differs depending on whether A has
occurred or not - they are not independent.
iii)
Discuss the circumstances under which might we reasonably claim that A and B
are independent. How do they depend on the values of n and r?
Solution: Suppose n and r are large e.g. suppose n= 1000, r = 200. Then P(B) = 199/999 if
A has occurred, or if A doesn't occur P(B) = 200/999. Both of these probabilities are very
close to 0.2, so we can claim that the probability that B occurs is not significantly affected by
whether A has or hasn't occurred, so that the 2 events are independent.
4. 1000 randomly selected school pupils from Edinburgh and 1000 randomly selected school
pupils from Glasgow take a test of mathematical ability. Out of the 10 highest-scoring
students, 2 are from Edinburgh and 8 are from Glasgow. Suppose that the variation in
ability to undertake the test in pupils is identical in the two cities.
i)
Under this assumption what is the probability that the best student is from
Glasgow?
Solution: If the variation in ability is identical then the best student can be considered to a
random draw from the set of all 2000 and by the results of question 3, the probability should
be 1/2.
ii)
What is the probability that the 2nd best student is from Glasgow: i) when the best
student is from Edinburgh? ii) when the best student is from Glasgow?
Solution: Again, under the assumption of identical variation in ability in the two groups, the
second best student is equally likely to be any of the 1999 pupils who remain after the best
student is identified. Therefore in case i) this is 1000/1999, or case ii) 999/1999. Both of
these probabilities are approximately 1/2.
iii)
Explain qualitatively why (roughly speaking) the number of students from
Glasgow in the top 10 has the same probability function as the number of heads
obtained when a fair coin is tossed ten times.
Solution: Repeating the logic of ii) then we can see that (more or less) the probability that
the next best student is from Glasgow is only weakly affected by the city of origin of the
students above them. For the 10th student, the probability that they are from Glasgow
ranges from 991/1991 (= 0.497) to 1000/1991 (= 0.502), depending on how many students
from Glasgow are placed above them. Therefore we can think of the origin of the person in
ith position, i = 1, 2, ...., 10 as being selected with equal probability from the set {Edinburgh,
Glasgow}regardless of who's above them.
iv)
Calculate a p-value to quantify the strength of evidence against the hypothesis that
there is no difference in mathematical ability of pupils in the two cities.
Solution: Our outcome of 8 from Glasgow out of 10 in the top 10 looks a little extreme
(perhaps) under the hypothesis that the variation is identical in the two cities. Under this
hypothesis, the number of Glasgow students in the top 10, which we call X, follows a
Binomial(10, 0.5) distribution by the logic of part iii). To get a p-value we need to calculate
the probability that we obtain an observation at last as extreme as the current one. This is
P(X2) + P(X8) = 2P(X2) (by symmetry).
From tables this probability is 20.0547 = 0.109. When the variation is identical in the two
cities we would get an observation at least as extreme around 11% of the time. This does not
represent strong evidence against the hypothesis of equal ability.
5. The Binomial(n, r) distribution describes the distribution of the number of successes X that
are recorded from n independent repetitions of a trial when the probability of success on
any trial is p. Consider the following experiments on individuals each of which involves
counting the number of successes:
i)
A randomly chosen individual is asked to take 10 ‘shots’ at a basketball goal and
the number of ‘baskets’ are counted.
ii)
In a parapsychology experiment on telepathy, a randomly chosen person is asked
to identify the (hidden) score on a fair die on 10 successive rolls, and the number
of correct guesses is counted.
iii)
Out of 20 identical boxes, 10 contain cash prizes and the remainder are empty. A
contestant is asked to select 10 different boxes and number of prizes is counted.
Discuss the extent to which the outcome of each of these experiments could be modelled
with a Binomial(10, p) distribution for some suitable p. If not give reasons, and suggest how
and why the distribution of outcomes might deviate from a binomial distribution, for example
by having too many extreme values.
Solution: The assumptions underlying the binomial distribution are that it counts the
number of successes X out of n independent trials where the probability of success is
constant for each trial. We need to see whether these assumptions are valid for the three
cases.
i) Probably not Binomial since the probability of success p would vary between subjects,
depending on ability at ball games, and for a given subject may tend to increase depending
on how many trials had been performed (a practice effect). Over many subjects you would
expect to see a number of very high scores and very low scores.
ii) It seems plausible that the outcome of this experiment would follow a binomial
distribution (particularly if you're sceptical about telepathy as a real phenomenon). For any
subject and any trial the probability that they guess correctly is, arguably, 1/6, so the scores
out of 10 would follow a Binomial(10, 1/6) distribution.
iii) The key thing to notice is that since they must select 10 different boxes, each time they
select a box with a prize, it becomes harder to choose one of the remaining prizes. (See
discussion in question 3). Therefore we can't think of the outcome of 10 trials as being
independent with fixed probabilities, so the Binomial distribution will not be a good
representation.
Download