s The geometric proof of the probability theory

advertisement
The geometric proof of the probability theory(otherwise called the Meridian Probability
Theorem)
By Atovigba, Garshagu Michael Vershima (B.Sc. Ed. Mathematics, M.Ed. Mathematics
Education Student, Benue State University, Makurdi Nigeria)
Abstract
The work seeks to provide the geometric proof of the probability theory which has been sought
for, for over 300 years after its being propounded by Fermat, Huygenes, Bernoulli, etc. The
proof is entitled the Meridian Probability Function And satisfies the age-long tradition that sum
of all probabilities in a sample sphere is unity and probability of an event in the sample space
ranges between zero and unity. The work takes the hyper sphere regarded as the probabilistic
sphere upon which the Meridian Probability function is defined. The work distinguishes between
possibility and probability; and polarizes the probability of an event happening against the
probability of the event not happening both of which sum up to unity within the
0 to1 range framework.
Introduction
Probability is the vehicle that enables the statistician to use information in a sample to make
inferences or describe the population from which the sample was obtained (Mendenhall, 1975),
the building blocks of probability being: (1). a simple event if it cannot be decomposed, and that
an experiment results in one and only one of the simple events e.g. in a coin toss e1 is head, e2 is
tail. (2). A set of all sample points of the experiment is called a sample space. (3). An event is a
collection of sample points. He further notes that if e1, e2, e2, …, en have equal chances of
occurrence, the probability: P(e1) = P(e2) = … Pen) =1/n. So that “to each point in the sample
space, we assign a number called the probability of ei denoted by the symbol. P(ei) such that O≤
P(ei) ≤ 1 and P(ei) = 1. (4). Events A, B are independent if P(A/B) = P(A) and P(B/A) = P(B)
They are not independent if
)
 B   PP((AB
, and
B)
PA
 A  PP((ABA))
PB
;
(5). A, B are mutually exclusive if AB is a null set; (6). For mutually exclusive events, P(A 
B) = P(A) + P(B) – P(AB) where P(AB) = 0; Else, P(AB)  0; (7). For independent A, B,
P(AB) = P(A)P(B) while P(AB) = P(A)P(B/A) = P(B)P(A/B)
Mendenhall on probability distributions says: Probability distribution is (for discrete random
variables) a formula, table or graph that provides the probability associated with each value of
the random variable (a discrete random variable is one that can assume a countable number of
values, while a continuous random variable. assumes the infinitely large number of values
corresponding to the parts on a line interval. Two requirements exist for a probability
distribution:
1. 0 ≤ P(yi) ≤ 1 and
2. ∑ P(yi) = 1
A critique of these criteria
This research looks at typical literature on probability represented by Mendenhall
(ibid:98) which says, for instance, that for tossing a die probability distribution for y is in the
following table.
no. P(ei )
1
1
6
1
2
6
1
3
6
1
4
6
1
5
6
1
6
6
y
1
2
3
4
5
6
P( y)
1
6
1
6
1
6
1
6
1
6
1
6
P
(
 y)  1
This researcher observes that, suppose that we instead calculate the respective probabilities
(classical approach) of having not faces 1,2,3,4,5,6, then each has P(yi)=5/6, which satisfies the
initial condition of 0 < P(yi) < 1 but fails to satisfy the second criterion P(yi) = 1, since 5/6 up to
6 times is 5 > 1. This is the collapse of the classical probability theorem. The classical school,
therefore, works only for a coin toss with P(H) = P(T) = ½ and when the two are added we have
1; and the probability of ‘not head’ = probability of ‘not tail’ = ½; and when the two are added
we have1. This is generalized for dual events. But beyond the toss of the coin or dualism (e.g. a
pluralistic dice), P(yi) is not always 1 especially when we turn the classical theory on
probabilities of events not happening.
To save probability theory from this collapse, the meridian probability theorem takes probability
as a statistic using the hyper sphere, which gives geometric credence to the criteria provided by
Fermat, Pascal, Huygenes, Bernoulli, etc – that is, for the respective probabilities of event x
happening, x not happening, and the total probability of events xi in a sample space, 0 < P(x) < 1,
0 < P(x1) < 1 and P(xi) = 1.
Statement of the meridian probability theorem (the geometric proof of the probability
theory):
Let x: 0 < x < 1 corresponding with the perpendicular distance from the arbitrary north (or south)
pole of the probabilistic sphere (hyper sphere), be the possibility of k event happening; then the
probability of k actually happening
P(k) = Cos2  = 2x – x2
and the probability of k not happening
P(k1) = Sin2  = (1 – x)2
 0 < x < 1 and 0 <  < ,
where  is the latitude corresponding with x perpendicular distance from the arbitrary north (or
south) pole of the probabilistic sphere.
It is purely a geometrical analysis comparing observable possibilities that are likened to
perpendicular distances of corresponding latitudes of the hyper sphere, which generates a sense
of probability that is provable and tenable within the defined framework as Fermat and co.
recommended.
Definition of operational terms
Definition 1: The Probabilistic Sphere:
The Probabilistic Sphere is the hyper sphere, a sphere of unit radius R = 1.
Definition 2: The Probabilistic Sphere and The Meridian function:
Given the Meridian Function r2 = 2Rx – x2 … (1)
(Atovigba: ibid); with R = 1, the probabilistic sphere has
r2 = 2x – x2 … (2).
Definition 3: Possibility: R = 1 is the total possibility of an aggregate of events occurring and x:
0 < x < 1 is the possibility of a particular k occurrence out of a group of the aggregate of the
events.
Definition 4: Hemispherical restriction of the meridian probability:
Since R = 1, the probabilistic sphere restricts the variable latitude to the northern (or southern)
hemisphere for the variable latitude So that 0 <  < .
Definition 5:The perpendicular distance x of the variable latitude  of the probabilistic sphere is
the possibility of k event occurring; hence 1 – x is the possibility of k not occurring:
Specifically,
0 < x < 1 and
0 < 1 - x < 1.
Definition 6: P(x) is the probability of k out of an event occurring and P(x1) the probability of k
out of an event not occurring, where x of the probabilistic sphere is the possibility of k of the
event occurring; and 1 – x the possibility of k not occurring:
0 < x < 1 and 0 < 1 - x < 1.
Theorem 1: The Meridian Probability Theorem (the geometric proof of the probability theory):
Let x: 0 < x < 1 corresponding with the perpendicular distance from the arbitrary north (or south)
pole of the probabilistic sphere, be the possibility of k of an event happening; then the
probability of k actually happening P(k) = Cos2  = 2x – x2 and the probability of k not
happening P(k1) = Sin2  = (1 – x)2  0 < x < 1 and 0 <  < , where  is the latitude
corresponding with x perpendicular distance from the arbitrary north (or south) pole of the
probabilistic sphere.
Proof:
We are to prove that 0 < P(k) < 1 and 0 < P(x1) < 1  0 < x < 1 and 0 <  < .
Now, let R = 1 thus the probabilistic sphere. If r be the radius of the latitude  then 0<r<R=1.
2
r
2
  r
Suppose that P(k) = Cos2  =  R 
= 2x – x2 … (2).
Since R = 1. Also, 0 < r < R=1 implies 0 < r2<1 implying 0 < Cos2  < 1. Hence,
0<P(k)< 1.
1 x 
2

  1  x 
Similarly, if P(k1) = Sin2  =  R 
since R = 1 … (4)
2
Since 0 < (1 - x) < 1, so also 0 < (1 - x) < 1, implying 0 < Sin2  < 1 or 0 <P(k1)<1. Proved.
2
Remark 1:
Theorem 1 makes abundant that the higher the possibility of k happening the higher the
probability of its occurrence in a trial and vice versa. Also, as will be seen in the example below,
possibility is different from the meridian probability.
Example 1:
The toss of a coin has possibility of head x = ½. Hence, possibility of not a head x1 = ½.
Thus the probability of k = 1 head in the single toss will be
1 3
 .
P(k) = 2x – x = 1 - 4 4
And the probability of k = 1 not a head is
2
2
2
1
 1
1
1       .
4
2
P(k1) = (1 – x)2 =  2 
Remark 2: Example 1 shows that the Meridian Probability is a simple model that polarizes P(k)
and P(k1): that is, an event either happens or does not happen. This helps in defending the forth
coming Meridian Probability Function which should sum up to unity.
Theorem 2: The Meridian Probability Function mpf:
The Meridian Probability Function mpf is a dualism of P(k) and P(k1) such that
f(x) = P(k) + P(k1) = 1
 0 < x < 1.
Proof:
We are to prove that
 f ( x)  1
0  P (k )  1
and
0  P (k 1 )  1.
in line with probabilistic traditions in which the sum of probabilities should be unity while
probability of success lies between 0 and 1, and the probability of failure must lie between 0 and
1.
Already, we have explained that the meridian probability is a dualism or polarity of P(k) and
P(k1); and Theorem 1 has adequately restricted both P(k) and P(x1) within the [0,1] framework of
probabilities.
Now,
1
1
x 0
0
 f ( x)   P(k )  P(k
1
)dx
: 0 < x < 1.
Hence,
1
 f ( x)  x
1
0
 1,
x 0
which is the required proof.
Example 2:
Consider Example 1. Hence,
 f ( x)   P(k )  P(k )dx
1
1
x 0
0
1
3 1
  1.
= 4 4
Theorem 3: Mutually Exclusive Events: Let s be sample space of N total population containing

Ei mutually exclusive events (groups), i = 1, 2, …  Z . Then, the possibility of ki out of any Ei
happening
x
Ei
C ki
N
Cs
Ei
C ki
where
is the combination of Ei taking ki expectation, and NCs is the combination of N
population taking s sample representing total possibility, and
P(k) = 2x – x2 as usual.
Proof:
We are to show that
P(k) = 2x – x2
Ei
Ck
x N i
Cs
where
Ei
x
C ki
Ei
N
C ki
Cs
With
is the combination of Ei taking ki expectation, and NCs is the combination
of N population taking s sample representing total possibility.
Let s be sample out of a finite N population and we expect finite k out of a finite Ei group:
Hence,
0 < k < Ei < s < N, i = 1, 2, ….
Clearly,
Ei
Ei
C ki
C ki
x

N
N
Cs .
0 < C s < 1 with
It follows then that
0 < P(k) = 2x – x2 < 1 as proof.
Theorem 4:
For mutually exclusive groups Ei of a sample space s and respective expectations ki, the
possibility of all ki’s after T trials is
 TEi C k
x    TN i
 Cs




,
i = 1, 2, … with P(k) = 2x – x2 holding.
Proof:
TN
Cs is the number of equal parts of graduating R = 1 for the probabilistic sphere. Hence, k 1, k2,
…, ki expectations from respective exclusive E1, E2, …, Ei groups after T times implies TE1,
TE2, …, TEi respective trials for each group;
Hence,
x1 
x2 
TE1
C k1
TN
Cs
TE 2
C k2
TN
Cs
,
,
.
.
.
xi 
TE i
C ki
TN
Cs
.
Hence,
 TEi C ki
x  x1  x 2  ...  xi  TN
 TN
 ...  TN
   TN
 Cs
Cs
Cs
Cs

TE1
C k1
TE 2
C k2
TE i
C ki




i = 1, 2, … with P(k) = 2x – x2 holding.
Theorem 5: Independent Events:
Let T be number of trials for some k expectation given independent groups of s sample space out
of some N population; then the possibility of k being in Ei of s is
x
x=
TEi
TN
Ck
Cs
with P(k) = 2x – x2 following,
TE i
where C k is the possibility of k happening from Ei and TNCs total possibility of sampling N up
to T times.
Proof:
Let N, s be population and sample respectively; then T trials increases N to TN; so that the
combination of possible s sample becomes TNCs.
Thus R = 1 is graduated into TNCs equal parts as total possibility; so that
TE i
0 < C k < TNCs.
Dividing through with TNCs,
TE i
0 < C k / TNC < 1 or
s
0 < x < 1.
It follows then that P(k) = 2x – x2.
Definition 7: Events not totally exclusive:
Let the groups out of the sample space s not be totally exclusive, i.e. they have common elements
then let E1  E 2  E n a non-empty set; then E1  E2 are not exclusive.
Theorem 6: For any groups E1 and E2 in sample space s such that E1  E2 = En a non-empty set,
the possibility of k happening out of E1  E 2  E n after T trials with N population is
x
TE n
Ck
TN
Cs
with P(k) = 2x – x2 following.
Proof:
Let E1  E 2  E n a non-empty set be finite and identifiable. Hence, let En < s, TN and any k in
En is susceptible to the same rule as for any finite Ei. Hence, Theorem 1 follows; or
x
TE n
TN
Ck
Cs
with P(k) = 2x – x2 following.
Definition 8: Meridian Probability as statistic:
The Meridian Probability is a statistic measure as any other measure so that the Mean probability
of occurrence of all i and Mean Probability of Non-occurrence of all I (i.e.i1) defined
respectively as:
i 
i 1 
 P(k ) , and
i
i
1
 P(k i )
i
i = 1, 2, …. Is the number of observations
Theorem 7:
Let x: 0 < x < 1 be the possibility of some ki of an event Ei of a sample space s out of N
population happening corresponding with the perpendicular distance from the arbitrary north (or
south) pole of the probabilistic sphere, then:
The probability of all ki (i = 1, 2, …) actually happening is equal to the mean probability of
occurrence of k1, k2,…, ki, where P(k) = 2x – x2
and the probability of ki not happening P(k1) = (1 – x)2 
Proof:
We are to prove that: 0 < i < 1, and 0 < i1 < 1,
0 < x < 1.
i = 1, 2, …, given some possibilities for all ki’s such that all respective xi’s: 0 < xi < 1.
Now,
i 
 P(k )  P(k
i
i
1
 P(k 2 )  ...  P(k i )
i
TE1
x1 
TN
and since for T trials,
C k1
Cs
TE 2
, x2 
 TEi C ki
0  P(k i )  2 TN
 Cs

each i generates
TN
C k2
Cs
TEi
C ki
TN
Cs
,..., x1 
  TEi C ki

  TN C s
 
2


2
 


TEi
,
C ki


TN
TN
0<
i

TEi
2
C ki

2
 1,
s
and it implies that any the sum > 1 is again divided by i so that
 2 TEi C  TN C   TEi C
ki
s
ki

2
TN
 Cs 
 P(k i ) 
i
i
0<
=
Hence,
 P(k i )

 
C 
Cs 


2



 1.
<1 or 0 < i < 1, and 0 < i1 < 1.
Example:
A box contains 3 black, 4 red, 5 yellow balls. 1 ball is picked. What is the probability that the
ball is black or red or yellow?
Solution:
By the meridian probability theorem, since N = 3 + 4 + 5 = 12;
2
7
1 1
2      .
16
For black with Eb =3, xb = 3/12 = ¼; hence P(kb) =  4   4 
2
5
1 1
P (k r )  2     
9.
3 3
For red with Er = 4, xr = 4/12 = 1/3. Hence,
2
95
5 5
P (k y )  2     
.
144 .
 12   12 
For yellow with Ey = 5, xy = 5/12. Hence,
By Theorem 7, we have been asked to find the mean probability of k being b, r or y on the index
of i = 3, which must conform with the range: 0 < i < 1.
Now,
 P(k i )
0 < I =
i
 7 5 95 
  

P(k1 )  P(k 2 )  P(k 3 )  16 9 144  27728



 1,
3
3
62208
which satisfies the framework.
Theorem 8: The Mean Probability Function:
 P(k )
 P(k )
i
Let P(k) be a statistic so that i =
i
and i =
i
i
1
; i = 1, 2, …;
then the Mean Meridian Probability Function
F(x) = i + i1.
Proof:
We are to prove that F(x) = 1 in line with traditions.
Now we have already demonstrated in theorem 7 that 0 < i < 1.
With
i 
 P( x
i1 
i
)
, and
i
1
 P ( xi )
i
F(x) = 1 + 11 + 2 + 21 + … + i + i1.
Hence,
 P( k )   P( k
i
i
F(x) =
since each
 P(k )  P(k
i
1
i
)
i
1
i
)
 P(k )  P(k

i
i
1
i
)
  i .  1
i
,
 = 1 and there are i number of  P(k )  P(k ) ’s. Proved.
1
i
i
Remark:
Thus infinite i   trials could be made but the mean probability function would work out
within the framework of summing to unity.
References
Atovigba, G.M.V. (2003). The Sphere: a quadratic approach. Makurdi, Nigeria: Gee
Tigons.
Mendenhall, W.(1975).Introduction to probability and statistics (4th Ed). North Scituate,
Mass: Duxbury Press.
Download