The geometric proof of the probability theory(otherwise called the Meridian Probability Theorem) By Atovigba, Garshagu Michael Vershima (B.Sc. Ed. Mathematics, M.Ed. Mathematics Education Student, Benue State University, Makurdi Nigeria) Abstract The work seeks to provide the geometric proof of the probability theory which has been sought for, for over 300 years after its being propounded by Fermat, Huygenes, Bernoulli, etc. The proof is entitled the Meridian Probability Function And satisfies the age-long tradition that sum of all probabilities in a sample sphere is unity and probability of an event in the sample space ranges between zero and unity. The work takes the hyper sphere regarded as the probabilistic sphere upon which the Meridian Probability function is defined. The work distinguishes between possibility and probability; and polarizes the probability of an event happening against the probability of the event not happening both of which sum up to unity within the 0 to1 range framework. Introduction Probability is the vehicle that enables the statistician to use information in a sample to make inferences or describe the population from which the sample was obtained (Mendenhall, 1975), the building blocks of probability being: (1). a simple event if it cannot be decomposed, and that an experiment results in one and only one of the simple events e.g. in a coin toss e1 is head, e2 is tail. (2). A set of all sample points of the experiment is called a sample space. (3). An event is a collection of sample points. He further notes that if e1, e2, e2, …, en have equal chances of occurrence, the probability: P(e1) = P(e2) = … Pen) =1/n. So that “to each point in the sample space, we assign a number called the probability of ei denoted by the symbol. P(ei) such that O≤ P(ei) ≤ 1 and P(ei) = 1. (4). Events A, B are independent if P(A/B) = P(A) and P(B/A) = P(B) They are not independent if ) B PP((AB , and B) PA A PP((ABA)) PB ; (5). A, B are mutually exclusive if AB is a null set; (6). For mutually exclusive events, P(A B) = P(A) + P(B) – P(AB) where P(AB) = 0; Else, P(AB) 0; (7). For independent A, B, P(AB) = P(A)P(B) while P(AB) = P(A)P(B/A) = P(B)P(A/B) Mendenhall on probability distributions says: Probability distribution is (for discrete random variables) a formula, table or graph that provides the probability associated with each value of the random variable (a discrete random variable is one that can assume a countable number of values, while a continuous random variable. assumes the infinitely large number of values corresponding to the parts on a line interval. Two requirements exist for a probability distribution: 1. 0 ≤ P(yi) ≤ 1 and 2. ∑ P(yi) = 1 A critique of these criteria This research looks at typical literature on probability represented by Mendenhall (ibid:98) which says, for instance, that for tossing a die probability distribution for y is in the following table. no. P(ei ) 1 1 6 1 2 6 1 3 6 1 4 6 1 5 6 1 6 6 y 1 2 3 4 5 6 P( y) 1 6 1 6 1 6 1 6 1 6 1 6 P ( y) 1 This researcher observes that, suppose that we instead calculate the respective probabilities (classical approach) of having not faces 1,2,3,4,5,6, then each has P(yi)=5/6, which satisfies the initial condition of 0 < P(yi) < 1 but fails to satisfy the second criterion P(yi) = 1, since 5/6 up to 6 times is 5 > 1. This is the collapse of the classical probability theorem. The classical school, therefore, works only for a coin toss with P(H) = P(T) = ½ and when the two are added we have 1; and the probability of ‘not head’ = probability of ‘not tail’ = ½; and when the two are added we have1. This is generalized for dual events. But beyond the toss of the coin or dualism (e.g. a pluralistic dice), P(yi) is not always 1 especially when we turn the classical theory on probabilities of events not happening. To save probability theory from this collapse, the meridian probability theorem takes probability as a statistic using the hyper sphere, which gives geometric credence to the criteria provided by Fermat, Pascal, Huygenes, Bernoulli, etc – that is, for the respective probabilities of event x happening, x not happening, and the total probability of events xi in a sample space, 0 < P(x) < 1, 0 < P(x1) < 1 and P(xi) = 1. Statement of the meridian probability theorem (the geometric proof of the probability theory): Let x: 0 < x < 1 corresponding with the perpendicular distance from the arbitrary north (or south) pole of the probabilistic sphere (hyper sphere), be the possibility of k event happening; then the probability of k actually happening P(k) = Cos2 = 2x – x2 and the probability of k not happening P(k1) = Sin2 = (1 – x)2 0 < x < 1 and 0 < < , where is the latitude corresponding with x perpendicular distance from the arbitrary north (or south) pole of the probabilistic sphere. It is purely a geometrical analysis comparing observable possibilities that are likened to perpendicular distances of corresponding latitudes of the hyper sphere, which generates a sense of probability that is provable and tenable within the defined framework as Fermat and co. recommended. Definition of operational terms Definition 1: The Probabilistic Sphere: The Probabilistic Sphere is the hyper sphere, a sphere of unit radius R = 1. Definition 2: The Probabilistic Sphere and The Meridian function: Given the Meridian Function r2 = 2Rx – x2 … (1) (Atovigba: ibid); with R = 1, the probabilistic sphere has r2 = 2x – x2 … (2). Definition 3: Possibility: R = 1 is the total possibility of an aggregate of events occurring and x: 0 < x < 1 is the possibility of a particular k occurrence out of a group of the aggregate of the events. Definition 4: Hemispherical restriction of the meridian probability: Since R = 1, the probabilistic sphere restricts the variable latitude to the northern (or southern) hemisphere for the variable latitude So that 0 < < . Definition 5:The perpendicular distance x of the variable latitude of the probabilistic sphere is the possibility of k event occurring; hence 1 – x is the possibility of k not occurring: Specifically, 0 < x < 1 and 0 < 1 - x < 1. Definition 6: P(x) is the probability of k out of an event occurring and P(x1) the probability of k out of an event not occurring, where x of the probabilistic sphere is the possibility of k of the event occurring; and 1 – x the possibility of k not occurring: 0 < x < 1 and 0 < 1 - x < 1. Theorem 1: The Meridian Probability Theorem (the geometric proof of the probability theory): Let x: 0 < x < 1 corresponding with the perpendicular distance from the arbitrary north (or south) pole of the probabilistic sphere, be the possibility of k of an event happening; then the probability of k actually happening P(k) = Cos2 = 2x – x2 and the probability of k not happening P(k1) = Sin2 = (1 – x)2 0 < x < 1 and 0 < < , where is the latitude corresponding with x perpendicular distance from the arbitrary north (or south) pole of the probabilistic sphere. Proof: We are to prove that 0 < P(k) < 1 and 0 < P(x1) < 1 0 < x < 1 and 0 < < . Now, let R = 1 thus the probabilistic sphere. If r be the radius of the latitude then 0<r<R=1. 2 r 2 r Suppose that P(k) = Cos2 = R = 2x – x2 … (2). Since R = 1. Also, 0 < r < R=1 implies 0 < r2<1 implying 0 < Cos2 < 1. Hence, 0<P(k)< 1. 1 x 2 1 x Similarly, if P(k1) = Sin2 = R since R = 1 … (4) 2 Since 0 < (1 - x) < 1, so also 0 < (1 - x) < 1, implying 0 < Sin2 < 1 or 0 <P(k1)<1. Proved. 2 Remark 1: Theorem 1 makes abundant that the higher the possibility of k happening the higher the probability of its occurrence in a trial and vice versa. Also, as will be seen in the example below, possibility is different from the meridian probability. Example 1: The toss of a coin has possibility of head x = ½. Hence, possibility of not a head x1 = ½. Thus the probability of k = 1 head in the single toss will be 1 3 . P(k) = 2x – x = 1 - 4 4 And the probability of k = 1 not a head is 2 2 2 1 1 1 1 . 4 2 P(k1) = (1 – x)2 = 2 Remark 2: Example 1 shows that the Meridian Probability is a simple model that polarizes P(k) and P(k1): that is, an event either happens or does not happen. This helps in defending the forth coming Meridian Probability Function which should sum up to unity. Theorem 2: The Meridian Probability Function mpf: The Meridian Probability Function mpf is a dualism of P(k) and P(k1) such that f(x) = P(k) + P(k1) = 1 0 < x < 1. Proof: We are to prove that f ( x) 1 0 P (k ) 1 and 0 P (k 1 ) 1. in line with probabilistic traditions in which the sum of probabilities should be unity while probability of success lies between 0 and 1, and the probability of failure must lie between 0 and 1. Already, we have explained that the meridian probability is a dualism or polarity of P(k) and P(k1); and Theorem 1 has adequately restricted both P(k) and P(x1) within the [0,1] framework of probabilities. Now, 1 1 x 0 0 f ( x) P(k ) P(k 1 )dx : 0 < x < 1. Hence, 1 f ( x) x 1 0 1, x 0 which is the required proof. Example 2: Consider Example 1. Hence, f ( x) P(k ) P(k )dx 1 1 x 0 0 1 3 1 1. = 4 4 Theorem 3: Mutually Exclusive Events: Let s be sample space of N total population containing Ei mutually exclusive events (groups), i = 1, 2, … Z . Then, the possibility of ki out of any Ei happening x Ei C ki N Cs Ei C ki where is the combination of Ei taking ki expectation, and NCs is the combination of N population taking s sample representing total possibility, and P(k) = 2x – x2 as usual. Proof: We are to show that P(k) = 2x – x2 Ei Ck x N i Cs where Ei x C ki Ei N C ki Cs With is the combination of Ei taking ki expectation, and NCs is the combination of N population taking s sample representing total possibility. Let s be sample out of a finite N population and we expect finite k out of a finite Ei group: Hence, 0 < k < Ei < s < N, i = 1, 2, …. Clearly, Ei Ei C ki C ki x N N Cs . 0 < C s < 1 with It follows then that 0 < P(k) = 2x – x2 < 1 as proof. Theorem 4: For mutually exclusive groups Ei of a sample space s and respective expectations ki, the possibility of all ki’s after T trials is TEi C k x TN i Cs , i = 1, 2, … with P(k) = 2x – x2 holding. Proof: TN Cs is the number of equal parts of graduating R = 1 for the probabilistic sphere. Hence, k 1, k2, …, ki expectations from respective exclusive E1, E2, …, Ei groups after T times implies TE1, TE2, …, TEi respective trials for each group; Hence, x1 x2 TE1 C k1 TN Cs TE 2 C k2 TN Cs , , . . . xi TE i C ki TN Cs . Hence, TEi C ki x x1 x 2 ... xi TN TN ... TN TN Cs Cs Cs Cs TE1 C k1 TE 2 C k2 TE i C ki i = 1, 2, … with P(k) = 2x – x2 holding. Theorem 5: Independent Events: Let T be number of trials for some k expectation given independent groups of s sample space out of some N population; then the possibility of k being in Ei of s is x x= TEi TN Ck Cs with P(k) = 2x – x2 following, TE i where C k is the possibility of k happening from Ei and TNCs total possibility of sampling N up to T times. Proof: Let N, s be population and sample respectively; then T trials increases N to TN; so that the combination of possible s sample becomes TNCs. Thus R = 1 is graduated into TNCs equal parts as total possibility; so that TE i 0 < C k < TNCs. Dividing through with TNCs, TE i 0 < C k / TNC < 1 or s 0 < x < 1. It follows then that P(k) = 2x – x2. Definition 7: Events not totally exclusive: Let the groups out of the sample space s not be totally exclusive, i.e. they have common elements then let E1 E 2 E n a non-empty set; then E1 E2 are not exclusive. Theorem 6: For any groups E1 and E2 in sample space s such that E1 E2 = En a non-empty set, the possibility of k happening out of E1 E 2 E n after T trials with N population is x TE n Ck TN Cs with P(k) = 2x – x2 following. Proof: Let E1 E 2 E n a non-empty set be finite and identifiable. Hence, let En < s, TN and any k in En is susceptible to the same rule as for any finite Ei. Hence, Theorem 1 follows; or x TE n TN Ck Cs with P(k) = 2x – x2 following. Definition 8: Meridian Probability as statistic: The Meridian Probability is a statistic measure as any other measure so that the Mean probability of occurrence of all i and Mean Probability of Non-occurrence of all I (i.e.i1) defined respectively as: i i 1 P(k ) , and i i 1 P(k i ) i i = 1, 2, …. Is the number of observations Theorem 7: Let x: 0 < x < 1 be the possibility of some ki of an event Ei of a sample space s out of N population happening corresponding with the perpendicular distance from the arbitrary north (or south) pole of the probabilistic sphere, then: The probability of all ki (i = 1, 2, …) actually happening is equal to the mean probability of occurrence of k1, k2,…, ki, where P(k) = 2x – x2 and the probability of ki not happening P(k1) = (1 – x)2 Proof: We are to prove that: 0 < i < 1, and 0 < i1 < 1, 0 < x < 1. i = 1, 2, …, given some possibilities for all ki’s such that all respective xi’s: 0 < xi < 1. Now, i P(k ) P(k i i 1 P(k 2 ) ... P(k i ) i TE1 x1 TN and since for T trials, C k1 Cs TE 2 , x2 TEi C ki 0 P(k i ) 2 TN Cs each i generates TN C k2 Cs TEi C ki TN Cs ,..., x1 TEi C ki TN C s 2 2 TEi , C ki TN TN 0< i TEi 2 C ki 2 1, s and it implies that any the sum > 1 is again divided by i so that 2 TEi C TN C TEi C ki s ki 2 TN Cs P(k i ) i i 0< = Hence, P(k i ) C Cs 2 1. <1 or 0 < i < 1, and 0 < i1 < 1. Example: A box contains 3 black, 4 red, 5 yellow balls. 1 ball is picked. What is the probability that the ball is black or red or yellow? Solution: By the meridian probability theorem, since N = 3 + 4 + 5 = 12; 2 7 1 1 2 . 16 For black with Eb =3, xb = 3/12 = ¼; hence P(kb) = 4 4 2 5 1 1 P (k r ) 2 9. 3 3 For red with Er = 4, xr = 4/12 = 1/3. Hence, 2 95 5 5 P (k y ) 2 . 144 . 12 12 For yellow with Ey = 5, xy = 5/12. Hence, By Theorem 7, we have been asked to find the mean probability of k being b, r or y on the index of i = 3, which must conform with the range: 0 < i < 1. Now, P(k i ) 0 < I = i 7 5 95 P(k1 ) P(k 2 ) P(k 3 ) 16 9 144 27728 1, 3 3 62208 which satisfies the framework. Theorem 8: The Mean Probability Function: P(k ) P(k ) i Let P(k) be a statistic so that i = i and i = i i 1 ; i = 1, 2, …; then the Mean Meridian Probability Function F(x) = i + i1. Proof: We are to prove that F(x) = 1 in line with traditions. Now we have already demonstrated in theorem 7 that 0 < i < 1. With i P( x i1 i ) , and i 1 P ( xi ) i F(x) = 1 + 11 + 2 + 21 + … + i + i1. Hence, P( k ) P( k i i F(x) = since each P(k ) P(k i 1 i ) i 1 i ) P(k ) P(k i i 1 i ) i . 1 i , = 1 and there are i number of P(k ) P(k ) ’s. Proved. 1 i i Remark: Thus infinite i trials could be made but the mean probability function would work out within the framework of summing to unity. References Atovigba, G.M.V. (2003). The Sphere: a quadratic approach. Makurdi, Nigeria: Gee Tigons. Mendenhall, W.(1975).Introduction to probability and statistics (4th Ed). North Scituate, Mass: Duxbury Press.