Dependence and its Measuring

Dependence and its Measuring Dr Boyan Dimitrov, Kettering University, Mathematics Department Abstract. The dependence in the world of the uncertainty is a complex concept. The textbooks do avoid any discussions in this regard. In the classical approach the conditional probability is used to determine further rules in operations with probability. We use it to establish a concept of dependence proposed about 50 years ago, and what are the ways of its interpretation and measuring when two random events А and В are dependent events. Then we apply it to some examples to illustrate how suitable is this approach in the studies of local dependence. 1. Introduction What I intend to tell you here, you can not find in any contemporary textbook. I am not sure if you could find it even in the older textbooks on Probability and Statistics. But, I have read it more than 40 years ago in the Bulgarian textbook on Probability written by the great Bulgarian mathematician Nikola Obreshkov (1963). Later during the years I have seen lots of sources of different kind, and never met it in other textbooks, or monographs. It is not a single word about this basics even in the well known Encyclopedia on Statistical Science published more than 20 years later. Not long ago I started working on the measures of dependence between random variables (r.v.). I found that researchers and practitioners develop variety of approaches to model dependence between rave’s copula, regressions, correlations, and indexes of various kinds). In these they use specific tools based on knowledge high above the basics, and out of the understandings of an ordinary student. And there was never a try to touch the base, where sir Obreshkov felt the things, but left it just in the mist. And so, I decided to start from there, the zero point. How far I can get? I cannot guess. I trust my incentives and throw myself brave in the challenge. Hopefully, this is due not to my ignorance and lack of information. Forgive me, if I rediscover the wheel. I truly hope to find followers among the readers of this article. For me the things look natural and simple. All the prerequisites are: What is probability of a random event, when we have dependence, what is conditional probability of a random event if another event occurs, and several basic rules when calculate probabilities related to pairs of events. Of course, some interpretations of the facts, which the probability contains, are in favor. When necessary, we will repeat here the most needed facts in regard of probability rules for two random events. Those, who are familiar with more than an introductory course in Probability and Statistics will find well known things, and let do not blame the authors of the textbooks they have used for the gaps we fill in here. For the beginners let the written here be a challenge to their wish to get deeper into the essence of the concept of dependence. We are willing the made of examples discussed here to be used for further ideas of applications, and to illustrate the rich opportunities for the use of this approach in practice. 2. Dependent events. Connection between random events Let А and B be two arbitrary random events. It is well known that А and B are independent only when the probability for their joint occurrence is equal to the product of the probabilities for their individual appearance, i.e. when it is fulfilled P( A  B)  P( A) P( B). (1) The readers familiar with the basics of probability theory know that the independence is equivalent to the fact, that the conditional probability of one of the events given that the other event occurred is not changed, and remains equal to its original, unconditional probability, i.e. in such cases it is fulfilled P( A | B)  P( A). (2) The only inconvenience in equation (2) for the definition of independence is that it requires P ( B )  0 , i.e. B has to be a possible event. Otherwise, the conditional probability P( A | B)  P( A  B) P( B) (3) is not defined, since in the right hand side of (3) there will be an improper division by zero. In the same time, even if we have P ( B )  0 (then В is called impossible, or zero event), equation (1) is fulfilled. This is due to the fact that the inclusion A  B  B implies that we have also the equality 0=0, since 0  P( A  B)  P( B)  0 , whatever the probability of the event А is equal. By the way, the identity in equation (1) holds also P( B)  1 , i.e. when В is a sure event. Then it is true P( A  B)  P( A)  P( B)  P( A  B)  P( A)  1  1  P( A) . Therefore, in (1) the equality sign is guaranteed. This means that a zero event as well as a sure event is independent with any other event, including them. The most important fact is that when equality in (1) does not hold, the events А and В are dependent. The dependence in the world of the uncertainty is a complex concept. It is too little done to explain its essentials. The textbooks do avoid any discussions in this regard. In the cases of a classical approach to the probability, in cases of countable finite equally likely elementary outcomes the equation (3) is used to determine the conditional probabilities, as well as this definition is a base for getting further rules in operations with probability. We enter into the establishing what is the concept of dependence and what are the ways of its interpretation and measuring when А and В are dependent events. In addition, we naturally assume that neither of these events is a zero, or sure event. The terminology used is the one proposed by Obreshkov, 1963. Definition 1. The number  ( A, B)  P( A  B)  P( A) P( B) is called connection between the events А and В. We immediately derive the following properties of the connection between two random events. (4) δ1) The connection between two random events  ( A, B) equals zero if and only if these events are independent. This includes the cases when some of the events are zero, or sure events. δ2) The connection between events А and В is symmetric, i.e. it is fulfilled  ( A, B)   ( B, A). δ3) If A1 , A2 , , A j ,  are mutually exclusive events, then it is fulfilled  ( A j , B)   ( A j , B) , i.e. the function  ( A, B) is additive with respect to either of its arguments. Therefore, it is also continuous function as the probabilities in its construction are. δ4) It is true that  ( A  C , B)   ( A, B)   (C , B)   ( A  C , B) , and most of the properties of the probability function for random events are transferred to the connection function. δ5) The connection between events А and B (the complement of the event В) is equal in magnitude to the connection between events А and В, but has an opposite sign, i.e. it is true  ( A, B )   ( B, A). Indeed, from P( A)  P( A  B)  P( A  B) we obtain that it is true  ( A, B)  P( A  B)  P( A) P( B)  [ P( A)  P( A  B)]  P( A)[1  P( B)] = - P ( A  B ) + P ( A) P ( B ) = -  ( A, B) . The connection between the complementary events A and B is the same as the one between А and В, i.e. it is fulfilled  ( A, B)   ( B, A). This property follows immediately after a double application of the property δ3). δ6) If the occurrence of А implies the occurrence of В, i.e. when we have A  B, then it is fulfilled  ( A, B)  P( A) P( B) and the connection between the events А and В is positive. The two events are then called positively associated. δ7) When А and В are mutually exclusive, i.e. when A  B   , then it is fulfilled  ( A, B)   P( A) P( B) and the connection between the events А and В is negative. δ8) When  ( A, B)  0, the occurrence of one of the two events increases the probability (this is the conditional probability) for the occurrence of the other event. The following relation is true:  ( A, B) (5) P( A | B)  P( A)  . P( B) By making use of (3), according to the multiplication rule for probabilities we get the equality P( A  B)  P( A | B) P( B) , and by substituting it in (4), we obtain one more representation for the connection between the two events А and В  ( A, B)  [ P( A | B)  P( A)]P( B). (6) This equation solved with respect to P ( A | B ) , gives the relation (5). Let us note specifically that  ( A, B)  0 is a guarantee that it is fulfilled P( A) P( B)  0 . The property δ8) can be reversed and for the cases when  ( A, B)  0 (then we also have P( A) P( B)  0 ) with a conclusion that if the connection is negative the occurrence of one of the events decreases the chances for the other one to occur. Equation (5) remains true. It also indicates that the knowledge of the connection is very important, and can be used for calculation of the posteriori probabilities similar to what people is custom to do when apply the Bayes rule! In our case we do not really need to know any complete system of hypotheses, which is required in order to apply the Bayes rule. It is sufficient to know the numeric value of the connection  ( A, B) between the two events only, and their prior probabilities, in order to exactly evaluate the posterior probability of either of the two events if we know that the other one occurs. We anticipate the most applications of the considered here measures of dependence for similar purposes. We will call the events А and В positively associated when  ( A, B)  0 , and negatively associated when  ( A, B)  0 . The reason for this is the relationship (5) (we connect it to the increase or decrease of the conditional probability for the occurrence of one of the events when the other one occurs), as well as with some likelihood to situations concerned random variables. δ11) The connection between any two events А and В satisfies the inequalities max{  P( A) P( B),[1  P( A)][1  P( B)]}   ( A, B)  min{ P( A)[1  P( B)], [1  P( A)]P( B)} . We call it Freshe-Hoefding inequalities. They also indicate that the values of the connection as a measure of dependence is between – ¼ and + ¼. Example 1. There are 1000 observations on the stock market, and it is found that in 80 cases there was a significant increase in the oil prices (event A). For the same observations it is registered that there is a significant increase of the income at the Money Market (event В) в 50 cases. Simultaneous significant increase in both investments (event A  B ) is observed in 20 occasions. Let us determine the connection between the two events, and let see how the information about occurrence of one of these events can help to make a forecast for the appearance of the other event. According to the frequencies estimation about probabilities, we have P ( A)  80  .08 ; 1000 P( B)  50  .05 , 1000 и P( A  B)  20  .02 . 1000 In accordance with Definition 1 we get  ( A, B) =.02 – (.08)(.05) = .016. Therefore, by equation (5) we find that if it is known that there is a significant increase in the investments in money market, then the probability to see also significant increase in the oil price is P ( A | B ) = .08 + (.016)/(.05) = .4. Analogously, if we have information that there is a significant increase in the oil prices on the market, then the chances to get also significant gains in the money market at the same day will be estimated as follows: P ( B | A) = .05 + (.016)/(.08) = .25. It is understandable, that these numbers can be obtained also if one uses formula (3), when it is known Р( A  B ) . In our case we assume that we know the numerical value of the connection  ( A, B) , and the individual prior probabilities Р (А) and Р (В) only. And namely, the knowledge of these numbers seems much more natural in the real life and to be in use in practice, as well as if one wants to model dependence between random events for other purposes Remark 1. For those who know more about the science of uncertainty, for instance what is a random variable (r.v.) and what is expected value Z(mathematical expectation) we note the following: If we introduce the r.v.s which are the indicators of the considered random events, i.e. I A  1, when event A occurs, and I A  0 when it occurs the complementary event A , then it is true E( I A )  P( A) and Cov( I A , I B )  E( I A .I B )  E( I A ) E( I B )  E( I AB )  P( A) P( B)  P( A  B)  P( A) P( B)   ( A, B) . Therefore, the connection between two random events equals to the covariance between their indicators. Comment: Similar to the covariance between to r.v.’s the numerical value of the connection  ( A, B) does not speak clearly about the magnitude of this connection between А and В. It is intuitively clear that the deeper connection should be between two coinciding events, i.e. the strongest connection must hold when А=В. In such cases we have P( A)  P( B) , and also  ( A, B)  P( A)  P 2 ( A). Let see some numbers. Let us assume that А = В, and P( A)  P( B) = .05. Then we will find  ( A, B)   ( A, A)  .05  .0025  .0475 , i.e. the connection of the event A with itself has a very low value .0475. Moreover, the value of the connection varies together with the probability of the event A. Let us look on another example where P( A)  .3, P( B)  .4, but А may occur as with В, as well as with B , and P ( A | B )  .6 . Then, according to (6) we obtain  ( A, B)  (.6  .3)(.4)  .12 . The value of this connection is about 25 times stronger then the previously considered, despite the fact that in the firs case the occurrence of В guarantees the occurence of А. Due to this reason there are introduced other measures for the strength of the dependence between two random events. In this way more opportunities for penetration into the complex concept of dependence are offered. 3. Regression coefficients as measure of dependence between random events. We start with an explanation and interpretation of the probability P( A), and of the conditional probability P ( A | B ) . There is a general concept of probability P( A), this is the measure of the chances of the random event А to occur in a single experiment. When this experiment is repeatable by assumption many times, and then P( A) is approximately equal to the portion of those experiments where the event А occurs, relative to the all performed experiments. Analogous is the interpretation of the conditional probability P ( A | B ) . In repeatable experiments it is approximately equal to the portion of those experiments where both events A and B simultaneously occur relative to the all counts where the event B occurred. In other words, the conditional probability P ( A | B ) is the conditional measure of the chances for the event A to occur, when it is already known that the other event B occurred. When В is a zero event then P ( B )  0 , and the conditional probability by the rule (3) can not be defined, and is considered usually as undefined. It is convenient, and we will accept once forever that in such cases is fulfilled P ( A | B ) = P( A) , since А and В are independent events according to the fulfillment of identity (1). We also have P ( A | B ) = P( A) when the event В is a sure event, i.e. when P( B)  1 . With this agreement and interpretations of the conditional probability we introduce the next measurement of the dependence of the event А on the event В: Definition 2. Regression coefficient rB (A) of the event А with respect to the event В is called the difference between the conditional probability for the event А to occur given the event В, and the conditional probability for the event А to occur given the complementary event B of the event В, namely (7) rB (A) = P ( A | B ) - P ( A | B ) . We immediately notice, that according to our convention above, the regression coefficient rB (A) is always defined, for any pair of events А and В (zero, sure, arbitrary random). Analogously is defined and the regression coefficient rA (B) of the event В with respect to the event А, namely (8) rA (B) = P( B | A)  P( B | A) . With the establishment of the properties of the two regression coefficients we will understand what else these measures contain in regard of the dependence between the two events А and В. The following statements hold: (r1) The equality to zero rB (A) = rA (B) =0 takes place if and only if the two events are independent. Proof. This statement obviously holds when one of the events (assume that this is B) is a zero, or a sure event. Hence, let consider the 0  P( B)  1 . Then also we will have 0  P( B)  1 . By representing the two conditional probabilities in (7) according to (3) and implementation of the written in the next chain of equivalent equalities we get rB ( A)   P( A  B) P( A  B) P( A  B) P( B)  P( A  B) P( B)   P( B) P( B) P( B) P( B) P( A  B)[1  P( B)]  P( A  B) P( B) P( A  B)  [ P( A  B)  P( A  B)]P( B)  . P( B) P( B) P( B) P( B) The expression in the quantity of the numerator in the last fraction gives P( A  B)  P( A  B)  P( A). After we take into account the definition of the connection  ( A, B) between the two events from the last fraction we will find that rB (A) and  ( A, B) are related by the identity rB ( A)   ( A, B) P( B)[1  P( B)] , and analogously, rA ( B)   ( A, B) P( A)[1  P( A)] . (9) Therefore, rB (A) = rA (B) =0 only when  ( A, B) =0. According to property δ1) of the connection  ( A, B) , this equality to zero is fulfilled only when the events А and В are independent. In order to avoid proofs continuous references to the extreme situations in further, from now on we will consider only the general situation when neither of the events А and/or В is zero, or sure event. However, all the next statements are true for these situations too, due to our agreements above. (r2) The regression coefficients rB (A) and rA (B) are numbers with equal signs and this is the sign of their connection  ( A, B) . However, their numerical values are not always equal. To be valid the equation rB (A) = rA (B) it is necessary and sufficient to be valid the equality P( A)[1  P( A)] = P( B)[1  P( B)] . Proof. It follows from (9) that the connection  ( A, B) and the two regression coefficients are related also by the equalities  ( A, B) = rB (A) P( B)[1  P( B)] = rA (B) P( A)[1  P( A)] . (10) From these follows statement (r2). (r3) The regression coefficients rB (A) and rA (B) are numbers between –1 and 1, i.e. they satisfy the inequalities  1  rB ( A)  1;  1  rA ( B)  1. (r3.1) The equality rB (A) = 1 holds only when the random event А coincides (is equivalent) with the event В. Тhen is also valid the equality rA (B) =1; (r3.2) The equality rB (A) = - 1 holds only when the random event А coincides (is equivalent) with the event B - the complement of the event В. Тhen is also valid the equality rA (B) = - 1, and respectively A  B . Proof. The inequalities in property (r3) follow from the equations (7) and (8) – the very definition of the regression coefficients. These are differences of two probabilities, and each probability is a number between 0 and 1. The extreme values of the differences are pointed out in (r3.1) and (r3.2). Assume that we have rB (A) =1. Then according to (7) it means that it is fulfilled P ( A | B ) = 1, and P ( A | B ) = 0. But then these equalities are equivalent to the identities P ( A  B ) = P(B) , and P ( A  B ) = 0. From these it follows that we have P( A) = P ( A  B ) + P ( A  B ) = P ( A  B ) = P(B) . Therefore, in this case the event А is equivalent to A  B , and it is equivalent В. From this fact follows that the events А and В are equivalent since the relation equivalence is transitive. Reversely, when А is equivalent to the event В (usually it is written as А=В), then we have P ( A | B ) = 1, and P ( A | B ) = 0, and according to (7) we will get rB (A) = 1. Property (r3.1) is proven. Analogously are conducted the considerations in the case rB (A) = - 1. According to (7) it means that it is fulfilled P ( A | B ) = 0, and P ( A | B ) = - 1. From these equations by following the way shown above we arrive to a conclusion that the events А and B are equivalent. Therefore, property (r3) is proven. (r4) It is fulfilled rB ( A) = - rB (A) , as well as rB ( A) = - rB (A) . Also the identities rA ( B) = rA (B) = - rB (A) hold. Proof. The first equality is obvious since the complement to B is the event В. The second equality is a consequence of (7) and of the equalities P ( A | B ) = 1 - P ( A | B ) , plus the relation P ( A | B ) = 1 - P ( A | B ) . In this way we get rB ( A) = P ( A | B ) - P ( A | B ) = 1 - P ( A | B ) - [1 - P ( A | B ) ] = - rB (A) . (r5) It is true that for any mutually exclusive sequence of events it is fulfilled rB ( j A j )   j rB ( A j ) . (r6) The regression function possesses the property rB ( A  C)  rB ( A)  rB (C)  rB ( A  C) . The two properties (r5) and (r6) are simple transfer of the respective properties of the conditional probabilities P(A|B) and P(A| B ) and some easy algebraic manipulations with the explicit expressions in both sides of the written equations. We omit details here. We will dare to interpret the properties (r3) of the regression coefficients in the following way: As closer is the numerical value of rB (A) to 1, as “denser inside within each other are the events A and B, considered as sets of outcomes of the experiment”. In a similar way we interpret also the negative values of the regression coefficient rB (A) close to negative 1. In other words then “denser inside within each other are the events A and B (the complement to the event B) considered as sets of outcomes of the experiment”. By the way, we do not forget that when rB (A) = 1, then also rA (B) = 1, and also we simultaneously have rB (A) = - 1 = rA (B) . Remark 2. The students, and some practitioners frequently mix the concepts of mutually exclusive events (the fact that we have A  B   which means that the impossibility for simultaneous occurrence of the both events A and B), and mutually independent events, expressed by the equations (1) or (2). For random events that are neither the zero or the sure event, the independence requires to be fulfilled A  B   . For this reason the equalities rB (A) = rA (B) =  ( A, B) = 0, which are equivalent to independence between A and B also indicate that it is true also A  B   . For completeness we will consider here some particular cases and specific forms of the connection  ( A, B) and of the regression coefficients rB (A) and rA (B) , in view of the mutual location of the two random events A and B in the sample space Ω of all possible outcomes of an experiment. These mutual locations without the case where B is inside A are shown on the Venn’ diagrams on Fig. 1. a. А is part of В ( A  B) b. А and В are mutually exclusive ( A  B   ) c. General location ( A B  ) Fig. 1. Actually, particular are all of the shown cases, but as specific particular cases are considered these on Fig. 1a, and 1b. For the case 1a. when A  B , we have A  B  A , P( A | B)  P( A) P( B) , and P ( A | B ) =0. We find accordingly  ( A, B) = P( A)[1  P( B)]  P( A) P( B) , rB (A) = P( A) P( B) , rA (B) = P ( B ) P ( A) . It is worth to notice that in all the cases of A  B , or B  A the dependence measures between two events (connection, as well as both regression coefficients) are positive. These measures for the case B  A are symmetric to these for the case a. with an exchange between the roles of the events A and B in it For the case, pictured on Fig. 1b, we find that the following equalities hold:  ( A, B) =  P ( A) P ( B ) , rB (A) =  P ( A) P ( B ) , rA (B) =  P ( B ) P( A) , and all of these measures are simultaneously negative. For the general case 1c one may get as positive, as well as negative measures of dependence. For example, if one has Р(А)=Р(В)=.5, and P ( A  B ) =.3, then the connection and both regression coefficients are positive; if we have Р(А)=Р(В)=.5, and P ( A  B ) =.1, these all measures are negative. The sign of the dependence could be interpreted as a trend in the dependence toward one of the extreme situations 1a, or 1b. If we combine the properties (r2) and (r3) of the regression coefficients, we will obtain the following interesting statement: Corollary. The numerical values of the connection  ( A, B) between two random events are 1 1 always a number between - ¼ and ¼, i.e. it is fulfilled    ( A, B)  . 4 4 Proof. According to equations (10) and property (r3) of the regression coefficients, we always have  ( A, B) = rB (A) Р(В)[1-P(B)] ≤ Р(В)[1-P(B)]. The probability р=Р(В) is always a number between 0 and 1, and the function g(p)=p(1-p) reaches its maximum ¼ within p  [0,1] for p=½ . Substitute these values in the above inequality, and get the inequality in the right hand side of the corollary. The left hand side inequality is obtained similarly when one uses first the minimal value –1 of the regression coefficient rB (A) , instead of its maximal value. It is interesting also the fact that the equality signs in the corollary hold not only when the two random events A and B (respectively А and B ) are equivalent, but in addition it is necessary to be fulfilled Р(В) = Р( B ) = 1/2. Therefore, the maximal connection is obtained when the two events coincide, and their probability equals to ½. Example 1 (continued): We calculate here the values of the two regression coefficients rB (A) and rA (B) according to the data of the given above example. We will use formulas (9). In this way we find: The regression coefficient of the event A (a significant increase of the gas prices on the market) in regard to the event B (significant increase in the Money Market return) has a numerical value rB (A) = (.016)/[(.05)(.95)] = .3368 . In the same time we have rA (B) = (.016)/[(.08)(.92)] = .2174 . One thing we immediately see: the measure of dependence of the event A with respect to the event B, expressed by the numeric value of the regression coefficient rB (A) is 1.5 times stronger than the strength of dependence of B with respect to A shown by the numerical value of the regression coefficient rA (B) . There exists an obvious asymmetry in the dependence between random events. (r7) Freshet-Hoefding inequalities for the Regression Coefficients between two random events:  P( A)  P( A) 1  P( A)  1  P( A)  max  , ,   rB ( A)  min  ; P( B)   1  P( B)  P( B) 1  P( B)   P( B)  P( B) 1  P( B)  1  P( B)  max  , ,   rA ( B)  min  . P( A)   1  P( A)  P( A) 1  P( A)  The proofs are relatively simple consequences from property δ11) of the connection function. The last property is anticipated to be used in simulation of dependent random events with desired value of the regression coefficient, and given marginal probabilities P (A) and P (B). The given inequalities show that some restrictions must be satisfied in this respect. Remark 3. It took some time to clarify for myself the true reason for this name “regression coefficient”. It contains a hint to follow, and it needs some additional knowledge about random variables (r.v.’s), regression modeling, and concepts for expectation and variance. Let I B ( ) and I B ( ) be the indicator r.v. as introduced in Remark 1, where the argument ω is a symbol for an arbitrary outcome from the experiment. Formally, construct the following “regression model” which represents a possible linear relationship (11) I B ( ) =α + β I A ( ) + ε (ω). It allows one to “predict” the value of the indicator I B ( ) if one knows just the value of indicator I A ( ) , and admits an error ε(ω) = I B ( ) – [α + β I A ( ) ] in this prediction with the following desired properties: ε has a zero expectation and minimum variance, i.e. the values of the coefficients α and β are such numbers that it is fulfilled E [ I B ( ) -α–β I A ( ) ] = 0, and Var [ I B ( ) -α – β I A ( ) ] = min Var [ I B ( ) -a – b I A ( ) ]. (12) a ,b The first equation gives that there must be fulfilled the relation α = P (B) – β P (A). When this is substituted in the second equation of (12), it turns into a one variable optimization problem. By applying the standard mathematical calculations and some algebra, we arrive to the conclusion that the value of β must minimize the expression  2 P(A)[1-P(A) - 2 β [P(A∩B) - P(A)P(B)] + P(B)[1-P(B)]. Therefore, its value is β = [P(A)P(B) - P(A∩B)]/P(A)P( A ) = δ(A,B)/ P(A)P( A ) , (13) where δ (A, B) is the connection between the two events discussed in the previous section. Hence, the other “optimal” coefficient has the value α = P (B) + δ (A, B)/ P ( A ) (14) If one substitutes P(B) = P(A∩B) + P(B ∩ A ) in the numerator of the expression (13), and formula (3) for the conditional probabilities, after some regrouping terms in numerator, and partitioning of the obtained expression into simple fractions, he will agree that the optimal value of the coefficient β has following equivalent forms of representation: β = P ( B | A)  P( B | A) = rA (B) . Analogous manipulations with expression (14) will prove that the regression coefficient α in equation (8) has the following equivalent representation α = P (B | A ) If in addition we take into account that the indicators of the complementary events are related by the equation I A ( ) = 1 – I A ( ) and use it in the main regression model (11), we will se that the same equation can be used for “optimal prediction” of the values of I B ( ) when the indicator r.v. I A ( ) is used, namely will get I B ( ) = P (B | A ) + rA (B) . I A ( ) + ε = P (B| A) - rA (B) . I A ( ) + ε. With this Remark 3 we gave an explanation of the genealogy of the Regression Coefficients. It is possible that in a textbook this measure of dependence to be introduced not as Definition 2 at the early stage with the probability concepts and rules, but much later after distributions and expectation for random vectors is introduced. In our opinion, such a delay will lose the opportunity to offer an early discussion on the dependence, and the challenge to offer the student an important gate to many small research studies on dependence. The obtained regression equation gives us the opportunity to explain the meaning of the specific numerical values of the regression coefficient rA (B) . We need to think in terms of outcomes ω of the experiment (or in terms of individuals who carry the feature A, equivalent to a statement that ω favors event A). Then P (B | A ) is something like the “net value of the indicator variable I B ( ) , and rA (B) is the (positive or negative) contribution of any single outcome ω to the prediction of the value of I B ( ) via the values of I A ( ) .More the value of rA (B) is, more outcomes that favor event A will contribute (will favor also) to the values of event B. The asymmetry in this form of dependence of one event on the other can be explained by the capacity of one or the other event. Events with less capacity (fewer amounts of favorable outcomes will have less influence on events with larger capacity. Therefore, when rA (B) is less than rB (A) , it can be concluded that event A is “more powerful” in the influence on B, than the power of event B in its influence on A. We should agree with this and accept it as reflecting what indeed exists in the real life. In the same time by catching the asymmetry with the proposed measures we are convinced about their flexibility and utility features We guess that it is now possible to use it for construction of some gradation in respect of the magnitude for the strength of dependence of one of the events with respect to the other one, according to the distance of the regression coefficient from the zero (where the independence stays). For instance, if the values are within .05 distance from the zero, the event could be classified as “almost independent on the other”, for distances between .05 to .2 from the zero, the event may be classified as weakly dependent on the other; if the distance is between .2 and .45, the event could be classified as moderately dependent; from .45 to .8 to be called as in average dependent, and above .8 to be classified as strongly dependent. Everybody will understand that this classification is pretty much conditional, even is made up by the author. However, it shows a possibility for the use of these coefficients. What may be interesting here, this is that despite of the asymmetry it is possible when fix one of the events, say B, and consider any finite sequence A1 , A2 ,, An of given random events, then these events can be ordered according to their “magnitude of influence on event B” which corresponds to the inverse order of the absolute value of their regression coefficients with respect to the event B. One last thing we would like to discuss here is how to use the known values of the regression coefficients to predict the posterior probabilities, e.g. P ( B | A) , when the prior (marginal probabilities) P(A) and P(B) are known. Assume that it is known rA (B) . Then equations (5) will allow evaluating rB (A) , and therefore in the reverse case the rules will be symmetric to what we show for the evaluation of the posterior (conditional) probability P ( B | A) . From Definition 2, property δ5), and equations (10) we get the sequence of equivalent presentations  ( A,B) P ( B | A) = rA (B) + P (B | A ) = rA (B) + P (B) + =P (B) + rA (B) [1-P(A)]. P( A) If we substitute in this last rule the calculated values of rA (B) = .2174, P (A) =.08, P(B)=.08, we will get the same value P ( B | A) = .05+.2174 ∙ (.92)= .25. 4. Correlation between two random events It is not known to me why this brilliant mathematician, Dr of the French Sorbonne, Obreshkov introduces the correlation between two random events to be equal to the geometric average of the two regression coefficients rB (A) and rA (B) , with the sign of either one. His definition is as follows: Definition 3. Correlation coefficient between two events A and B we call the number R A, B =  rB ( A)  rA ( B) , (15) whose sign, plus or minus, is the sign either of the two regression coefficients. If we use the representations (9) in equation (15), we immediately obtain an equivalent representation of the correlation coefficient R A, B in terms of the connection  ( A, B) , namely  ( A, B) R A, B = = P( A) P( A) P( B) P( B) P( A  B)  P( A) P( B) . (16) P( A) P( A) P( B) P( B) We do not forget that neither of the events А or В is zero, or sure event. However, if it happens then  ( A, B) = rB (A) = rA (B) = R A, B = 0. Remark 4. From representation (16) we understand, that in fact, the correlation coefficient R A, B between the events А and В is equal to the correlation coefficient  I A , I B between the random variables I A and I B (the indicators of the two random events A and B, exactly as these are defined in Remark 1). To see this, one needs the concept of correlation coefficient between two random variables, which is based on the concepts of random variable (r.v.) X, of probability distribution, of mathematical expectation Е(Х) and of variance D(X) for a r.v., two-dimensional distribution and related with expected values of functions of random variables. For those who have this knowledge it should be familiar the formula  X ,Y = Cov( X , Y ) D( X ) D(Y ) = E ( XY )  E ( X ) E (Y ) D( X ) D(Y ) , which defines the correlation coefficient between two r.v.’s Х and Y. When Х = I A , and Y = I B it is true that Е(ХY)=Е( I A I B ) = P ( A  B ) ; Е( I A ) = P(A); D( I A ) = P( A)[1  P( A)]  P( A) P( A) , And ultimately it will be obtained  I A , I B = R A, B . It is presumable to use this approach in the definition of the correlation coefficient between two random events, namely as equal to the correlation between their indicators. And such an approach would require the introduction of all the listed above complex concepts (including integration of functions of r.v.’s and more), and will be not available at the basics of Probability Theory. For this reason, we personally admire the approach proposed by Obreshkov. Let us now discuss the properties of R A, B and see what gives the knowledge of the correlation coefficient between two random events R1. It is fulfilled R A, B = 0 if and only if the two events А and В are independent. Proof. First, let us note that equation (16) seems useless when P(A)= 0, or Р(В) = 0 (i.e. when one of the events is zero), or when P(A)= 1, or Р(В) = 1 (i.e. when one of the events is sure event). But, in such situations А and В are independent. According to our earlier agreements we have  ( A, B) = rB (A) = rA (B) =0, and also R A, B = 0. However, if R A, B = 0 in other occasions (when are fulfilled the inclusions P( A)  (0,1) and P( B)  (0,1) ), then the zero correlation coefficients holds only when also  ( A, B) = 0, i.e. when А and В are independent.. R2. The correlation coefficient R A, B always is a number between –1 and +1, i.e. it is fulfilled -1 ≤ R A, B ≤ 1. R2.1. The equality R A, B = 1 holds if and only if the events А and В are equivalent, i.e. when А = В. R2.2. The equality R A, B = - 1 holds if and only if the events А and B are equivalent, i.e. when А = B (then, of course it holds also A = В). Proof. The assertions here follow directly from Definition 3, and from properties (r3) of the regression coefficients rB (A) and rA (B) . With some more work these properties can be derived from equations (16) if these equalities are used for direct definition of the correlation coefficient between two random events, and nothing is known about the regression coefficients. We leave this challenge for the readers. R3. The correlation coefficient R A, B has the same sign as the other measures of the dependence between two random events А and В (and this is the sign of the connection.  ( A, B) , as well as the sign of the two regression coefficients rB (A) and rA (B) ). The knowledge of R A, B allows calculating the posterior probability of one of the events under the condition that the other one is occurred. For instance, P (B | A) will be determined by the rule P ( B | A) = P(B) + R A, B P( A) P( B) P( B) . P( A) (17) Proof. The first part of the statement follows from the relationships (16) and from equation (9). To prove (17) we use equation (16) and the presentation  ( A, B) = P ( B | A) P( A) - P( A) P(B) in it. This will lead to the relation [P(B | A) - P(B)]P(A) R A, B = . P( A) P( A) P( B) P( B) When this equation is solved with respect to P ( B | A) , it will give (117). Once again, equation (17) allows calculating the posterior probability P ( B | A) of the random event В, when it is known the correlation coefficient R A, B , and the prior probabilities P( A) and P(B) of both events, plus the information that the other event A occurs. This rule reminds the Bayes rule for posterior probabilities. However, in our case there is no need of B to be a member of a complete system of events (the so called “hypotheses”, representing a partitioning of the sure event Ω in the form of mutually exclusive particular cases of the form Ω = B1 + . . . + Bn ). Also, there is no need neither of the conditional probabilities of the event B under the assumption that either of the hypotheses took place which are needed in order to be able to apply the Bayes Rule P( B j | A)  P( A | B j ) P( B j ) P( A | B1 ) P( B1 )    P( A | Bn ) P( Bn ) for the posterior probabilities. For our rule it is no necessity the event B to be part of a system of hypotheses, and can be any. Then it is sufficient to have the prior probabilities P( A) and P(B) available (then the probabilities of their complements P( A) and P(B) needed in (17) are determined by the well known relations P( A) = 1 - P( A) ), an available numerical value of the correlation coefficient R A, B , and the rule (17) to be applied. Important Note: We should note that in the definition of R A, B (by formula (16) and by Definition 1, or according to Definition 3 and the equations participating in Definition 2) participate just probabilities. And these probabilities have natural frequencies statistical estimations, so that there is an easy and natural way to estimate the correlation coefficient R A, B . This is our reason to believe, that the proposed here rule for estimating (evaluating) the posterior probabilities can be turned into a powerful tool in calculating posterior probabilities, with a brilliant use of the statistical information for practical purposes. In addition, we will notice that the net increase, or decrease in the posterior probability compare to the prior probability and expressed by formula (17), is equal to the quantity P( A) P( B) P( B) , and depends only on the value of the mutual correlation R A, B (positive or P( A) negative), and also on the prior probabilities of the two events А and В. A comprehensive rule can be written as well as in the case when it is known that the complement A of the event A occurs: then the quantity R A, B must be replaced by - R A, B , as well as the symbols А must be replaced by R A, B A , and vice versa, A to be replaced by А. The readers easily will work out the details, and get the equation P ( B | A) = P(B) - R A, B P ( A) P ( B ) P ( B ) . P ( A) R4. It is fulfilled R A, B = R A, B = - R A, B ; R A, B = R A, B . Proof. These equations follow from the presentation (16) for R A, B , and from the property δ4) of the connection δ(А,В) between two events and the possible combinations of these and their complements. Particular Cases. The found rules above work always when neither of the random events А or В is zero, or sure event. Otherwise we have R A, B = 0. However, there are mutual allocations between the two events as shown on Fig. 1. For each of these particular cases the correlation coefficient takes the following specific values: 1а. When it is fulfilled A  B , then it is true R A, B  P ( A) P ( B ) ; P ( A) P ( B ) 1б. Whenever the two events are mutually exclusive is true A  B   , and also P( A) P( B) . R A, B   P( A) P( B) The use of the numerical values of the correlation coefficient is similar to the use of the two regression coefficients. As closer is R A, B to the zero, as “closer” are the two events А and В to the independence. Let us note once again that R A, B = 0 if and only if the two events are independent. For random variables similar statement is not true. The equality to zero of their mutual correlation coefficient does not mean independence, but only registers an absence of correlation. The two random variables are called then non-correlated. As closer is R A, B to the number 1, as “dense one within the other” are the events А and В, and when R A, B = 1, the two events coincide (are equivalent). As closer is R A, B to the number -1, as “dense one within the other” are the events А and B , and when R A, B = - 1, the two events coincide (are equivalent). As well dense one within the other are the events A and В. These interpretations seem convenient when conducting research and investigations associated with qualitative (non-numeric) factors and characteristics. Such cases are common in sociology, ecology, jurisdictions, medicine, criminology, design of experiments, and other similar   P( A) P( B)   P( A) P( B)   P( A) P( B) P( A) P( B)   max  ,  R ( A , B )  min ,    P( A) P( B)     P( A) P( B)   P( A) P( B) P( A) P( B)   areas. R5. Freshet-Hoefding inequalities for the Correlation Coefficient Its proof is similar to these for the regression coefficient. We omit details. Just notice, that these inequalities are important if one wants to construct (e.g. for simulation purposes) events with given individual probability, and desired mutual correlation. Example 1 (continued): We will calculate the numerical value of the correlation coefficient R A, B for the events considered in Example 1 according to its definition, because we have the numerical values of the two regression coefficients rA (B) and rB (A) from the previous section. In this way we get R A, B = (.3368)(.2174) = .2706. Analogously, as in the cases with the work using the regression coefficients, the numeric value of the correlation coefficient could be used for some classifications of the degree (strength) of the mutual dependence. The practical implementation will give a clear indication about the rules of such classifications. From our example we see that the correlation coefficient is something in-between the two regression coefficients. In certain degree, it absorbs the misbalance (the asymmetry) in the two regression coefficients, and looks a well balanced measure of dependence between the two events, including its magnitude of strength. Assume, that it is known R (A, B) =. 2706 as it is calculated. Then the rule (17) will allow evaluating P ( B | A) , as well as P ( A | B ) - the posterior (conditional) probabilities of one event given the information that the other one occurs. If we substitute P (A) =.08, P (B) =.05, we will get the same value P ( B | A) = .05+ (.2706) (.92)(.05)(.95) /(. 08) = .25 as in Example 1 of Section 2. Similar examples could be given in variety areas of our life. For instance, one could consider the possible degree of dependence between tornado touch downs in Kansas (event A), and in Alabama (event B). In sociology a family with 3, or more children (event A), and an income above the average (event B); in medicine someone gets an infarct (event A), and a stroke (event B). More examples, far better and meaningful are expected when the revenue of this approach is assessed. 5. Empirical estimation of the measures of dependence between random events The fact that these measures of the dependence between random events are made of their probabilities makes them very attractive and in the same time easy for statistical estimation and practical use. It is well known that if in N independent experiments an event А occurs k A times, the statistical estimator of the probability Р (А) is the ratio k A /N. In this way all the probabilities in the definitions of the introduced measures can be statistically estimated and when replaced in the equations for the measures will represent the respective statistical estimations of these measures. By the way, the obtained in this approach estimators are also maximum likelihood estimators of these characteristics, since the estimator of the probability Р(А) is a maximum likelihood estimator. Let in N independent experiments (or observations) the random event А occurs k A times, the random event В occurs k B times, and the event A  B occurs k AB times. Then the statistical estimators of our measures of dependence will be respectively as follows: For the connection between the two events the estimator is given by the formula k k k ˆ( A, B)  AB  A  B . N N N For the two regression coefficients the estimators are k A B k A k B   N N N ; rˆA ( B)  kA kA (1  ) N N k A B k A k B   N N N , rˆB ( A) = kB kB (1  ) N N and the correlation coefficient has the estimator Rˆ ( A, B ) = k A B k A k B   N N N . kA k A kB kB (1  ) (1  ) N N N N According to the rules of the statistical estimation, these estimators are all consistent (i.e. in large numbers of observations their values with high probability are close to the true values of the estimated parameter). Moreover, the estimator of the connection ˆ( A, B) is also unbiased, since its expectation is the very function connection δ(А,В) , i.e. there is no systematic error in this estimate. As a consequence, we have that the proposed here estimators of the measures of the dependence between any two random events can be used for practical purposes with the reasonable interpretations and explanations, as it is shown above in our theoretical discussion, and up to certain extend, in our example. As we see, the use of the conditional probabilities in the estimations of the regression coefficients is not needed. We personally are excited from the offered in these approach opportunities. Even in the given here example we are using the frequency interpretation of the probabilities and not any assumed theoretical values 6. Some warnings First of all, we should note, that the introduced measures of dependence between random events are not transitive. It is possible that the random event А is positively associated with the random event B, and this event В to be positively associated with a third random event С, but the event А to be negatively associated with С. To see this it is sufficient to imagine the events А and В compatible (with a non-empty intersection, as shown on Fig. 1 c) as well the events В and С also compatible, while А and С being incompatible, mutually exclusive, and therefore, with a negative connection. Then it may happen to observe the mentioned here situation. As we could see, for mutually exclusive events the connection is negative, when for the non-exclusive pairs (А, В) and (В, С) every kind of dependence is possible. However, in these facts we also see a lots of flexibility when studying the dependence not as an integral feature, but as composed in a number of particular details. 7. An illustration of possible applications As an illustration of what one can do with the proposed here measures of dependence between two random events we analyze here the data from Table 2.4 from the book of Alan Agresti Categorical Data Analysis, 2006. The following table represents the observed data about the yearly income of people and the job satisfaction. Table 1: Observed Frequencies of Income and Job Satisfaction Income US $$ < 6,000 6,000–15,000 15,000-25,000 > 25,000 Total Marginally Very Dissatisfied 20 22 13 7 62 Job Satisfaction Little Moderately Satisfied Satisfied 24 80 38 104 28 81 18 54 108 319 Very Satisfied 82 125 113 92 Total Marginally 206 289 235 171 412 901 When apply the empirical rules for evaluating the probabilities in each category ni , j n. j n Pi , j  , Pi ,.  .i , P., j  , n n n the above table produces the empirical probabilities for a new observation to fall in the respective cell Table 2: Empirical Estimations of the probabilities for each particular case Pi , j , Pi.. , P. j Income US $$ Very Dissatisfied Job Satisfaction Little Moderately Satisfied Satisfied Very Satisfied < 6,000 6,000–15,000 15,000-25,000 > 25,000 .02220 .02442 .01443 .00776 .02664 .04217 .03108 .01998 .08879 .11543 .08990 .05993 .09101 .13873 .12542 .10211 Total (marginal) distribution .22864 .32075 .26083 .18978 Total (marginal) distribution .06881 .11987 .35405 .45727 1.00000 Applying the rules given by the definitions of the proposed measures of dependence between random events, and either the use of the empirical probabilities in Table 2, or alternatively by using the rules for empirical estimation of these measures as described in Section 5, we obtain the following tables 3 to 6. Table 3: Empirical Estimations of the connection function for each particular category of Income and Job Satisfaction  ( IncomeGroupi , Satisfaction j ) Income US $$ < 6,000 6,000–15,000 15,000-25,000 > 25,000 Job Satisfaction Very Little Dissatisfied Satisfied Moderately Satisfied Very Satisfied 0.006467282 0.002349193 -0.00351771 -0.00077 0.003722 -0.00019 0.00784 0.001868 -0.00245 -0.01354 -0.00794 0.00615 -0.00529876 -0.00277 -0.00726 0.015329 0 0 Total sum in a column 0 0 An interesting, and important feature of this table is that the sums of all entries in a row, as well as the sum of all entries in a particular column, equals zero. This property is in accordance to property δ3 because the sum represent the total connection of the respective category if the given factor with the union of all other categories of the other factor, which equals to the sure event. The numerical values of the connection function do not show the magnitude of the dependence between the categories of the income and the levels of job satisfaction. However, their sign indicates a direction of the possible dependence compare to the “neutral stage” of independence. Positive sign indicates a positive local association between these two variables, and the negative sign indicates a negative association in the locality of these particular categories of the two variables. The negative association is marked in cold blue, and the positive areas of association are highlighted in warm pink. The numerical values of the regression coefficients are true measures for the magnitude of the dependence between the two variables, besides the positive association between lowest categories of the income and the low levels of satisfaction, we observe negative association between low level of income and highest levels of job satisfaction. However, these magnitudes are small, close to the zone of independence. We also observe the asymmetry of the dependence, when compare the corresponding entries of tables 4 and 5. For instance, look at the first entries on both tables of the regression coefficients: the one in Table 4 is about 3 times greater than the respective entry in Table 5. It says that the income group is three times more dependent on the answers about job satisfaction (Table 4), than the job satisfaction answer on the income group. Table 4: Empirical Estimations of the regression coefficient between each particular level of income with respect to the job satisfaction rSatisfaction j ( IncomeGroupi ) Job Satisfaction Income US $$ < 6,000 6,000–15,000 15,000-25,000 > 25,000 Very Dissatisfied Little Satisfied Moderately Satisfied Very Satisfied 0.100932704 0.036663063 -0.05489976 -0.00727 0.035276 -0.00176 0.034281 0.00817 -0.0107 -0.05456 -0.03199 0.024782 -0.08269601 -0.02625 -0.03175 0.061768 Table 5: Empirical Estimations of the regression coefficient between each particular level of the job satisfaction with respect to the income Income US $$ < 6,000 6,000–15,000 15,000-25,000 > 25,000 For instance, the number Very Dissatisfied Job Satisfaction Little Moderately Satisfied Satisfied Very Satisfied 0.03667013 0.01078257 -0.01824561 -0.00435 0.017082 -0.00096 0.044454 0.008576 -0.01269 -0.07677 -0.03644 0.0319 -0.03446045 -0.01801 -0.04723 0.099694 rVeryDissatisfied ( 6000) =.100932704 indicates positive dependence of the category of the lowest income “<6,000” on the category: “Very Dissatisfied” for the Job Satisfaction variable. The same number with negative sign rVeryDissatisfied ( 6,000) = - 100932704 indicate the negative strength of dependence of all the other income categories, higher than “<6,000” on the category: “Very Dissatisfied” for the Job Satisfaction variable. Similarly, the sums of numbers from several cells in a column of Table 4, (or in a row of Table 5) will indicate the strength of dependence of the union of the categories of the respective factor “Income” on the corresponding to the column category of the “Job, Satisfaction” (with analogous switch of factor’s interpretation). The two regression coefficient’ matrices allow us to calculate the correlation coefficients between every pair of particular categories of the two factors, according to the rules of Section 4. Table 6 summarizes these calculations. The numbers actually represent the numerical estimations of the respective correlation coefficients. Obviously, each of these numbers gives the local average measure of dependence between the two factors. Unfortunately, the summation of the numbers in a vertical or horizontal line is not having the same or similar meaning as in the cases of connection, or regression matrices. Also, the sums of the numbers in a row or in a column do not equal zero as above. Graphical presentation of the information given in Tables 4 to 7 is given on Fig. 2 – Fig. 6 at the end. Table 6: Empirical Estimations of the correlation coefficient between each particular income group and the categories of the job satisfaction R( IncomeGroupi , Satisfaction j ) Income US $$ < 6,000 6,000–15,000 15,000-25,000 > 25,000 Job Satisfaction Very Little Moderately Dissatisfied Satisfied Satisfied Very Satisfied 0.060838 0.019883 - 0.031649 - 0.005623 0.024548 - 0.001302 0.039037 0.008371 - 0.011653 - 0.064721 - 0.034144 0.028117 - 0.053383 - 0.02174 - 0.038723 0.078472 A prediction of the Income group when you know the job satisfaction, the marginal probabilities and the connection (or Correlation coefficients) matrix. If one knows the category B of the job satisfaction, and has handy the connection function, or either of the other measures of dependence between these categories plus the marginal unconditional probabilities P(A) and P(B) of the particular groups, then the conditional (posterior) probabilities P(A|B) for the income groups can be re-evaluated by making use one of the rules (5), or (13), or equivalent. The following Table 7 presents these probabilities, and for comparison, P(A) are given in the last column. Table 7: Forecast of the probabilities P( Ai | B j )  P( Ai )   ( Ai , B j ) / P( B j ) of particular income group given the categories of the job satisfaction Income US $$ Very Dissatisfied < 6,000 6,000–15,000 15,000-25,000 > 25,000 Total  i P(Bk | Ai )  1 0.32262753 0.35489028 0.20970789 0.11277431 1.00000 Little Satisfied 0.22224076 0.35179778 0.25928089 0.16668057 1.00000 Moderately Satisfied 0.250783788 0.326027397 0.253918938 0.169269877 1.00000 Very Satisfied 0.19902902 0.303387495 0.274279966 0.223303519 1.00000 Unconditional Probabilities P(A) .22864 .32075 .26083 .18978 1.00000 The red numbers show the “hot local positions”, where the conditional probability increases compare to the prior (unconditional) probability. The blue colored numbers show the places of local decrease in the posterior probability, and these are the places where the connection is negative. Now we know, that if someone answer is “very dissatisfied, then the highest chance is that it comes from someone who belongs to the range of 6,000 – 15,000 income. The chances that such answer comes from the group of income “<6,000” have increased by approximately .10. If someone answer is “Very Satisfied”, then the lowest chances that it comes from the group of income “<$6,000”, and this is totally different (as the entire order of income classes) from the prior distribution of the income. Also, the sum of the numbers in a column gives 1, since there are all the possible parts of the sure event ( S  i Ai ) . Analogously, if one knows the income group A, and has handy the connection function, or either of the other measures of dependence P ( B j ) of the particular groups, then the conditional (posterior) probabilities P( B j | Ai ) of the job satisfaction groups of answers can be re-evaluated by making use the same rules respectively. The following Table 8 presents these probabilities, and for comparison, P ( B j ) are given in the last row. Table 8: Forecast of the probabilities p( B j | Ai )  P( B j )   (( Ai , B j ) / PAi ) of particular income group given the categories of the job satisfaction Very Dissatisfied < 6,000 6,000–15,000 15,000-25,000 > 25,000 Unconditional Probabilities P(B) Very Dissatisfied Job Satisfaction Little Moderately Satisfied Satisfied Very Satisfied Total  k P( Ai | Bk )  1 0.09709587 0.07613406 0.05532339 0.11651505 0.13147311 0.11915807 0.388339748 0.359875292 0.344668941 0.398049335 0.432517537 0.480849596 1 1 1 0.04088945 0.1052798 0.3157867 0.538044051 1 0.06881 0.11987 0.35405 0.45727 1.00000 Here we like to notice, that similar “categorizations” can be made for any two numeric random variables, and what we see and read in the above tables can be used for studies of the local structure of dependence between random variables. 8. Conclusions We discussed four measures of dependence between two random events. These measures are equivalent, and exhibit natural properties. The numerical values or regression coefficients and of the correlation coefficient may serve as indicators of the magnitude of dependence between random events. These measures provide simple ways to detect independence, coincidence, and degree of dependence. When either measure of dependence is known, as well as the individual probability of each event, this allows restoration of all the other measures of dependence, and the joint probability. Also it serves for better prediction of the chance for occurrence of one event, given that the other one occurs. If applied to the events A = [a ≤ X < b], and B = [c ≤ Y < d], these measures immediately turn into measures of the LOCAL DEPENDENCE between the r.v.’s X and Y associated with the rectangle [a, b] х [c, d] on the plane. Therefore, the proposed and discussed here measures offer a great tool in the study of local dependence. References [1] A. Agresti (2006) Categorical Data Analysis, John Wiley & Sons, Hew York. [2] B. Dimitrov, and N. Yanev (1991) Probability and Statistics, A textbook, Sofia University “Kliment Ohridski”, Sofia (Secоnd Edition 1998, Third Edition 2007). [3] N. Obreshkov (1963) Probability Theory, Nauka i Izkustvo, Sofia (in Bulgarian). [4] Encyclopedia of Statistical Sciences (1981 – 1988), v. 1 – v. 9. Editors-in-Chief S. Kotz, and N. L. Johnson, John Wiley & Sons, New York . Connection Function between Income Levesl and Satisfaction Levels 0.02 0.015 0.01 0.005 Connection 0 values -0.005 -0.01 -0.015 Series1 Series2 Series3 S3 1 2 S1 3 Satisfaction Level Series4 4 Income Level Fig. 2. Surface plot of the Connection Function according to data of Table 3. Regression Coefficients of Sattisfaction w.r. to Income Level 0.15 0.1 Regr. Coeff. 0.05 values 0 Series1 Series2 -0.05 -0.1 1 2 S3 Income Level S1 3 Series3 Series4 4 Satisfaction Level Fig. 3. Surface plot of the Regression Coefficient Function rSatisfaction j ( IncomeGroupi ) according to data of Table 4 Regression Coefficients surface for Satisfaction w.r.to Income 0.1 0.05 Regr. Coeff. values Series1 0 -0.05 Series2 S4 -0.1 S1 1 2 3 Income Level 4 Series3 Series4 Satisfaction Level Fig. 4. Surface plot of the Regression Coefficient Function rIncomeGroupi ( Satisfaction j ) according to data of Table 5 Correlation Function for the Income Level and the Job Satisfaction Levels 0.1 0.05 Correlation values Series1 0 Series2 -0.05 S4 -0.1 1 2 S1 3 4 Job Satisfaction Series3 Series4 Income Level Fig. 5. Surface plot of the Correlation Coefficient Function R( IncomeGroupi , Satisfaction j ) according to data of Table 6

Dependence and its Measuring

Related documents

Products

Support

Dependence and its Measuring

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib