202-207. Neighborhoods Systems: Measure, Probability and Belief Functions T.Y.Lin1 2 * tylin@cs.sjsu.edu Y, Y, Yao yyao@flash.lakeheadu.ca 1 Berkeley Initiative in Soft Computing, Department of Electrical Engineering and Computer Science, University of California, Berkeley, California 94720 2 Department of Computer Science, Lakehead Univeristy Thunder Bay, Ontario, Canada P7b 5E1 Abstract: The notion of neighborhood system is a mathematical formalism for “negligible quantity.” It formulate the mathematical concept of neighborhoods in the context of advanced computing. Neighborhood systems, by definition, include topology (topological neighborhood systems), rough sets (S5 -neighborhood systems) and binary relations (basic neighborhood systems). In this paper, real valued functions based on neighborhood systems are studied. The study covers many important quantities in uncertainty, such as belief functions, measure, and probability; fuzzy sets are not included here, because it was reported elsewhere. It seems that neighborhood systems are an effective underlying data structure for managing uncertainty. Keywords: binary relation, measure, neighborhood, probability, qualitative fuzzy set, rough set, topology. ---------------- * This research is partially supported by Electric Power Research Institute, Palo Alto, California * On leave from San Jose State University (tylin@cs.sjsu.edu) uncertainty from a geometric point of view, namely, neighborhood systems. Roughly, a 1.Introduction neighborhood system assigns each object a Representing and measuring uncertainty are critical (possibly empty, finite, or infinite) family of non-empty subsets. Such subsets, called in advanced computing. Various theories have neighborhoods, represent the semantics of been proposed. The most notable theories are “negligible,” which is the essence of Zadeh’s fuzzy theories [24] and Pawlak’s uncertainty handling: neglect the negligible. rough set theory [1 6]. In this paper, we discuss Using neighborhoods, one can define open one of the most encompassing notions of sets, hence the interior and closure of any subset (as in topological spaces) [6], [19], [20]. By taking equivalence classes as neighborhoods, the lo wer and upper approximations are precisely the interior and closure, respectively. Rough set approximation is a special for m of neighborhood theor y; it was called cate gory [2]. Generalized rough sets based on various modal logic [21] are all special forms of neighborhood systems, called basic neighborhood systems. The notion of neighborhood systems is one of the ‘correct’ mathematical formalisms for expressing the semantics of approximation and uncertainty in the context of advanced computing. Our interests stemmed from database retrieval and mining [15], [3], [7], [8] [9], [2], [10], [21]. One may view it as a first step toward the granulate mathematics described by Lotfi Zadeh [23]. Some basic notions and its application to qualitative fuzzy theory was reported in [13], [14]. In this paper, we turn out attention to the quantitative aspect of uncertainty. One could develop a full fledge measure and probability theory as in classical mathematics [5]. At this early stage, we focus on its applications; belief functions are formulated using the notion of neighborhood systems [18]. 2. Neighborhood Systems Since the systematic study of neighborhood system is relatively recent, we recall some of our motivation from [14]. Neighborhood systems are abstracted from numerical analysis. In any standard procedure of finding approximate solutions, the very first step is to choose a “small” number e, or equivalently, an e-neighborhood for each point on the real line. During the process of finding approximate solutions, this particular family of neighborhoods never changes. In other words, the only relevant notion of the real line topology [6] is this particular family of chosen neighborhoods. We can view the first step as the step of setting up a proper context for discussions, that is, one step up the “standard” for what it means by “near” for this special circumstance. Such a family of chosen neighborhoods, not the full topology, is the essential formalism for approximation. The neighborhood is a fundamental notion in mathematical analysis. It is also a common notion in many other areas. It appears in logic [1], in a text of genetic algorithm [4], rough sets [16], generalized rough sets [21], and databases. A systematic study in the context of advanced computing was started by the first author and his students. The study was motivated from database retrieval and mining [15], [3], [7], [8] [9], [2], [10]. 2.1 Definitions and Properties In this section, we recall some notions of neighborhood system from [14]. Let U be the universe of discourse and p be an object or point in U. 1. A neighborhood, denoted by N(p), or simply N, of p is a no n-empty subset o f U, which may or may not contain the object p. Any subset that contains a ( non -empty) neighborhood is a neighborhood. 2. A neighborhood system of an object p, denoted by NS(p), is a maximal family of neighborhoods of p. If p has no neighborhood, then NS(p) is an empty family; in this case, we simply say that p has no neighborhood. 3 . A n e i g h b o r h o o d s y s t e m o f U , d e n o t e d b y N S ( U ) i s t h e c o l l e c t i o n o f N S ( p ) f o r a l l p i n U . S u c h a n e i g h b o r h o o d s y s t e m m a y a l s o b e c a l l e d F - t o p o l o g y ( r e a d a s f i n i t e t y p e t o p o l o g y ) . F o r s i m p l i c i t y a s e t U t o g e t h e r w i t h N S ( U ) i s c a l l e d a n e i g h b o r h o o d s y s t e m s p a c e o r a n e i g h b o r h o o d s y s t e m . A n e i g h b o r h o o d s y s t e m i s c a l l e d a F r e c h e t ( V ) s p a c e , i f e v e r y N S ( p ) i s n o n - e m p t y [ 1 9 ] . 4. N is open, if N is a neighborhood of every object in N. 5. More generally, a subset X of U is open if for every object in X, there is a neighborhood N(p) ˝ X. A subset X is closed if its complement is open. 6. NS(p) and NS(U) are open if every neighborhood is open. NS(U) is topological, if U is the usual topological space [6]. Both NS(U) and the collection of open sets is called topology if U is a topological space. 7. Let X be a subset of U. The lower approximation of X can be defined by I[X] = { p: there is a N(p) ˝ X} = interior of X, that is, I[X] is the largest open set contained in X. 8. Similarly, the upper approximation of X can be defined by ( 0 is the empty set) C[X] = {p: " N(p), X ˙ N(p) „ 0}= closure of X. that is, C[X] is the smallest closed set contains X. I[X] and C[X] are precisely the lower and upper approximation in rough set theory. 9. A topological space is a neighborhood system space, but not the converse. 10 . Intersectio ns and finite unio ns o f clo sed sets are clo sed . 11. In topological spaces, unions and finite intersections of open sets are open. In neighborhood systems, unions is open, but intersections may not be open. 2.2 Basic Neighborhoods and Binary Relations 13. A minimal neighborhood of p, denoted by MN(p), is a m i n i m a l m e m b e r o f N S ( p ) i n t h e s e n s e t h a t M N ( p ) c o n t a i n s n o m e m b e r o f N ( p ) a s p r o p e r s u b s e t s . I n g e n e r a l s u c h M N ( p ) m a y o r m a y n o t e x i s t . T h e m a x i m a l f a m i l y o f a l l M N ( p ) a t p wi l l b e d e n o t e d b y M N S ( p ) . T h e f a m i l y o f M N S ( p ) f o r a l l p wi l l b e d e n o t e d b y M N S ( U ) . L e t n ( p ) b e t h e n u m b e r o f ( d i s t i n c t ) M N S ( p ) ’ s a t p . I f , f o r a l l p , n ( p ) = n i s a c o n s t a n t i n t e g e r , M N S ( U ) i s a n n - m i n i m a l n e i g h b o r h o o d s y s t e m , a n d d e n o t e d b y n - M N S ( U ) ; we wi l l b e i n t e r e s t e d i n 1 - m i n i m a l n e i g h b o r h o o d s y s t e m s , c a l l e d b a s i c ( b i n a r y ) n e i g h b o r h o o d s y s t e m B S ( U ) . B S ( U ) c a n b e d e f i n e d b y a b i n a r y r e l a t i o n a n d v i c e v e r s a - s e e b e l o w. S o “ B ” i n B S ( U ) m a y b e r e f e r r e d t o a s a b a s i c n e i g h b o r h o o d o r a b i n a r y n e i g h b o r h o o d . 14. Let R be a binary relation defined on U, then B(p) = {x : pRx} is a neighborhood of p. So a binary relation R gives rise to a basic (binary) neighborhood system. Conversely, one can use the basic neighborhoods to define the binary relation. From the implementation point of view, we can rephrase basic neighborhood systems as follows: 15. A basic neighborhood system BS(U) is a data structure that assigns to each datum a list of data. Since E-F = (E’¨ F)’), it follows that every algebra is a ring. 16. Given a neighborhood system NS(p) at p. A 3. A s-Ring(or s-Boolean Ring) of sets is a non-empty class minimal member of NS(p) may or may not exit. For S of sets such that example, a neighborhood system of a real number (a) if E ˘ S and F ˘ S, then E - F ˘ S (b) if Ei ˘ S then ¨i{Ei | I =1, 2,…} ˘ S has no minimal neighborhood. 17. A binary relation on U defines one and only basic (binary) neighborhood system; they are summarized in the table below [3]. A s-algebra is s-ring containing U. We are interested in finite universes, s-Ring (sBinary Relatio Basic (Binary) Algebra) is the same as Ring and Algebra. 4. A Relations nships Neighborhoods non-empty class H of sets is hereditary if, reflexive,transitive « S4 , (topological) H(E), m *(E ) ” inf {S n m( En ) | E ˝ ¨n En andEn ˘ R}, then m * is an outer measure on H(R); if m is equivalence « clopen topology, S 5 , Similar ly the n -grad ed b inar y relatio ns [ 21] f i n i t e o r s - fi n i t e s o i s m *, wh e r e i n f i s t h e correspo nd to n -minimal neighbo r hood systems. l e a s t u p p e r bound. We are interested in finite universes only, 3. Measure and Probability so the infinite sum is in fact a finite sum. Let U be the universe, we will be interested in the 6. Let m be a measure on a ring R. For every E in following notions [5]. 1. A ring (or Boolean ring) H(E), of sets is a non-empty we define class R of sets such that if m * (E ) ” sup {S n m( En ) | E ° ¨n En andEn ˘ R}, E˘R and F˘R, then E¨F ˘Rand E-F˘R Then m * is called an inner measure induced by m, In other words, a ring is a non -empty class of sets where sup is greatest lower bound. Infinite sum is finite; which is closed under the formation of unions and see item 4. differences. Let E be a class of sets. It is not difficult to show that 7. From [5], pp. 50, item 5 can be improved to there exists a unique ring R(E), the smallest ring m *(E ) ” inf { m ( F)|E ˝FandF˘R}, containing E; it will called the ring generated by E. S i n c e i t i s fi n i t e S( R ) = R a n d e x t e n d e d me a s u r e i s m i t s e l f. 2. An algebra (or Boolean algebra) of sets is a nonempty class R of sets such that 4. Borel Sets for Neighborhood Systems (a) if E ˘ R and F ˘ R , then E¨F ˘ R (b) if E ˘ R, then 8. Traditionally a Borel set is defined on a topological E’ space, we will extend it to a neighborhood system space. Let C be the class of all compact and closed sets. As usual the Borel set is the s-ring generated wheneverE˘SandF˝ E,thenF˘S.ThepowersetofXisahereditaryclass.For serial « serial reflexive « refle everyringR,H(R)isthesmallesthereditaryring symmetric « symmetric symm generated by R. 5. A measure is an extended real Euclidean fi open valued , non-negative, transitive « transitive Euclidean « and countably additive set function m, defined reflexive, symmetric « Brower on a ring, and such that m(0)=0. If m is a measure on a ring R and if, for every E in by the class of all compact and closed sets; it will be denoted by BOrel(U). 9. Since we consider finite neighborhood systems only, all closed sets are compact; and s-ring is a ring; a s-algebra is an algebra. 1 0 . B O r e l ( U ) i s a n a l g e b r a g e n e r a t e d b y c l o s e d s e t s ; B O r e l ( U ) i s a n a l g e b r a g e n e r a t e d b y o p e n s e t s . 1 1 . I f U i s S 5 -neighborhood system space, then BOrel(U) is the collections of all definable sets, namely, finite unions of equivalence classes; definable sets are the clopen sets. P r o p o s i t i o n 1 . L e t U b e a f i n i t e S 5 - n e i g h b o r h o o d s y s t e m s p a c e a n d m i s a m e a s u r e ( f o r e x a m p l e , t h e counting measure that is the cardinal number of a finite set) on BOrel(U). Then the outer measure and inner measure m*(E ) = m (C(E )) m* ( E ) = m ( I ( E ) ) a r e t h e m e a s u r e o f l o w e r a n d u p p e r a p p r o x i m a t i o n . C o r o l l a r y 2 . U i s a f i n i t e S 5 -neighborhood system space and m is a measure. r(E)=m(E)/TOTAL, where TOTAL= m(U). Then P is a probability measure and its outer and inner probability measure are the probability of upper and lower approximation; they are belief and plausibility functions respectively; see [17] r is important measure, so we will give a formal definition in next. 12. Let U be a finite se t and m is the counting measure. For simplicity, the probability measure r=m(E)/TOTAL will be called counting probability measure. 1 3 . I f U i s S 4 - n e i g h b o r h o o d s ys t e m s p a c e , t h en B O r el ( U) i s t h e c l a s s o f a l l f i n i t e , d i sj o i n t unions of proper differences o f sets of clo sed sets, and BOrel(U) is the class o f all finite, disjoint unio ns of proper differences of sets of closed sets. Proposition 3. U is a finite S 4 -neighborhood system space and m is a measure (for example, the counting measure that is the cardinal number of a finite set) on BOrel(U). Then the outer measure and inner measure m * (E)=sup{m(F ) | E ° F and F are closed}, m*(E ) = inf { m * (F ) | E ˝ F and Fare open}, 5. Belief Functions Let us recall some notions from [18]. Let U be a finite set and POwer(U) be its power set. If we use a full word as a notation, we cap the first two characters; so that one can distinguish between notations and words. 14. Belief function: A unit interval valued function Bel : POwer(U) fi [0, 1] is called a belief function if (a) Bel (0 )=0 (b) Bel(U)=1 (c) For every finite collection E j , j=1,2,…n of subsets of U, Bel(¨j n Ej ) ‡ Ss n (-1)t Bel ( Es) w h e r e s r e p r e s e n t a l l p o s s i b l e f i n i t e s u b s e t s o f { 1 , 2 , . . n } a n d t i s t h e n - | s | + 1 , w h e r e | s | d e n o t e t h e c a r d i n a l n u m b e r o f s . Bel can be constructed from basic probability (see next item): Bel (A) = S m(B), where B varies through all subsets of A. 15. Basic probability: A unit interval valued function m : POwer(U) fi [0, 1] is called basic probability if (a) m(0) =0 (0 is also used to denote empty set) (b) S n m(En ) =1, where E n varies through POwer(U). This definition is somewhat deceiving, what we really have here is a generalization of probability mass function from points to subsets. If we assign basic probability to each basic neighborhood (and zero to all other sets), we get immediately a belief function on U. More generally if we assign basic probability to all minimal neighborhoods (and zero to all other sets), we again get a belief function on U. Neighborhood systems (of finite space) are the most natural underlying structure for belief functions. Conversely, if a space has a belief function, then there is a very natural neighborhood system associated to the belief function: Given a idefined by LEarn(x) , w h e r e C is the unique basic neighborhood of x, i belief function, there is a basic probability. The =C equivalently the concept learned by x. Some collection of sets on which the basic probability comments are in order. For each x there is a unique are non-zero is a neighborhood system. Namely, basic neighborhood, so LEarn is a well defined map we have the following: (we also consider multi-valued learning [11]). Propositon 4. U is a finite space. U has a belief The map LEarn gives rise to a partition on U (its function iff there is a neighborhood system on U. quotient set is isomorphic to COncept). The probabilit In the next few paragraphs, we describe some measure P_m on POwer(COncept) specific examples on previous theorem. 16. In induces a probability measure r_m on U as follows: [11], Lin and Hadjimichael studied non- 1 r_m(LEarn(X))=P_m(X), X ˘ POwer(COncept). classificatory learning. It is a multilevel learning. Note that the collection of all inverse image, Mathematically, it is a sequence of mappings. At first - 1 LEarn(X)), X ˘ POwer(COncept) level, it maps each point (a base concept) to its unique is the s-ring generated by the equivalence classes o basic neighborhood, called a concept of level 1. The U. So we have results similar to the Corollary 2. family of such basic neighborhoods is denoted by T h e o r e m 5 . U i s a fi n i t e b a s i c n e i g h b o r h o o d COncept(1) or simply Concept. In general step, it maps s ys t e m. C O n c e p t i s t h e d i s t i n c t l i s t o f b a s i c each point in COncept(n) to an element in neighborhoods. COncept(n+1), where COncept(n+1) is called concept of level (n+1) and is the family of basic neighborhoods of COncept(n). Implicitly the level one learning is also in [12]; the soft rules in level 0 are hard rules in level 1. } be the distinct list of basic neighborhoods * = e_c_Bel (item 17) 19. This is a i, … , C ,.,Cn (note that two distinct points may have the generalization of Pawlak's results. The same neighborhood.) For example, in rough outer and inner probability measures of r_m are the set theory, COncept is the set of probability of lower and upper approximation of the 17. Let COncept equivalence relation LEarn. This theorem is related to, but ={C 1 different from [22]. equivalence classes. Let m be the “external sum measure.” Recall that | • | is the cardinal number. i1 |+|C2 | + …+ |C n | + … + |C| m (C i)= | Ci | 20. We will call this s-alg Next, we consider the following basic probability of m (COncept)= | C m : POwer(2 C O n c e p t ) fi [ 0 , 1 ] the basic neighborhood system defined by the equations, probability; the belief function 6. Conclusions i(a) )/m(COncept) (b) m(A) = 0 if Aˇ Concept. m induces a belief function on U; we call it the external counting belief function, denoted by e_c_Bel of the basic neighborhood system. The "same" m, as a probability mass function, induces a probability measure P_m on Power(COncept) 18. Now we will consider the learning map LEarn : U fi COncept m(C i) = m (C Neighborhood systems were introduce arena of advanced computing for mod retrievals. It turns out to be a very effe uncertainty. In this paper, we examine functions," measure, probability, and b of neighborhood systems; fuzzy sets w study conclude that neighborhoods ma mathematical formalism for uncertaint effective formalism to granulate inform Acknowledgment This author would like to express his Professor Zadeh for his kind guidance to join the Berkeley Initiative in Soft C (BISC). Our deepest thanks also go to Dr. Martin Wildberger at EPRI, Electric Power Research Institute, for his generous sponsorship. References 1. Back, T., Evolutionary Algorithm in Theory and Practice, Oxford University Press, 1996 2. Bairamian, S., Goal Search in Relational Databases, T hesis, California State University at Northridge, 1989. 3. Chellas, B., Modal Logic, an Introduction, Cambridge University Press, 1980. 4 . Engesser, K., “So me co nnectio ns b et wee n to p o lo gical and Mo d al Lo gic,” Mathe matical Lo gic Quarterl y, 4 1 , 4 9 -6 4 , 1 99 5 . 5 . Ha l mo s, P ., Mea su re Th eo ry , Va n No str a nd , 1 9 5 0 . 6 . Ke ll y, J ., G en era l To p o lo g y, Va n No s tra nd , 1 9 5 5 . 7 . Li n, T .Y., “N ei g hb o r ho o d S ys te ms a nd Re lat io na l Database”. Proceedings of CSC ‘88, February, 1988. 8 . L i n , T . Y . , “N e i g h b o r h o o d S y s t e m s a n d Ap p r o x i m a t i o n i n D a t a b a s e a n d K n o wl e d g e Base Systems,” Proceedings of the Fourth International Symposium on Methodologies of Intelligent Systems , Poster Session, October 12-15, 1989. 9 . Lin, T .Y. , “Ro ug h Sets, Neighb o r ho o d S yste ms a nd Ap p r o xi mat io n ,” Fifth I n terna tion a l S ymp o siu m o n Meth o do lo g ies o f I n tellig en t S ystems, S elected Pap ers , Oct. 19 9 0. ( Q. Liu, K. J . Huang and W . Che n) . 10. Lin, T.Y., “Topological Data Models and Approximate Retrieval and Reasoning,” Proceedings of Annual ACM Conference, February, 1989. 1 1 . Li n., T . Y . a nd Ha d j i mic ha el, M., “No nc la s si fica to r y Ge ne r al i za tio n i n D at a Mi ni n g ,” P ro c eed in g s o f Th e Fo u r th Wo rk sh o p o n Ro u g h S et s, Fu zz y S e ts a n d Ma ch in e Di sco v e ry, T o k yo , J ap a n , No ve mb er 8 -1 0 ,1 9 9 6 12. Lin T. Y., and Yao, Y. Y., “Mining Soft Rules Using Rough Sets and Neighborhoods,” Symposium on Modeling, Analysis and Simulation, CESA’96 IMACS Multiconference (Computational Engineering in Systems Applications), Lille, France, July 9 -12, 1996, Vol. 2 of 2,pp.1095-1100. (Coauthor Yao) 1 3 . Li n , T . Y . “A S e t T h eo r y f o r So f t C o mp u t i n g , ” P ro c e ed in g s o f 1 9 9 6 I E EE I n t e rn a ti o n a l C o n f e r en ce o n F u z z y S y s t e m s , Ne w O r l e a n s , Lo u i s i a n a , S e p t e mb e r 8 - 1 1 , 1 9 9 6 . 14. Lin T. Y., “Neighborhood Systems -Applications to Qualitative Fuzzy and Rough Sets”, Advances in Information Sciences, Volume IV. Ed. Paul Wang. 15. Motro, A., “Supporting goal queries in relational databases”, Expert Database Systems, Proceedings of the First International Conference, L. Kerschberg, Institute of I nfor mation Management, Technolo gy and Policy, University of S. Carolina, 1986. 16. Pawlak, Z., Rough sets. Theoretical Aspects of Reasoning about Data, Kluwer Academic Publishers, 1991 1 7 . P a wl a k , Z . , R o u g h P r o b a b i l i t y , B u l l . P o l . Acd. Sci., Math 32, 607 -615, 1984 18. Schaffer, G., A Mathematical theory of Evidence, Princeton University, 1976. 19. Sierpenski W. and Krieger, C., General Topology, University of Torranto press, 1956 20. Yao , Y., "Two View of Rough Sets on Finite Universes," Journal of Approximate Reasoning. To appear 21. Yao, Y., and Lin, T.Y., “Generalization of Rough Sets using Modal Logics”, Intelligent Automation and Soft Computing, an International Journal, to appear, 1996 22. Yao and Lingras, Belief Function in Rough Set Models, Proceedings of Second Annual Joint Conference on Information Science, Wrightsville Beach, North Carolina, Sept. 28-Oct. 1, 1995, pp. 190-193. 23. Zadeh L., The Key Roles of Information Granulation and Fuzzy logic in Human Reasoning, 1996 IEEE International Conference on Fuzzy Systems, September 8-11, 1996 24 . Zad eh L., “Fuzzy Sets,“ I nfo rma tion a nd Con tro l, 8, 1 965, pp. 338 -353. TsauYoung(T.Y.)LinreceivedhisPh.DfromYaleUniversity,andnowisaProfessoratSan JoseStateUniversityandVisitingScholarinBISC,UniversityofCalifornia-Berkeley.Hehasbeen chairsand membersofprogramcommitteesinvariousconferencesandworkshops,associate editorsandmembersofeditorialboardsofseveralinternationaljournals.Heisthepresidentof InternationalRoughSetSociety.Hisinterestsincludeapproximationindatabaseand knowledge-basesystems,datamining,datasecurity,fuzzysets,intelligentcontrol,Petrinets,and roughsets(alphabeticalorder). Yiyu(Y.Y.)YaoreceivedhisPh.DfromUniveristyofReginaandnowisanassociateprofessorat LkeheadUniversity.Heservedinthe programcommitteesofseveralconferencesand workshops,alsoisanassociateeditorofaninternationaljournal.Heisthe secretaryofroughcontrol group.Hisinterestsincludefuzzysets,informationretrieval,androughsets(alphabeticalorder).