CONCEPTUAL INTERPRETATION OF FUZZY THEORY Karl Erich Wolff Fachhochschule Darmstadt Forschungsgruppe Begriffsanalyse der Technischen Hochschule Darmstadt Ernst Schröder Zentrum für Begriffliche Wissensverarbeitung In: Zimmermann, H.J. (ed.): EUFIT '98 6th European Congress on Intelligent Techniques and Soft Computing 1998, Vol. I, 555 – 562. Conceptual Interpretation of Fuzzy Theory Karl Erich Wolff Fachbereich Mathematik und Naturwissenschaften, Fachhochschule Darmstadt, Schöfferstr. 3, D-64295 Darmstadt, Germany Forschungsgruppe Begriffsanalyse der Technischen Universität Darmstadt ERNST SCHRÖDER ZENTRUM FÜR BEGRIFFLICHE WISSENSVERARBEITUNG email: ABSTRACT: The central result of this paper is the development of L-Fuzzy Scaling Theory where (L,) is an arbitrary ordered set which replaces the real unit interval in classical Fuzzy Theory. The underlying basic observation is that linguistic variables play the same role in Fuzzy Theory as conceptual scales in Formal Concept Analysis. The main steps are the introduction of the concept lattice of an L-Fuzzy set, the Representation Theorem for cut contexts, the Dedekind-MacNeille Embedding Theorem, the introduction of (realized) L-Fuzzy linguistic variables and products of them which leads to a conceptual interpretation of Fuzzy implications. 1 INTRODUCTION L.A. Zadeh started his foreword in “Fuzzy Set Theory and its Applications” (Zimmermann 1991) with the words: “As its name implies, the theory of fuzzy sets is, basically, a theory of graded concepts – a theory in which everything is a matter of degree or, to put it figuratively, everything has elasticity.” This paper shows that Fuzzy Theory is not only a theory of “graded concepts” in the metaphorical sense but also in the sense of Formal Concept Analysis (FCA) – a theory developed by R.Wille (1982). FCA is based on classical (crisp) set theory, formalizes the object-attribute relation “an object has an attribute” by a formal context (G,M,I) where I G M and introduces the set of “formal concepts” of a given formal context. This set of formal concepts, ordered in a natural way using set inclusion, yields a complete lattice, called the concept lattice of the given formal context. If the concept lattice is finite it can be represented graphically by a line diagram. This looks like a very rigid discrete structure, far away from elasticity and continuity. But there is a deep connection between the finite lattices and the lattice given by the continuum of the real numbers including + and - with the usual real ordering: they are complete lattices and as a matter of fact - all complete lattices are concept lattices and vice versa. Indeed the construction of the real numbers from the rational numbers by Dedekind cuts is just a special case of the construction of a concept lattice from a given ordered set, namely building the smallest complete lattice containing the given order, called the Dedekind-MacNeille completion. This becomes very important for this paper: firstly the elasticity of the continuum can be described in a formal conceptual way and secondly the generalization of the classical unit interval [0,1] to an arbitrary lattice or even an arbitrary ordered 1 set can also be captured by building the Dedekind-MacNeille completion of the chosen ordered set. Then a very useful possibility occurs: we now have the freedom to choose the logic as a meaningful ordered set, for example as the concept lattice of a formal context. 2 BASIC NOTIONS IN L-FUZZY THEORY Let X be a set and (L, ) an ordered set, i.e. is a reflexive, antisymmetric and transitive relation on the set L. (L, ) is often chosen as a lattice, mostly as the complete lattice ([0,1], ) of the real unit interval with the usual real ordering. The set F(X,L) := { f f: X L } is called the set of all LFuzzy sets (or L-membership functions) on X. The order relation on L induces an order relation on F(X,L) which will be denoted also by : For any f1 , f2 F(X,L) let f1 f2 iff xX f1 (x) f2 (x). Then (F(X,L), ) is also an ordered set which is a (complete) lattice, if (L, ) is a (complete) lattice. 3 BASIC NOTIONS IN FORMAL CONCEPT ANALYSIS Formal Concept Analysis was introduced by Wille (1982). The interested reader is also referred to Ganter, Wille (1996), Wille (1996) and, for a short introduction, to Wolff (1994). We recall the central ideas and the basic notions: Formal Concept Analysis is based on a formalization of the philosophical understanding of a concept and the conviction that human thinking and communication always take place in contexts which determine the specific meaning of the concepts used (Wille 1996). Therefore Formal Concept Analysis starts with the formalization of contexts: A (formal) context is defined as a tripel K = (G,M,I), where G and M are sets and I is a binary relation between G and M, i.e. I G M. If (g,m)I we write “g I m” and read it “the object g has the attribute m”. For a given formal context K = (G,M,I) the formal concepts of K are introduced as pairs (A,B) such that A G, B M and A = B and B = A, where the upper derivation A of A in K is defined by A := {mM g A g I m } and the lower derivation is B := {gG m B g I m }. A is called the extent and B the intent of the concept (A,B). On the set B(K) of all concepts of K the subconcept-superconcept-relation is defined by: (A1 , B1 ) (A2 , B2 ) if A1 A2 (which is equivalent to B2 B1 ). The ordered set (B(K), ) is a complete lattice, called the concept lattice of K. The context K can be reconstructed from its concept lattice since G is the extent of the largest concept, M is the intent of the smallest concept and g I m iff g m, where g := ( {g}, {g}} is the object concept of g and m := ( {m}, {m} } is the attribute concept of m. Each finite concept lattice (B(K), ) can be represented by a line diagram which is a Hasse-diagrams of (B(K), ) such that the point representing an object concept c is labeled with the names of all objects g such that g = c. The set of these objects is called the contingent of c, its cardinality the contingency number or the frequency of c. Dually a point representing an attribute concept c is labeled with the names of all attributes m such that m = c. Therefore the context K can be reconstructed from a line diagram by the reading rule: An object g has an attribute m iff there is an ascending path from the point labeled by “g” to the point labeled by “m”. Examples of contexts and concept lattices are given in section 7. 2 4 THE CONCEPT LATTICE OF AN L-FUZZY SET The central connection between Fuzzy sets and crisp (usual) sets are the cuts: For each L-Fuzzy set f F(X,L) and each L the -cut of f is defined to be the set f := {xX f(x) }. The -cuts of f are used to describe L-Fuzzy sets by concept lattices. Definition: Let X be a set, (L, ) an ordered set. For each L-Fuzzy set f F(X,L) let If := {(, x) f(x) } and Kf := (L, X, If ). Kf is called the cut context of the L-Fuzzy set f. The name “cut context” was chosen since for each object L the intent of the object concept () is just the -cut f . The set C(f) := {f L } is called the cut system of f . Example: Classical Fuzzy sets Let X := R be the set of real numbers, L = [0,1] the real unit interval ordered by the usual real order . We call the Fuzzy-sets in F(R, [0,1]) the classical Fuzzy sets. Figure 1 shows a classical Fuzzyset, some of its cuts and some concepts of the concept lattice of the cut context. Figure 1: A classical Fuzzy set and some cuts - the concept lattice of its cut context is a chain In Figure 1 we indicate that the concept lattice of a classical Fuzzy set is a chain. This is a consequence of the following embedding theorems. Embedding Theorem 1: Let X be a set, (L, ) an ordered set and f F(X,L). Then : L B(Kf ) , where () := ( , ) is the object concept of , is an order preserving map from (L, ) into (B(Kf ), ) . is an order embedding iff (f f ). Proof: Let , L, then f f () (). The well-known representation theorem (Böhme 1993) for [0,1]-Fuzzy sets on a set X states that each [0,1]-Fuzzy set f can be represented by the family (f [0,1] ) of its cuts using the formula f(x) = sup{min{, (c(f))(x)} [0,1] } where c(f) denotes the characteristic function of f .This can be generalized to an arbitrary order (L, ) by the following theorem: 3 Representation theorem for cut contexts of L-Fuzzy sets: Let X be a set, (L, ) an ordered set and f F(X,L). Then f can be reconstructed from the cut context Kf := (L, X, If ) and (L, ) by the formula f(x) = max{ L If x }, i.e. f(x) is the maximum in (L, ) of the extent of the attribute concept of x in (B(Kf ), ). Proof: Let x X, then f(x) = max{ L f(x) }= max{ L If x }. Definition: Let (L, ) be an ordered set. The concept lattice of the context (L, L, ) is called the DedekindMacNeille completion of (L, ), denoted by DM(L). It is well-known ( Davey, Priestley 1990; Ganter, Wille 1996) that DM(L) is (up to isomorphism) the smallest complete lattice into which order embeddings from (L, ) exist. An order embedding : L DM(L) is defined by () := ( (], [) ), where (] := {L } and [) := {L }. This order embedding preserves suprema and infima as far as they exist in (L, ). Embedding Theorem 2: Let X be a set, (L, ) an ordered set and f F(X,L). Then the Dedekind-MacNeille embedding restricted to fX fX : fX DM(L) satisfies fX = fX where : L B(Kf) is the object concept mapping of Kf , : B(Kf) B(Sf), ((A,B)) := (A, A ) is a lattice isomorphism onto the concept lattice of Sf :=(L,fX,) and : B(Sf) DM(L) , ((A,B)) := (A, A ) is an infimum-preserving order embedding. Proof: Let x X and := f(x), then () = (f, f ) = ( (], f ). The last equation holds since (] f = { L z X ( f(z) f(z) ) } and on the other side f (] since = f(x). The lattice isomorphism fixes extents, hence () = ((], (] ) = ((], [) fX ). Therefore () = ((], [) fX ) = ((], [) ) = (). Example: If X is an arbitrary set, ([0,1], ) the usual complete lattice on the unit interval and f F(X, [0,1]), then the mapping is an order embedding of the concept lattice B(Kf) into the Dedekind-MacNeille completion of ([0,1], ) which is isomorphic to the chain ([0,1], ). Hence B(Kf) is a chain. 5 FUZZY SCALING THEORY The central observation combining Fuzzy Theory and Conceptual Scaling Theory leading to the development of Fuzzy Scaling Theory is that linguistic variables and conceptual scales serve for the same purpose namely a formal description of a language about the values of some measurement function. Now we transfer the main ideas of Conceptual Scaling Theory to Fuzzy Theory. Zadeh (1973, 1975) introduced the notion of a linguistic variable in the following way: 4 Def.: (Zadeh 1975) “By a linguistic variable we mean a variable whose values are words or sentences in a natural or artificial language. For example, Age is a linguistic variable if its values are linguistic rather than numerical, i.e., young, not young, very young, quite young, old, not very old and not very young, etc., rather than 20, 21, 22, 23,....” The following definition introduces the notion of a linguistic variable over an arbitrary ordered set. Def.: A linguistic variable (over an ordered set (L, )) is a quintupel (X, V, , L, ), where X is a set (called the ´domain´), V is a set (of ´linguistic values´), (L, ) is an ordered set and is a mapping : V F(X, L) which represents each linguistic value v by an L-Fuzzy set v := (v) on X. A classical linguistic variable is a linguistic variable over the usual ordering of the unit interval ([0,1], ). Def.: For each linguistic variable := (X, V, , L, ) the scale S of is defined by the formal context S := (X, VL, I), where for x X and (v,) VL x I (v,) : v(x). For v V the subcontext Sv := (X, {v}L, I (X ({v}L)) ) of S is isomorphic to the dual of the cut context Kv = (L, X, {(,x) v(x) }). Hence S is isomorphic to the apposition (cf. Ganter, Wille 1996, p.40)vV (Kv)d of all dual cut contexts of the membership functions (v) of the linguistic variable (for examples see section 7). Therefore the concept lattice B(S) can be supremum-embedded into the direct product of the concept lattices B((Kv)d) which can be supremum-embedded into the direct product DM(Ld)V of V copies of the dual of the DedekindMacNeille completion DM(L) (by Embedding Theorem 2). Now we introduce the combination of a measurement function with a linguistic variable. Def.: Let := (X, V, , L, ) be a linguistic variable and m: G X be a function from an arbitrary set G into X. Then the tuple (G, m, ) := (G, m, X, V, , L, ) is called a realized linguistic variable of and the linguistic variable of (G, m, ). The function m is called a measurement function from the set G of objects into the domain X of the linguistic variable . Def.: Let := (G, m, X, V, , L, ) be a realized linguistic variable over (L, ). Then the formal context K := (G, VL , J ) where the incidence relation J is defined for each g G and each (v, ) VL by g J (v, ) v (m(g)) is called the derived context of the realized linguistic variable . For v V the subcontext K,v := (G, {v}L , J ( G ({v}L)) ) of K is called the derived context of the linguistic value v. Hence the derived context K is the apposition of the derived contexts of the linguistic values: K = vV K,v . If := (X, V, , L, ) is the linguistic variable of := (G, m, X, V, , L, ) there is a simple relation between the scale S and the derived context K : for each g G and each (v, ) VL g J (v, ) v (m(g)) m(g) I (v,). 5 Hence the concept lattice B(K) can be supremum-embedded into the concept lattice B(S) (Ganter, Wille 1996, p.98). The following definition introduces the product of realized linguistic variables on the same set of objects. Def.: Let := (G, m, X, V, , L, ) and ´ := (G, m´, X´, V´, ´, L´, ´ ) be two realized linguistic variables on the same set G of objects. The mapping mm´ : G XX´ which is defined by (mm´)(g) := ( m(g), m´(g) ) is called the product of the two measurement functions m and m´. The mapping ´: VV´ F(XX´, LL´) is defined by (´)(v, v´) := v ´v´ , where (v ´v´ )(x, x´) := ( v(x), ´v´(x´) ) LL´ for all (x, x´) XX´. (LL´, ) is the usual product of the ordered sets (L, ), (L´, ´ ), where (a, a´) (b, b´) : a b and a´ ´ b´ (for all a,b L, a´,b´ L´). Then the following tuple ´ := (G, mm´, XX´, VV´, ´, LL´, ) is a realized linguistic variable on the product (LL´, ), called the product of and ´. The quintuple ´ := (XX´, VV´, ´, LL´, ) is called the product of the corresponding linguistic variables and ´. Discussion of this definition: 1. The product ´ of two realized linguistic variables and ´ clearly contains all the information about and ´. 2. If one tries to define another product of two classical linguistic variables and ´ which is again a classical linguistic variable and contains all the information about and ´ then the information about and ´ represented in the product [0,1] [0,1] has to be represented in [0,1]. As far as I see there is no natural way to do this. This difficulty does not occur in the definition above since we used an arbitrary ordered set instead of the usual ordering on the unit interval – and the direct product of two ordered sets is an ordered set while the direct product of two intervals is not an interval. 6 CONCEPTUAL INTERPRETATION OF FUZZY IMPLICATIONS Let := (G, m, X, V, , L, ) and ´ := (G, m´, X´, V´, ´, L´, ´ ) be two realized linguistic variables on the same set G of objects. Let v V, v´ V´. What should be the meaning of the Fuzzy implication “If m is v, then m´ is v´ ” ? In classical Fuzzy Theory one tries to represent the meaning of this implication by the construction of a Fuzzy relation R F(XX´, [0,1] ) which depends only on the Fuzzy sets (v) and ´(v´). Examples are the Kleene-Dienes implication, the Lukasiewicz implication, the stochastic implication, the Goguen implication, the Gödel implication, the Sharp implication and the Mamdani implication (Driankov et al. 1993, pp. 85-93) – and they all have some disadvantages (Hellendoorn 1990,1992). In contrast to this approach the following conceptual interpretation is emphasized: Instead of constructing a Fuzzy relation R F(XX´, [0,1] ) which neither depends on the measurement functions m and m´ nor on the object set G we construct a formal context depending on the two given realized linguistic variables. The central idea is the representation of the meaning of the Fuzzy implication “If m is v, then m´ is v´ ” in terms of the distribution of the objects over the object concepts of this formal context. The most important candidate for a suitable context is the 6 derived context K´ of the direct product ´ = (G, mm´, XX´, VV´, ´, LL´, ) of the realized linguistic variables and ´. To use the conceptual implications between attributes of K´ we need the incidence relation I of K´ = (G, (VV´)( LL´), I) which is given by g I ( (v,v´), (,´)) (,´) (´)(v,v´) ([mm´](g)) ( v(m(g)) and ´ ´ ´v ´(m´(g)) ) ( g J (v, ) and g J´ (v´, ´) ). Hence there is an infimum-preserving order embedding from the concept lattice B(K´ ) into the concept lattice of the apposition KK´ of the derived contexts K and K´ of the realized linguistic variables and ´. This embedding is very useful for the interpretation of dependencies between the given realized linguistic variables: In practice the set G of measured objects is finite, hence the concept lattices of K, K´ and K´ are also finite. Therefore the number of attribute concepts in these concept lattices is finite – even if the sets V and V´ of linguistic values and the logics L and L´ are infinite. There are several possibilities to represent dependencies between some chosen attributes. Main tools are the implications (Ganter, Wille 1996, p.79), partial implications (Luxenburger 1991) and dependency tables (Wehrle 1997, Wolff 1998). In some situations it is meaningful to interpret the Fuzzy implication “If m is v, then m´ is v´ ” as an implication between attributes of the apposition KK´ : For some suitably chosen (, ´)( LL´) the implication ( (v, ) (v´, ´) ) is valid in KK´ , which means: [gG ( v(m(g)) ´ ´ ´v ´(m´(g)) )]. This “implication interpretation” is sometimes “too crisp”, especially if implications occur which are “nearly” valid, namely valid up to some counterexamples. We relax the implication to a partial implication and interpret the meaning of the given Fuzzy implication by the following “accuracy interpretation”: For some suitably chosen (, ´)( LL´) and some p [0, 1] : n(, ´)/n() p, where n(, ´) is the number of objects which satisfy ( v(m(g)) and ´ ´ ´v ´(m´(g)) ) and n() is the number of objects which satisfy v(m(g)). This means that the implication ( (v, ) (v´, ´) ) has an accuracy p (Luxenburger 1991). In practice it is much more informative to understand the whole object distribution (in the subcontext of KK´ which has all attributes belonging to v and v´) instead of looking only at the implications or partial implications mentioned above. Our main tool to understand this object distribution is its dependency table. It represents in form of a modified contingency table the deviations of the object distribution from a certain expected object distribution, for example the object distribution which can be expected under the stochastic independence assumption of the given attributes m and m´. This leads to further useful interpretations of the Fuzzy implication “If m is v, then m´ is v´ ”. For details the reader is referred to Wehrle (1997), Wolff (1998). 7 EXAMPLES The following example of a (realized) linguistic variable and its (derived) context is taken from data of a university ranking (SPIEGEL 1989,1990; WOLFF 1992). In the chosen part of the SPIEGEL questionnaire concerning the quality of research and teaching of their professors 158 bavarian students answered several questions using values 1,...,6 (1:= very bad, 6:= very good); missing values are denoted by “ / ”. The meaning of these values can be formalized by the following linguistic variable := (X, V, , L, ), where X := {1,2,3,4,5,6,/}, V := {bad, good, missing}, L := {0, 1/3, 2/3, 1} ordered by the usual ordering of rational numbers. The chosen membership functions and the scale S of the linguistic variable are given in the following table: 7 1 2 3 4 5 6 / bad good miss 1 0 0 2/3 0 0 1/3 0 0 0 1/3 0 0 2/3 0 0 1 0 0 0 1 1 2 3 4 5 6 / b0 b1/3 b2/3 b1 g0 g1/3 g2/3 g1 m0 m1/3 m2/3 m1 Table 1: The membership functions of the linguistic variable and the scale S The concept lattice B(S) is represented by the line diagram in Figure 1.1: Figure 1: The line diagram of the scale S, the corresponding conceptual scale and the line diagram of its derived context Reading example in Figure 1.1: The object 2 in S has exactly the attributes bad 2/3, bad 1/3, bad 0, miss 0 and good 0. The lattice structure of Figure 1.1 can be described with fewer and easier understandable attributes, namely the attributes “ 1”,..., “6” and “/ ”. The concept lattice of the corresponding conceptual scale is represented in Figure 1.2. If we construct the derived context of this conceptual scale for the question about the quality of the consultation of professors we get the concept lattice represented in Figure 1.3 which is isomorphic to the concept lattice B(K) of the corresponding realized scale := (G, m, ) where G is the set of the 158 students and m(g) is the value of student g for the consultation question. Reading example in Figure 1.3: There are exactly 10 students which judged the quality of the consultation of their professors as “very bad”. It is obvious that the lattice in Figure 1.2 represents a meaningful frame for an embedding of the values of the set X and can therefore be understood as a “nonlinear” logic – that is an ordered set which is not a chain. To describe this idea precisely we take the inverse order (L1 , 1) of the lattice in Figure 1.2 (or the inverse order (L2 , 2) of the seven labeled points in Figure 1.2) as a meaningful nonlinear logic of a new linguistic variable ´ - with only one membership function indicated by the labels 1,..., 6, / in Figure 1.2. Then the concept lattice B(S´) is isomorphic to B(S). This example shows that the logic of a linguistic variable should and can be chosen as a meaningful and possibly nonlinear order (for example the concept lattice of a suitable context). 8 We don´t have enough space in this paper to discuss concrete examples of the above mentioned interpretations of Fuzzy implications using conceptual implications, partial implications and dependency tables for the direct product of two realized linguistic variables. 8 CONCLUSIONS This is the first paper introducing Fuzzy Scaling Theory developed from Conceptual Scaling Theory. The underlying basic observation is that linguistic variables play the same role in Fuzzy Theory as conceptual scales in Formal Concept Analysis. Therefore the narrow linear logics and the insignificant abstract logics can be replaced by meaningful logics – for example concept lattices combining meaningful attributes with values of given measurements. Since any ordered set can be embedded into a complete lattice the Dedekind-MacNeille completion of an ordered set plays an important role as the minimal complete lattice into which the given ordered set can be embedded. This leads to the notion of an L-Fuzzy set over an arbitrary ordered set (L,). The central role of cuts of a membership function are the reason for the introduction of the cut context of an L-Fuzzy set. This leads to the possibility of describing linguistic variables without any loss of information by conceptual scales - which allows to transform the ideas of Conceptual Scaling Theory into Fuzzy Scaling Theory. This results in the definition of the direct product of (realized) linguistic variables which cannot be formulated in the restricted theory of linear logics since the direct product of linear logics is not a linear logic. I believe that this was the main obstacle in the development of classical Fuzzy Theory. A second obstacle lies in the difficulty to distinguish between the role of the set G of measured objects (in our example the set of students) and the role of the set X of values of the given measurement. Both sets are used as the set of formal objects - namely X as the object set of the scale of a linguistic variable and G as the object set of the derived context of a realized linguistic variable. The introduction of Fuzzy Scaling Theory masters these difficulties and leads to an interpretation of Fuzzy implications in terms of the object distribution in the derived context of the product of realized linguistic variables. In this paper I did not mention any of the difficult problems arising if we relate the lattice of the real unit interval with the usual truncated algebraic operations and try to understand the meaning of these operations. I just interpreted and extended the ordinal part of Fuzzy Theory which uses only the ordinal structure of the logic L. Finally I like to mention some connections to the book on Fuzzy Concepts by Silke Pollandt (1996) based on her doctoral thesis (Silke Umbreit 1995). This book is the first one combining Fuzzy Theory and Formal Concept Analysis. It is based on L-Fuzzy algebras in the sense of Wechler (1978) and introduces L-Fuzzy contexts and its fuzzy concept lattices. Each L-Fuzzy context can be viewed as formal description of a linguistic variable and contains the scale of this linguistic variable as a subcontext but the relation to Conceptual Scaling Theory is not formally developed in this book. 