Visualization of Weighted Lattices for Data Analysis Tim Hannan, Lance Miller, Alex Pogel Physical Science Laboratory New Mexico State University apogel@psl.nmsu.edu Weighted lattices • A weighted lattice is a pair (L, w:L[0,1]), where L is a (join semi-) lattice and w is an order-preserving map (level of generality to be explained) • (for us) L is always finite • This talk: provide motivation (from applications) for defining criteria by which to judge methods of drawing weighted lattices Our Motivation General task: provide analysis of time-series data Need: a visualization of time-series data, e.g. as an interval is moved across a timeline t0 tf t0 tf goal: near real-time analysis tool Formal Concept Analysis (FCA) FCA background: input binary relation I GM Galois correspondence P(G) P(M) : closure system Result: the concept lattice (aka Galois lattice, etc), is labeled and used for analysis The concept lattice is an example of a weighted lattice, with relative cardinality of domain sets used to define weight (let w(C) be relative size of extent) Application Problems 1. the lattice representation of data (binary relation) can be highly sensitive (in terms of cardinality) to minor variation in the data 2. order-theoretic presentations of lattices are often ineffective for the task of finding weak implications (naturally occurring or introduced by noise) Problem 1: highly sensitive 5 5 a1 o1 o2 o3 o4 o5 a2 1 1 0 0 1 5 a4 0 0 1 0 1 a5 0 0 1 1 1 1 1 1 1 1 5 a1 o1 o2 o3 o4 o5 a3 1 0 0 0 1 a2 1 1 0 0 1 a3 1 0 0 0 1 a4 0 0 1 0 1 a5 0 0 1 1 1 1 1 1 1 0 Proposed Solution 1. Extend the usual order-based drawings of lattices to include two factors: order and weight values 2. Compare various weight functions, some involving order only and some with support, use to evaluate 3. Introduce various criteria for judging a weight function with respect to the variation it introduces in the face of minor variation in data OUTLINE • basic constructions of Formal Concept Analysis, Plus attribute logic and association rules • The problems and a new drawing tool • theoretical examples and existing data sets • criteria Basic construction Input binary relation: Maps I GM (-)’:P(G)P(M) H’ = {m in M : for all h in H, h I m }, (-)’:P(M)P(G) dually yield a Galois correspondence between P(G) and P(M) Which induces a closure system, a complete lattice More directly: the complete meet subsemilattice of P(G) that is generated by {{m}’: m in M} (aka smallest topped intersection structure in P(G) generated by {{m}’: m in M} ) Basic construction Domain and codomain closure systems (on G and M) are dually isomorphic, so we consider one lattice, represented by closed set pairs (H,N) [H’=N, N’=H] where H is in P(G), N is in P(M) and ordered to reflect one of the closure systems: inclusion in 1st coordinate Important labeling The lattice is labeled: ({g}’’,{g}’) is labeled g ({m}’,{m}’’) is labeled m I is preserved in the lattice: g I m iff ({g}’’,{g}’) < ({m}’,{m}’’) Animal Context 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 es ha s_ fu r ha s_ sc al d 4le gg e -b lo od ed w ar m ai r_ br ea th er es _i n_ w at er liv liv es to ck m on _p et co m an -e at er m no la ys _e gg animal1 animal2 animal3 animal4 animal5 animal6 animal7 animal8 animal9 animal10 animal11 animal12 animal13 animal14 animal15 animal16 animal17 animal18 animal19 animal20 animal21 animal22 animal23 animal24 animal25 animal26 animal27 animal28 animal29 animal30 animal31 animal32 animal33 animal34 animal35 ct ur na l 11 s 35 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Binary relation is preserved Weight function from “extent” The concept lattice is an example of a weighted lattice (L, w:L[0,1]), where we use the relative cardinality of domain sets to define w for C=(H,N): | 1 (C ) | | extent(C ) | | H | w(C ) |G| |G| |G| View weight (via color) for global view 100 88.5 75 66 50 45 33 25 12.5 10-chain, uniform weights 10-chain, mostly low weights 10-chain, mostly low weights 100 88.5 75 66 50 45 33 25 12.5 10-chain, mostly low weights 100 88.5 75 66 50 45 33 25 12.5 Implications in the labeled lattice Interpreting: view G as a set of objects, M as a set of attributes, and I as the satisfaction relation; then For n, m in M, nm iff ({n}’,{n}’’) < ({m}’,{m}’’) {n}' {m}': “Every object that satisfies n also satisfies m” And for A,B subsets of M, AB iff (A’,A’’) < (B’,B’’) Example: Animals A Subinterval of the lattice fourlegged implies airbreather pet implies warm-blooded (iguana?) and pet and nocturnal implies fur Association rules = Weak Implications Oft-seen topic in data mining is the mining of association rules from a data set (binary relation I) An association rule is a pair (A,B), with A,B subsets of M, interpreted to say “in cases where A holds, B also holds” (weakened implication) “in the event of A, event B also occurs” (conditional event) important additional information is needed to evaluate a pair: Confidence (A,B) and Support(A,B) Support the function supp:P(M)xP(M)[0,1] outputs | A'B' | Supp(( A, B)) |G| [the percent of overall evidence for which the rule is positively witnessed] Grocery Example: with G being shoppers and M being items they may have bought, supp(beer pretzels) = 0.22 means that of all grocery shoppers, 22% bought both beer and pretzels Confidence the function conf:P(M)xP(M)[0,1] outputs | A' B' | conf (( A, B)) | A' | [the percent of those instances where the hypothesis holds for which the conclusion also holds] Grocery Example: conf(beer pretzels) = 0.84 means that of those shoppers who bought beer, 84% of them also bought pretzels Use of support & confidence • user indicates support and confidence thresholds, to filter the massive output (22|M| rules) • In practice, setting supp and conf may require trial and error to find values that give presentable info • like FCA, this is an exploratory data analysis tool • e.g., a Boston University CS (Gnu P.L.) tool: ARMiner (short for Association Rule Miner) livestockfur, 80% confidence Identified because of the similarity in color between “livestock” and the concept node below it Support = 11% Support of a concept • Define the support of a concept C=(H,N) (as for w) by | extent(C ) | | 1 (C ) | s(C ) |G| |G| • So that the support of an implication (viewed as an association rule with 100% confidence) is the support of its premise • And the support of an association rule is the support of the concept through which it is expressed | A' B' | Supp(( A, B)) s(C ) |G| A B C SARS data: 43 x13… 105 concepts NOTE: 2^13 = 8192 Violent Deaths (MA,2000) data 5% cutoff threshold, to battle screen bottleneck Violent Deaths (MA, 2000) data: towards OR Utility as part of the KDD Process • Needs attention given to data preparation (work) • Need more attention to training/testing, for built-in verification of discovered rules • No domain-specific constructions (advantage ?) • Does not scale without clustering (universal ?) Potential Over-sensitivity Problem: the cardinality of the concept lattice arising from a given data set can vary drastically with only minor changes in the data set This is a problem for applications because 1. noise is often present in data (for many reasons) 2. Changes in data are introduced for analysis Focus: on lattice diagrams that do not vary much, then definitions of what that means, when minor changes occur in the data set Desire: an idea of the degree to which this is possible 3. Worst case analysis: exponential a1 a2 a3 g1 1 1 1 g2 1 1 1 g3 1 1 1 g4 1 1 1 g5 1 1 1 g6 1 1 1 .. . . . . an .. 1 1 1 1 1 1 a1 a2 a3 g1 0 1 1 g2 1 1 1 g3 1 1 1 g4 1 1 1 g5 1 1 1 g6 1 1 1 yields 1-elt lattice .. . . gn 1 1 1 1 gn 1 1 1 . . . an .. 1 1 1 1 1 1 yields 2-chain . 1 here each change of 1/n2 induces a doubling of lattice cardinality a1 a2 a3 g1 0 1 1 g2 1 0 1 g3 1 1 1 g4 1 1 1 g5 1 1 1 g6 1 1 1 .. . gn 1 1 1 . . . an .. 1 1 1 1 1 1 . 1 yields 22 a1 a2 a3 g1 0 1 1 g2 1 0 1 g3 1 1 0 g4 1 1 1 g5 1 1 1 g6 1 1 1 .. . gn 1 1 1 . . . an .. 1 1 1 1 1 1 . 0 2n = P({1,2,…,n}) Countries: orgs & weapons 243 13 N seekN Afghanistan 0 Albania 0 Algeria 0 Angola 0 Argentina 0 Austria 0 Azerbaijan 0 Bahrain 0 Bangladesh 0 Belarus 0 Belgium 0 Benin 0 Brazil 0 Brunei 0 Bulgaria 0 BurkinaFaso 0 Cameroon 0 Canada 0 Chad 0 Chile 0 China 1 Comoros 0 Cuba 0 CzechRepublic 0 Denmark 0 Djibouti 0 Egypt 0 ElSalvador 0 Ethiopia 0 Finland 0 France 1 Gabon 0 Gambia 0 Germany 0 Greece 0 Guinea 0 Guinea-Bissau 0 Guyana 0 Hungary 0 Iceland 0 India 1 Indonesia 0 Iran 0 Iraq 0 Ireland 0 Israel 1 Italy 0 IvoryCoast-Coted'Ivoire 0 Japan 0 Jordan 0 Kazakhstan 0 Korea(North) 1 Korea(South) 0 Kuwait 0 Kyrgyzstan 0 Laos 0 Lebanon 0 Libya 0 Luxembourg 0 Malaysia 0 Maldives 0 Mali 0 Mauritania 0 Mexico 0 Morocco 0 Mozambique 0 Myanmar 0 Netherlands 0 Nicaragua 0 Niger 0 C 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 B 0 0 0 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 0 1 0 0 0 1 1 1 0 1 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1 0 0 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 0 0 1 1 0 1 0 M 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 1 0 0 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 NATO 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 1 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 OPEC 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 1 0 0 1 1 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 EU 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 UNSC 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 1 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 OIC 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 G8 1 1 1 0 0 0 1 1 1 0 0 1 0 1 0 1 1 0 1 0 0 1 0 0 0 1 1 0 0 0 0 1 1 0 0 1 1 1 0 0 0 1 1 1 0 0 0 1 0 1 1 0 0 1 1 0 1 1 0 1 1 1 1 0 1 1 0 0 0 1 SST 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 AL 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 1 1 0 0 0 0 1 0 1 0 0 0 0 0 3159 = 243 x 13 entries, comparing original with 1% noise, 2% noise data # of concepts Original 65 1% noise 100 = 32 changes 2% noise 140 = 64 changes animal1 animal2 animal3 animal4 animal5 animal6 animal7 animal8 animal9 animal10 animal11 animal12 animal13 animal14 animal15 animal16 animal17 animal18 animal19 animal20 animal21 animal22 animal23 animal24 animal25 animal26 animal27 animal28 animal29 animal30 animal31 animal32 animal33 animal34 animal35 1 1 1 1 1 data 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 es ha s_ fu r ha s_ sc al d 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 35 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 17% 44% 1 1 1 1 1 1 1 1 1 1 59 1 1 1 1 1 1 1 1 1 1 1 2% noise 1 1 1 1 # of concepts 41 1 1 1 1% noise 1 1 1 1 1 1 4le gg e -b lo od ed 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Original 1 1 1 1 w ar m ai r_ br ea th er liv liv es to ck m on _p et co m m an -e at er l ct ur na no es _i n_ w at er Animal context 11 la ys _e gg s 35 3% noise 1 1 1 1 1 1 62 1 1 1 1 1 1 1 1 1 1 1 5% 77% Problem 2. usual presentations of lattices can obscure weak implications, as node placement uses covering relations Recall: conf(livestockfur)=0.8 Same rule, new diagram Same rule, new diagram Zoom in, two methods 80% of livestock have fur: confidence(livestockfur) = 0.8 Connected spaces • Counterexamples in Topology (28 spaces, 10 props.) • Table of connected spaces modified at 1 posn. GraphWin: Experimental tool • Details on this drawing program (largely unfinished) • constructed to quickly generate multiple alternatives • to test the range of possibilities and be able to generate and evaluate criteria for the problem • Two other excellent drawing programs (also java): – ConExp (Sergey Yevtushenko): nice interface, useful controls, available at SourceForge – LatDrawWin (Ralph Freese): control over attraction and repulsion, avail. at www.math.hawaii.edu/~ralph/LatDraw Choices in GraphWin Vector sums chosen vs. level-wise criteria optimization Choose representation method, to govern vector sums • Additive line diagram (ALD) – use positions in intent(C) • Use vectors of upper covers, and symmetry (or some other rule) in meet irreducible cases Algorithm then uses the covering relation to march down through the lattice, treating new elements only after their upper covers are placed Additive Line diagram for poset P • Introduce a set representation: order-embedding (or dual order embedding) rep : P P( X ) Can set X=G (X=M), or use irreducible versions • And a placement of the representation set vec : X R 3 • And then position each vector based on that information: pos( p) vec( x) n xrep( p ) Upper covers method for P Place maximal elements on a circle in z = 0, centered at (0,0,0) For further lower covers, 1. if element has more than one upper cover, use vector sum of their positions 2. If element has only one upper cover x, find all such lower covers of x, and distribute all uniformly about a circle d units below Choices in GraphWin Choose a height dampening method, to provide a zvalue at each vector placement To each element a we assign • No dampening height s to be the length of the • Longest path layer longest path to the top (negate) • Balanced height (Freese) To a we assign height r-s+k, • Support where r is s(0) in [0,a] and k is s(0) in L • Log(support+1) • Weight value (from (L, w:L[0,1])), which includes f(support) for any order preserving f:[0,1][0,1] Notice that each choice can be converted into some o.p. w:L[0,1] Apply weight values Once all vector positions are determined w( p) (xp , yp , z p ) replace coordinates (xp,yp,zp) by | zp | In particular, this forces the new z-value to be w(p) Choices in GraphWin Choose level improvements: Once a level has all its positions given, • Shift level so that its center of mass matches the center of mass of all the previous levels’ points, projected into one z=c plane – this amounts to the xy-position of 1L, (0,0) • Dilate (expand/contract) the elements about the center of mass (e.g. to preserve density) • Other local optimizations are available 4. Theoretical examples To see the difference between order-theoretic drawing and order+support drawing • 25 and 24 • A near- Boolean algebra ConExp on 2^5 LatDrawWin on 2^5 GraphWin on 2^5 distension ALD vs Upper Covers, no dampening GraphWin on 2^5 ALD vs Upper Covers, LPL ALD vs Upper Covers, Freese GraphWin on 2^5: ALD vs Upper Covers, support 2^3 x 2^2 ALD vs Upper Covers, LOG(support) GraphWin on 2^5: 2^3 x 2^2 ALD vs Upper Covers, support 2^4: LatDrawWin 2^4: GraphWin 2^4: GraphWin 2^4: GraphWin Various forms of BA3 ALD-support UC-support Various forms of BA3 ALD-support UC-support Various forms of BA3 ALD-support UC-support 4. Isolated Cluster As input No dampening LatDrawWin 4. Isolated Cluster: ALD, Freese & LPL 4. Isolated Cluster: ALD & UppCov, Freese 4. Isolated Cluster: ALD, Support & Log(Support) 4. Isolated Cluster: ALD & UppCov, no height control 4. Isolated Cluster: UppCov, Support & Log(Support) Slight node movement 5. Using the tool on real data sets Now we consider real data sets original vs. noisy versions • Animal context • Countries: organizations and weapons some time shifting • Stock exchanges, 1986 To see how the different lattice drawing methods fare with the sensitivity problem Animal context data # of concepts Original 35 x 11 = 385 1% noise = 4 changes 2% noise = 8 changes 3% noise = 12 changes 35 17 41 44 77% 59 5 62 Animal Comparisons UC-BH, orig. Animal Comparisons UC-BH, 1% noise Animal Comparisons UC-BH, 2% noise Animal Comparisons UC-BH, 3% noise Animal Comparisons UC-BH: orig, vs 1%, vs 2% Animal Comparisons UC-BH: orig, vs 1%, vs 2% Animal Comparisons ALD-LPL: orig, vs 1%, vs 2% Animal Comparisons ALD-LPL: orig, vs 1%, vs 2% Animal Comparisons LatDrawWin: orig, vs 1%, vs 2% Animal Comparisons LatDrawWin: orig, vs 1%, vs 2% Animal Comparisons ALD-Support: orig, vs 1%, vs 2% Animal Comparisons ALD-Support: orig, vs 1%, vs 2% Animal Comparisons: ALD-Support, orig Animal Comparisons: ALD-Support, 1% Animal Comparisons: ALD-Support, 2% Animal Comparisons: ALD-Support, 3% Countries: orgs & weapons 243 13 N seekN Afghanistan 0 Albania 0 Algeria 0 Angola 0 Argentina 0 Austria 0 Azerbaijan 0 Bahrain 0 Bangladesh 0 Belarus 0 Belgium 0 Benin 0 Brazil 0 Brunei 0 Bulgaria 0 BurkinaFaso 0 Cameroon 0 Canada 0 Chad 0 Chile 0 China 1 Comoros 0 Cuba 0 CzechRepublic 0 Denmark 0 Djibouti 0 Egypt 0 ElSalvador 0 Ethiopia 0 Finland 0 France 1 Gabon 0 Gambia 0 Germany 0 Greece 0 Guinea 0 Guinea-Bissau 0 Guyana 0 Hungary 0 Iceland 0 India 1 Indonesia 0 Iran 0 Iraq 0 Ireland 0 Israel 1 Italy 0 IvoryCoast-Coted'Ivoire 0 Japan 0 Jordan 0 Kazakhstan 0 Korea(North) 1 Korea(South) 0 Kuwait 0 Kyrgyzstan 0 Laos 0 Lebanon 0 Libya 0 Luxembourg 0 Malaysia 0 Maldives 0 Mali 0 Mauritania 0 Mexico 0 Morocco 0 Mozambique 0 Myanmar 0 Netherlands 0 Nicaragua 0 Niger 0 C 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 B 0 0 0 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 0 1 0 0 0 1 1 1 0 1 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1 0 0 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 0 0 1 1 0 1 0 M 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 1 0 0 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 NATO 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 1 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 OPEC 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 1 0 0 1 1 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 EU 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 UNSC 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 1 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 OIC 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 G8 1 1 1 0 0 0 1 1 1 0 0 1 0 1 0 1 1 0 1 0 0 1 0 0 0 1 1 0 0 0 0 1 1 0 0 1 1 1 0 0 0 1 1 1 0 0 0 1 0 1 1 0 0 1 1 0 1 1 0 1 1 1 1 0 1 1 0 0 0 1 SST 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 AL 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 1 1 0 0 0 0 1 0 1 0 0 0 0 0 3159=243x13 entries, comparing original with 1% and 2% noise versions table # of concepts original 65 1% noise 100 2% noise 140 ALD-no dampening ALD-no dampening, 1% change ALD-no dampening, 2% change ALD-no dampening Orig 65 1% change 100 2% change 140 ALD-support ALD-support, 1% change ALD-support, 2% change ALD-support Orig 65 1% change 100 2% change 140 ALD-LPL vs BH Lung Cancer, Bird Keeping orig 10% noise Stock Exchanges UP & DOWN ’86-’87 • Now we view six month periods, with only one week shifts, yielding max. 4% change (5/125) • ALD-Support (resp ALD-LOG(Support)) does well in preserving structure Six month intervals, 1 week shifts Weeks shifted 0 # concepts 122 1 2 3 122 120 120 4 5 6 118 121 120 7 8 9 10 120 123 128 128 11-14 122 1/1986-6/1986, 0 shifts 100 88.5 75 66 50 45 25 12.5 1/1986-6/1986, 1 shifts 100 88.5 75 66 50 45 25 12.5 1/1986-6/1986, 2 shifts 100 88.5 75 66 50 45 25 12.5 1/1986-6/1986, 3 shifts 100 88.5 75 66 50 45 25 12.5 1/1986-6/1986, 4 shifts 100 88.5 75 66 50 45 25 12.5 1/1986-6/1986, 5 shifts 100 88.5 75 66 50 45 25 12.5 1/1986-6/1986, 6 shifts 100 88.5 75 66 50 45 25 12.5 1/1986-6/1986, 7 shifts 100 88.5 75 66 50 45 25 12.5 1/1986-6/1986, 8 shifts 100 88.5 75 66 50 45 25 12.5 1/1986-6/1986, 9 shifts 100 88.5 75 66 50 45 25 12.5 1/1986-6/1986, 10 shifts 100 88.5 75 66 50 45 25 12.5 1/1986-6/1986, 11 shifts 100 88.5 75 66 50 45 25 12.5 1/1986-6/1986, 12 shifts 100 88.5 75 66 50 45 25 12.5 Compact spaces, noised up # of concepts 350 300 250 200 # of concepts 150 100 50 0 0 2 4 6 8 10 12 # of errors # of concepts 0 101 1 107 2 114 3 127 4 135 5 140 6 151 7 177 8 225 9 230 10 237 11 311 Compact – ALD-support, 0 errors Compact – ALD-support, 1 errors Compact – ALD-support, 2 errors Compacct – ALD-support, 3 errors Compact – ALD-support, 4 errors Compact – ALD-support, 5 errors Compact – ALD-support, 6 errors Compact – ALD-support, 7 errors Compact – ALD-support, 8 errors Compact – ALD-support, 9 errors Compact – ALD-support, 10 errors Compact – ALD-support, 11 errors Compact spaces UC-support 0 errors Compact spaces UC-support 1 errors Compact spaces UC-support 2 errors Compact spaces UC-support 3 errors Compact spaces UC-support 4 errors Compact spaces UC-support 5 errors Compact spaces UC-support 6 errors Compact spaces UC-support 7 errors Compact spaces UC-support 8 errors Compact spaces UC-support 9 errors Compact spaces UC-support 10 errors Compact spaces UC-support 11 errors Compact spaces, noised up ALD 0,1,2,3 Compact spaces, noised up ALD 4,5,6,7 Compact spaces, noised up ALD 8,9,10,11 Compact spaces, noised up ALD 12,13,14,15 Compact spaces UC-support 0,1,2,3 Compact spaces UC-support 4,5,6,7 Compact spaces UC-support 8,9,10,11 Compact spaces UC-support 12,13,14,15 6. Need Criteria • Need a distance function to measure the overall change in the lattices when the dataset is changed, drawn using weight functions • Idea: take max of radii that can be placed around all nodes of one lattice so that all nodes of the other are captured within (compute in both directions) • Count the balls only according to values of sim:L1xL2[0,1] • OR: Use these balls to discuss edges; every edge in one lattice must be between balls with edges in the other • A function of the radius: the percent of larger lattice covered by balls of the smaller lattice, in combination with the radii Shift right lattice onto left lattice References R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large databases. In ACM SIGMOD Intl. Conf. Management of Data, May 1993. R. Freese, LatDrawWin.java, at http://www.math.hawaii.edu/~ralph/LatDraw/ B. Ganter and R. Wille, Formal Concept Analysis: Mathematical Foundations, Springer 1999. G. Stumme, R. Taouil, Y. Bastide, N. Pasquier, and L. Lakhal, Computing iceberg concept lattices with Titanic, In Data & Knowledge Engineering, 42 (2002), pp. 189--222. S. Yevtushenko, http://sourceforge.net/projects/conexp, Release 1.0 (2002); now Release 1.1 is available (May 2003). M. Zaki and M. Ogihara, Theoretical Foundations of Associations Rules, In Proceedings of 3rd SIGMOD'98 Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD'98), Seattle, Washington, USA, June 1998.