Structural and Quantitative Characteristics of Information from the Point of View of Invariance Marcin Jan Schroeder Akita International University Akita, Japan mjs@aiu.ac.jp Philosophy of Information and Information Processing PIIP 2015 at Pembroke College Oxford, March 27, 2015 Main Points 1. 2. 3. 4. 5. 6. 7. The role of invariance (symmetry) in science, mathematics, physics, philosophy, etc. Invariance was already considered in the context of the measure of information by Hartley Nontrivial forms of invariance require some structural constraints. This leads to structural aspects of information. Dualism of selective and structural manifestations of information Quantitative characteristics of the selective manifestation of information. Qualitative characteristics of the structural manifestation of information. Quantitative characteristics m(L) and m*(L) of the structural manifestation of information. Invariance and the Scientific Method Physics Objective description of reality, i.e. invariant (or covariant) with respect to the change of “observer” (or reference frame). Consider transformations of the description corresponding to changes of the reference frame. Then, find what are the invariants of such transformations. In mechanics: translation in time – energy, translation in space – momentum, rotation – angular momentum Invariance and the Scientific Method Mathematics Geometry in Erlangen Program of Felix Klein (1872): Study of invariants with respect to a specific group of transformations of the plane. For Euclidean geometry transformations which preserve distance of points. Change of the symmetry group leads to different type of geometry. In topology, we are interested in the invariants of continuous transformations, etc. Transformations form an algebraic structure of a group with respect to the composition of transformations. This group is a symmetry group for the invariants of the transformations. Invariance and the Scientific Method Other disciplines Chemistry: Symmetry of molecules (i.e. group of transformations preserving configuration of atoms in the molecules) is associated with chemical properties of compounds. Geology: Classification of minerals based on symmetry of their crystal structure. Biology: Mystery of the lack of symmetry in nature for organic molecules of the same chemical composition, but different configuration (chirality), which is present in laboratory synthesis. Philosophy: Structuralism (Jean Piaget) as a methodological tool based on the idea of the study of invariants of relevant transformations. Symmetry Breaking Marks Many Critical Moments of Science Majority of important moments in history of science came with discoveries of symmetry breaking. Examples: (1) Aristarchus of Samos – asymmetry in the size of Sun and Earth (the former much bigger than the latter as a result of ingenious experimental work) as an argument for heliocentric model. (2) Special Relativity – Breaking of Galileo Group Symmetry in Maxwell equations leading to the transition to Lorentz/Poincare Group. (3) Discovery of the violation of parity conservation in the beta decay of Cobalt-60 leading to the unification of weak nuclear and electromagnetic interactions. Another type of “symmetry crisis”: Loschmidt 1876 (Kelvin 1874) Paradox: Mechanics is invariant with respect to time reversal, but thermodynamics is not (entropy cannot decrease in time!). How can we claim that thermodynamic phenomena can be derived from mechanics? Boltzmann’s answer – Statistical interpretation of entropy. Contribution of Ralph Hartley to Information Theory In his famous 1948 paper Claude Shannon made reference to the paper of Ralph Hartley [Hartley, R. V. L. (1928). Transmission of Information. Bell Sys. Tech. J., 7(3), 535-563.] and gave Hartley credit in the 1980’s for the inspiration which stimulated his own work, but Hartley’s work deserves more recognition than just: “I started with information theory, inspired by Hartley’s paper, which was a good paper, but it did not take account of things like noise and best encoding and probabilistic aspects” (interview by Ellersick, 1984) Actually, Hartley is writing about these matters, but is coming to different conclusions and is going in different direction. Information Theory Started from the Study of Invariance Derivation (on p.540) of Hartley’s formula for the measure of information H = n log m(s), where s is the number of symbols, n is a number of selections, m arbitrary integer, involved the assumption of invariance (!). Hartley considered physical “primary symbols” grouped (ENCODING!) to represent psychologically determined “secondary symbols”, and derived the formula from the invariance with respect to grouping of prime symbols to represent secondary ones. Hartley also considered invariance w. r. to the language of communication! Structural Characteristics of Information If we want to go further into invariance of information we have to introduce a conceptual framework including characteristics of the structural aspects of information. In a purely probabilistic study of information as in the work of Hartley or Shannon (there is nothing in entropy, which is not in the probability distribution, so its subject is rather a selection of an element out of the given set in terms of probability distribution, not theory of information), there are no explicit constraints on the group of symmetry, and no structural characteristics are considered. Hartley addressed the issue of invariance, but only from the point of view of making the constraints minimal, i.e. eliminating structural aspects. In the following a general approach to information will be utilized in which both selection and structure have their roles. Preliminaries: Information Defined in terms of the one-many opposition Information is an identification of a variety. Alternative: Information is that which gives unity to variety. • One-many opposition is categorical (undefined) • The word “variety” (many) is understood in a very general (categorical) way, as the opposite of “one” and in common sense meaning is synonymous with the words “many”, “plurality”, “multiplicity”, “set.” • Identification of a variety is understood, that which makes the many one. M. J. Schroeder, Philosophical Foundations for the Concept of Information: Selective and Structural Information. Proceedings of the Third International Conference on the Foundations of Information Science, Paris 2005. http://www.mdpi.org/fis2005/ Preliminaries: Two Basic Manifestations of Information There are two most fundamental ways this “one-ness” can be realized: • by the selection of one element of the variety out of many, • or by the structure on the variety which binds the elements into unity. So, we can consider two different manifestations of information related to these two forms of identification: selective and structural, coexisting with each other. The choice (selection) of an element in the variety requires that the elements have to have some structure distinguishing (identifying) them. This relation between selection and structure makes the manifestations dual. Preliminaries: General Theory of Information in a Nutshell (1) Information system consists of: information carrier = variety = set S with a Moore family of subsets of S (or equiv. a closure operator f on set S for which is its family of closed subsets). The choice of this family is responsible for the type of information system (geometric, topological, logical, etc.) HERE ENTERS STRUCTURAL ASPECT OF INFORMATION! If consists of all subsets of S, we get the orthodox Shannon type of information theory. Information itself is a dually-hereditary subfamily closed with respect to finite intersections (filter) 0 of . It gives the set S unity either by facilitating selection of its element(s) or by giving S some structure. The alternative, but equivalent formulation is based on the concept of closure operator. Preliminaries: General Theory of Information (2) In the formalism based on the concept of a closure operator f: f:2S2S, A,B S: A f(A) & A B f(A) f(B) & f(f(A)) = f(A) the Moore family is the set of f-closed subsets (f(A) = A) and acquires a complete lattice structure Lf with respect to inclusion. This lattice Lf can be interpreted as a logic of the information system. Its decomposability describes the level of information integration. In the orthodox “information theory” we have a trivial closure operator: f:2S2S, A S: f(A) = A. All subsets are closed and the logic of information is an atomic Boolean algebra. In this Special case information is associated with the relation R: xS B S: xRA if x A. In the more general case we have a relation :xS B S: xRB iff xf(B). Preliminaries: General Theory of Information (3) Shannon explored (without much success) the study of linguistic structure of messages through conditional probabilities of characters in sequences of increasing length. Hartley considered structural characteristics in audio-visual encoding of information, but without any specific idea structural constraints. Bar-Hillel and Carnap, in order to develop semantic theory of information, tried to involve the logical structure of language, but with equally vague connection between information and structure. Rene Thom in “Structural Stability and Morphogenesis” gave most clear recognition of the fact that it is structure that carries information, but his understanding of structure was quite narrow (geometric or topological structures). Preliminaries: General Theory of Information (4) In the present approach, the restriction on the type of a structure carrying information is minimal. On the other hand, structures can be very complex, requiring an arbitrarily rich symmetry group of transformations. The main point of difference between the traditional (nonstructural) approach and the present one is that in the former information is associated with the fact of belonging to some arbitrary set (information that x has some property equivalent to x A, where A consists of all elements with this property). Here, not every set can be associated with some property, only sets which have appropriate for the case structural meaning. Selective and structural manifestations of information are dual, in the sense that one always coexists with the other. Quantitative characteristics of Information Two manifestations of information: • Selective manifestation by the selection of one element of the variety out of many, • Structural manifestation by the structure on the variety which binds the elements into a whole. Selective manifestation can be associated with the study of information initiated by Hartley and Shannon. There are many quantitative measures when the selection is described by a probability distribution. Structural manifestation can be associated with the attempts to study information in terms of form, but all these attempts were qualitative. Is an invariant quantitative characterization of structural manifestation possible? Quantitative Characteristics of Selective Manifestation of Information (1) Classical approach of Shannon: n H = - pi log2 pi i=1 n pi =1 i: pi0 i=1 The index i represents here each of the elements of the information carrier S. We can see that every transformation T of the set S (bijective function) preserves the value of H, as long as pSi = pT(S)T(i). If we can ignore structural aspects of information, or if within the structure probability distribution has meaning, we can use this, or any other “measure of information” (i.e. statistical measure of choice) based on a probability distribution. However, for many (majority) of structures there is no meaningful association of a probability distribution with the structure of information carrier. Quantitative Characteristics of Selective Manifestation of Information (2) The study of information within a system, not in the context of communication, leads to consequences. For this reason Schrödinger introduced the concept of “negative entropy” renamed later “negentropy”. But this creates other problems. Better solution is to use an alternative measure: n n Inf(n,p) = pi log2 (npi) i=1 pi =1 i: pi0 i=1 Obviously Inf (n,p) = Hmax – H(n,p). Then we have a characteristic of the degree of determination of information or relative measure of information: n Inf*(n,p) = ∑ pi logn(npi ), i=1 n pi =1 i: pi0 i=1 Or simpler Inf*(n,p) = Inf (n,p)/Infmax and then 0 ≤ Inf*(n,p) ≤ 1. (M. J. Schroeder “An Alternative to Entropy in the Measurement of Information” Entropy, 2004, 6, 388-412) How to Characterize Structural Manifestation of Information? Level of Information Integration describes specific type of the structure imposed on the variety. This structure may have different levels of integration related to its decomposability into component structures. Decomposability of the structure can be described in terms of irreducibility of the logic Lf of information system into a direct product of component lattices. Marcin J. Schroeder, Quantum Coherence without Quantum Mechanics in Modeling the Unity of Consciousness. In P. Bruza, et al. (Eds.) QI 2009, LNAI 5494, Springer, pp. 97-112, 2009. Quantum mechanics provides examples of completely integrated information systems, but there are many other examples, for instance geometric info systems. Logic of Completely Disintegrated Information System Described by Boolean Algebra = X X Complete Reduction to Coherent Components = X X Logic of Completely Integrated Information System Logic of this system cannot be represented as a product of logics of any other structures. It represents a logic of completely integrated information system. It is not a quantum information system as the logic does not satisfy some additional conditions. Other Examples of Logics for Completely Integrated Information Systems (Classic Non-distributive Lattices M5 and N5) M5 N5 Formal Tools for Analysis of the Level of Information Integration In this and the following slides the references will be made to: Birkhoff, G. (1967). Lattice Theory, 3rd. ed. Providence, R. I.: American Mathematical Society Colloquium Publications, Vol. XXV. The main tool for reducibility/irreducibility of the posets is the concept of a center (Chpt. III,8): Def. The center of a poset P with 0 and 1 is the set C of elements (called “central elements”) which have one component 0 and the other 1 under some direct factorization of P. Thm. 10. The center C of a poset P with 0 and 1 is a Boolean lattice in which joins and meets represent joins and meets in P. Formal Tools for Analysis of the Level of Information Integration Def. An element a of a lattice L wih 0 and 1 is neutral iff (a,x,y)D for all x,y in L, i.e. the triple , x, y generates a distributive sublattice of L. Thm. 12. The center of a lattice with 0 and 1 consists of its complemented, neutral elements. Fact. 0 and 1 are central elements in every poset with 0 and 1. It follows from Thm. 12 above that the lattices M5 and N5 are irreducible and that every Boolean lattice is identical with its center. We can observe that although the Exchange Property of Steinitz (wE) ASx,yS: xf(A) & xf(A{y}) yf(A {x}) itself does not imply complete irreducibility, but it does if every two element set has closure with at least three elements. Quantitative Characteristics of Structural Manifestation of Information FROM NOW ON WE WILL ASSUME THAT THE LOGIC OF INFORMATION Lf IS FINITE! Lemma: If lattices L1 and L2 have their centers C1 and C2 respectively, then the direct product L1 x L2 has C1 x C2 as its center. It is a simple corollary of Thm. 11. We will write |L| for the number of elements in set L. Now we can show using Thm. 10 and Thm 11 that the number of irreducible components of the logic Lf is log2 (|C|). This number is giving us some indication regarding reducibility of the logic (complete irreducibility is for value 1, the increase indicates that the number of irreducible components is increasing). But as long as we do not know the size of the logic, the value of such measure is limited. It is better to consider first a measure of complexity m(L). Quantitative Characteristics of Structural Manifestation of Information Def. Measure of complexity of logic L m(L) = log2 (|L|/|C|) = log2 (|L|) - log2 (|C|) Then, if L is a Boolean lattice (completely reducible), then m(L) = 0, but when L is completely irreducible, then m(L) = log2 (|L|) – 1. Since the center is preserved by all lattice automorphisms, so is m(L). Also, m is “semi-additive” in the sense that: m(L1 x L2) = m(L1) + m(L2). In particular for a logic L= L1 x L2 x L3 x…x Lk where all Li are irreducible, in agreement with the definition we have m(L) = log2 (|L|) – k = log2 (|L|) – log2 (|C|). Quantitative Characteristics of Structural Manifestation of Information On the previous slide m(L) was called a measure of complexity, not of irreducibility, as it is increasing to infinity with the size of the logic L. We have simple irreducible two-element logic with m(L) = 0, as it is a Boolean lattice. Also, for the completely irreducible logics with lattices M5 and N5 we have: m(M5) = m(N5) = log2 (5/2) ≈ 1.32 while there are many reducible (although not completely) logics with higher values of m(L). So, in order to have a measure of irreducibility we can introduce a relative measure m* which is an invariant of transformations preserving information structure: Def. For lattices with at least two elements: m*(L) = m(L)/(mmax+1) = m(L)/ log2 (|L|) = log2 (|L|/|C|) / log2 (|L|), where mmax is the maximum Complexity m(L) for a logic of size |L|. Quantitative Characteristics of Structural Manifestation of Information From the definition m*(L) = log2 (|L|/|C|) / log2 (|L|) = 1 – [log2 (|C|) / log2 (|L|)]. Then it is easy to see that If L is a Boolean lattice, m*(L) = 0, but when L is completely irreducible, then: m*(L) = 1 – [1/ log2 (|L|)]. So 0 m*(L) <1 and m* is an increasing function of the size of L with limit 1 at infinity. m*(M5) = m*(N5) = 1 – 1/log2 (5) ≈ 0.57; m*(D10) ≈ ca. 0.70 It is not a surprise that m* is not semi-additive (in the sense in which m is), because m* measures irreducibility. When we have a product of logics, we cannot expect increase of irreducibility. Instead we have a logarithmic weighted mean: m*(L1 x L2) = m*(L1) + β m*(L2), where + β = 1 = log1 (|L1|) / [log2 (|L1|) + log2 (|L2|)] and β = log1 (|L2|) / [log2 (|L1|) + log2 (|L2|)]. Conclusion Both measures, measure of complexity m(L) and measure of information integration m*(L) are invariants of all transformations which preserve the logic of information. There are many cases, when there is a high level of information integration (information logic is not Boolean), but still we can define a generalized form of probabilistic measure. For those cases, we can consider both types of the measure of information. An example can be found in quantum mechanical information systems. Further question is, how to measure selective manifestation for information system whose logic does not admit orthocomplementation, and therefore the concept of probabilistic measure does not make sense. References (Selection) Theory of selective and structural information: Schroeder, M. J. (2011). From Philosophy to Theory of Information, Intl. J. Information Theories and Applications, vol. 18, no. 1, 56-68. Schroeder, M. J. (2014). Algebraic Model for the Dualism of Selective and Structural Manifestations of Information. In M. Kondo (Ed.), RIMS Kokyuroku, No. 1915. Kyoto: Research Institute for Mathematical Sciences, Kyoto University, pp. 44-52. Information integration in terms of irreducibility into product of component structures: Schroeder, M. J. (2009). Quantum Coherence without Quantum Mechanics in Modeling the Unity of Consciousness. In P. Bruza, et al. (Eds.) QI 2009, LNAI 5494, Springer, pp. 97-112. Structural analysis of complexity in terms of information: Schroeder, M. J. (2013)Schroeder, M. J. (2013c). The Complexity of Complexity: Structural vs. Quantitative Approach. In: Proceedings of the International Conference on Complexity, Cybernetics, and Informing Science CCISE 2013 in Porto, Portugal, http://www.iiis-summer13.org/ccise/VirtualSession/viewpaper. asp?C2=CC195GT&vc=58/ Logic and information: Schroeder, M. J. (2012). Search for Syllogistic Structure of Semantic Information. Journal of Applied Non-Classical Logic, 22, 2012, 101-127.