Structural and Quantitative Characteristics of Information from the Point of View of Invariance

advertisement
Structural and Quantitative
Characteristics of Information
from the Point of View of
Invariance
Marcin Jan Schroeder
Akita International University
Akita, Japan
mjs@aiu.ac.jp
Philosophy of Information and Information
Processing PIIP 2015 at Pembroke College
Oxford, March 27, 2015
Main Points
1.
2.
3.
4.
5.
6.
7.
The role of invariance (symmetry) in science, mathematics,
physics, philosophy, etc.
Invariance was already considered in the context of the
measure of information by Hartley
Nontrivial forms of invariance require some structural
constraints. This leads to structural aspects of information.
Dualism of selective and structural manifestations of
information
Quantitative characteristics of the selective manifestation of
information.
Qualitative characteristics of the structural manifestation of
information.
Quantitative characteristics m(L) and m*(L) of the structural
manifestation of information.
Invariance and the Scientific Method
Physics
Objective description of reality, i.e. invariant (or
covariant) with respect to the change of
“observer” (or reference frame).
Consider transformations of the description
corresponding to changes of the reference
frame.
Then, find what are the invariants of such
transformations.
In mechanics: translation in time – energy,
translation in space – momentum, rotation –
angular momentum
Invariance and the Scientific Method
Mathematics
Geometry in Erlangen Program of Felix Klein (1872):
Study of invariants with respect to a specific group of
transformations of the plane. For Euclidean geometry
transformations which preserve distance of points.
Change of the symmetry group leads to different type of
geometry.
In topology, we are interested in the invariants of
continuous transformations, etc.
Transformations form an algebraic structure of a group with
respect to the composition of transformations. This group
is a symmetry group for the invariants of the
transformations.
Invariance and the Scientific Method
Other disciplines
Chemistry: Symmetry of molecules (i.e. group of
transformations preserving configuration of atoms in the
molecules) is associated with chemical properties of
compounds.
Geology: Classification of minerals based on symmetry of
their crystal structure.
Biology: Mystery of the lack of symmetry in nature for
organic molecules of the same chemical composition,
but different configuration (chirality), which is present in
laboratory synthesis.
Philosophy: Structuralism (Jean Piaget) as a
methodological tool based on the idea of the study of
invariants of relevant transformations.
Symmetry Breaking Marks Many
Critical Moments of Science
Majority of important moments in history of science came with
discoveries of symmetry breaking.
Examples: (1) Aristarchus of Samos – asymmetry in the size of Sun
and Earth (the former much bigger than the latter as a result of
ingenious experimental work) as an argument for heliocentric model.
(2) Special Relativity – Breaking of Galileo Group Symmetry in Maxwell
equations leading to the transition to Lorentz/Poincare Group. (3)
Discovery of the violation of parity conservation in the beta decay of
Cobalt-60 leading to the unification of weak nuclear and
electromagnetic interactions.
Another type of “symmetry crisis”: Loschmidt 1876 (Kelvin 1874)
Paradox: Mechanics is invariant with respect to time reversal, but
thermodynamics is not (entropy cannot decrease in time!). How can we
claim that thermodynamic phenomena can be derived from mechanics?
Boltzmann’s answer – Statistical interpretation of entropy.
Contribution of Ralph Hartley to
Information Theory
In his famous 1948 paper Claude Shannon made
reference to the paper of Ralph Hartley [Hartley,
R. V. L. (1928). Transmission of Information.
Bell Sys. Tech. J., 7(3), 535-563.] and gave
Hartley credit in the 1980’s for the inspiration
which stimulated his own work, but Hartley’s
work deserves more recognition than just:
“I started with information theory, inspired by Hartley’s
paper, which was a good paper, but it did not take
account of things like noise and best encoding and
probabilistic aspects” (interview by Ellersick, 1984)
Actually, Hartley is writing about these matters, but is
coming to different conclusions and is going in
different direction.
Information Theory Started from
the Study of Invariance
Derivation (on p.540) of Hartley’s formula for the
measure of information H = n log m(s),
where s is the number of symbols, n is a number of
selections, m arbitrary integer, involved the
assumption of invariance (!). Hartley considered
physical “primary symbols” grouped (ENCODING!)
to represent psychologically determined “secondary
symbols”, and derived the formula from the
invariance with respect to grouping of prime symbols
to represent secondary ones. Hartley also considered
invariance w. r. to the language of communication!
Structural Characteristics of
Information
If we want to go further into invariance of information we have to
introduce a conceptual framework including characteristics of the
structural aspects of information.
In a purely probabilistic study of information as in the work of
Hartley or Shannon (there is nothing in entropy, which is not in
the probability distribution, so its subject is rather a selection of
an element out of the given set in terms of probability
distribution, not theory of information), there are no explicit
constraints on the group of symmetry, and no structural
characteristics are considered. Hartley addressed the issue of
invariance, but only from the point of view of making the
constraints minimal, i.e. eliminating structural aspects.
In the following a general approach to information will be
utilized in which both selection and structure have their roles.
Preliminaries: Information
Defined in terms of the one-many opposition
Information is an identification of a variety.
Alternative: Information is that which gives
unity to variety.
• One-many opposition is categorical (undefined)
• The word “variety” (many) is understood in a very
general (categorical) way, as the opposite of “one” and
in common sense meaning is synonymous with the
words “many”, “plurality”, “multiplicity”, “set.”
• Identification of a variety is understood, that which
makes the many one.
M. J. Schroeder, Philosophical Foundations for the Concept of Information: Selective and
Structural Information. Proceedings of the Third International Conference on the
Foundations of Information Science, Paris 2005. http://www.mdpi.org/fis2005/
Preliminaries: Two Basic
Manifestations of Information
There are two most fundamental ways this “one-ness” can
be realized:
• by the selection of one element of the variety out of
many,
• or by the structure on the variety which binds the
elements into unity.
So, we can consider two different manifestations of
information related to these two forms of identification:
selective and structural, coexisting with each other.
The choice (selection) of an element in the variety requires
that the elements have to have some structure distinguishing
(identifying) them. This relation between selection and
structure makes the manifestations dual.
Preliminaries: General Theory of
Information in a Nutshell (1)
Information system consists of:
information carrier = variety = set S with a Moore family  of
subsets of S (or equiv. a closure operator f on set S for which  is
its family of closed subsets).
The choice of this family is responsible for the type of information
system (geometric, topological, logical, etc.) HERE ENTERS
STRUCTURAL ASPECT OF INFORMATION! If  consists
of all subsets of S, we get the orthodox Shannon type of
information theory.
Information itself is a dually-hereditary subfamily closed with
respect to finite intersections (filter) 0 of .
It gives the set S unity either by facilitating selection of its
element(s) or by giving S some structure.
The alternative, but equivalent formulation is based on the concept of closure
operator.
Preliminaries: General Theory of
Information (2)
In the formalism based on the concept of a closure operator f:
f:2S2S, A,B  S: A  f(A) & A  B  f(A)  f(B)
& f(f(A)) = f(A) the Moore family  is the set of f-closed subsets
(f(A) = A) and  acquires a complete lattice structure Lf with
respect to inclusion. This lattice Lf can be interpreted as a logic
of the information system. Its decomposability describes the
level of information integration. In the orthodox “information
theory” we have a trivial closure operator:
f:2S2S, A  S: f(A) = A. All subsets are closed and the
logic of information is an atomic Boolean algebra. In this
Special case information is associated with the relation R:
xS B  S: xRA if x  A. In the more general case we
have a relation :xS B  S: xRB iff xf(B).
Preliminaries: General Theory of
Information (3)
Shannon explored (without much success) the study of linguistic
structure of messages through conditional probabilities of
characters in sequences of increasing length.
Hartley considered structural characteristics in audio-visual
encoding of information, but without any specific idea
structural constraints.
Bar-Hillel and Carnap, in order to develop semantic theory of
information, tried to involve the logical structure of language,
but with equally vague connection between information and
structure.
Rene Thom in “Structural Stability and Morphogenesis” gave
most clear recognition of the fact that it is structure that carries
information, but his understanding of structure was quite
narrow (geometric or topological structures).
Preliminaries: General Theory of
Information (4)
In the present approach, the restriction on the type of a structure
carrying information is minimal. On the other hand, structures
can be very complex, requiring an arbitrarily rich symmetry
group of transformations.
The main point of difference between the traditional (nonstructural) approach and the present one is that in the former
information is associated with the fact of belonging to some
arbitrary set (information that x has some property equivalent
to x  A, where A consists of all elements with this property).
Here, not every set can be associated with some property, only
sets which have appropriate for the case structural meaning.
Selective and structural manifestations of information are
dual, in the sense that one always coexists with the other.
Quantitative characteristics of
Information
Two manifestations of information:
• Selective manifestation by the selection of one
element of the variety out of many,
• Structural manifestation by the structure on the
variety which binds the elements into a whole.
Selective manifestation can be associated with the study of
information initiated by Hartley and Shannon. There are
many quantitative measures when the selection is described
by a probability distribution.
Structural manifestation can be associated with the
attempts to study information in terms of form, but all these
attempts were qualitative.
Is an invariant quantitative characterization of
structural manifestation possible?
Quantitative Characteristics of Selective
Manifestation of Information (1)
Classical approach of Shannon:
n
H = -  pi log2 pi
i=1
n
 pi =1 i: pi0
i=1
The index i represents here each of the elements of the
information carrier S. We can see that every transformation T
of the set S (bijective function) preserves the value of H, as
long as pSi = pT(S)T(i).
If we can ignore structural aspects of information, or if within the
structure probability distribution has meaning, we can use this,
or any other “measure of information” (i.e. statistical measure
of choice) based on a probability distribution.
However, for many (majority) of structures there is no
meaningful association of a probability distribution with the
structure of information carrier.
Quantitative Characteristics of Selective
Manifestation of Information (2)
The study of information within a system, not in the context of
communication, leads to consequences. For this reason Schrödinger
introduced the concept of “negative entropy” renamed later “negentropy”. But
this creates other problems. Better solution is to use an alternative measure:
n
n
Inf(n,p) =  pi log2 (npi)
i=1
 pi =1 i: pi0
i=1
Obviously Inf (n,p) = Hmax – H(n,p). Then we have a characteristic of the
degree of determination of information or relative measure of information:
n
Inf*(n,p) = ∑ pi logn(npi ),
i=1
n
 pi =1 i: pi0
i=1
Or simpler Inf*(n,p) = Inf (n,p)/Infmax and then 0 ≤ Inf*(n,p) ≤ 1.
(M. J. Schroeder “An Alternative to Entropy in the Measurement of
Information” Entropy, 2004, 6, 388-412)
How to Characterize Structural
Manifestation of Information?
Level of Information Integration describes specific
type of the structure imposed on the variety. This
structure may have different levels of integration related
to its decomposability into component structures.
Decomposability of the structure can be described in
terms of irreducibility of the logic Lf of information
system into a direct product of component lattices.
Marcin J. Schroeder, Quantum Coherence without Quantum Mechanics in
Modeling the Unity of Consciousness. In P. Bruza, et al. (Eds.) QI 2009,
LNAI 5494, Springer, pp. 97-112, 2009.
Quantum mechanics provides examples of completely
integrated information systems, but there are many
other examples, for instance geometric info systems.
Logic of Completely Disintegrated
Information System Described by Boolean
Algebra
=
X
X
Complete Reduction to Coherent
Components
=
X
X
Logic of Completely Integrated
Information System
Logic of this system
cannot be represented as a
product of logics of any
other structures. It
represents a logic of
completely integrated
information system. It is
not a quantum information
system as the logic does
not satisfy some additional
conditions.
Other Examples of Logics for Completely
Integrated Information Systems
(Classic Non-distributive Lattices M5 and N5)
M5
N5
Formal Tools for Analysis of the
Level of Information Integration
In this and the following slides the references will be made
to:
Birkhoff, G. (1967). Lattice Theory, 3rd. ed. Providence, R.
I.: American Mathematical Society Colloquium
Publications, Vol. XXV.
The main tool for reducibility/irreducibility of the posets is the
concept of a center (Chpt. III,8):
Def. The center of a poset P with 0 and 1 is the set C of elements
(called “central elements”) which have one component 0 and
the other 1 under some direct factorization of P.
Thm. 10. The center C of a poset P with 0 and 1 is a Boolean
lattice in which joins and meets represent joins and meets in P.
Formal Tools for Analysis of the Level
of Information Integration
Def. An element a of a lattice L wih 0 and 1 is neutral iff (a,x,y)D
for all x,y in L, i.e. the triple , x, y generates a distributive
sublattice of L.
Thm. 12. The center of a lattice with 0 and 1 consists of its
complemented, neutral elements.
Fact. 0 and 1 are central elements in every poset with 0 and 1.
It follows from Thm. 12 above that the lattices M5 and N5 are
irreducible and that every Boolean lattice is identical with its
center.
We can observe that although the Exchange Property of Steinitz
(wE) ASx,yS: xf(A) & xf(A{y}) 
yf(A  {x}) itself
does not imply complete irreducibility, but it does if every two
element set has closure with at least three elements.
Quantitative Characteristics of
Structural Manifestation of Information
FROM NOW ON WE WILL ASSUME THAT THE LOGIC
OF INFORMATION Lf IS FINITE!
Lemma: If lattices L1 and L2 have their centers C1 and C2
respectively, then the direct product L1 x L2 has C1 x C2 as its
center.
It is a simple corollary of Thm. 11.
We will write |L| for the number of elements in set L.
Now we can show using Thm. 10 and Thm 11 that the number of
irreducible components of the logic Lf is log2 (|C|). This
number is giving us some indication regarding reducibility of
the logic (complete irreducibility is for value 1, the increase
indicates that the number of irreducible components is
increasing). But as long as we do not know the size of the
logic, the value of such measure is limited. It is better to
consider first a measure of complexity m(L).
Quantitative Characteristics of
Structural Manifestation of Information
Def. Measure of complexity of logic L
m(L) = log2 (|L|/|C|) = log2 (|L|) - log2 (|C|)
Then, if L is a Boolean lattice (completely reducible),
then m(L) = 0, but when L is completely irreducible,
then m(L) = log2 (|L|) – 1. Since the center is
preserved by all lattice automorphisms, so is m(L).
Also, m is “semi-additive” in the sense that:
m(L1 x L2) = m(L1) + m(L2). In particular for a
logic L= L1 x L2 x L3 x…x Lk where all Li are
irreducible, in agreement with the definition we have
m(L) = log2 (|L|) – k = log2 (|L|) – log2 (|C|).
Quantitative Characteristics of
Structural Manifestation of Information
On the previous slide m(L) was called a measure of complexity,
not of irreducibility, as it is increasing to infinity with the size
of the logic L. We have simple irreducible two-element logic
with m(L) = 0, as it is a Boolean lattice. Also, for the
completely irreducible logics with lattices M5 and N5 we have:
m(M5) = m(N5) = log2 (5/2) ≈ 1.32 while there are many
reducible (although not completely) logics with higher values
of m(L). So, in order to have a measure of irreducibility we
can introduce a relative measure m* which is an invariant of
transformations preserving information structure:
Def. For lattices with at least two elements:
m*(L) = m(L)/(mmax+1) = m(L)/ log2 (|L|) =
log2 (|L|/|C|) / log2 (|L|), where mmax is the maximum
Complexity m(L) for a logic of size |L|.
Quantitative Characteristics of Structural
Manifestation of Information
From the definition m*(L) = log2 (|L|/|C|) / log2 (|L|) =
1 – [log2 (|C|) / log2 (|L|)]. Then it is easy to see that
If L is a Boolean lattice, m*(L) = 0, but when L is completely
irreducible, then: m*(L) = 1 – [1/ log2 (|L|)].
So 0  m*(L) <1 and m* is an increasing function of the size of
L with limit 1 at infinity.
m*(M5) = m*(N5) = 1 – 1/log2 (5) ≈ 0.57; m*(D10) ≈ ca. 0.70
It is not a surprise that m* is not semi-additive (in the sense in
which m is), because m* measures irreducibility. When we
have a product of logics, we cannot expect increase of
irreducibility. Instead we have a logarithmic weighted mean:
m*(L1 x L2) =  m*(L1) + β m*(L2), where  + β = 1
 = log1 (|L1|) / [log2 (|L1|) + log2 (|L2|)] and
β = log1 (|L2|) / [log2 (|L1|) + log2 (|L2|)].
Conclusion
Both measures, measure of complexity m(L) and measure of
information integration m*(L) are invariants of all
transformations which preserve the logic of information.
There are many cases, when there is a high level of
information integration (information logic is not Boolean),
but still we can define a generalized form of probabilistic
measure. For those cases, we can consider both types of the
measure of information. An example can be found in
quantum mechanical information systems.
Further question is, how to measure selective manifestation
for information system whose logic does not admit
orthocomplementation, and therefore the concept of
probabilistic measure does not make sense.
References (Selection)
Theory of selective and structural information:
Schroeder, M. J. (2011). From Philosophy to Theory of Information, Intl. J.
Information Theories and Applications, vol. 18, no. 1, 56-68.
Schroeder, M. J. (2014). Algebraic Model for the Dualism of Selective and Structural
Manifestations of Information. In M. Kondo (Ed.), RIMS Kokyuroku, No. 1915.
Kyoto: Research Institute for Mathematical Sciences, Kyoto University, pp. 44-52.
Information integration in terms of irreducibility into product of component structures:
Schroeder, M. J. (2009). Quantum Coherence without Quantum Mechanics in
Modeling the Unity of Consciousness. In P. Bruza, et al. (Eds.) QI 2009, LNAI
5494, Springer, pp. 97-112.
Structural analysis of complexity in terms of information:
Schroeder, M. J. (2013)Schroeder, M. J. (2013c). The Complexity of Complexity:
Structural vs. Quantitative Approach. In: Proceedings of the International
Conference on Complexity, Cybernetics, and Informing Science CCISE 2013 in
Porto, Portugal, http://www.iiis-summer13.org/ccise/VirtualSession/viewpaper.
asp?C2=CC195GT&vc=58/
Logic and information:
Schroeder, M. J. (2012). Search for Syllogistic Structure of Semantic Information.
Journal of Applied Non-Classical Logic, 22, 2012, 101-127.
Download