A-03-Lévy-Preface 15-22 (18 Feb 08) SP FINAL

advertisement
COLLECTIVE INTELLIGENCE: CREATING A PROSPEROUS WORLD AT PEACE
A metalanguage for computer
augmented collective intelligence
Prof. Pierre Lévy, CRC, FRSC1
The semantic interoperability problem
The universe of communication opened up to us by the interconnection of
digital data and automatic manipulators of symbols—in other words,
cyberspace—henceforth constitutes the virtual memory of collective human
intelligence. Yet, at the symbolic level, important obstacles hinder digital
memory from working fully in the service of an optimal management of
knowledge. These obstacles can be decomposed into two interdependent subgroups.
The first one concerns the multiplicity and the incompatibility of symbolic
systems:

plurality of natural languages;

incompatibility and inadaptation of the numerous indexation and
cataloguing systems inherited from the print era (that were not designed
to exploit the general interconnection and computing power of
cyberspace);

multiplicity and incompatibility of taxonomies,
terminologies, ontologies and classification systems.
1
thesaurus,
Pierre Lévy is a philosopher who devoted his professional life to the understanding of
the cultural and cognitive implications of the digital technologies, to promote their best
social uses and to study the phenomenon of human collective intelligence. Additional
biographic and reference information is on the last page of this chapter.
15
PREFACES
The second sub-group of obstacles concerns the difficulties encountered
by computer science when it tries to take into account the meaning of
documents by means of general methods.
Current commercial search engines base their search on strings of
characters and not on concepts. For example, for example, when a user enters
the request « dog», this word is processed as the string of characters « d, o, g »
and not as a concept that could be translated in several languages (chien, kelb,
cane...), belonging to the sub-classes of mammals and pets, and constituting
(for example) the super class of bull-dogs and dobermans.
The so-called semantic web, despite its technical sophistication, still
does not foster the practical progress in the organization and retrieval of
collective memory that is expected from it. It suffers from the same limitation
of perspective as the artificial intelligence. For its leaders, the task of exploiting
the computers for the augmentation of human intelligence is restricted to the
automation of logical operations on standard data formats. The design of
original symbolic systems for the notation of meaning that could take
advantage of the new possibilities of automatic processing at the service of
human collective intelligence is not addressed by the semantic web.
The IEML initiative
In order to overcome the contemporary obstacles to a full exploitation of the
new opportunities opened up by cyberspace to human collective intelligence,
the Canada Research Chair in collective intelligence at the University of Ottawa
has undertaken the task of designing and implementing a metalanguage for
semantic addressing. The metalanguage is called IEML for Information
Economy MetaLanguage.
The Information Economy MetaLangage (IEML) is a formal language
for the expression of semantic sets. It is designed to denote formally—or to
address—concepts as semantic sets. Concepts, and networks of concepts, of
whatever complexity, can be formalized and uniquely identified—or
addressed—by semantic sets expressed in IEML.
Thanks to the regularity of IEML grammar (that is designed in such a
way that semantic structures are mirrored by syntactic structures); many
computable functions can be applied to IEML expressions, including ordering,
visualization and semantic distance measurement functions.
16
TECHNICAL PREFACE
To avoid any misunderstanding, I want to stress here that IEML is not
supposed to replace or compete with any data format like XML, RDF or OWL.
IEML has been designed to replace natural language expressions in whatever
data format. The use of IEML expressions to tag semantic metadata on digital
documents may be preferred to the use of natural language expressions because
semantic sets expressed formally in IEML allow a larger range of computable
functions. So, the IEML initiative is not competing with the semantic web: it
prepares the erection of the next layer of cyberspace.
IEML grammar is a singular abstract structure that can be expressed by
different syntaxes (or notation systems) according to different purposes. For
example, there is an XML-IEML syntax (XML: eXtended Mark-up Language)
and a STAR-IEML syntax (STAR: Symbolic Tool for Augmented Reasoning).
In STAR syntax, the semantic addresses begins by a "*" end are closed by a
"**". There is an objective relationship between semantic addresses expressed
in STAR-IEML and semantic addresses expressed in XML-IEML. In general,
automatic translations can be provided between different IEML syntaxes
because they share the same grammar. For practical purposes:

IEML expressions of semantic sets can be used as semantic metadata;

IEML is the basis for the expression of IEML ontologies, that can be
defined as functions on semantic sets, including relations between
semantic sets;

IEML paves the way for a generation of semantic search engines and
tagging machines that can be customized according to their original
semantic perspectives but can also cooperate by a collective
intelligence protocol for the standard exchange of semantic metadata.
An on-line IEML-natural languages dictionary establishes the
correspondence between the expressions of the metalanguage and their
interpretation in natural languages. The grammar, dictionary and various
software modules based on the use of the metalanguage are open-source and
available for free.
The Layers Of Digital Memory Addressing
In order to understand the need for a new layer of memory addressing in
cyberspace, we have to analyze the arrangement of the preceding layers.
17
PREFACES
Figure 1: Layers of Digital Memory Addressing
First Layer (bit addressing)
At the level of the computers that compose the nodes within cyberspace, the
local system for addressing bits of information is managed in a decentralized
fashion by various operating systems (such as Unix or Windows), then used by
software applications. The development of computing in the 1950s created
technical conditions for a remarkable augmentation in the arithmetical and
logical processing of information.
Second Layer (server addressing)
At the level of the network of networks, each server has an attributed address,
according to the universal protocol of the Internet. IP (Internet Protocol)
addresses are used by the information routing—or commutation—system that
makes the Internet work. The development of the Internet in the 1980s
corresponds to the advent of personal computing, the growth of virtual
18
TECHNICAL PREFACE
communities, and the beginning of the convergence of the media and
telecommunications in the digital universe.
Third Layer (page addressing)
At the level of the World Wide Web, the pages of documents, in turn, have a
universal address according to the universal system of URLs (Uniform
Resource Locator), and the links between documents are handled according to
the HTTP standard (HyperText Transfer Protocol). Web addresses and
hypertext links are used by search engines and Web surfers. The popularization
of the Web from 1995 onward helped give rise to a global public multimedia
sphere.
Fourth Layer (concept addressing)
The Semantic space takes the form of an additional layer of digital memory,
resting on a universal addressing system for concepts: IEML. As a coordinate
system of the semantic space, IEML makes it possible to automatically manage
the relationships among the meaningful content of documents, and this
independently from the natural languages in which the documents are written.
Semantic computing is dedicated to the automatic manipulation of IEML
expressions that address the data. In so doing, it increases human capacity for
interpretation of the virtual memory from a practically infinite array of
semantic perspectives. New devices for multimedia exploration of the dynamic
universe of concepts could take support from semantic computing.
A glimpse into the generative semantics behind IEML
The epistemological principle that has guided me into the invention of IEML is
that the complexity and the variety of the automatic operations that can be
performed on variables depend on the structure of the variables. Accordingly to
this principle, IEML is a symbolic system the expressions of which allow a
greater range of automatic operations than the expressions of natural languages.
The core of IEML regularity is its generative structure. A full technical
description of IEML is not possible in the context of this book. Nevertheless, I
can propose here to the reader to have a glimpse into the "generative semantics"
that is at the basis of the metalanguage.
Any IEML expression of a semantic set is composed from five primitive
elements and an empty subset of elements. Sets and subsets of primitive
elements are represented by ten characters.
19
PREFACES
From the primitive elements of the first layer, a generative operation
produces recursively five layers of generated elements called flows. So, there
are six layers in the IEML stack.
Except for the first layer, the elements of which are primitives, a flow of
layer n is a triple (source, destination and translator) of flows from the layer n1. The first role of a flow of layer n is an element of layer n-1 and is called the
source of the flow. The second role of a flow of layer n is an element of layer n1 and is called the destination of the flow. The third role of a flow of layer n is
an element of layer n-1 and is called the translator of the flow. The order of
magnitude of the number of semantic elements at layer 6 is: 1069.
Punctuation marks, here in the layer generative order (: . – ' , _) explicitate
the generative operations and permit the parsing of expressions.
Example:
*M:O:.** == *(S:U:.|S:A:.|B:U:.|B:A:.|T:U:.|B:A:.)**
The expression *M:O:.** is a category of layer 2, so it is closed with a "."
*M:** is the source player of layer 1 (the noun-type primitive category), so it is
closed with a ":"
*O:** is the destination player of layer 1 (the verb-type primitive category), so it is
closed with a ":"
*S:U:.**, *S:A:.**, etc. are flows of layer 2 produced by the generative operation.
As they are flows of layer two, they are closed by ".". They are structured by two roles:
source and destination. The players of these roles are primitive elements of layer 1,
expressed by token characters closed by the mark of layer 1 ":".
20
TECHNICAL PREFACE
Figure 2: Layer Flows
IEML makes possible very compact expressions of all sorts of semantic
sets. From the expressions of sets of layer n, the grammatical structure of IEML
allows for the automatic generation of graphs (trees, cycles) and matrixes of
sets from layer n-1. These graphs and matrixes can be used for navigation,
visualization and channeling of information value, according to the choices of
communities of users.
21
PREFACES
Figure 3: High-Level Overview
Reference (forthcoming): Metalanguage (2009). Hermes Science, London.
English bibliography
Cyberculture. (2001). Minnesota U.P. (first edition : Odile Jacob, Paris, 1997,
313. pp.)
Becoming Virtual. (1998). Plenum Press (NY). (first edition: La Découverte,
Paris, 1995. 180 pp.)
Collective Intelligence. (1997). Plenum Press, NY. Paperback (1999): Perseus
Books, Cambridge Mass. (first edition : La Découverte, Paris, 1994, 245
pp.)
Web address: www.ieml.org
22
Download