Classifications and concepts: towards an elementary theory of knowledge interaction

Richard P. Smiraglia

Information Organization Research Group, University of Wisconsin-Milwaukee,

Milwaukee, Wisconsin, USA, and

Charles van den Heuvel

Huygens Institute for the History of The Netherlands, The Hague,

The Netherlands


Purpose – This paper seeks to outline the central role of concepts in the knowledge universe, and the intertwining roles of works, instantiations, and documents. In particular the authors are interested in ontological and epistemological aspects of concepts and in the question to which extent there is a need for natural languages to link concepts to create meaningful patterns.

Design/methodology/approach – The authors describe the quest for the smallest elements of knowledge from a historical perspective. They focus on the metaphor of the universe of knowledge and its impact on classification and retrieval of concepts. They outline the major components of an elementary theory of knowledge interaction.

Findings – The paper outlines the major components of an elementary theory of knowledge interaction that is based on the structure of knowledge rather than on the content of documents, in which semantics becomes not a matter of synonymous concepts, but rather of coordinating knowledge structures. The evidence is derived from existing empirical research.

Originality/value – The paper shifts the bases for knowledge organization from a search for a universal order to an understanding of a universal structure within which many context-dependent orders are possible.

Keywords Concept theory, Classification, History of information science, Information management,

Information retrieval languages, Universe of knowledge, Instantiation, Information retrieval

Paper type Conceptual paper

1. Introduction

In Weaving the Web Tim Berners-Lee (2000, pp. 35-36) refers to the study of small objects in physics to create simple rules for global systems in his future visions of the world wide web and the semantic web:

One of the beautiful things about physics is its ongoing quest to find simple rules that describe the behavior of very small objects. Once found, these rules can often be scaled up to describe the behavior of monumental systems in the real world. For example, by understanding how two molecules of a gas interact when they collide, scientists using suitable mathematics can deduce how billions of billions of gas molecules – say, the earth’s Journal of Documentation

The authors owe a tremendous debt of gratitude to Thomas M. Dousa for his advice and counsel, and for his very thorough reading of the drafts of this paper.

atmosphere – will change. This allows them to analyze global weather patterns, and thus predict the weather. If rules governing hypertext links between servers and browsers stayed simple, then our web of a few documents could grow to a global web.

Berners-Lee’s explanation of the potential growth of the global web of knowledge by using a metaphor from physics to understand the complexity of monumental systems in the real world by analyzing small objects stands in a very long historical tradition.

Since Antiquity observations of the order of the cosmos have been used to explain the order of universal knowledge. Discussions on the substance of the universe played a role in the use of metaphors in the rising discipline of library science at the end of the nineteenth century to explain classification. In recent studies we returned to these metaphors and questioned their historical readings based on a transition from universe of knowledge to universe of concept systems (van den Heuvel and Smiraglia, 2010;

Smiraglia and van den Heuvel, 2011; Smiraglia et al.

, 2011). From the existing historiography in library and information science so far it becomes clear that before we can speak of a classification theory the relationship between classification and concepts in knowledge organization needs to be reexamined. How can it be that the order of that which is known is dependent on the behavior of an unexplained phenomenon? And yet that is where the domain of knowledge organization finds itself.

Hjørland (2009) made a valuable attempt to formulate the outline of a

“concept-theory”. However, we claim that a successful application of such a theory can only be validated in the context of a larger framework of elementary structures of knowledge. Earlier we sketched such a framework focusing on the UDC, that we reuse here for the analysis of theoretical views on classifications (Smiraglia et al.

, 2011).

Components of elementary structures of knowledge

Hjørland (2009) signals the problem of definition of “concept.” Indeed, competing definitions and various propositions for synonyms or description appear in the literature on classification. It is impossible to match notions of “concept” exactly with terms such as ideas, facets, isolates or elements, which often have different meanings in various classification systems and that are just used implicitly by classificationists in their theories. Nevertheless, we need some sort of overview of attempts in which one has tried to map notions of concepts and components of elementary structures of knowledge in various classification systems to enable the validation of an elementary theory of knowledge interaction. Here we discuss notions of elementary components in the work of Paul Otlet (1868-1944). In particular a typescript with the title: “Structure and classification of knowledge. General considerations and synoptic table” [Structure et Classification des Connaissances. Considerations Generales et Tableau Synoptique] of March 13th, 1928 is of interest since it discusses Otlet’s theoretical interpretations of various component of knowledge, which he could not include in the UDC, or

“documentary classification” as he called it in this context, but could be used in

“preparation of its revision”[1].

Relations between elementary structures of knowledge

Hjørland states that concepts can only be understood in context. The same applies to what we described above as “components of elementary structures of knowledge”. If we cannot exactly match the various notions of terms of which concepts are built up, we might at least try to describe the relations between the elementary structures of knowledge and their design to come to higher levels of concreteness. From linguistic

Classifications and concepts




362 theory we have learned that both the syntax and semantics are of importance in the understanding of concepts. Since various classifications have been discussed as artificial languages, we are in particular interested the interplay between syntax and semantics in the relations between elementary structures of knowledge from a historical perspective.

Interaction between elementary structures of knowledge

Given our assumption that it is impossible to completely match concepts with the components of elementary structures of knowledge, and that the order and character of the relationships in classification are shaped by our perceptions as well, we claim that a complete data-integration is impossible and that we should focus on the ways in which we interact with knowledge rather than on knowledge organization per se .

2. Elementary components: the historical quest for knowledge particles

The order of knowledge in explanations of classifications was not infrequently compared with the order of atoms and molecules in the universe. Bliss wrote a historiographical overview of the use of the universe of knowledge metaphor in classification as early as 1929, while more than a half century later historians of library and information science, such as Wilson, Miksa and Beghtol juxtaposed it with a new metaphor in “classification theory”: the universe of concepts (Bliss, 1929; Wilson, [1968]

1978; Miksa, 1992, 1998; Beghtol, 2008, Smiraglia et al.

, 2011).

Since John Dalton formulated his atomic theory in A New System of Chemical

Philosophy (1808 and 1810), a debate had developed about whether the universe was built up from matter or from gas (kinetic energy). The Austrian physicist Ludwig

Boltzmann (1844-1906) adhered to the theory that the universe was built up from matter with atoms as its basic constituents while the German Nobel prize winner in chemistry Wilhelm Ostwald (1853-1932) until 1908 denied the existence of atoms and proclaimed that the whole universe consisted of kinetic energy; and that matter is nothing but a spatial grouping of various energies. Around the turn of the century it had just become known that atoms could be divided into subparticles. In 1897 Joseph

John Thompson (1856-1940) announced the discovery of the electron and in 1901 Max

Planck (1858-1947) individuated the quantum. Four years later Albert Einstein

(1879-1955) published the first part of his relativity theory, on special relativity

(discussing the interactions between elementary particles and spacetime), to be followed by a publication on general relativity (discussing its implications for the universe) in 1916. These discoveries followed by the introduction of quantum mechanics and relativity theory at the beginning of the twentieth Century gave a new impetus to the existing debate on the substance of the universe.

Thomas S. Kuhn in The Structure of Scientific Revolutions uses this transition from atomism in the Newtonian tradition to quantum mechanics as an example to support his view that the discovery of a new and unexpected phenomenon that might be revolutionary in one (sub-)discipline is not necessarily so in another one. Although such a revolutionary discovery might lead to a shift of paradigms in more disciplines it hardly ever results in a fundamental change of rules therein (Kuhn, 1996, p. 49).

Following Kuhn’s line of reasoning we claim that transition in views on the relationships between atoms and the universe at large, perhaps did not change existing library classification systems and practices to a great extent, but certainly had an impact on views of some classifiers on the future of knowledge organization in the first half of the twentieth century.

Several authors have recognized the search of Paul Otlet (1868-1944) for the smallest elements of the universe of knowledge. Rayward (1990, pp. 6-8) described the process referred to as the Monographic Principle, in which documents in whatever format or matter can be dissected to what Otlet had called the most elementary “item of information with its own identity”, classed in the Universal Decimal Classification system to be disseminated and recombined in different formats (Rayward, 1990, p. 1)

(see Figure 1). Frohmann (2008, p. 79) read this process in the service of the full revelation of facts. Ducheyne (2005, p. 114) discussed the atomist approach to illustrate

Otlet’s objectivist view on language and draw parallels with Wittgenstein’s (2007) logical atomism. He returned to this metaphor to question past claims about the

Classifications and concepts


Figure 1.



364 influence of positivism on Otlet’s works and to get a better understanding of the latter views on ontology and epistemology (Ducheyne, 2009). More recently, Thomas

M. Dousa compared the atomist metaphor in the work of Otlet with the contemporary views of the American classifier Ernest Cushing Richardson (1860-1939) who in his

Classification, Theoretical and Practical (Richardson, 1901) had described the concept of the “unit idea” as the simplest element in the “order of things” (Dousa, 2010a, p. 19).

Otlet’s search for the smallest particles of knowledge must be seen in relation to these developments in the sciences at the end of the nineteenth and the beginning of the twentieth century. He was in direct contact for instance with the aforementioned

Wilhelm Ostwald, and indirectly as we will see later, with the protagonists of the debates around relativity theory and quantum mechanics.

Otlet’s monographical principle, that allowed the splitting up and recombination of knowledge in smaller parts, was based on Ostwald’s “Monographieprinzip”, which the latter had described as “the principle of the independent preservation of smaller pieces of thought” (Rayward, 1975, 1990; Hapke, 2008). It implies a direct link between small immaterial components of knowledge production, and material ones in the form of documentation. We do not know whether Otlet and Ostwald discussed issues of the substance of the universe.

Although there is no evidence that Otlet was influenced by Ostwald’s views on the universe, the latter would bring atomist theory closer to him in an indirect way. Ostwald had created the International Institute for Organizing Intellectual Work, The Bridge under the auspices of Emperor Wilhelm II. Otlet figures, in his role of the General

Secretary of the International Institute of Bibliography on the first page of the list of members that The Bridge published in 1913. Although Ostwald stood in the tradition of the nineteenth century atomist debate on the substance of the universe, The Bridge would create a brief, but important moment in discussions around more revolutionary developments in physics and chemistry at the beginning of the twentieth century. For the first meeting of 1911, The Bridge invited outstanding scholars in atomic theory of the time such as Ernest Rutherford, Henri Poincare´ and Marie Curie and entrepreneurs/benefactors such Andrew Carnegie and Ernest Solvay. The Bridge had a very short life, but in the same year of its founding Ernest Solvay was able to enlarge the group with the first protagonists of the debates that would dominate theoretical physics for the greater part of the twentieth century. In 1911 Solvay organized the conference Radiation and the Quanta that brought together Albert Einstein, Max Planck,

Ernest Rutherford, Marie Curie and others in the Conseil de Physique that would meet in

Brussels to discuss the very latest development in the recent atomic and quantum theories. The meetings were by invitation only and Paul Otlet was not one of them.

However, Otlet was connected indirectly to this network via Robert B. Goldschmidt

(1877-1935), with whom Otlet had worked together on the Microphotographic Book (and with whom he would continue working), who was one of the scholars who had convinced

Solvay to create an International Institute of Physics (and to support a similar project for

Chemistry that was brought in by Wilhelm Ostwald) and a participant in the Solvay meetings of 1911 and 1913 (Lambert, 2010, 163 and 167). Especially from Otlet’s later work it becomes clear that he followed these new discoveries with much interest. In

Monde, Otlet referred to the theories of Einstein and Riemann and argued that they implied new conceptualizations of space. In an unpublished document in the Archives of the Mundaneum of May 1939 (source: Mons. -Mundaneum, EUM – Farde

57–Inventions– 54-19390511) with the title: “Element, atom, ion and chemical affinity: historical” Otlet lists protagonists in the development of theories on the atom and its sub

particles from 1807 until 1930, with the names of Dalton, Faraday, Van t’Hoff, Ostwald,

Rutherford, Bohr, Goldschmidt, and others. Otlet’s historical overview of the atom by scientists was not just an attempt to create a lemma for one of his many encyclopedias.

Another document with the title Unity from1931 reveals a more fundamental interest in the history of the atom (source Mons, Mundaneum, EUM/Farde 57/Inventions/Old number VMT N 4937 1931.12.16). In this document Otlet states that all the sciences are developing towards unity. This statement is based on two observations:

(1) all the sciences can be reduced to physics; and

(2) physics possibly can be reduced to one single formula.

In short, physics plays in Otlet’s view an instrumental role in the unification of the sciences where he finds an analogy for reducing the complexity of the world.

In Monde Otlet (1935) tried to capture the complete reality of the world in one single formula. In his view, the synthesis of the world is the product of object and subject, but also of what is unknown and mysterious (Otlet, 1935, p. XXI). This synthesis is expressed first in French words, followed by a letter code based on the first letters of these words. In order to make this equation independent of language, Otlet proposed a numerical annotation of decimal fractions (see Figure 2).

Otlet’s formula of the world is visualized in a construction of spheres of various sizes (see Figure 3). This visualization shows two spheres of which the outer one represents the objects: (0,1) things (nature, man, society and divinity), (0,2) space and

(0,3) time, and the inner one the subjects: (0,5) Creations and (0,6) Expressions, and (0,7) the Unknown and Mystery circling around the central globe representing (0,4) the Self.

This representation of Monde as a whole indicated with the numerical code (0,8) makes part of an oblong folder in which all eight elements are represented separately and in the last one turn in circles around the world documentation center (0,9) Mundaneum.

On this representation of the World, Otlet wrote the comment “the equation of the world develops like this. It is at the same time its classification.” Ducheyne uses this visualization to underline Otlet’s attempt to connect the microcosm of human being with the macrocosm of the universe, whereby all knowable elements of reality and the relations between them could be overseen, comprehended, and contemplated

(Ducheyne, 2009, p. 234).

In another study (van den Heuvel and Smiraglia, 2010, pp. 51-52) we argued that the growing interest of Otlet in new developments in physics, on the one hand, can explain the theory he developed in Monde, but on the other hand inevitably leads to a form of tension within his all-encompassing system. Otlet refers to the contradictions that have arisen between the theory of relativity and quantum theory and it becomes clear that the notion of timespace and the problem of scalability of laws applicable to macro- and micro physical objects have implications for his own universal knowledge system

(Otlet, 1935, pp. 16-18, pp. 29-31; van den Heuvel and Smiraglia, 2010; van den Heuvel and Rayward, 2011). In the introduction of Monde, Otlet explained that the equation of the world never is completely perfect as a result of limitations of the human mind in perceiving reality, but that at the same time the mind never stops in analyzing and making mental constructions of reality: “It makes use of representative concepts

[emphasis original] of reality” (Otlet, 1935, p. VII). How do we have to read Otlet’s concepts in the context of his organization of knowledge and how do they relate to atomic interpretations of his work? Can we compare for instance Frohmann’s (2008)

Classifications and concepts





Figure 2.

interpretation of atoms as facts with the linguistic atoms described by Ducheyne (2005, p. 114)?

Otlet signals the tension between reality and finding the right representation to capture reality when he tries to come to a synoptic table of the structure and classification of knowledge. “The creation of such a table”, Otlet observes:

Classifications and concepts


[ . . .

] offers various difficulties because:



Tradition has not created abridged terms that allow the design of a whole group of ideas desired to be classified by one single word.

There are overlaps and interpretations of rubrics, for instance from static point of view

(matter, form) and from a dynamic point of view (where one distinguishes energy and direction, teleology)” (Otlet, 1928, not numbered, p. 1, see[1]).

Within his classification of knowledge, Otlet seems to try to develop a typology,

“fundamental distinctions” and to outline some dimensions “general points of vue” of classification. For the latter he distinguishes within classification for example the dimension “abstract, concrete”: abstract series (conception), concrete series (real object and its production/function, use) and descriptive series (history, science, variation and statistiques of object). He formulates “things” (choses) and terms that represent them such as: a material object, a material/immaterial being, a phenomenon, an idea (conception,

Figure 3.



368 theory, abstract point of view), an intellectual work, a social organism (entity), a person or a class of person, a human act (operation) and a fact/an event (Otlet, 1928, p. 2, see [1]).

Although Otlet does not provide exact definitions of “things”, his distinction in different sorts of things makes clear that he differentiates between objects, beings, ideas and facts.

This distinction becomes more clear when he describes the purpose of the table.

Otlet considered the development of such a table as a scientific and theoretical exercise, with in principal no documentary purpose. Although it could serve for the preparation and revision of the decimal classification in the future, its purpose was not to be incorporated in the existing UDC. Nevertheless, Otlet considered the development of such a table relevant for various reasons:

[ . . .

] such as for the construction of new sciences, the completion of existing sciences, the comparison hereof, to judge relative importance of questions, to obtain an classification of objects, even of the universe, and for discovering, by analogy and in an almost automatic and mechanical way, new truths and facts (Otlet, 1928, not numbered, p. 1, see [1]).

This description makes clear that Otlet separates the structure of knowledge from its documentation and its classification (see Figure 4).

Conceptions form part of the ideas we have of reality, they can serve to obtain a classification of objects – even of the whole universe – and to find new truths and facts. The aim of classification is to abstract the reality as represented in documents to come to facts that can be recombined in new (formats of) documents. This process is well described in his later publication Monde:

The word reveals itself to man in society in four modalities: The real world (Reality), the known world (Thought), the expressed world (language), the graphic/described _ [Otlet uses the word graphise´] World (Document) [..]In principal, they should concord perfectly, but in fact they do not. Thought does not know all; language does not express all and the document does not register all (Otlet, 1935, pp. VII-VIII).

“However, they tend to do it”, Otlet continues, “or should tend to do it.” (Otlet, 1935, p. VIII).

Conceptions do not form part of classification, but can facilitate classification:

[ . . .

] the expression of each concept leads to a corresponding vocabulary term; the relation between those terms, leads to procedures of expression that form the grammar. Higher and more general the relations between concepts lead to logic. Applied to an organization of knowledge, the ordering of the elements of logic leads to science. In the end the whole of science leads to the knowledge of the World in its totality (Otlet, 1935, p. VII).

In this incremental model languages (Otlet often uses the term expressions as a synomym) play a crucial role to bring knowledge and its documentation together.

3. Elementary relations: languages and notations

In the historiography of knowledge organization classifications have been discussed as artificial languages. For instance, it is a recurrent theme in the publications of

Ranganathan and interpreters of his work. In a few studies the theme is discussed in relation to Otlet’s views on classification and the UDC. Ducheyne (2005) discussed

Otlet’s views in relation to Wittgenstein’s “linguistic atomism”, while Dousa recently discussed the Universal Decimal Classication as a documentary language (Dousa,

2010b) and as an artificial language (Smiraglia et al.

, 2011). Otlet indeed refers to

Classifications and concepts

369 classification as a language. In his On the Structure of Classification (Otlet, 1895-1896, transl. Rayward, 1990, p. 52) he compares his numerical notation to spoken language:

Classification numbers will [ . . .

] be complex numerical expressions made up of different factors whose respective meanings when juxtaposed will express a complex idea after fashion of compound words in spoken languages.

Ducheyne’s comparison of Otlet’s views on knowledge organization in relation to

Wittgenstein’s “linguistic atomism ” is indeed quite tempting. Wittgenstein’s views were

Figure 4.



370 based on the doctrine of logical atomism that the mathematician Bertrand Russell introduced in his 1911 lecture to the French Philosophical Society : Le Re´alisme

Analytique . However, the latter acknowledged in his The Philosophy of Logical Atomism

(1908) that he was much indebted to the ideas of his friend and former pupil Wittgenstein

(Proops, 2011). The doctrine of logical atomism states that elementary propositions assert the existence of atomic states of affairs, are mutually independent and are combinations of semantically simple symbols or “names” that refer to “objects.” To these simple objects correspond primitive signs (linguistic atoms) that cannot be dissected further by definition. Ducheyne states that both Otlet and Wittgenstein subscribed to the doctrine of linguistic objectivism which he defined as follows:

Linguistic atoms uniquely correspond to certain discrete and well-defined elements in the world and further combinations of these linguistic atoms can objectively capture “the order of the world” (Ducheyne, 2005, p. 114).

When describing the need for the creation of scientific language and a future bibliographical classification of the social sciences, Otlet is using atoms, or more precisely molecules, as a metaphor. Referring to “the precision of language in chemistry”, Otlet states:

A word [ . . .

] not only evokes the object named in its concrete form, but also by logical association, all the characteristics and attributes of the object in the same way that the formula for a compound expresses its relationships and quickly makes its elements evident

(Otlet, 1891 – 1892, translated in Rayward, 1990, p. 19).

However, it is questionable whether Wittgenstein’s attempt to identify objects by means of semantically simple concepts, in short to find the elements in a conceptual universe, can be read in the same way as Otlet’s atoms as information in a universe of documents.

Otlet clearly makes a distinction between what he calls a scientific language, in which expressions of ideas are held together by a logical grammar; and a documentary language (described by Dousa, 2010b) that organizes a multitude and variety of documents. This notion that scientific language, based on the developments of concepts, and the documentary one, directed at the combination of facts, do not coincide. Otlet tries to overcome this in the notation systems of his classifications of knowledge and the UDC.

Above we already discussed how Otlet tried to capture the order of the world in one single formula and had described the synthesis of the world first in a letter code derived from the French terms followed by a numerical notation of decimal fractions.

This simple replacement of verbal concepts by numerical notation might seem trivial at first sight. However, Otlet sees enormous potentiality in numerical notation:

It is concise. It is international, not depending on any language. It refers to concepts not to words and their fluctuant synonyms. It marks the order to follow or the classification of concepts. Applied to an exposition of data or facts (donne´es), it marks the successive order.

Finally, the formula (equation) constitutes the plan itself according to which the exposition of data can be developed and references to it can be made (Otlet, 1935, p. XXII).

Otlet links his equations of the world to the classification and the notation of knowledge in the UDC in the future. He foresees a further development of the UDC in which ways will be found to merge classifications (“refondues”), in order to create a single instrument that can be applied both to the sciences and to its documentation

(Otlet, 1935, p. XXII and note[1]).

In short the classifications of concepts and of facts in the UDC are not contradictory, but complementary. They are complementary in the beginning, but Otlet has the hope that notation can overcome the difficulties in the classifications of knowledge that he had signaled before in his manuscript: “Structure and classification of knowledge.

General considerations and synoptic table”.

Otlet’s notion that decimal notation is crucial for merging the sciences and documentation in one classification is directly related to his view on the role mathematics plays within contemporary developments in physics. In a chapter in

Monde with the title: “Limited intelligence and mathematics”, Otlet observes that the contradiction between quantum theory and wave-mechanics in modern physics can only be explained, if one changes the very subjective point-of-view of the commonly used language of representation in physics for a mathematical point-of-view (Otlet,

1935, p. 29). He suggests that the fundamental problems of scalability in physics can be solved by mathematics “and provide the necessary language for our deductions” (Otlet,

1935, p. 29). This view seems incompatible with Wittgenstein’s later rejection of one of the fundamental principles of logical atomism, i.e. the independence of elementary propositions, since it was in his view mathematically impossible. Ducheyne’s observation that Otlet’s description of Monde corresponds with Wittgenstein’s logical atomism seems therefore to be valid only for the early publications of the philosopher on this matter. However, by the time that Otlet had formulated his order of the world in

Monde (1935), Wittgenstein (1929) had already referred in his Some Remarks of Logic to the limitation of mathematical notation and suggested that terms had to replace real numbers in atomic propositions (Proops, 2011). In his Philosophical Grammar

Wittgenstein (1936) distanced himself further from a definite dissection of propositions in mathematical terms when he stated:

I spoke as if there was a calculus in which such a dissection would be possible (Wittgenstein,

1936, p. 211, quote from Proops, 2011).

While Wittgenstein had encountered the limitations of mathematical notation, Otlet stressed its future potential not only for understanding the contradictions in physics sketched here above, but also as a creative force to apprehend reality in general:

Mathematics [...] is not merely a tool to enable higher levels of abstraction but has become thought (pense´e) itself, at least to consider what is its substitute and its successor.

Mathematics is not just a translator but a producer of concepts, that are untranslatable in another language (Otlet, 1935, p. 31; van den Heuvel and Smiraglia, 2010, p. 52).

Otlet’s notion that the mathematical representation of physical reality is not just a translation but a producer of concepts has important implications for the role he ascribes to classification. It implies that the notation gives a dynamic quality to the classification system itself. Instead of being built up purely by clear semantic content from components purified by the “external” Monographic Principle (van den Heuvel,

2008), the UDC as a producer of knowledge would also have kinetic energy, similar to the way in which Otlet considered the book as a dynamic embodiment of energy and thought (compare Day, 1997, p. 312). It would become an active instrumentation producing concepts from the inside out to create a better world.

Otlet’s view of the UDC as a dynamic producer of concepts expressed above, is of interest for the potential role of classifications for hypothesis generation described by

Miksa (1992, p. 120):

Classifications and concepts





The goal of this possible application is to explore how documents might possibly relate to one another, not for the purpose of retrieving the documents per se , but rather for the purpose of clustering them according to their intellectual likeness – to aid scholars, for example, in the intellectual process of organizing the ideas.

The question of how a classification can have intrinsic dynamic qualities that makes it suitable for interaction by its users, rather than an object for study for data integration by classificationists, motivates our search to come to an elementary theory of knowledge interaction.

4. Toward an elementary theory of knowledge interaction

We argued that it is impossible to complete the various notions of terms of which concepts are built up, and that it might be more useful to describe the dimensions of the relations between the elementary structures of knowledge to come to a better understanding of concepts. Moreover, we described how one and the same classificationist can develop classification languages and notations that run parallel but are not necessarily integrated. Finally we observed that classification languages are not necessarily static objects of study in which syntax and semantics serve as frameworks to integrate concepts, but also can have intrinsic dynamic qualities as producers of knowledge. This raises the question of how we perceive and interact with this knowledge production. Here we try to build up a contextual framework for our thought experiment of an elementary theory of knowledge interaction. We anchor our theory against the backdrop of Patrick Wilson’s ([1968] 1978, 6 ff.) “bibliographical universe” and Jesse Shera’s (1951) views on classification as the basis of bibliographic organization.

For Wilson the bibliographical universe is a concept space – an abstract intellectual milieu – in which all knowledge resides, and all retrievable knowledge (that which has been recorded, and the recordings that have been stored) exists insofar as possible in relation to its whole and its parts as well as to its universal backdrop. That is, the whole of knowledge is present in a discrete if infinite space, that knowledge which has been recorded is directly retrievable, that which has not been recorded is perhaps potentially retrievable if its boundaries and loci can be established, and, pathways exist both to collocate and disambiguate. In other words, Wilson’s universe is comprehensible if, as Otlet advised, all knowable elements and relations between them are contemplated. Wilson says that his universe is bibliographical because the objects that populate it are writings (Wilson uses the word “texts,” to distinguish a work from any semiotic instance of it; we use “writing” to give the sense of generic, abstract creation apart from any semantic instance of it; see Smiraglia, 2001, p. 3), but we can infer from his discussion of the processes of description, exploitation, and the construction and use of the bibliographical apparatus that he would clearly extend the reach of his universe to all exemplars of recorded knowledge (what in information science is usually called a document). Thus we know that texts and other writings

(such as maps, paintings, sculptures, and musical works, for instance) are directly retrievable entities in the bibliographical universe. But so are potentially recordable entities so long as their ideation is in some sense retrievable.

We found inspiration in the interpretations of Shera and Farradane to outline some requirements of such an elementary theory of knowledge (Smiraglia and Heuvel, 2011):


We are in search of groupings that will be meaningful in relevant contexts or relationship (compare Shera, 1951, p. 85).



We are in search of multiple approaches to the relata rather than the provision of alternative locations for individual units (compare Shera, 1951, p. 88).

We are in need of capturing processes (not objects), which can be represented as events in analogy with structures of thought (compare Shera, 1951, p. 80) and temporary perceptions that allow for concurrence, comparison and association

(compare Farradane, 1952, p. 81).

Shera disassociated the idea content from its physical embodiment (Shera, 1951, p. 81).

Moreover, the idea that a knowledge element could get a new function in another ensemble tallies to the pragmatic approach of Shera to classification that any single unit may be meaningful in any number of different relationships. “Thus it is the external relations, the environment, of the concepts that are all important to the act of classifying” (Shera, 1951, pp. 83-84). Otlet’s model on which we focused has the same limitation as most classifications in that it focused on one universal order of knowledge. We are interested in the nature and behavior of knowledge unities in various constellations or universes to formulate an alternative to a universal classificatory order, in order to create (temporary) interfaces that allow for interactions of knowledge between various universes (Smiraglia and van den Heuvel, 2011).

4.1 Entities in universes of knowledge

Therefore in our construct the metaphorical bibliographical universes are populated by entities – knowable elements of reality – that can be seen to exist in relationship to each other – relationships of nearness and distance, of joint motion, of evolution over time, etc. Smiraglia (1996) extended the metaphor logically by suggesting that works and their instantiations cluster in metaphorical constellations, having orbital and therefore gravitational relationship to each other, and that there are different sorts of celestial bodies in the bibliographical universe. These “constellations” are groupings of instantiations of works – not only the progenitor work itself, but also its editions, translations, abridgments, adaptations, excerpts, etc., and their instantiations as well.

These have been termed variously “bibliographic families” (Wilson), “superworks”

(Svenonius, 2001), “textual identity networks” (Leazer and Furner, 1999), and

“instantiation networks” (Smiraglia, 2008a).

At a meta-level we can consider Svenonius’ (2001, p. 54) set of entities to be the basic components that make up the bibliographical universe: works, authors, titles, editions, subjects, classifications, indexes, documents, productions, carriers, and locations. Bean and Green (2001) compile a summary of the kinds of relationships that can exist within, between and among units of recorded knowledge – semantic and syntactical relationships, bibliographic relationships (explicit relationships that exist among documentary records of knowledge), warrant, and relevance. Collectively, these entities and the relationships among them, both expressed and as yet undiscovered, are the population of the bibliographic universe, which traditionally has been seen as an object of primary interest to the domain of knowledge organization. It is easy to see that this set of entities and relationships bears striking resemblance to Otlet’s equation and enumeration – things, and representations of them – in Figure 2. There we see that the physical world is considered to be made up of objects in space and time, and knowledge of the physical world is embodied in representations whether in synthesized narrative or mere words, embracing that which is known and that which is unknown (and therefore unrecorded but potentially knowable). Knowledge that is

Classifications and concepts




374 recorded can be organized using the entities named, and retrieval of that knowledge is facilitated by the syndetic pathways among them, which result from the mapping of relationships. Knowledge that has not been recorded remains as potential information, but often remains outside the realm of knowledge organization.

Empirical evidence about both entities and relationships outlines the quantitative parameters of a theory of knowledge interaction. Such a theory can at present embrace only recorded knowledge, about which a fair amount of empirical evidence exists; but the potential that as yet unrecorded knowledge might also be described by this theory must be admitted, however haltingly. That a theory of knowledge must precede a theory of knowledge interaction seems obvious, but perhaps must be stated here explicitly. These include Tillett’s (1982) ground-breaking study of bibliographic relationships, Taylor’s studies that demonstrate the constant distribution of author productivity (Taylor-Dowell, 1982; Taylor and Paff, 1986; Taylor, 1992), several studies that demonstrated a predictable distribution of documentary forms (Smiraglia,

2002a, b), and those that demonstrate the universality of the phenomenon of instantiation among information objects (Smiraglia, 2008a; Greenberg, 2009). We use the verb “outlines” here to indicate the degree to which very little empirical evidence of these entities and relationships exists. As though we were standing at a metaphorical horizon, which allows us to observe an overview, we have broad enough empirical understanding to create a complex description of the broad outlines of the part of the universe of knowledge that we can see. But clearly there is room for much more evidence gathering.

4.1.1 A Taxonomy of entities in universes of knowledge . Specifically, we want to assert a taxonomy of the universe of knowledge such that entities are:


works are made up of ideas;




ideas are made up of concepts; concepts, the atomic elements of the bibliographical universe, are represented by signs; and signs, concepts, and works constitute taxons, which constitute canons.

And processes are:



perception, which can be described by dynamic semiotic action, which can be either triadic (as in the Peirceian R-I-O triad) or diadic (as in the Saussurean signified-signifier dyad); and events along a continuum in both time and intellectual space, which is described by the information phenomenon of instantiation.

Works are linked in space and time in two distinct but very different ways: by ideational and semantic evolution (the “works” phenomenon), and by non-semantic attribution sequences (the “citation” phenomenon). That is, specific texts of works are linked by their common ideational or semantic expressions (ideas, or strings of meaning, which might be words or sounds or images or mere impulses) which instantiate across cultural boundaries in response to canonical catalysts. Thus, for example, Gone with the Wind the novel, Gone with the Wind the screenplay, and Gone with the Wind the movie all share (or are linked) common ideas (plot, setting, characters, etc.) and common semantic expressions (the words, dialogue, etc.). They also arose in a chronological sequence to meet some demand, and they all exist in many editions, versions, etc. (instantiations). But

other, specific ideas are linked by attribution networks traced across space and time through citation links, these ideas instantiate within and among intellectual boundaries according to canonical norms, and the citation links provide means for visualizing social perception of intellectual similarity or distance.

4.2 The role of the concept

Knowledge organization has a strong basis in concept-theory (Dahlberg’s term was

“concept – theoretic”), which is essentially a semantic approach to the classification of that which is known and expressed. The problem of the definition of “concept” has been repeatedly highlighted, notably by Hjørland (2009). Competing definitions appear in the literature, which might be one reason Hjørland calls it a socially negotiated construct. The concept as seen in KO today is a single, simple, unsubdivided ideational entity. Dahlberg

(2006, p. 12) for example, suggests that in the taxonomy of knowledge, concepts form what she calls “knowledge units,” which are a synthesis of concept characteristics represented by signs. For example, “medicine” and “forensic” are concepts; when combined into

“forensic medicine” they form a knowledge unit. In her schema, concepts then are recognized by their synthesized characteristics, which she calls knowledge elements. In our example, then, “forensic medicine” has a meaning – its knowledge element which is a synthesis – which is different from simply the combination of “forensic” and “medicine.”

Such characteristics are perceptual, allowing concepts to be represented differentially by means of facets, but they are “not to be confused with features of concepts, e.g. broader, narrower, related, etc.” Hjørland similarly suggests that concepts are dependent on perception, most especially he demonstrates the manner in which epistemological points-of-view affect the understanding of concepts, such that (Hjørland, 2009, p. 1527):


The ideal of empiricism is to define concepts by clustering similar objects

(relying on features that can be observed “objectively” and avoiding theoretical selection of defining properties).




The ideal of rationalism is to define concepts by a set of primitive concepts (or

“semantic primitives”) considered “given”.

The ideal of historicism is to define concepts (a) genealogically and (b) by explicating their relations to theories and discourses.

The ideal of pragmatism is to define concepts by deciding which class of things best serves a given purpose and then to fixate this class in a sign.

In each instance the concept is a single element at the most basic level of understanding

– an “atomic element” in other words; the epistemological perspectives suggest different approaches to the synthesis of characteristics, and domain-specific discourses provide context for the representational signs. Later in the same article Hjørland (2009, p. 1529) says that knowledge organization systems “should not consider concepts to be universal but to be linked to certain discourses and interests.” More recently, Szostak (2011, p. 2247) has proposed an approach to interdisciplinary classification in which what he calls

“basic concepts” are “concepts that can readily be ascribed similar meanings across disciplines or cultures.” His proposal is that a concern for interdisciplinary knowledge organization is to make a distinction between such basic concepts, which can be understood in many domains, and what he calls “complex concepts,” which are those that can be understood only within discrete domains.

The concept, then, is roughly equivalent to Ranganathan’s (1967) use of the term

“isolate” in apposition to the term subject. It is also parallel to Edmund Husserl’s (1913,

Classifications and concepts





1950) notion of the “eideia,” which has been used to explain the noetic function of perception along multiple individual egotistical continua (Smiraglia, 2008b, 2010;

Smiraglia and Gabel, 2009). It is in language – representational, traditional, contextual semantics – that “concept” and “likeness” clash. It is here that perception becomes a critical process because any given concept is context dependent. And it is here that we return to the semantic hindrance that lies between the expression of thought and the labeling of concepts. Therefore a concept-theory must be seen as a substrate of all knowledge, as it were, a layer that lies beneath and supports expressed knowledge, such that a foundational theory allows the interpolation of domain-specific contexts. A stated theory that points to empirical means of both evidence and functionality is critical to the furtherance of knowledge organization as a domain. Here we present the outline of such a theory, and we point to the empirical means for further research.

Domain-specific analyses underlie the comprehension of all of knowledge organization.

They require careful empirical research into the concepts that constitute any domain’s ontology, as well as the epistemological shading of the origin of its conceptual base.

Precisely the work that now is being conducted in knowledge organization to explicate the axes of specific domains is the beginning of the laying of such a foundation. This foundational theory, or as we call it here an elementary theory of knowledge interaction, is that which has been articulated in the preceding section. The entities of the records of knowledge and the relationships among them are the constants on which domain-specific contexts may rest, and from which domain-specific concept-theoretic structures may emerge. Such domain-specific structures then may co-exist in parallel, in much the same way disciplines co-exist in a classification such as the UDC. The flexibility of the faceted structure of UDC and of its movable form divisions is indicative of the fundamental function of the substrate, providing the fundament on which domain-specific ground can flourish from which, in turn, concepts emerge to populate the realm of knowledge.

4.2.1 Semiotic dynamicity . Concepts themselves are represented by signs and the expressions that signify them – which can be grouped, and which can function dynamically in space and time. This dynamic action is described by the semiotic process, which moves signs and the larger entities that contain them from inception to perception

– as it were from thought to ideation – along a constant trajectory. Just as concepts are diverse, so may signs be described diversely as “likenesses, indices, and symbols”

(Peirce, [1894] 1998, p. 10). Likenesses (Peirce also called these “icons”) imitate what they represent, indications (or “indices”) point to things to which they are connected, and symbols are associated with the things they represent because of a pattern of usage.

These terms represent a dynamic progression of signs: likenesses being furthest removed from the objects they represent, indices being physically connected, and symbols being intellectually connected. Thus the degree of closeness is a measure of the force of a sign. Signs, and therefore the concepts they signify, are then dynamic in multiple ways. Perception of them is subject to semiotic action, and their degree of force is subject to spatial motion along a representational continuum. Concepts are combined in various ways, which can be measured, as can their associated elemental signs. Domain analytical tools, such as co-word analysis, citation analysis, author-co-citation analysis, journal co-citation analysis, and content analysis are the methodological tools that are at present in use for the demarcation of the conceptual bounds of individual domains. Other sorts of analyses, such as those applied to social-tagging or those used for the development of thesauri, and those in use in the field of terminography, are similarly, metrical tools for the demarcation of the conceptual bounds of individual domains. What

is needed is the theoretical base we supply here, as well as much, much, more empirical analysis using the methods described here, as well as the development of new methods such as the emerging methods for subject ontogeny (Tennis, 2003; Salah et al.


Those that are combined in deliberate ways may rise to the level of works, but there are many combinations of concepts that are informative but which are not deliberate works

(Smiraglia, 2006, 2008a).

Research in KO that makes use of semiotic theory has demonstrated the ways in which signs may be viewed as either static or dynamic, and those that are dynamic may instantiate in a variety of ways. Smiraglia (2002c, 2008a) has demonstrated the ways in which works may function as signs – most works (most ideas, most signs, etc.) exist in single immutable instances, but those that become accepted as cultural icons

(art works, sacred texts, oral history, folk songs, tunes, etc.) become dynamic. Static signs conform to Saussurean semiosis of signified and signifier, but dynamic signs conform to the mutable nature of Peircean symbols. A Peirceian symbol is usually represented as a triad of Representamen, Interpretant, and Object. Each Object, on being perceived, becomes a new Representamen, meaning the symbol is potentially infinitely mutating. Friedman (2006, 2007) has demonstrated how concept maps empirically demonstrate Peircean and Saussurean semiosis. Mai (2001) suggested

Periceian semiotic theory could be used to create a clearer focus on the analysis of subject content of documents, extracting concepts in a more reliable way, so as to represent them better in KO schema.

Works or other information objects that are subject to taxonomic inclusion – that is, the concatenation of mutable mutating instantiations – are those that constitute measurable canons (bodies of ideation in the form of works, such as “a literature”). For example, canons can range from “the arts and sciences,” which is the basis of most undergraduate university education in the USA, to “library and information science,” arguably the canon closest to this paper, which includes classical work by Dewey and

Cutter, Otlet and Briet, Buckland and Bates, Goldhor, Rayward, Day, and so forth

(Doherty, 1998). A canon is the literature accepted as foundational for a domain, and therefore, a canon can be as broad or narrow as its domain. It is canonicity, or acceptance into a canon, that has been demonstrated to be associated with a high degree of instantiation. Put more simply, a work or a set of works, once accepted into a canon, become in demand, which causes more editions, translations, adaptations, commentaries, etc. to be generated by the domain. These canons provide the warrant for most classificatory activity in KO. Instantiation has been shown to be a continuum along which ideation is combined with intellectual force into the expressions of works (Smiraglia,

2008a). Motion is the pathway of ideation in the process of instantiation. Earlier we made reference to the classic novel/motion picture Gone with the Wind; it is easy to see that the sequence of novel-screenplay-movie-etc. takes place on a temporal continuum; at each point, the ideation (the ideas in the novel) is combined with intellectual force to generate a new expression (novel to screenplay, or screenplay to movie, for instance). Such motion can be positive, negative, or both – that is, it can add to the original ideation, subtract from it, or abandon it completely – and therefore is analogous to the semiotic/semiosis dichotomy in which signified-signifier/representamen-interpretant-0bject are the descriptors of the motion of concepts when intellectual force is applied. Instantiation networks (in this case the set of all expressions known as Gone with the Wind ), or canons of deliberately constituted expressed knowledge, can be open or closed.

KO is notoriously amenable to external explanations and many other approaches have been incorporated – e.g. semiotics, post-modernism, phenomenology – to help

Classifications and concepts




378 flesh out the limits of concept-theory. In particular, authors pointing to postmodernist approaches have suggested the domain of KO should move away from a search for universal explanations and seek instead to find concrete, domain-specific ontologies

(for example, Miksa, 1998; Mai, 1999; Hjørland and Albrechtsen, 1999). One could understand this movement within KO to be a renewed approach to domain-specific explanations in search of concepts with clear local relevance. Our contribution is different in that we here provide a different epistemological reading of “knowledge” as elemental, elementary, theory – a new but very useful approach to the consideration of our own paradigm – by outlining a semantics that is based on structure and on related forces between components, rather than on content.

Semiotic theory defines the manner in which fluidity inheres in the recognition of symbolic knowledge. Phenomenology uses the egotistical eideia to define the manner in which the same phenomenon encountered at different points seems to have different reflective perceptual definitions. Post-modernism allows us to embrace all explanations at once, comprehending the possibility that each might contribute one element to an eventual overall theory of knowledge. The empirical studies cited above demonstrate with clarity the dynamic forces that propel knowledge constantly into new dimensions.

5. Conclusion

Our objective is a theory that would support a process that resembles Otlet’s grinder of all sorts of documents (see Figure 1) into which pre-coordinated knowledge is entered, inside which its components – works, ideas, and at the most basic level concepts – are disassembled and reassembled, the result of which is semantic hyperlinking based on knowledge structure in addition to traditional semantics. We have explored some significant historical understandings of the relationship between ideas, concepts, and knowledge, which is their synthesis. We suggest the necessity of drilling down the taxonomy of knowledge elements to find the core particles, which can be used to generate classes of structural knowledge elements, rather than universal concatenations of what Szostak calls “basic concepts”. We also suggest that faceted classification should embrace more than simple categorical aspects of complex concepts. Rather, facets should be used to represent multiple dimensions of concepts by temporal interfaces with instantiating information objects and semiotic interfaces insofar as possible to reveal the dynamic nature of concepts across otherwise distinctly different domains. For example, a temporal interface for instantiating information objects is offered by the FRBR model of work !

expression !

manifestation; a semiotic interface is, in other words a documentation of evolving signs – a trivial example is the painting Mona Lisa and the iconic New Yorker cover featuring the face of Monica Lewinsky in a photographically contrived image of the painting (http://

Our exposition of the dissection and reassemblage of concepts is an attempt to demonstrate how a systematization of these approaches can bring further insight about the structure of knowledge, and therefore of the meaning of its extension. Our approach is very similar to that put forward by Stock (2010), in which many of the same components are combined with reference to the role of knowledge organization systems (KOSs) in support of the information retrieval process. However, the taxonomy we generate is a first step toward demonstrating a semantics that is based on the structure of knowledge (i.e. the footprint created when the elements of the theory outlined here cumulated for any concept, idea, or work) rather than the content of

documents (i.e. the mere names of the concepts represented in it), so as to enable the development of mechanisms for linking related knowledge entities with (so far) undiscovered similarities.


1. The Hague, Netherlands – National Library KB – Archives of FID – Box 95 – x 058:025.46

“andere classificatie systemen – MD 25.49 – Otlet’s numbering Note 2197 (ante 368) –


Classifications and concepts



