Conceptual Spaces Part 1: Fundamental notions P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Opening remarks This tutorial is more about cognitive science than IR, is fragmented and offers a somewhat personal interpretation The content is drawn mostly from Gärdenfors’ “Conceptual Spaces: The geometry of thought”, MIT Press, 2000. Also driven by some personal intuition: – – The model theory for IR should be rooted in cognitive semantics How do you capture these computational semantics in a computational form and what can you do with them? Gärdenfors’ point of departure How can representations (information) in a cognitive system be modelled in an appropriate way? – – Symbolic perspective: representation via symbol, a cognitive system is described by a Turing machine (cognition = computation = symbol manipulation) Associationist perspective: representation via associations between “different kinds of information elements” (e.g. connectionism – associations modelled by artificial neural networks) The problem with the symbolic and associationist perspectives “mechanisms of concept acquisition, which are paramount for the understanding of many cognitive phenomena, cannot be given a satisfactory treatment in any of these representational forms” – – Concept acquisition (learning) closely tied with similarity Geometric representation: similarity can be “modelled in a natural way” Gärdenfors’ cognitive model symbolic conceptual associationist (sub-conceptual) Propositional representation Geometric representation Connectionist representation Conceptual spaces outline Quality dimension Domain (Context) property Concept “Conceptual spaces are a framework for a number of empirical theories: concept formation, induction, semantics” How can conceptual spaces be realized (e.g., for IR) Quality dimensions Represent various “qualities” of an object: – – – – – – – Temperature Weight Brightness Pitch Height Width Depth A distinction is made between “scientific” and “phenomenal” (psychological) dimensions Quality dimensions (con’t) “Each quality dimension is endowed with certain geometrical structures (in some cases topological or ordering relations) 0 Weight: isomorphic to non-negative reals Quality dimensions may have a discrete geometric structure Discrete structure divides objects into disjoint classes 1. 2. Kinship relation: father, mother, sister etc, (geometric structure = discrete points) t “Even for discrete dimensions we can distinguish a rudimentary geometric structure” Phenomenal vs. scientific interpretations of dimensions Phenomenal interpretation: dimensions originate from cognitive structures (perception, memories) of humans or other organisms – E.g. (height, width, depth), hue, pitch Scientific interpretation: dimensions are treated as part of a scientific theory – E.g., weight Example: colour Hue- the particular shade of colour – – Chromaticity- the saturation of the colour; from grey to higher intensities – – Geometric structure: circle Value: polar coordinate Geometric structure: segment of reals Value: real number Brightness: black to white – – Geometric structure: reals in [0,1] Value: real number Example: colour (hue, chromaticity, brightness) NB geometric structure allows phenomenologically “complementary” and “opposite” hues can be distinguished Integral and separable dimensions Dimensions are integral if an object cannot be assigned a value in one dimension without giving it a value in another: – E.g. cannot distinguish hue without brightness, or pitch without loudness Dimensions that are not integral, are said to be separable Psychologically, integral and separable dimensions are assumed to differ in cross dimensional similarity – – – integral dimensions are higher in cross-dimensional similarity than separable dimensions. (This point will motivate how similarities in the conceptual space are calculated depending on whether dimensions are integral or separable. N.B. IR matching functions treat all dimensions equally) Where do dimensions originate from? Scientific dimensions: tightly connected to the measurement methods used Psychological dimensions: – – – Some dimensions appear innate, or developed very early; e.g. inside/outside, dangerous/not-dangerous. (These appear to be preconscious) Dimensions are necessary for learning – to make sense of “blooming, buzzing, confusion”. Dimensions are added by the learning process to expand the conceptual space: E.g., young children have difficulty in identifying whether two objects differ w.r.t brightness or size, even though they can see the objects differ in some way. “Both differentiation and dimensionalization occur throughout one’s lifetime”. In summary, Quality dimensions are the building blocks of representations within an conceptual space Gärdenfors’ rebuttal of logical positivism: – “Humans and other animals can represent the qualities of objects, for example, when planning an action, without presuming an internal language or another symbolic system in which these qualities are expressed. As a consequence, I claim that the quality dimensions of conceptual spaces are independent of symbolic representations and more fundamental than these” Conceptual spaces outline Quality dimension Domain (Context) property Concept “Conceptual spaces are a framework for a number of empirical theories: concept formation, induction, semantics” How can conceptual spaces be realized (e.g., for IR) Domains and conceptual space A domain is set of integral dimensions- a separable subspace (e.g., hue, chromaticity, brightness) A conceptual space is a collection of one or more domains – Cognitive structure is defined in terms of domains as it is assumed that an object can be ascribed certain properties independently of other properties Not all domains are assumed to be metric – a domain may be an ordering with no distance defined Domains are not independent, but may be correlated, e.g., the ripeness and colour domains co-vary in the space of fruits Conceptual spaces outline Quality dimension Domain (Context) property Concept “Conceptual spaces are a framework for a number of empirical theories: concept formation, induction, semantics” How can conceptual spaces be realized (e.g., for IR) Properties and concepts: general idea A property is a region in a subspace (domain) A concept is based on several separable subspaces Example property: “red” hue chromaticity brightness Criterion P: A natural property is a convex region of a domain (subspace) “natural” – those properties that are natural for the purposes of problem solving, planning, communicating, etc Motivation for convex regions x x y Convex y Not convex x and y are points (objects) in the conceptual space If x and y both have property P, then any object between x and y is assumed to have property P Remarks about Criterion P Criterion P: A natural property is a convex region of a domain (subspace) Assumption: “Most properties expressed by simple words in natural languages can be analyzed as natural properties” “The semantics of the linguistic constituents (e.g. “red”) is severely constrained by the underlying conceptual space” (I.e. no “bleen”) “Criterion P provides an account of properties that is independent of both possible worlds and objects” Strong connection between convex regions and prototype theory (categorization) (Easier to understand how inductive inferences are made) Example concept: “apple” Apple = < < , , , , texture, fruit, nutrition> , > Criterion C: A natural concept is represented as a set of regions in a number of domains together with an assignment of salience weights to the domains and information about how the regions in the different domains are correlated Concepts and inference (in passing) The salience of different domains determines which associations can be made, and which inferences can be triggered – Context: moving a piano – leads to association “heavy” More about this next time….. How to model relevance: concept? Topicality About my topic Novelty Unique or the only source; familiar Currency Up-to-date Quality Well written, credible Presentation Comprehensive Source aspects Prominent author Info aspects Theoretical paper Appeal enjoyable Table from Yuan, Belkin and Kim, ACM SIGIR 2002 Poster How to model a document(s): ? “An exosomantic memory is a computerized system that operates as an extension to human memory. Ideally, use of an exosomantic system would be transparent, so that finding information would seem the same as remembering it to the human user” (B.C. Brookes, 1975) – – – To create computerized representations of data sets that are consistent with human perception of the data sets To enable personalized relations to representations of data sets To provide natural interfaces for interaction with exosomantic memory Newby, G. Cognitive space and information space. JASIST 52(12), 2001 Term = dimension “Since many of the fundamental quality dimensions are determined by our perceptual mechanisms, there is a direct link between properties described by regions of such dimensions and perceptions” (rats!) However, dimensional spaces based on terms have shown marked correlation with human information processing: – – – HAL and note (“It is difficult to know how to encode abstract concepts with traditional semantic features. Global co-occurrence models, such as HAL, may provide a solution to part of this problem”) So, terms as dimensions in a global co-occurrence leads useful vector representations of abstract concepts HAL’s results seem to be echoed by Newby using Principal Component Analysis on a term-term co-occurrence matrix Text fragment = dimension For example, (term x document) matrix Latent semantic analysis produces vector representations of words in a reduced dimensional space: – – LSA correlates with human information processing on a number of tasks, e.g., semantic priming Landauer at al often use short fragments (dimension = 1 or 2 sentences) Dimensional reduction is apparently successful in re-producing cognitive compatibility, but the reason for this is unknown Determining the appropriate dimensional structure for IR models is still an open question, especially in light of cognitive aspects Similarity: introductory remarks Similarity is central to many aspects of cognition: concept formation (learning), memory and perceptual organization Similarity is not an absolute notion but relative to a particular domain (or dimension) – – “an apple an orange are similar as they have the same shape” Similarity defined in terms of the “number of shared properties” leads to arbitrary similarity – “a writing desk is like a raven” Similarity is an exponentially decreasing function of distance N.B. clustering in IR often uses an “absolute” notion of similarity Metric spaces A real-valued function d(x,y) is said to be a distance function for space S if it satisfies the following conditions for all points x, y and z in S: Minimality : d ( x, y ) 0, d ( x, y ) 0 only if x y Symmetry : d ( x, y ) d ( y, x) Triangle inequality : d ( x, y ) d ( y, z ) d ( x, z ) A space that has a distance function is called a metric space (There is debate about whether distance is symmetric from a psychological viewpoint. Eg Tversky et al “Tel Aviv judged more similar to New York” than vice versa. Gärdenfors accepts the symmetry axiom) Equi-distance under the Euclidean metric d E ( x, y ) (x y ) i 2 i i x Set of points at distance d from a point x form a circle Points between x and y are on a straight line Equi-distance under the city-block metric d C ( x, y ) xi yi i x The set of points at distance d from a point x form a diamond The set of points between x and y is a rectangle generated by x and y and the directions of the axes Between-ness in the city-block metric y x All points in the rectangle are considered to be between x and y Metrics: integral and separable dimensions For separable dimensions, calculate the distance using the cityblock metric: – “If two dimensions are separable, the dissimilarity of two stimuli is obtained by adding the dissimilarity along each of the two dimensions” For integral dimensions, calculate distance using the Euclidean metric: – “When two dimensions are integral, the dissimilarity is determined both dimensions taken together Minkowski metrics Euclidean and city-block are special cases of Minkowski metrics: d k ( x, y ) r City-block: r = 1 Euclidean: r = 2 i xi yi r Scaling dimensions Due to context, the scales of the different dimensions cannot be assumed identical d k ( x, y ) r i wi xi yi Dimensional scaling factor r Similarity as a function of distance A common assumption in psychological literature is that similarity is an exponentially decaying function of distance: s( x, y) e c.d ( x , y ) The constant c is a sensitivity parameter. The similarity between x and y drops quickly when the distance between the objects is relatively small, while it drops more slowly when the distance is relatively large. The formula captures the similarity-based generalization performances of human subjects in a variety of settings IR-related comments on similarity In the vector-space model, similarity is determined by the cosine function, which is not exponentially decaying IR models don’t distinguish between integral and separable dimensions, even though this distinction is significant from a cognitive point of view Experience so far with computational cognitive models is mixed: – – – LSA uses cosine similarity (not exponentially decaying)!! HAL used Minkowski (r = 1) to measure semantic distance, I.e a nonEuclidean distance metric was employed (Non-Euclidean metrics should perhaps be explored) Prototypes and categorical perception: introductory remarks Human subjects judge “a robin as a more prototypical bird than a penguin” Classifying an object is accomplished by determining its similarity to the prototype: – – Similarity is judged w.r.t a reference object/region Similarity is context-sensitive: a robin is a prototypical bird, but a canary is a prototypical pet bird Continuous perception: membership to a category is graded Prototype regions in animal space reptile emu archaeopteryx robin mammal bat bird penguin platypus Categorical perception: stimuli between categories distinguished with more ease and accuracy than within them Based on Gärdenfors & Williams IJCAI 2001 Computing categories in conceptual space: Voronoi tessellations Given prototypes p1 ,, pn require that q be in the same category as its most similar prototype. Consequence: partitioning of the space into convex regions Voronoi Tessellations (con’t) Much psychological data concords with tessellating conceptual spaces into star-shaped (and sometimes convex) regions around prototypes (e.g., stop consonants in phoneme classification” Boundaries produced by Voronoi tesselations provide the threshold of similarity and support a mechanism explaining categorical perception Gärdenfors & Williams, Reasoning about categories in conceptual spaces, Proceedings IJCAI 2001 Part II Concept combination Induction Semantics Non-monotonic aspects of concepts Realizing (approximating) conceptual spaces