McClellandCogSciWebSeminar

advertisement
Semantic Cognition:
A Parallel Distributed Processing
Approach
James L. McClelland
Center for the Neural Basis of Cognition
and
Departments of Psychology and Computer Science,
Carnegie Mellon
Timothy T. Rogers
Center for the Neural Basis of Cognition
and now
MRC Cognition and Brain Sciences Unit, UK
CNBC
A Joint Project of Carnegie Mellon and the University of Pittsburgh
Approaches to Semantic Cognition
• Concepts and their Properties
– Is Socrates Mortal?
• Hierarchical Propositional Models
– Quillian, 1968; Collins and Quillian, 1969
• Theory-Theory and Related Approaches
– Murphy and Medin, 1985; Gopnik and Wellman,
1994; Keil, 1991; Carey, 1985
• Parallel Distributed Processing
– Hinton, 1981; Rumelhart and Todd, 1993; McRae, De
Sa, and Seidenberg, 1997
Plan for This Talk
• Compare a distributed, connectionist model that learns
from exposure to information about the relations
between concepts and their properties to the ‘classical’
Hierarchical Propositional Approach.
• Show how the model accounts for a set of phenomena
that have been introduced in support of ‘Theory Theory’
• Conclude with a brief consideration of where we are in
the development of a theory of semantic cognition.
Initial Motivations for the Model
• Provide a connectionist alternative to
traditional hierarchical propositional models of
conceptual knowledge representation.
• Account for development of conceptual
knowledge as a gradual process involving
progressive differentiation.
Quillian’s
Hierarchical
Propositional
Model
The Parallel Distributed Processing
Approach
• Processing occurs via propagation of activation among
simple processing units.
• Knowledge is stored in the weights on connections
between the simple processing units.
• Propositions are not stored directly.
– The ability to produce complete propositions from
partial probes arises through the activation process,
based on the knowledge stored in the weights.
• Learning occurs via adjustment of the connections.
• Semantic knowledge is gradually acquired through
repeated exposure, mirroring the gradual nature of
cognitive development.
Activation
The Rumelhart Model
The Training Data:
All propositions true of
items at the bottom level
of the tree, e.g.:
Robin can {grow, move, fly}
Error
Any Questions?
Differentiation in Development
The Rumelhart Model
Trajectories of Representations Through State
Space over Time
Any Questions?
Tenets of Theory Theory
• Intuitive domain knowledge of relations between items
and their properties is use to decide:
– which categories are ‘good’ ones and which
properties are central to particular concepts
– how properties should be generalized from one
category to another
• Many proponents suggest that some theory-like
knowledge (or constraints on acquiring such knowledge)
must be available ‘initially’.
• Others emphasize reorganization of knowledge through
experience, but provide very little discussion of how
experience leads to reorganization.
Three Phenomena Supporting
Theory Theory
 Category goodness and feature importance.
• Differential importance of properties in
different concepts.
• Reorganization of conceptual knowledge.
Effects of Coherent Variation of
Properties on Learning
• Attributes that vary together create the concepts that
populate the taxonomic hierarchy, and determine which
properties are central to a given concept.
• Where sets of attributes vary together, they exert a
strong effect on learning.
– Items with co-varying properties stay together
through semantic space and form the clusters
corresponding to super-ordinate concepts.
• Arbitrary properties (those that do not co-vary with
others) are very difficult to learn, even when frequency
is controlled.
– They control a late stage of differentiation in which
individual items within clusters become conceptually
distinct.
Coherence
Training
Environment
Items
Properties
Coherent
Incoherent
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
No Category Labels are Provided!
Effect of Coherence on Learning
Effect of Coherence on Representation
Extended model
for remaining
simulations
Progressive Differentiation of Category
Structure Without Names
300 Epochs
plants | animals
1200 Epochs
plants | animals
Any Questions?
Three Phenomena Supporting
Theory Theory
 Category goodness and feature importance.
 Differential importance of properties in
different concepts.
• Reorganization of conceptual knowledge.
Differential Importance
(Marcario, 1991)
• 3-4 yr old children see a puppet
and are told he likes to eat, or
play with, a certain object (e.g.,
top object at right)
– Children then must choose
another one that will “be the
same kind of thing to eat” or
that will be “the same kind of
thing to play with”.
– In the first case they tend to
choose the object with the
same color.
– In the second case they will
tend to choose the object
with the same shape.
Adjustments to Training
Environment
• To address this we added some new property units and
created clear cases of feature-dependencies in the
model:
• Among the plants:
– All trees are large
– All flowers are small
– Either can be bright or dull
• Among the animals:
– All birds are bright
– All fish are dull
– Either can be small or large
• Though partially counter-factual, these assignments
allow us to explore domain specificity of feature
dependencies in the model.
Testing Feature Importance
• After partial learning, model is shown eight test objects:
– Four “Animals”:
• All have skin
• All combinations of bright/dull and large/small
– Four “Plants”:
• All have roots
• All combinations of bright/dull and large/small
• Representations are generated by using
back-propagation, training the item-to-representation
weights only.
• Representations are then compared to see which
animals are treated as most similar, and also which
plants are treated as most similar.
(One unit is added
for each test object)
Similarities of Obtained
Representations
Size is relevant
for Plants
Brightness is relevant
for Animals
Differential Feature Importance
• The simulation suggests that domain-general
learning mechanisms can learn that different
features are important for different concepts.
• The network has acquired domain-specific
knowledge of just the sort theory theorists
claim children know about concepts.
• It does so from the distributions of properties
of concepts, without the aid of initial domain
knowledge.
Phenomena Supporting Theory
Theory
 Category goodness and feature importance.
 Differential importance of properties in
different concepts.
 Reorganization of conceptual knowledge.
Conceptual Reorganization (Carey, 1985)
• Carey demonstrates that young children ‘discover’ the
unity of plants and animals as living things only around
the age of 10.
• She suggests that the emergence of the concept of
living thing coalesces from assimilation of different
kinds of information, including:
– Need for nutrients
– What it means to be dead vs. alive
– Reproductive properties
Conceptual Reorganization in the
Model
• Our simulation model provides a vehicle for
exploring how conceptual reorganization can
occur.
– The model is capable of forming initial
representations based on superficial
appearances
– Later, it can discover shared structure that
cuts across several different relational
contexts, and use the emergent common
structure as a basis for a deeper
organization.
Reorganization Simulation
• We consider the coalescence of the superordinate
categories plant and animal, in a situation where the
training data initially supports a superficial organization
based on appearance properties.
• In each training pattern, the input is an item and one of
the three relations: ISA, HAS, or CAN.
• The target includes all of the superficial appearance
properties (IS properties) plus the properties
appropriate for the relation.
• The model quickly learns representations that capture
the superficial IS properties.
• Later, it reorganizes these representations as it learns
the relation-dependent properties.
Organization of Conceptual Knowledge
at Different Points in Development
Phenomena Supporting Theory
Theory
 Category goodness and feature importance.
 Differential importance of properties in
different concepts.
 Reorganization of conceptual knowledge.
Summary
• The model exhibits several characteristics of
human cognition that motivated the appeal to
naïve domain theories.
• The model does these things simply by
adjusting the weights on connections among
simple processing units, and by propagating
signals backward and forward through these
weighted connections.
Relationship between the Model and
Theory Theory
• There is a sense in which the knowledge in the
connections plays the role of the informal domain
theories advocated by theory theorists, and one might
be tempted to suggest that the model is ‘merely an
implementation’ of the theory theory.
• However, it differs from the theory theory in several
very important ways:
– It provides explicit mechanisms indicating how
domain knowledge influences semantic cognition.
– The PDP model avoids bringing in unwanted aspects
of what we generally mean by ‘theory’
– It offers a learning process that provides a means for
the acquisition of such knowledge.
– It demonstrates that some of the sorts of constraints
theory-theorists have suggested might be innate can
in fact be acquired from experience.
Conclusions
• In our view the ‘theory theory’ should be
viewed as more of a pre-theoretical heuristic
than an actual theory of semantic cognition.
• Our own proposals, built on Hinton’s and
Rumelhart’s, are far from the final word, and
do not constitute a complete theory at this
point.
• Our hope is that they will contribute, along
with the work of many others, to the ongoing
development of an adequate and complete
theory of semantic cognition.
Download