to view slides

advertisement
A Probabilistic Approach to
Semantic Representation
Tom Griffiths
Mark Steyvers
Josh Tenenbaum
• How do we store the meanings of words?
– question of representation
– requires efficient abstraction
• How do we store the meanings of words?
– question of representation
– requires efficient abstraction
• Why do we store this information?
– function of semantic memory
– predictive structure
Latent Semantic Analysis
(Landauer & Dumais, 1997)
co-occurrence matrix
high dimensional space
Doc1
Doc2
Doc3 …
words
34
0
3
in
0
12
2
semantic
5
19
6
in
spaces
…
11
…
6
…
1
…
semantic
X
SVD
words
spaces
UDVT
Mechanistic Claim
Some component of word meaning can be
extracted from co-occurrence statistics
Mechanistic Claim
Some component of word meaning can be
extracted from co-occurrence statistics
But…
– Why should this be true?
– Is the SVD the best way to treat these data?
– What assumptions are we making about meaning?
Mechanism and Function
Some component of word meaning can be
extracted from co-occurrence statistics
Semantic memory is structured to aid retrieval
via context-specific prediction
Functional Claim
Semantic memory is structured to aid retrieval
via context-specific prediction
– Motivates sensitivity to co-occurrence statistics
– Identifies how co-occurrence data should be used
– Allows the role of meaning to be specified exactly,
and finds a meaningful decomposition of language
A Probabilistic Approach
• The function of semantic memory
– The psychological problem of meaning
– One approach to meaning
• Solving the statistical problem of meaning
– Maximum likelihood estimation
– Bayesian statistics
• Comparisons with Latent Semantic Analysis
– Quantitative
– Qualitative
A Probabilistic Approach
• The function of semantic memory
– The psychological problem of meaning
– One approach to meaning
• Solving the statistical problem of meaning
– Maximum likelihood estimation
– Bayesian statistics
• Comparisons with Latent Semantic Analysis
– Quantitative
– Qualitative
The Function of Semantic Memory
• To predict what concepts are likely to be needed
in a context, and thereby ease their retrieval
• Similar to rational accounts of categorization
and memory (Anderson, 1990)
• Same principle appears in semantic networks
(Collins & Quillian, 1969; Collins & Loftus, 1975)
The Psychological Problem of Meaning
• Simply memorizing whole word-document
co-occurrence matrix does not help
• Generalization requires abstraction, and this
abstraction identifies the nature of meaning
• Specifying a generative model for documents
allows inference and generalization
One Approach to Meaning
• Each document a mixture of topics
• Each word chosen from a single topic
•
•
from parameters
from parameters
One Approach to Meaning
w
P(w|z = 1) = f (1)
HEART
LOVE
SOUL
TEARS
JOY
SCIENTIFIC
KNOWLEDGE
WORK
RESEARCH
MATHEMATICS
topic 1
0.2
0.2
0.2
0.2
0.2
0.0
0.0
0.0
0.0
0.0
w
P(w|z = 2) = f (2)
HEART
LOVE
SOUL
TEARS
JOY
SCIENTIFIC
KNOWLEDGE
WORK
RESEARCH
MATHEMATICS
topic 2
0.0
0.0
0.0
0.0
0.0
0.2
0.2
0.2
0.2
0.2
One Approach to Meaning
Choose mixture weights for each document, generate “bag of words”
q = {P(z = 1), P(z = 2)}
{0, 1}
{0.25, 0.75}
MATHEMATICS KNOWLEDGE RESEARCH WORK MATHEMATICS
RESEARCH WORK SCIENTIFIC MATHEMATICS WORK
SCIENTIFIC KNOWLEDGE MATHEMATICS SCIENTIFIC
HEART LOVE TEARS KNOWLEDGE HEART
{0.5, 0.5}
MATHEMATICS HEART RESEARCH LOVE MATHEMATICS
WORK TEARS SOUL KNOWLEDGE HEART
{0.75, 0.25}
WORK JOY SOUL TEARS MATHEMATICS
TEARS LOVE LOVE LOVE SOUL
{1, 0}
TEARS LOVE JOY SOUL LOVE TEARS SOUL SOUL TEARS JOY
One Approach to Meaning
q
z
• Generative model for co-occurrence data
• Introduced by Blei, Ng, and Jordan (2002)
• Clarifies pLSI (Hofmann, 1999)
w
Matrix Interpretation
normalized
co-occurrence matrix
=
F
topics
C
topics
words
words
documents
mixture
components
documents
Q
mixture
weights
A form of non-negative matrix factorization
Matrix Interpretation
=
Q
vectors
=
U
vectors
C
words
words
vectors
D
vectors
documents
F
documents
topics
C
topics
words
words
documents
documents
VT
The Function of Semantic Memory
• Prediction of needed concepts aids retrieval
• Generalization aided by a generative model
• One generative model: mixtures of topics
• Gives non-negative, non-orthogonal factorization
of word-document co-occurrence matrix
A Probabilistic Approach
• The function of semantic memory
– The psychological problem of meaning
– One approach to meaning
• Solving the statistical problem of meaning
– Maximum likelihood estimation
– Bayesian statistics
• Comparisons with Latent Semantic Analysis
– Quantitative
– Qualitative
The Statistical Problem of Meaning
• Generating data from parameters easy
• Learning parameters from data is hard
• Two approaches to this problem
– Maximum likelihood estimation
– Bayesian statistics
Inverting the Generative Model
• Maximum likelihood estimation
WT + DT parameters
• Variational EM (Blei, Ng & Jordan, 2002)
WT + T parameters
• Bayesian inference
0 parameters
Bayesian Inference
• Sum in the denominator over Tn terms
• Full posterior only tractable to a constant
Markov Chain Monte Carlo
• Sample from a Markov chain which
converges to target distribution
• Allows sampling from an unnormalized
posterior distribution
• Can compute approximate statistics
from intractable distributions
(MacKay, 2002)
Gibbs Sampling
For variables x1, x2, …, xn
Draw xi(t) from P(xi|x-i)
x-i = x1(t), x2(t),…, xi-1(t), xi+1(t-1), …, xn(t-1)
Gibbs Sampling
(MacKay, 2002)
Gibbs Sampling
• Need full conditional distributions for variables
• Since we only sample z we need
number of times word w assigned to topic j
number of times topic j used in document d
Gibbs Sampling
iteration
1
i
wi
di
zi
1
2
3
4
5
6
7
8
9
10
11
12
.
.
.
50
MATHEMATICS
KNOWLEDGE
RESEARCH
WORK
MATHEMATICS
RESEARCH
WORK
SCIENTIFIC
MATHEMATICS
WORK
SCIENTIFIC
KNOWLEDGE
.
.
.
JOY
1
1
1
1
1
1
1
1
1
1
2
2
.
.
.
5
2
2
1
2
1
2
2
1
2
1
1
1
.
.
.
2
Gibbs Sampling
iteration
1
2
i
wi
di
zi
zi
1
2
3
4
5
6
7
8
9
10
11
12
.
.
.
50
MATHEMATICS
KNOWLEDGE
RESEARCH
WORK
MATHEMATICS
RESEARCH
WORK
SCIENTIFIC
MATHEMATICS
WORK
SCIENTIFIC
KNOWLEDGE
.
.
.
JOY
1
1
1
1
1
1
1
1
1
1
2
2
.
.
.
5
2
2
1
2
1
2
2
1
2
1
1
1
.
.
.
2
?
Gibbs Sampling
iteration
1
2
i
wi
di
zi
zi
1
2
3
4
5
6
7
8
9
10
11
12
.
.
.
50
MATHEMATICS
KNOWLEDGE
RESEARCH
WORK
MATHEMATICS
RESEARCH
WORK
SCIENTIFIC
MATHEMATICS
WORK
SCIENTIFIC
KNOWLEDGE
.
.
.
JOY
1
1
1
1
1
1
1
1
1
1
2
2
.
.
.
5
2
2
1
2
1
2
2
1
2
1
1
1
.
.
.
2
?
Gibbs Sampling
iteration
1
2
i
wi
di
zi
zi
1
2
3
4
5
6
7
8
9
10
11
12
.
.
.
50
MATHEMATICS
KNOWLEDGE
RESEARCH
WORK
MATHEMATICS
RESEARCH
WORK
SCIENTIFIC
MATHEMATICS
WORK
SCIENTIFIC
KNOWLEDGE
.
.
.
JOY
1
1
1
1
1
1
1
1
1
1
2
2
.
.
.
5
2
2
1
2
1
2
2
1
2
1
1
1
.
.
.
2
?
Gibbs Sampling
iteration
1
2
i
wi
di
zi
zi
1
2
3
4
5
6
7
8
9
10
11
12
.
.
.
50
MATHEMATICS
KNOWLEDGE
RESEARCH
WORK
MATHEMATICS
RESEARCH
WORK
SCIENTIFIC
MATHEMATICS
WORK
SCIENTIFIC
KNOWLEDGE
.
.
.
JOY
1
1
1
1
1
1
1
1
1
1
2
2
.
.
.
5
2
2
1
2
1
2
2
1
2
1
1
1
.
.
.
2
2
?
Gibbs Sampling
iteration
1
2
i
wi
di
zi
zi
1
2
3
4
5
6
7
8
9
10
11
12
.
.
.
50
MATHEMATICS
KNOWLEDGE
RESEARCH
WORK
MATHEMATICS
RESEARCH
WORK
SCIENTIFIC
MATHEMATICS
WORK
SCIENTIFIC
KNOWLEDGE
.
.
.
JOY
1
1
1
1
1
1
1
1
1
1
2
2
.
.
.
5
2
2
1
2
1
2
2
1
2
1
1
1
.
.
.
2
2
1
?
Gibbs Sampling
iteration
1
2
i
wi
di
zi
zi
1
2
3
4
5
6
7
8
9
10
11
12
.
.
.
50
MATHEMATICS
KNOWLEDGE
RESEARCH
WORK
MATHEMATICS
RESEARCH
WORK
SCIENTIFIC
MATHEMATICS
WORK
SCIENTIFIC
KNOWLEDGE
.
.
.
JOY
1
1
1
1
1
1
1
1
1
1
2
2
.
.
.
5
2
2
1
2
1
2
2
1
2
1
1
1
.
.
.
2
2
1
1
?
Gibbs Sampling
iteration
1
2
i
wi
di
zi
zi
1
2
3
4
5
6
7
8
9
10
11
12
.
.
.
50
MATHEMATICS
KNOWLEDGE
RESEARCH
WORK
MATHEMATICS
RESEARCH
WORK
SCIENTIFIC
MATHEMATICS
WORK
SCIENTIFIC
KNOWLEDGE
.
.
.
JOY
1
1
1
1
1
1
1
1
1
1
2
2
.
.
.
5
2
2
1
2
1
2
2
1
2
1
1
1
.
.
.
2
2
1
1
2
?
Gibbs Sampling
iteration
1
2
…
1000
i
wi
di
zi
zi
zi
1
2
3
4
5
6
7
8
9
10
11
12
.
.
.
50
MATHEMATICS
KNOWLEDGE
RESEARCH
WORK
MATHEMATICS
RESEARCH
WORK
SCIENTIFIC
MATHEMATICS
WORK
SCIENTIFIC
KNOWLEDGE
.
.
.
JOY
1
1
1
1
1
1
1
1
1
1
2
2
.
.
.
5
2
2
1
2
1
2
2
1
2
1
1
1
.
.
.
2
2
1
1
2
2
2
2
1
2
2
1
2
.
.
.
1
2
2
2
1
2
2
2
1
2
2
2
2
.
.
.
1
…
A Visual Example: Bars
sample each pixel from
a mixture of topics
pixel = word
image = document
A Visual Example: Bars
From 1000 Images
Interpretable Decomposition
• SVD gives a basis for the data, but not an interpretable one
• The true basis is not orthogonal, so rotation does no good
Application to Corpus Data
• TASA corpus: text from first grade to college
• Vocabulary of 26414 words
• Set of 36999 documents
• Approximately 6 million words in corpus
A Selection of Topics
THEORY
SCIENTISTS
EXPERIMENT
OBSERVATIONS
SCIENTIFIC
EXPERIMENTS
HYPOTHESIS
EXPLAIN
SCIENTIST
OBSERVED
EXPLANATION
BASED
OBSERVATION
IDEA
EVIDENCE
THEORIES
BELIEVED
DISCOVERED
OBSERVE
FACTS
SPACE
EARTH
MOON
PLANET
ROCKET
MARS
ORBIT
ASTRONAUTS
FIRST
SPACECRAFT
JUPITER
SATELLITE
SATELLITES
ATMOSPHERE
SPACESHIP
SURFACE
SCIENTISTS
ASTRONAUT
SATURN
MILES
ART
STUDENTS
PAINT
TEACHER
ARTIST
STUDENT
PAINTING
TEACHERS
PAINTED
TEACHING
ARTISTS
CLASS
MUSEUM
CLASSROOM
WORK
SCHOOL
PAINTINGS
LEARNING
STYLE
PUPILS
PICTURES
CONTENT
WORKS
INSTRUCTION
OWN
TAUGHT
SCULPTURE
GROUP
PAINTER
GRADE
ARTS
SHOULD
BEAUTIFUL
GRADES
DESIGNS
CLASSES
PORTRAIT
PUPIL
PAINTERS
GIVEN
BRAIN
NERVE
SENSE
SENSES
ARE
NERVOUS
NERVES
BODY
SMELL
TASTE
TOUCH
MESSAGES
IMPULSES
CORD
ORGANS
SPINAL
FIBERS
SENSORY
PAIN
IS
CURRENT
ELECTRICITY
ELECTRIC
CIRCUIT
IS
ELECTRICAL
VOLTAGE
FLOW
BATTERY
WIRE
WIRES
SWITCH
CONNECTED
ELECTRONS
RESISTANCE
POWER
CONDUCTORS
CIRCUITS
TUBE
NEGATIVE
NATURE
WORLD
HUMAN
PHILOSOPHY
MORAL
KNOWLEDGE
THOUGHT
REASON
SENSE
OUR
TRUTH
NATURAL
EXISTENCE
BEING
LIFE
MIND
ARISTOTLE
BELIEVED
EXPERIENCE
REALITY
THIRD
FIRST
SECOND
THREE
FOURTH
FOUR
GRADE
TWO
FIFTH
SEVENTH
SIXTH
EIGHTH
HALF
SEVEN
SIX
SINGLE
NINTH
END
TENTH
ANOTHER
A Selection of Topics
JOB
SCIENCE
BALL
FIELD
STORY
MIND
DISEASE
WATER
WORK
STUDY
GAME
MAGNETIC
STORIES
WORLD
BACTERIA
FISH
JOBS
SCIENTISTS
TEAM
MAGNET
TELL
DREAM
DISEASES
SEA
CAREER
SCIENTIFIC FOOTBALL
WIRE
CHARACTER
DREAMS
GERMS
SWIM
KNOWLEDGE
BASEBALL EXPERIENCE
NEEDLE
THOUGHT CHARACTERS
FEVER
SWIMMING
WORK
PLAYERS EMPLOYMENT
CURRENT
AUTHOR
IMAGINATION
CAUSE
POOL
OPPORTUNITIES
RESEARCH
PLAY
COIL
READ
MOMENT
CAUSED
LIKE
WORKING
CHEMISTRY
FIELD
POLES
TOLD
THOUGHTS
SPREAD
SHELL
TRAINING
TECHNOLOGY PLAYER
IRON
SETTING
OWN
VIRUSES
SHARK
SKILLS
MANY
BASKETBALL
COMPASS
TALES
REAL
INFECTION
TANK
CAREERS
MATHEMATICS COACH
LINES
PLOT
LIFE
VIRUS
SHELLS
POSITIONS
BIOLOGY
PLAYED
CORE
TELLING
IMAGINE
MICROORGANISMS SHARKS
FIND
FIELD
PLAYING
ELECTRIC
SHORT
SENSE
PERSON
DIVING
POSITION
PHYSICS
HIT
DIRECTION
INFECTIOUS
DOLPHINS CONSCIOUSNESS FICTION
FIELD
LABORATORY
TENNIS
FORCE
ACTION
STRANGE
COMMON
SWAM
OCCUPATIONS
STUDIES
TEAMS
MAGNETS
TRUE
FEELING
CAUSING
LONG
REQUIRE
WORLD
GAMES
BE
EVENTS
WHOLE
SMALLPOX
SEAL
OPPORTUNITY
SPORTS
MAGNETISM SCIENTIST
TELLS
BEING
BODY
DIVE
EARN
STUDYING
BAT
POLE
TALE
MIGHT
INFECTIONS
DOLPHIN
ABLE
SCIENCES
TERRY
INDUCED
NOVEL
HOPE
CERTAIN
UNDERWATER
A Selection of Topics
JOB
SCIENCE
BALL
FIELD
STORY
MIND
DISEASE
WATER
WORK
STUDY
GAME
MAGNETIC
STORIES
WORLD
BACTERIA
FISH
JOBS
SCIENTISTS
TEAM
MAGNET
TELL
DREAM
DISEASES
SEA
CAREER
SCIENTIFIC FOOTBALL
WIRE
CHARACTER
DREAMS
GERMS
SWIM
KNOWLEDGE
BASEBALL EXPERIENCE
NEEDLE
THOUGHT CHARACTERS
FEVER
SWIMMING
WORK
PLAYERS EMPLOYMENT
CURRENT
AUTHOR
IMAGINATION
CAUSE
POOL
OPPORTUNITIES
RESEARCH
PLAY
COIL
READ
MOMENT
CAUSED
LIKE
WORKING
CHEMISTRY
FIELD
POLES
TOLD
THOUGHTS
SPREAD
SHELL
TRAINING
TECHNOLOGY PLAYER
IRON
SETTING
OWN
VIRUSES
SHARK
SKILLS
MANY
BASKETBALL
COMPASS
TALES
REAL
INFECTION
TANK
CAREERS
MATHEMATICS COACH
LINES
PLOT
LIFE
VIRUS
SHELLS
POSITIONS
BIOLOGY
PLAYED
CORE
TELLING
IMAGINE
MICROORGANISMS SHARKS
FIND
FIELD
PLAYING
ELECTRIC
SHORT
SENSE
PERSON
DIVING
POSITION
PHYSICS
HIT
DIRECTION
INFECTIOUS
DOLPHINS CONSCIOUSNESS FICTION
FIELD
LABORATORY
TENNIS
FORCE
ACTION
STRANGE
COMMON
SWAM
OCCUPATIONS
STUDIES
TEAMS
MAGNETS
TRUE
FEELING
CAUSING
LONG
REQUIRE
WORLD
GAMES
BE
EVENTS
WHOLE
SMALLPOX
SEAL
OPPORTUNITY
SPORTS
MAGNETISM SCIENTIST
TELLS
BEING
BODY
DIVE
EARN
STUDYING
BAT
POLE
TALE
MIGHT
INFECTIONS
DOLPHIN
ABLE
SCIENCES
TERRY
INDUCED
NOVEL
HOPE
CERTAIN
UNDERWATER
A Probabilistic Approach
• The function of semantic memory
– The psychological problem of meaning
– One approach to meaning
• Solving the statistical problem of meaning
– Maximum likelihood estimation
– Bayesian statistics
• Comparisons with Latent Semantic Analysis
– Quantitative
– Qualitative
Probabilistic Queries
•
can be computed in different ways
• Fixed topic assumption:
• Multiple samples:
Quantitative Comparisons
• Two types of task
– general semantic tasks: dictionary, thesaurus
– prediction of memory data
• All tests use LSA with 400 vectors, and a
probabilistic model with 100 samples each
using 500 topics
Fill in the Blank
• 12856 sentences extracted from WordNet
his cold deprived him of his sense of _
silence broken by dogs barking _
a _ hybrid accent
• Overall performance
– LSA gives median rank of 3393
– Probabilistic model gives median rank of 3344
Fill in the Blank
Synonyms
• 280 sets of five synonyms from WordNet,
ordered by number of senses
BREAK (78)
EXPOSE (9)
CUT (72) REDUCE (19)
RUN (53)
GO (34)
DISCOVER (8)
CONTRACT (12)
WORK (25)
DECLARE (7)
SHORTEN (5)
ABRIDGE (1)
FUNCTION (9)
• Two tasks:
– Predict first synonym
– Predict last synonym
• Increasing number of synonyms
REVEAL (3)
OPERATE (7)
First Synonym
Last Synonym
Synonyms and Word Frequency
Synonyms and Word Frequency
Probabilistic
LSA
Synonyms and Word Frequency
Probabilistic
LSA
Word Frequency and Filling Blanks
Probabilistic
LSA
Performance on Semantic Tasks
• Performance comparable, neither great
• Difference in effects of word frequency due
to treatment of co-occurrence data
• Probabilistic approach useful in addressing
psychological data: frequency important
Intrusions in Free Recall
CHAIR
FOOD
DESK
TOP
LEG
EAT
CLOTH
DISH
WOOD
DINNER
MARBLE
TENNIS
• Intrusion rates from Deese (1959)
• Used average word vectors in LSA,
P(word|list) in probabilistic model
• Favors LSA, since probabilistic
combination can be multimodal
Intrusions in Free Recall
Intrusions in Free Recall
models
word frequency
Word Frequency is Not Enough
• An explanation needs to address two questions:
– Why do these words intrude?
– Why do other words not intrude?
Word Frequency is Not Enough
• An explanation needs to address two questions:
– Why do these words intrude?
– Why do other words not intrude?
• Median word frequency rank: 1698.5
• Median rank in model: 21
Word Association
• Word association norms from Nelson et al. (1998)
PLANETS
associate number
people
model
1
2
3
4
5
6
7
8
EARTH
STARS
SPACE
SUN
MARS
UNIVERSE
SATURN
GALAXY
STARS
STAR
SUN
EARTH
SPACE
SKY
PLANET
UNIVERSE
Word Association
Performance on Memory Tasks
• Outperforms LSA on simple memory tasks,
both far better at predicting memory data
• Improvement due to role of word frequency
• Not a complete account, but can form a part
of more complex memory models
Qualitative Comparisons
• Naturally deals with complications for LSA
– Polysemy
– Asymmetry
• Respects natural statistics of language
• Easily extends to other models of meaning
Beyond the Bag of Words
q
z
z
z
w
w
w
Beyond the Bag of Words
q
q
z
z
z
z
z
z
w
w
w
w
w
w
s
s
s
Semantic categories
MAP
FOOD
NORTH
FOODS
EARTH
BODY
SOUTH
NUTRIENTS
POLE
DIET
MAPS
FAT
EQUATOR
SUGAR
WEST
ENERGY
LINES
MILK
EAST
EATING
AUSTRALIA
FRUITS
GLOBE
VEGETABLES
POLES
WEIGHT
HEMISPHERE
FATS
LATITUDE
NEEDS
CARBOHYDRATES PLACES
LAND
VITAMINS
WORLD
CALORIES
COMPASS
PROTEIN
CONTINENTS
MINERALS
GOLD
CELLS
BEHAVIOR
DOCTOR
BOOK
IRON
CELL
SELF
PATIENT
BOOKS
SILVER
ORGANISMS
INDIVIDUAL
HEALTH
READING
ALGAE
PERSONALITY
HOSPITAL
INFORMATION COPPER
METAL
BACTERIA
RESPONSE
MEDICAL
LIBRARY
METALS
MICROSCOPE
SOCIAL
CARE
REPORT
STEEL
MEMBRANE
EMOTIONAL
PATIENTS
PAGE
CLAY
ORGANISM
LEARNING
NURSE
TITLE
LEAD
FOOD
FEELINGS
DOCTORS
SUBJECT
ADAM
LIVING
PSYCHOLOGISTS
MEDICINE
PAGES
ORE
FUNGI
INDIVIDUALS
NURSING
GUIDE
ALUMINUM PSYCHOLOGICAL
MOLD
TREATMENT
WORDS
MINERAL
EXPERIENCES MATERIALS
NURSES
MATERIAL
MINE
NUCLEUS
ENVIRONMENT
PHYSICIAN
ARTICLE
STONE
CELLED
HUMAN
HOSPITALS
ARTICLES
MINERALS
STRUCTURES
RESPONSES
DR
WORD
POT
MATERIAL
BEHAVIORS
SICK
FACTS
MINING
STRUCTURE
ATTITUDES
ASSISTANT
AUTHOR
MINERS
GREEN
PSYCHOLOGY
EMERGENCY
REFERENCE
TIN
MOLDS
PERSON
PRACTICE
NOTE
PLANTS
PLANT
LEAVES
SEEDS
SOIL
ROOTS
FLOWERS
WATER
FOOD
GREEN
SEED
STEMS
FLOWER
STEM
LEAF
ANIMALS
ROOT
POLLEN
GROWING
GROW
Syntactic categories
SAID
ASKED
THOUGHT
TOLD
SAYS
MEANS
CALLED
CRIED
SHOWS
ANSWERED
TELLS
REPLIED
SHOUTED
EXPLAINED
LAUGHED
MEANT
WROTE
SHOWED
BELIEVED
WHISPERED
THE
HIS
THEIR
YOUR
HER
ITS
MY
OUR
THIS
THESE
A
AN
THAT
NEW
THOSE
EACH
MR
ANY
MRS
ALL
MORE
SUCH
LESS
MUCH
KNOWN
JUST
BETTER
RATHER
GREATER
HIGHER
LARGER
LONGER
FASTER
EXACTLY
SMALLER
SOMETHING
BIGGER
FEWER
LOWER
ALMOST
ON
AT
INTO
FROM
WITH
THROUGH
OVER
AROUND
AGAINST
ACROSS
UPON
TOWARD
UNDER
ALONG
NEAR
BEHIND
OFF
ABOVE
DOWN
BEFORE
GOOD
SMALL
NEW
IMPORTANT
GREAT
LITTLE
LARGE
*
BIG
LONG
HIGH
DIFFERENT
SPECIAL
OLD
STRONG
YOUNG
COMMON
WHITE
SINGLE
CERTAIN
ONE
SOME
MANY
TWO
EACH
ALL
MOST
ANY
THREE
THIS
EVERY
SEVERAL
FOUR
FIVE
BOTH
TEN
SIX
MUCH
TWENTY
EIGHT
HE
YOU
THEY
I
SHE
WE
IT
PEOPLE
EVERYONE
OTHERS
SCIENTISTS
SOMEONE
WHO
NOBODY
ONE
SOMETHING
ANYONE
EVERYBODY
SOME
THEN
BE
MAKE
GET
HAVE
GO
TAKE
DO
FIND
USE
SEE
HELP
KEEP
GIVE
LOOK
COME
WORK
MOVE
LIVE
EAT
BECOME
Sentence generation
RESEARCH:
[S] THE CHIEF WICKED SELECTION OF RESEARCH IN THE BIG MONTHS
[S] EXPLANATIONS
[S] IN THE PHYSICISTS EXPERIMENTS
[S] HE MUST QUIT THE USE OF THE CONCLUSIONS
[S] ASTRONOMY PEERED UPON YOUR SCIENTISTS DOOR
[S] ANATOMY ESTABLISHED WITH PRINCIPLES EXPECTED IN BIOLOGY
[S] ONCE BUT KNOWLEDGE MAY GROW
[S] HE DECIDED THE MODERATE SCIENCE
LANGUAGE:
[S] RESEARCHERS GIVE THE SPEECH
[S] THE SOUND FEEL NO LISTENERS
[S] WHICH WAS TO BE MEANING
[S] HER VOCABULARIES STOPPED WORDS
[S] HE EXPRESSLY WANTED THAT BETTER VOWEL
Sentence generation
LAW:
[S] BUT THE CRIME HAD BEEN SEVERELY POLITE OR CONFUSED
[S] CUSTODY ON ENFORCEMENT RIGHTS IS PLENTIFUL
CLOTHING:
[S] WEALTHY COTTON PORTFOLIO WAS OUT OF ALL SMALL SUITS
[S] HE IS CONNECTING SNEAKERS
[S] THUS CLOTHING ARE THOSE OF CORDUROY
[S] THE FIRST AMOUNTS OF FASHION IN THE SKIRT
[S] GET TIGHT TO GET THE EXTENT OF THE BELTS
[S] ANY WARDROBE CHOOSES TWO SHOES
THE ARTS:
[S] SHE INFURIATED THE MUSIC
[S] ACTORS WILL MANAGE FLOATING FOR JOY
[S] THEY ARE A SCENE AWAY WITH MY THINKER
[S] IT MEANS A CONCLUSION
Conclusion
Taking a probabilistic approach can clarify some
of the central issues in semantic representation
– Motivates sensitivity to co-occurrence statistics
– Identifies how co-occurrence data should be used
– Allows the role of meaning to be specified exactly,
and finds a meaningful decomposition of language
Probabilities and Inner Products
• Single word:
w
F
• List of words:
Model Selection
• How many topics does a language contain?
• Major issue for parametric models
• Not so much for non-parametric models
– Dirichlet process mixtures
– Expect more topics than tractable
– Choice of number is choice of scale
Gibbs Sampling and EM
• How many topics does a language contain?
• EM finds fixed set of topics, single estimate
• Sampling allows for multiple sets of topics,
and multimodal posterior distributions
Natural Statistics
• Treating co-occurrence data as frequencies
preserves the natural statistics of language
• Word frequency
• Zipf’s Law of Meaning
Natural Statistics
Natural Statistics
Natural Statistics
Word Association
CROWN
people
model
KING
JEWEL
QUEEN
HEAD
HAT
TOP
ROYAL
THRONE
KING
TEETH
HAIR
TOOTH
ENGLAND
MOUTH
QUEEN
PRINCE
Word Association
SANTA
people
model
CHRISTMAS
TOYS
LIE
MEXICO
SPANISH
CALIFORNIA
Download