The Topology of WordNet:
some metrics
Ann Devitt and Carl Vogel
Computational Linguistics Group
Trinity College Dublin, Ireland
 WordNet “sub-hierarchies”
 Multiple inheritance
 Branching Factor
 Depth versus Height
 Cluster coefficients
 Specificity pilot study
WordNet as directed acyclic graph
Node and synset interchangeable
Dimensional distribution
Overlap between hierarchies
2072 synsets: more than 1 top hierarchy
35 synsets: more than 2 top hierarchies
Some overlap examples
Abstraction and Event
948 synsets
group action
Entity and Group
250 nodes
Multiple inheritance
2.6% of nodes
 Normal distribution throughout depth
 Significantly different in different
 χ2 (8, N=75180)=324.27, p≤0.001
Specificity examples
Parents = 1, depth < 3
Parents > 1, depth < 3
 artefact
 office
Parents = 1, depth > 8
Parents > 1, depth > 8
sea bass
 selfcondemnation
 bombardon
 palomino
Branching Factor
Number of children + 1
 Including leaf nodes
 Range: 1 – 573
 Average: 2.023
 Excluding leaf nodes:
 Average: 5.793
 97% less than 20
Branching factor
Overall low branching factor
 Same distribution in all sub-hierarchies
 Large number of nodes in total
 Greater overall depth in paths
 Not a shallow structure
despite 55,000 leaf nodes
Depth vs Height
 Maximum = 18
 Normal distribution
 Height:
 Maximum = 5
 93.6% 1 or 2 nodes from a leaf node
 Zipfian distribution
Depth vs Height
Reported distributions
 the same across the different sub
Depth is a more informative measure
Clustering coefficient
Measure of graph connectivity
 Ratio:
Number of connections btwn nodes
 Possible number of connections
2 Σi
ki (ki – 1)
Cluster coefficients
First-order measure
 Not useful for WordNet
 Only 62 nodes have a coefficient > 0
 Does not form clusters readily
Cluster coefficients
Second-order measure
 Average 0.337
 Normal distribution
 May form clusters of wider diameter
Pilot Study Aims
Do people have a notion of
generality/specificity for concepts?
Do people agree on what is more/less
What features of WordNet do these
judgments correlate with?
Sample ranking task I
Axis, axis of rotation – (the center around
which something rotates
 River boat – (a boat used on rivers or to ply
a river)
 Remains – (any object that is left unused or
still extant; “I threw out the remains of my
Sample ranking task II
rational motive - (a motive that can be
defended by reasoning or logical argument
 disapproval - (the act of disapproving or
 harmony, concord, concordance (agreement of opinions)
Do people agree on what is
more/less general/specific?
Cochran Q statistic (Cochran 1950)
H0 : that any agreement between respondents is
due to chance
Overall: for 11 respondents
 Cochran's Q165.859
 44 degrees of freedom
 Asymp. Sig. .000
What WN features correlate?
 Less deep = more general
 Inconclusive
 Less sisters = more general
 Did not seem to affect judgments
 Did increase the difficulty of the task
WordNet metrics
 Inheritance: Sub-hierarchy and parentage
 Branching Factor
 Distance: depth and height
 Clustering
 Pilot study
 Suggests where to go with a larger study
Multiple Inheritance vs Depth
