Wikitology Wikipedia as an Ontology

advertisement
Wikitology
Wikipedia as an Ontology
Tim Finin, UMBC
Zareen Syed and Anupam Joshi
University of Maryland, Baltimore County
James Mayfield, Paul McNamee and Christine Piatko
JHU Human Language Technology Center of Excellence
Overview
• Introduction
• Wikipedia as an ontology
• Applications
• Discussion
• Conclusion
introduction  wikitology  applications  discussion  conclusion
Wikis and Knowledge
• Wikis are a great way to collaborate on
knowledge encoding
– Wikipedia is an archetype for this, but there
are many examples
• Ongoing research is exploring how to
integrate this with structured knowledge
– DBpedia, Semantic Media Wiki, Freebase, etc.
• I’ll describe an approach we’ve taken and
experiments in using it
– We came at this from an IR/HLT perspective
introduction  wikitology  applications  discussion  conclusion
Wikipedia data in RDF
introduction  wikitology  applications  discussion  conclusion
Populating Freebase KB
introduction  wikitology  applications  discussion  conclusion
Populating Powerset’s KB
introduction  wikitology  applications  discussion  conclusion
AskWiki uses Wikipedia for QA
introduction  wikitology  applications  discussion  conclusion
With sometimes surprising results
introduction  wikitology  applications  discussion  conclusion
TrueKnowledge mines Wikipedia
introduction  wikitology  applications  discussion  conclusion
Wikipedia pages as tags
introduction  wikitology  applications  discussion  conclusion
Wikitology
We are exploring an approach to
deriving an ontology from Wikipedia
that is useful in a variety of language
processing tasks
introduction  wikitology  applications  discussion  conclusion
Our original problem (2006)
• Problem: describe what an analyst has
been working on to support collaboration
• Idea: track documents she reads and
map these to terms in an ontology,
aggregate to produce a short list of topics
• Approach: use Wikipedia articles as
ontology terms, use document-article
similarity for the mapping, and spreading
activation for aggregation
introduction  wikitology  applications  discussion  conclusion
What’s a document about?
Two common approaches:
(1) Select words and phrases using TFIDF that characterize the document
(2) Map document to a list of terms from
a controlled vocabulary or ontology
(1) is flexible and does not require
creating and maintaining an ontology
(2) can tie documents to a rich knowledge
base
introduction  wikitology  applications  discussion  conclusion
Wikitology !
• Using Wikipedia as an ontology offers the
best of both approaches
– each article (~3M) is a concept in the ontology
– terms linked via Wikipedia’s category system
(~200k) and inter-article links
– Lots of structured and semi-structured data
• It’s a consensus ontology created and
maintained by a diverse community
• Broad coverage, multilingual, very current
• Overall content quality is high
introduction  wikitology  applications  discussion  conclusion
Wikitology features
• Terms have unique IDs (URLs) and are
“self describing” for people
• Underlying graphs provide structure and
associations: categories, article links,
disambiguation, aliases (redirects), …
• Article history contains useful meta-data
for trust, provenance, controversy, …
• External sources provide more info (e.g.,
Google’s PageRank)
• Annotated with structured data from
DBpedia, Freebase, Geonames & LOD
introduction  wikitology  applications  discussion  conclusion
Problems as an Ontology
Treating Wikipedia as an ontology reveals
many problems
•Uncategorized and miscategorized
articles
•Single document in too many categories:
– George W. Bush is included in about 30
categories
•Links between articles belonging to very
different categories
– John F. Kennedy has a link for “coincidence
theory” which belongs to the Mathematical
Analysis/ Topology/Fixed Points
introduction  wikitology  applications  discussion  conclusion
Problems as an Ontology
•Article links in text are not “typed”
•Uneven category articulation
– Some categories are under represented
where as others have many articles
•Administrative categories, e.g.
– Clean up from Sep 2006
– Articles with unsourced statements
•Over-linking, e.g.
– A mention of United States linked to the
page United_states
– Mentions of 1949 linked to the year 1949
introduction  wikitology  applications  discussion  conclusion
Problems as an Ontology
Wikipedia’s infobox templates have great
potential for have several problems
•Multiple templates for same class
•Multiple attribute names for same property
– E.g., six attributes for a person’s birth date
•Attributes lack domains or datatypes
– E.g., value can be string or link
introduction  wikitology  applications  discussion  conclusion
Wikitology 1, 2, 3
• We’ve addressed some of of these
problems in developing Wikitology
• The development has been driven by
several use cases and applications
introduction  wikitology  applications  discussion  conclusion
Wikitology Use Cases
• Identifying user context in a collaboration
system from documents viewed (2006)
• Improve IR accuracy of by adding
Wikitology tags to documents (2007)
• Cross document co-reference resolution
for named entities in text (2008)
• Knowledge Base population from text
(2009)
• Improve Web search engine by tagging
documents and queries (2009)
introduction  wikitology  applications  discussion  conclusion
Wikitology 1.0 (2007)
• Structured Data
– Specialized concepts (article titles)
– Generalized concepts (category titles)
– Inter-category and -article links as relations
between concepts
graphs
– Article-category links as relations between
specialized and generalized concepts
text
• Un-Structured Data
– Article text
• Algorithms to remove useless categories and links, infer categories, and
select, rank and aggregate concepts
using the hybrid knowledge base
Human input
& editing
introduction  wikitology  applications  discussion  conclusion
Experiments
• Goal: given one or more documents, compute
a ranked list of the top Wikipedia articles
and/or categories that describe it.
• Basic metric: document similarity between
Wikipedia article and document(s)
• Variations: role of categories, eliminating
uninteresting articles, use of spreading
activation, using similarity scores, weighing
links, number of spreading activation pulses,
individual or set of query documents, etc, etc.
introduction  wikitology  applications  discussion  conclusion
Method 1
Using Wikipedia article text & categories to predict concepts
Input
Query
doc(s)
similar to
0.2
Cosine similarity
0.8
0.1
Similar Wikipedia Articles
0.2
introduction  wikitology  applications  discussion  conclusion
Method 1
Using Wikipedia article text & categories to predict concepts
Wikipedia
Category
Graph
Input
Query
doc(s)
similar to
0.8
0.2
0.1
Similar Wikipedia Articles
0.2
Cosine similarity
0.3
introduction  wikitology  applications  discussion  conclusion
Method 1
Using Wikipedia article text & categories to predict concepts
Output
Rank Categories
1. Links
2. Cosine similarity
Wikipedia
Category
Graph
0.9
3
Input
Query
doc(s)
similar to
0.8
0.2
0.1
Similar Wikipedia Articles
0.2
Cosine similarity
0.3
introduction  wikitology  applications  discussion  conclusion
Method 2
Using spreading activation on category link graph to get
aggregated concepts
Spreading Activation
Output
Ranked Concepts
based
Wikipedia
Category
Graph
on Final Activation
Score
Input
Query
doc(s)
Similar to
0.8
0.2
0.1
0.2
Cosine similarity
0.3
Input Function
Ij   Oi
i
Output Function Oj 
Aj
Dj * k
introduction  wikitology  applications  discussion  conclusion
Method 3
Using spreading activation on article link graph
Input
Query Similar To
doc(s)
Edge Weights: Cosine
similarity between linked
articles
Threshold: Ignore
Spreading Activation to
articles with less than
0.4 Cosine similarity
score
Wikipedia Article
Links Graph
Spreading Activation
 Oiwij
i
Aj
Node Output Function Oj 
k
Node Input Function
Ij

Output
Ranked Concepts based on Final
Activation Score
Evaluation
• An initial informal evaluation compared
results against our own judgments
• Used to select promising combinations
of ideas and parameter settings
• Formal evaluation:
– Selected Wikipedia articles for testing;
remove from Lucene index and graphs
– For each, use methods to predict categories
and linked articles
– Compare results using precision and recall
to known categories and linked articles
introduction  wikitology  applications  discussion  conclusion
Example
Prediction for Set of Test Documents
Test Document Titles in the Set: (Wikipedia Articles)
Crop_rotation
Permaculture
Beneficial_insects
Neem
Lady_Bird
Principles_of_Organic_Agriculture
Concept not in the
Rhizobia
Category Hierarchy
Biointensive
Intercropping
Green_manure
Method 1
Method 2 (2 pulses)
Method 3 (2 pulses)
Ranking Categories Directly
Spreading Activation on Category
links Graph
Spreading Activation on Article
Links Graph
Agriculture
Sustainable_technologies
Crops
Agronomy
Permaculture
Skills
Applied_sciences
Land_management
Food_industry
Agriculture
Organic_farming
Sustainable_agriculture
Organic_gardening
Agriculture
Companion_planting
Category prediction evaluation
Avg. Similar ity
Threshold
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Precision
Average Precis ion
M1
SA1
SA2
0.24
0.25
0.29
0.36
0.42
0.45
0.55
0.55
1
0.3
0.3
0.34
0.43
0.52
0.57
0.63
0.63
1
0.32
0.33
0.37
0.47
0.57
0.62
0.68
0.68
1
M1
(1)
0.61
0.62
0.66
0.76
0.87
0.91
0.92
0.92
1
M1
(2)
0.65
0.65
0.69
0.81
0.92
0.96
1
1
1
Recall
F-M easure
SA1
SA2
M1
SA1 SA2
M1
SA1
SA2
0.6
0.61
0.67
0.77
0.88
0.92
0.97
0.97
1
0.74
0.75
0.78
0.85
0.95
0.98
1
1
1
0.81
0.81
0.85
0.91
0.95
0.94
1
1
1
0.93
0.93
0.95
0.97
0.98
0.97
1
1
1
0.38
0.38
0.43
0.51
0.58
0.61
0.71
0.71
1
0.45
0.46
0.5
0.6
0.68
0.72
0.77
0.77
1
0.49
0.49
0.53
0.64
0.73
0.77
0.81
0.81
1
0.97
0.97
0.97
0.99
1
1
1
1
1
• Spreading activation with two pulses worked best
• Only considering articles with similarity > 0.5 was
a good threshold
introduction  wikitology  applications  discussion  conclusion
Article prediction evaluation
Avg.
Simila rity
Threshold
Precision
Average
Precision
Recall
F-M easure
0
0.28
0.5
0.53
0.31
0.1
0.28
0.5
0.53
0.31
0.2
0.32
0.56
0.58
0.35
0.3
0.41
0.69
0.66
0.44
0.4
0.51
0.85
0.79
0.56
0.5
0.59
0.94
0.88
0.67
0.6
0.53
0.91
0.9
0.63
0.7
0.66
1
1
0.79
0.8
0.67
1
1
0.8
• Spreading activation with one pulse worked best
• Only considering articles with similarity > 0.5
was a good threshold
introduction  wikitology  applications  discussion  conclusion
Improving IR performance (2008-09)
• Improving IR performance for a collection
by adding semantic terms to documents
• Query with blind relevance feedback may
benefit from the semantic terms
• Initial evaluation with NIST TREC 2005
collection in collaboration with Paul
McNamee, JHU HLTCOE
• Ongoing: integration into RiverGlass
MORAG search engine
introduction  wikitology  applications  discussion  conclusion
Improving IR performance
Doc: FT921-4598 (3/9/92)
... Alan Turing, described as a brilliant mathematician and
a key figure in the breaking of the Nazis' Enigma codes.
Prof IJ Good says it is as well that British security was
unaware of Turing's homosexuality, otherwise he might
have been fired 'and we might have lost the war'. In 1950
Turing wrote the seminal paper 'Computing Machinery
And Intelligence', but in 1954 killed himself ...
Turing_machine, Turing_test, Church_Turing_thesis,
Halting_problem, Computable_number, Bombe,
Alan_Turing, Recusion_theory, Formal_methods,
Computational_models, Theory_of_computation,
Theoretical_computer_science, Artificial_Intelligence
introduction  wikitology  applications  discussion  conclusion
Evaluation
• Mixed results on NIST evaluation
• Slightly worse on mean average
precision
• Slightly better for precision at 10
MAP
P@10
base
0.2076 0.4207
Base + rf
0.2470 0.4480
Concepts + rf
0.2400 0.4553
introduction  wikitology  applications  discussion  conclusion
Information Extraction
• Problem: resolve entities found by a
named entity recognition system across
documents to a KB entries
• ACE 2008: NIST run Automatic Extraction Conference is focused on this task
– We were part of a team lead by JHU
Human Language Technology Center of
Excellence
– Use Wikitology to map document entities to
KB entities
introduction  wikitology  applications  discussion  conclusion
Wikitology 2.0 (2008)
RDF
Freebase KB
Databases
RDF
graphs
text
Yago
Human input & editing
WordNet
Named Entity Recognition
Timothy F. Geithner, who as president of the
New York Federal Reserve Bank oversaw
many of the nation’s most powerful financial
institutions, stunned the group with the
audacity of his answer. He proposed asking
Congress to give the president broad power
to guarantee all the debt in the banking
system, according to two participants,
including Michele Davis, then an assistant
Treasury secretary.
Named Entity Recognition
Timothy F. Geithner, who as president of the
New York Federal Reserve Bank oversaw
many of the nation’s most powerful financial
institutions, stunned the group with the
audacity of his answer. He proposed asking
Congress to give the president broad power
to guarantee all the debt in the banking
system, according to two participants,
including Michele Davis, then an assistant
Treasury secretary.
Open Calais
Free NER service that returns results in RDF
Global Coreference Task
• Start with entities and relations produced by a within document
extraction system
– Produce ‘Global’ clusters for PERSON and ORGANIZATION entities
William Wallace
(living British Lord)
William Wallace
(of Braveheart fame)
Abu Abbas
aka Muhammad Zaydan
aka Muhammad Abbas
– Only evaluate over instances of entities with a name
• Challenges:
– Very limited development data
• ACE released 49 files in English, none in Arabic
• MITRE released English ACE05 corpus, but annotation is noisy and data has few
ambiguous entities
– Within document mistakes are propagated to cross-document system
– 10K document evaluation set required work on scalability of
approaches
introduction  wikitology  applications  discussion  conclusion
Global Coreference Resolution Approach
• Serif for intra-document processing
• Entity Filtering
Document Entities:
– Collect all pairs of SERIF entities
– Filter entity pairs with heuristics (e.g.,
string similarity of mentions) to get highrecall set of pairs significantly smaller
than n2 possible pairs
• Feature generation
• Training
E1: Abu Abbas was arrested …
E2: Palestinian President Mahmoud Abbas ...
E3: … election of Abu Mazen
E4: … president George Bush
Filtered Pairs:
E1, E2 (shared word)
E1, E3 (shared word)
E2, E3 (known alias)
– Train SVM to identify coreferent pairs
• Entity Clustering
– Cluster predicted pairs
– Each connected component forms a
global entity
Features:
E1, E2: character overlap: 5
E1, E2: distinct Freebase entities: true
E1, E3: character overlap: 3
E1, E3: distinct Freebase entities: false
….
• Relation Identification
– Every pair of SERIF-identified relations
whose types are identical and whose
endpoints are coreferent are deemed to
be coreferent
Entity Clusters:
Abu Mazen
Mahmoud Abbas
Palestinian Leader
convicted terrorist
Muhammed Abbas
Abu Abbas
introduction  wikitology  applications  discussion  conclusion
Wikitology tagging
• Using Serif’s output, we produced an
entity document for each entity.
Included the entity’s name, nominal and pronominal mentions, APF type and subtype, and words
in a window around the mentions
• We tagged entity documents using Wikitology producing vectors of (1) terms
and (2) categories for the entity
• We used the vectors to compute features measuring entity pair
similarity/dissimilarity
introduction  wikitology  applications  discussion  conclusion
Entity Document & Tags
<DOC>
<DOCNO>ABC19980430.1830.0091.LDC2000T44-E2 <DOCNO>
<TEXT>
Webb Hubbell
PER
Individual
NAM: "Hubbell” "Hubbells” "Webb Hubbell” "Webb_Hubbell"
NAM: "Mr . " "friend” "income"
PRO: "he” "him” "his"
, . abc's accountant after again ago all alleges alone also and
arranged attorney avoid been before being betray but came
can cat charges cheating circle clearly close concluded
conspiracy cooperate counsel counsel's department did
disgrace do dog dollars earned eightynine enough evasion
feel financial firm first four friend friends going got grand
happening has he help him hi s hope house hubbell hubbells
hundred hush income increase independent indict indicted
indictment inner investigating jackie jackie_judd jail jordan
judd jury justice kantor ken knew lady late law left lie little
make many mickey mid money mr my nineteen nineties
ninetyfour not nothing now office other others paying
peter_jennings president's pressure pressured probe
prosecutors questions reported reveal rock saddened said
schemed seen seven since starr statement such tax taxes
tell them they thousand time today ultimately vernon
washington webb webb_hubbell were what's whether which
white whitewater why wife years
</TEXT>
</DOC>
Wikitology article tag vector
Webster_Hubbell 1.000
Hubbell_Trading_Post National Historic Site 0.379
United_States_v._Hubbell 0.377
Hubbell_Center 0.226
Whitewater_controversy 0.222
Wikitology category tag vector
Clinton_administration_controversies 0.204
American_political_scandals 0.204
Living_people 0.201
1949_births 0.167
People_from_Arkansas 0.167
Arkansas_politicians 0.167
American_tax_evaders 0.167
Arkansas_lawyers 0.167
Wikitology derived features
• Seven features measured entity similarity using
cosine similarity of various length article or
category vectors
• Five features measured entity dissimilarity:
•
•
•
•
two PER entities match different Wikitology persons
two entities match Wikitology tags in a disambiguation set
two ORG entities match different Wikitology organizations
two PER entities match different Wikitology persons,
weighted by 1-abs(score1-score2)
• two ORG entities match different Wikitology orgs,
weighted by 1-abs(score1-score2)
introduction  wikitology  applications  discussion  conclusion
COE Features
• Character-level features
– Exact Match of NAM mentions
•
•
•
•
– Words
•
•
•
•
Longest mention exact match
Some mention exact match
Multiple mention exact match
All mention exact match
– Partial Match
• Dice score, character bigrams
• Dice score, longest mention
character bigrams
• Last word of longest string match
– Matching nominals and
pronominals
•
•
•
•
• Document-level features
Exact match
Multiple exact match
All match
Dice score of mention strings
Dice score, words in document
Dice score, words around mentions
Cosine score, words in document
Cosine score, words around
mentions
– Entities
• Dice score, entities in document
• Dice score, entities around mentions
• Metadata features
–
–
–
–
Speech/text
News/non-news
Same document
Social context features
• Heuristic
• Probabilistic
introduction  wikitology  applications  discussion  conclusion
More COE Features
• KB features - ontology
• KB features - instances
– Known alias
• Also derived aliases from test
collection
– BBN name match
– Famous singleton
• KB features - semantic match
–
–
–
–
–
–
–
–
–
Entity type match
Sex match
Number match
Occupation match
Fuzzy occupation match
Nationality match
Spouse match
Parent match
Sibling match
– Wikitology
•
•
•
•
Top Wikitology category matches
Top Wikitology article matches
Different top Wikitology person
Different top Wikitology
organization
• Top Wikitology categories in
disambiguation set
– Reuters topics
• Cosine score, words in document
• Cosine score, words around
mentions
– Thesaurus concepts
• Cosine score, words in document
• Cosine score, words around
mentions
introduction  wikitology  applications  discussion  conclusion
Clustering
• Approach
– Assign score to each entity pair (SVM or heuristic)
– Eliminate pairs whose score does not exceed
threshold (0.95 for SVM runs)
– Identify connected components in resulting graph
• Large clusters
– AP (good)
– Clinton (bad; conflates William and Hillary)
– Sources of large clusters varied
• Connected components clustering
• SERIF errors
• Insufficient features to distinguish separate entities
introduction  wikitology  applications  discussion  conclusion
Features with High F1 scores
• Recall that F1 = 2*P*R/(P+R)
• Variants of exact name match, in general,
especially: a name mention in one entity
exactly matches one in the other (83.1%)
• Cosine similarity of the vectors of top
Wikitology article matches (75.1%)
• Top Wikitology article for the two entities
matched (38.1%)
• An entity contained a mention that was a
known alias of a mention found in the other
(47.5%)
introduction  wikitology  applications  discussion  conclusion
Feature Ablation
A post hoc feature ablation evaluation
showed contribution of KB features
introduction  wikitology  applications  discussion  conclusion
High Precision Features
• High precision/low recall features are
useful when applicable
• Features with precision > 95% include:
– A name mentioned by each entity matches
exactly one person in Wikipedia
– The entities have the same parent
– The entities have the same spouse
– All name mentions have an exact match
across the two entities
– Longest named mention has exact match
introduction  wikitology  applications  discussion  conclusion
Knowledge Base Population
• The 2009 NIST Text Analysis Conference (TAC) will include a new Knowledge
Base Population track
• Goal: discover information about named
entities (people, organizations, places) and
incorporate it into a KB
• TAC KBP has two related tasks:
–Entity linking: doc. entity mention -> KB entity
–Slot filling: given a document entity mention,
find missing slot values in large corpus
introduction  wikitology  applications  discussion  conclusion
KBs and IE are Symbiotic
KB info helps interpret text
Knowledge
Base
Information
Extraction
from Text
IE helps populate KBs
introduction  wikitology  applications  discussion  conclusion
Planned Extensions
• Make greater use of data from Linked
Open Data (LOD) resources: DBpedia,
Geonames, Freebase
• Replace ad hoc processing of RDF data
in Lucene with a triple store
• Add additional graphs (e.g., derived from
infobox links and develop algorithms to
exploit them
• Develop a better hybrid query creation
tools
introduction  wikitology  applications  discussion  conclusion
Wikitology 3.0
(2009)
Application
Specific
Algorithms
Application
Specific
Algorithms
IR
collection
Articles
Wikitology
Code
RDF
reasoner
Application
Specific
Algorithms
Relational
Database
Triple
Store
Category
InfoboxLinks
GraphGraph
Infobox
Page Link
Graph
Graph
Linked
Semantic
Web data &
ontologies
Challenges
• Wikitology tagging is expensive
– ~3 seconds/document
– ACE English: ~150K entities (~24 hr on Bluegrit)
– A spreading activation algorithm on the underlying
graphs improves accuracy at even more cost
• Exploit the RDF metadata and data and
the underlying graphs
– requires reasoning and graph processing
• Extract entities from Wiki text to find more
relations
– More graph processing
introduction  wikitology  applications  discussion  conclusion
Wikipedia’s social network
• Wikipedia has an implicit ‘social network’
that can help disambiguate PER
mentions
• Resolving PER mentions in a short
document to KB people who are linked
in the KB is good
• The same can be done for the network
of ORG and GPE entities
WSN Data
• We extracted 213K people from the
DBpedia’s Infobox dataset, ~30K of which
participate in an infobox link to another
person
• We extracted 875K people from
Freebase, 616K of were linked to
Wikipedia pages, 431K of which are in
one of 4.8M person-person article links
• Consider a document that mentions two
people: George Bush and Mr. Quayle
Which Bush & which Quayle?
Six George Bushes
Nine Male Quayles
A simple closeness metric
Let Si = {two hop neighbors of Si}
Cij = |intersection(Si,Sj)| / |union(Si,Sj) |
Cij>0 for six of the 56 possible pairs
0.43 George_H._W._Bush -- Dan_Quayle
0.24 George_W._Bush -- Dan_Quayle
0.18 George_Bush_(biblical_scholar) -- Dan_Quayle
0.02 George_Bush_(biblical_scholar) -- James_C._Quayle
0.02 George_H._W._Bush -- Anthony_Quayle
0.01 George_H._W._Bush -- James_C._Quayle
Application to TAC KBP
• Using entity network data extracted from
Dbpedia and Wikipedia provides
evidence to support KBP tasks:
– Mapping document mentions into
infobox entities
– Mapping potential slot fillers into
infobox entities
– Evaluating the coherence of entities
as potential slot fillers
Next Steps
• Construct a Web-based API and demo
system to facilitate experimentation
• Process Wikitology updates in real-time
• Exploit machine learning to classify
pages and improve performance
• Better use of cluster using Hadoop, etc.
• Exploit cell technology for spreading
activation and other graph-based
algorithms
– e.g., recognize people by the graph of
relations they are part of
introduction  wikitology  applications  discussion  conclusion
Dbpedia ontology
• Dbpedia 3.2 (Nov 2008) added a manually
constructed ontology with
Place
– 170 classes in a subsumption hierarchy
– 880K instances
– 940 properties with domain and range
248,000
Person 214,000
Work
193,000
Species 90,000
Org.
76,000
Building 23,000
• A partial, manual mapping was constructed from
infobox attributes to these term
• Current domain and range constraints are “loose”
• Namespace: http://dbpedia.org/ontology/
Person
56 properties
Organisation
50 properties
Place
110 properties
Exploiting Linked Data
Conclusion
• Our initial applications shows that the
Wikitology idea has merit
• Wikipedia is increasingly being used as
a knowledge source of choice
• Easily extendable to other wikis and
collaborative KBs, e.g., Intellipedia
• Serious use may require exploiting
cluster machines and cell processing
• We need to move beyond Wikipedia to
exploit the LOD cloud
introduction  wikitology  applications  discussion  conclusion
Download