Slides

advertisement
Social Network Extraction of
Academic Researchers
Jie Tang, Duo Zhang, and Limin Yao
Tsinghua University
Oct. 29th 2007
1
Outline
• Motivation
• Related Work
• Problem Description
• Our Approach
• Experimental Results
• Summary
2
Motivation
• More and more online social networks become
available
– e.g., YouTube.com, Facebook.com, etc.
• However, the social networks are usually separated
• A question arises: can we build a integrated social
network from the separated ones automatically?
• As a case study, how to build an social network
automatically for academic community?
– ArnetMiner.org
3
Motivating Example
Ruud Bolle
2
Office: 1S-D58
Letters: IBM T.J.
Watson Information
Research Center
Contact
P.O. Box 704
Ruud Bolle
Office: 1S-D58 Yorktown Heights, NY 10598 USA
IBM T.J.
WatsonCenter
Research Center
Letters: Packages:
IBM T.J. Watson
Research
Skyline Drive
P.O. Box19704
Hawthorne,
NY10598
10532USA
USA
Yorktown
Heights, NY
Email:
Packages:
IBMbolle@us.ibm.com
T.J. Watson Research Center
19 Skyline Drive
Ruud M. Bolle was born in Voorburg,
The Netherlands.
He received the Bachelor's
Hawthorne,
NY 10532 USA
Degree in Analog Electronics
1977 and the Master's Degree in Electrical
Email: inbolle@us.ibm.com
Engineering in 1980, both from Delft University of Technology, Delft, The
In 1983
he received
Master'sEducational
Degree
in Applied
Mathematics and in
Ruud M.Netherlands.
Bolle was born
in Voorburg,
Thethe
Netherlands.
He received
thehistory
Bachelor's
the Ph.D.
in Electrical
Engineering
from Brown
University,
Providence, Rhode
Degree 1984
in Analog
Electronics
in 1977
and the Master's
Degree
in Electrical
Island.
In 1984
he from
became
Research of
Staff
Member atDelft,
the IBM
Engineering
in 1980,
both
Delfta University
Technology,
TheThomas J. Watson
Research
Center
in the Artificial
Intelligence
Department
the Computer
Netherlands.
In 1983
he received
the Master's
Degree
in Applied of
Mathematics
andScience
in
In 1988Engineering
he became from
manager
of University,
the newly formed
Exploratory
1984 theDepartment.
Ph.D. in Electrical
Brown
Providence,
RhodeComputer
Vision
whichaisResearch
part of theStaff
Math
Sciences
Department.
Island. In
1984Group
he became
Member
at the
IBM Thomas J. Watson
Research Center in the Artificial Intelligence Department of the Computer Science
Currently,
hishe
research
are
onformed
video database
indexing,
video
Department.
In 1988
becameinterests
manager
offocused
the newly
Exploratory
Computer
processing,
visual
interaction
and biometrics applications.
Vision Group
which is
part human-computer
of the Math Sciences
Department.
video database indexing
video processing
visual human-computer interaction
biometrics applications
1
IBM T.J. Watson
Research Center
Research Staff
Affiliation
2006
Position
Homepage
Photo
Name
Ruud Bolle
1984
Sharat Chikkerur, Sharath Pankanti, Alan Jea, Nalini K. Ratha, Ruud M. Bolle: Fingerprint
49 EE Representation Using Localized Texture Features. ICPR (4) 2006: 521-524
2Andrew Senior, Arun Hampapur, Ying-li Tian, Lisa Brown, Sharath Pankanti, Ruud M. Bolle:
48 EE Appearance models for occlusion handling. Image Vision Comput. 24(11): 1233-1243 (2006)
Msuniv
Delft University of Technology
47
46
...
4
1
Bsdate
1977
Bsuniv
Delft University of Technology
Bsmajor
Msmajor
Msmajor
Electrical Engineering
Applied Mathematics
Co-author
Co-author
Publication 2#
Publication 1#
Title
Title
Cancelable Biometrics:
A Case Study in
Venue
Fingerprints
Date
End_page
Start_page
ICPR
2005
1Ruud M. Bolle, Jonathan H. Connell, Sharath Pankanti, Nalini K. Ratha, Andrew W. Senior:
EE The Relation between the ROC Curve and the CMC. AutoID 2005: 15-20
Sharat Chikkerur, Venu Govindaraju, Sharath Pankanti, Ruud M. Bolle, Nalini K. Ratha:
EE 2
Novel Approaches for Minutiae Verification in Fingerprint Images. WACV. 2005: 111-116
Ruud Bolle
Analog Electronics
1980
1Nalini K. Ratha, Jonathan Connell, Ruud M. Bolle, Sharat Chikkerur: Cancelable Biometrics:
50 EE A Case Study in Fingerprints. ICPR (4) 2006: 370-373
bolle@us.ibm.com
Email
Phddate
Phduniv
Phdmajor
Msdate
Brown University
Publications
DBLP: Ruud Bolle
IBM T.J. Watson Research
Center
P.O. Box 704
Address Yorktown Heights,
NY 10598 USA
Address
http://researchweb.watson.ibm.com/
ecvg/people/bolle.html
Electrical Engineering
Ruud
M. Bolle interests
is a Fellow
the IEEE
thedatabase
AIPR. Heindexing,
is Area Editor
Currently,
his research
areoffocused
onand
video
video of Computer
Vision
andhuman-computer
Image Understanding
and Associate
Editor applications.
of Pattern Recognition. Ruud
processing,
visual
interaction
and biometrics
Academic services
M. Bolle is a Member of the IBM Academy of Technology.
Ruud M. Bolle is a Fellow of the IEEE and the AIPR. He is Area Editor of Computer
Vision and Image Understanding and Associate Editor of Pattern Recognition. Ruud
M. Bolle is a Member of the IBM Academy of Technology.
IBM T.J. Watson Research
Center
19 Skyline Drive
Hawthorne, NY 10532 USA
Research_Interest
370
Fingerprint
Representation Using
Localized Texture
Features
Venue
End_page
Start_page
2006
2006
521
ICPR
373
coauthor
Publication #3
affiliation
524
UIUC
Ruud Bolle
2
Publication #5
...
Date
coauthor
position
Professor
Motivating Example
Ruud Bolle
2
Office: 1S-D58
Letters: IBM T.J.
Watson Information
Research Center
Contact
P.O. Box 704
Ruud Bolle
Office: 1S-D58 Yorktown Heights, NY 10598 USA
IBM T.J.
WatsonCenter
Research Center
Letters: Packages:
IBM T.J. Watson
Research
Skyline Drive
P.O. Box19704
Hawthorne,
NY10598
10532USA
USA
Yorktown
Heights, NY
Email:
Packages:
IBMbolle@us.ibm.com
T.J. Watson Research Center
19 Skyline Drive
Ruud M. Bolle was born in Voorburg,
The Netherlands.
He received the Bachelor's
Hawthorne,
NY 10532 USA
Degree in Analog Electronics
1977 and the Master's Degree in Electrical
Email: inbolle@us.ibm.com
Engineering in 1980, both from Delft University of Technology, Delft, The
In 1983
he received
Master'sEducational
Degree
in Applied
Mathematics and in
Ruud M.Netherlands.
Bolle was born
in Voorburg,
Thethe
Netherlands.
He received
thehistory
Bachelor's
the Ph.D.
in Electrical
Engineering
from Brown
University,
Providence, Rhode
Degree 1984
in Analog
Electronics
in 1977
and the Master's
Degree
in Electrical
Island.
In 1984
he from
became
Research of
Staff
Member atDelft,
the IBM
Engineering
in 1980,
both
Delfta University
Technology,
TheThomas J. Watson
Research
Center
in the Artificial
Intelligence
Department
the Computer
Netherlands.
In 1983
he received
the Master's
Degree
in Applied of
Mathematics
andScience
in
In 1988Engineering
he became from
manager
of University,
the newly formed
Exploratory
1984 theDepartment.
Ph.D. in Electrical
Brown
Providence,
RhodeComputer
Vision
whichaisResearch
part of theStaff
Math
Sciences
Department.
Island. In
1984Group
he became
Member
at the
IBM Thomas J. Watson
Research Center in the Artificial Intelligence Department of the Computer Science
Currently,
hishe
research
are
onformed
video database
indexing,
video
Department.
In 1988
becameinterests
manager
offocused
the newly
Exploratory
Computer
processing,
visual
interaction
and biometrics applications.
Vision Group
which is
part human-computer
of the Math Sciences
Department.
video database indexing
video processing
visual human-computer interaction
biometrics applications
1
Two key issues:
IBM T.J. Watson
Research Center
Research Staff
Affiliation
IBM T.J. Watson Research
Center
P.O. Box 704
Address Yorktown Heights,
NY 10598 USA
Address
http://researchweb.watson.ibm.com/
ecvg/people/bolle.html
Position
Homepage
Photo
Name
Ruud Bolle
1984
Brown University
Analog Electronics
1980
Msuniv
Delft University of Technology
Msmajor
Msmajor
Electrical Engineering
Applied Mathematics
Co-author
Co-author
Publication 2#
Title
Title
1Nalini K. Ratha, Jonathan Connell, Ruud M. Bolle, Sharat Chikkerur: Cancelable Biometrics:
50 EE A Case Study in Fingerprints. ICPR (4) 2006: 370-373
Sharat Chikkerur, Sharath Pankanti, Alan Jea, Nalini K. Ratha, Ruud M. Bolle: Fingerprint
49 EE Representation Using Localized Texture Features. ICPR (4) 2006: 521-524
2Andrew Senior, Arun Hampapur, Ying-li Tian, Lisa Brown, Sharath Pankanti, Ruud M. Bolle:
48 EE Appearance models for occlusion handling. Image Vision Comput. 24(11): 1233-1243 (2006)
Cancelable Biometrics:
A Case Study in
Venue
Fingerprints
1Ruud M. Bolle, Jonathan H. Connell, Sharath Pankanti, Nalini K. Ratha, Andrew W. Senior:
EE The Relation between the ROC Curve and the CMC. AutoID 2005: 15-20
Sharat Chikkerur, Venu Govindaraju, Sharath Pankanti, Ruud M. Bolle, Nalini K. Ratha:
EE 2
Novel Approaches for Minutiae Verification in Fingerprint Images. WACV. 2005: 111-116
...
Date
End_page
Start_page
ICPR
2005
5
Bsdate
1977
Bsuniv
Delft University of Technology
Bsmajor
Publication 1#
2006
46
Ruud Bolle
1
• How to accurately extract the researcher
profile information
from the Web?
Academic services
• How to integrate the information from different
sources? Publications
Ruud
M. Bolle interests
is a Fellow
the IEEE
thedatabase
AIPR. Heindexing,
is Area Editor
Currently,
his research
areoffocused
onand
video
video of Computer
Vision
andhuman-computer
Image Understanding
and Associate
Editor applications.
of Pattern Recognition. Ruud
processing,
visual
interaction
and biometrics
M. Bolle is a Member of the IBM Academy of Technology.
Ruud M. Bolle is a Fellow of the IEEE and the AIPR. He is Area Editor of Computer
Vision and Image Understanding and Associate Editor of Pattern Recognition. Ruud
M. Bolle is a Member of the IBM Academy of Technology.
47
bolle@us.ibm.com
Email
Phddate
Phduniv
Phdmajor
Msdate
Electrical Engineering
DBLP: Ruud Bolle
IBM T.J. Watson Research
Center
19 Skyline Drive
Hawthorne, NY 10532 USA
Research_Interest
370
Fingerprint
Representation Using
Localized Texture
Features
Venue
End_page
Start_page
2006
2006
521
ICPR
373
coauthor
Publication #3
affiliation
524
UIUC
Ruud Bolle
2
Publication #5
...
Date
coauthor
position
Professor
Outline
• Motivation
• Related Work
• Problem Description
• Our Approach
• Experimental Results
• Summary
6
Related Work – Person Profiling
• Profile Information Extraction
– E.g., Yu et al. (2005), resume IE
– Alani et al. (2003), Artequakt system
• Contact Information Extraction
– E.g., Kristjansson et al. (2004), Interactive extraction
– Balog and Rijke (2006), Heuristic rules
• Information Extraction Methods
– E.g., HMM (Ghahramani, 1997),
– MEMM (McCallum, 2000),
– CRFs (Lafferty, 2001)
7
Related Work – Name Disambiguation
•
Unsupervised Methods
–
–
•
Supervised Methods
–
–
•
Support Vector Machines, Naïve Bayes, etc.
E.g. Han (2004)
Graph-based Approach
–
–
8
Hierarchy clustering, K-way spectral clustering, etc.
E.g. Han (2005), Mann (2003), Tan (2006)
Random Walk, etc.
E.g. Bekkerman (2005), Malin (2005), Minkov (2006)
Outline
• Motivation
• Related Work
• Problem Description
• Our Approach
• Experimental Results
• Summary
9
Researcher Social Network Extraction
70.60% of the researchers
have at least one homepage
or an introducing page
Research_Interest
Fax
Affiliation
Title
85.6% from
universities
14.4% from
companies
Start_page
71.9% are
homepages
End_page
40% are in lists
and tables
28.1% are
introducing
pages
60% are natural
language text
Phone
Postion
Publication_venue
Address
Person Photo
Email
Homepage
Publication
Name
Authored
Coauthor
Researcher
Bsdate
Bsuniv
Phddate
Phduniv
Phdmajor
Msdate
Bsmajor
Msuniv
Msmajor
Date
There are a large number of
person names having the
ambiguity problem
Even 3 “Yi Li” graduated the
author’s lab
70% moved at least one time
10
Outline
• Motivation
• Related Work
• Problem Description
• Our Approach
• Experimental Results
• Summary
11
Markov Random Field
Ya
Yb
Special Cases:
- Conditional Random Fields
- Hidden Markov Random
Fields
Yc
Ye
Yd
Yf
Markov Property:
12
P(Yi | Y j | Y j  Yi )  P(Yi | Y j | Y j ~ Yi )
CRFs
- Green nodes are hidden vars, - Purple nodes are observations
…
…
…
ADR
…
ADR
AFF
AFF
AFF
AFF
AFF
AFF
POS
POS
POS
POS
POS
POS
OTH
OTH
OTH
OTH
OTH
OTH
He
is
a
Professor
at


1
p( y | x) 
exp    j t j (e, y |e , x)   k sk (v, y |v , x) 
Z ( x)
vV ,k
 eE , j

13
UIUC
Processing Flow for Profiling
1
Preprocessing
Train
2
Tagging
Standard word
He obtained his BS in Computer
Science in 1999...
Determine
Tokens
Special word
Image Token
Ruud M. Bolle is a Fellow of the
IEEE...
Inputted docs
Assigning
tags
AUC
ALC
FUC
AMC
PRV
DEL
AUC
ALC
FUC
AMC
PRV
RPA
DEL
AUC
ALC
FUC
AMC
Punc. mark
…..
Ruud
Test
Bolle
is
a
Fellow of
the
AUC
ALC
FUC
AMC
AUC
ALC
FUC
AMC
IEEE
Labeling data
Model Learning
3
ALC
Feature
definitions
PRV
ALC
Learning a
CRF model
PRV
FUC
A unified
tagging model
RPA
AUC
ALC
FUC
AMC
PRV
DEL
AUC
ALC
FUC
AMC
PRV
DEL
PRV
RPA
DEL
…..
obtained
his
BS
in
Computer Science
Labeled data
14
PSB
PRV
DEL
Term
...
Document
Ruud M. Bolle is a Fellow of the
IEEE and the AIPR. He is Area
Editor of Computer Vision and
Image Understanding and Associate
Editor of Pattern Recognition. Ruud
M. Bolle is a Member of the IBM
Academy of Technology...
PRV
DEL
AUC
ALC
FUC
AMC
PSB
PRV
DEL
…..
Ruud
Bolle
is
a
Fellow of
Tagging results
the
IEEE
Token Definitions
Standard word
Standard
word
15
Words in natural language
Special word
Special
word
Including several general ‘special words’
e.g. email address, IP address, URL, date,
number, money, percentage, unnecessary
tokens (e.g. ‘===’ and ‘###’), etc.
Image token
Image
token
<IMAGE src="defaul3.jpg" alt=""/>
Term
Term
Punctuation
Punctuation
marks
marks
base NP,
“Computer
Science”
base
NP,like
like
“Computer
Science”
Including period,
question
mark,
and and
Including
period,
question
mark,
exclamation mark
exclamation
mark
Possible Tag Assignment
Token type
Possible tags
Standard word
All possible tags
Special word
Position, Affiliation, Address, Email, Phone,
Fax, Phd/Ms/Bs-date
Image token
Photo, Email
Term token
Position, Affiliation, Address, Phd/Ms/Bsuniv, Phd/Ms/Bs-major
Position, Affiliation, Address, Email, Phone,
Punctuation marks
Fax, Phd/Ms/Bs-date
16
Feature Definition
• Content features
Word features
Morphological features
Image size
Image height/width ratio
Image format
Image color
Face recognition
The value of height/width. The value of a person photo is
often larger than 1
JPG or BMP
The number of the “unique color” used in the image and the
number of bits used for per pixel, i.e. 32,24,16,8,1
Whether the current image contains a person face
Image filename
Whether the filename contains (partially) the researcher
name
Image “ALT”
Whether the “alt” of the image contains (partially) the
researcher name
Image positive keywords
Image negative keywords
17
Standard Word
Whether the current token is a word
Whether the word is capitalized
Image Token
The size of the image
“myself”, “biology”
“ads”, “banner”, “logo”
Feature Definition
• Pattern features
Positive words
“Fax:” for Fax, “director” for Position
Whether the current token is a special
Special tokens
word
• Term features
18
Term
Whether the current token is a term
Dictionary
Whether the current token is included
in a dictionary
Our Method to Name Disambiguation
y4=2
t -coauthor y7=2
y1=1
cite
coauthor
y10=3
y5=2
y6=2
co-conference y3=1
y2=1
coauthor
cite
coauthor
co-conference
cite
• A hidden Markov Random
Field model
y9=3
y11=3
y8=1
coauthor
x4
• Hidden Variables Y represent
the labels of publications
x9
x7
x1
x5
x3
• Observable Variables X
represent publications
x10
x6
x2
x11
x8
19
• Constraints define the
dependencies over hidden
variables
Objective Function
maximize P (Y | X )  P (Y ) P ( X | Y )
1
exp(V (Y ))
Z1
1
 exp(  VNi (Y ))
Z1
Ni N
P( X | Y ) 
1
 exp(V (i, j ))
Z1
i
j
1
exp(   D( xi , yi ))
Z2
xi X
1
 exp( D( xi , x j ) I ( yi  y j )  [ wk ck ( yi , y j )])
Z1
i
j
ck C
2
1
2
minimize fobj  {D( xi , x j ) I ( yi  y j )  [ wk ck ( yi , y j )]}   D( xi , yi )  log Z
i
20
j
ck C
xi X
Constraint Definition
C
c1
c2
c3
c4
c5
W
w1
w2
w3
w4
w5
c6
w6
Constraint Name
CoOrg
CoAuthor
Citation
CoEmail
Feedback
τ-CoAuthor
Description
ai(0).affiliation = aj(0).affiliation
r, s>0, ai(r)=aj(s)
pi cites pj or pj cites pi
ai(0).email = aj(0).email
Constraints from user feedback
one common author in τ extension
p1: A, B, C
p2: A, B
p3: A, D
p4: C, D
21
(0)
(3)
(2)
Mp(1)
:
p1
p1 1
p2 1
0
p3 01
p2
01
1
01
p3
0
1
1
0
1
Parameterized Distance Function

We define our distance function as follows:
D( xi , x j )  1 
xiT Ax j
|| xi ||A || x j ||A
where || xi ||A  xiT Axi
22

We can see that || xi ||A actually maps each vector xi
into another new space, i.e. A1/2xi

To simplify our question, we define A as a diagonal
matrix
EM Framework
• Initialization
• use constraints to generate initial k clusters
f obj ( xi , yi )  {D( xi , x j ) I (h  l j )  [ wk ck ( pi , p j )]}  D( xi , yi )
• E-Step
• M-Step
i
ck C
j
x
i :li  h
i
• Update cluster centroid y  ||  x ||
• Update parameter matrix A
i
i :li  h
f obj
am
 {
i
D( xi , x j )
am
23
j
D( xi , x j )
am
I (li  l j )  [ wk ck ( pi , p j )]} 
ck C
xim x jm || xi ||A || x j ||A  x Ax j
T
i

i
A
D( xi , yi )
am
xi  X

2
xim
|| xi ||2A  x 2jm || x j ||2A
|| xi ||2A || x j ||2A
2 || xi ||A || x j ||A
Outline
• Motivation
• Related Work
• Problem Description
• Our Approach
• Experimental Results
• Summary
24
Profiling Experiments
• Dataset
– IK researchers from ArnetMiner.org
• Baseline
– Amilcare
– Support Vector Machines
– Unified_NT (CRFs without transition features)
• Evaluation measures
– Precision, Recall, F1
25
Profiling Results—5-fold cross validation
26
Profiling Task
Unified
Unified_NT
SVM
Amilcare
Photo
89.11
88.64
88.86
31.62
Position
69.44
64.70
64.68
56.48
Affiliation
83.52
72.16
73.86
46.65
Phone
91.10
78.72
79.71
83.33
Fax
90.83
64.28
64.17
86.88
Email
80.35
75.47
79.37
78.70
Address
86.34
75.15
77.04
66.24
Bsuniv
67.38
57.56
59.54
47.17
Bsmajor
64.20
59.18
60.75
58.67
Bsdate
53.49
40.59
28.49
52.34
Msuniv
57.55
47.49
49.78
45.00
Msmajor
63.35
61.92
62.10
57.14
Msdate
48.96
41.27
30.07
56.00
Phduniv
63.73
53.11
57.01
59.42
Phdmajor
67.92
59.30
59.67
57.93
Phddate
57.75
42.49
41.44
61.19
Overall
83.37
83.37
72.09
73.57
62.30
Contribution of Features
84
content
82
content+term
content+pattern
80
all
F1 measure
78
76
74
72
70
68
27
Features
Disambiguation Experiments
• Data Sets:
Abbreviated Name dataset
Name Set
Publications
Name Variations
C. Chang
402
G. Wu
28
Real Name dataset
Name
Affiliation
Publication
97
Shanghai Jiao Tong Univ.
6
152
46
Dept. of Automation, Tsinghua Univ.
3
K. Zhang
293
40
Alabama Univ.
8
J. Li
551
102
Univ. of California, Davis
4
B. Liang
55
14
Carnegie Mellon University
5
M. Hong
108
30
State Univ. of New York at Albany
4
National Univ. of Singapore
6
X. Xie
136
36
South China Univ. of Technology
2
P. Xu
39
5
George Mason Univ.
2
H. Xu
182
60
Chinese Academy of Sciences
5
W. Yang
263
82
Univ. of Washington
3
Nanjing Normal Univ.
4
Jing Zhang
(54/25)
Yi Li
(42/22)
Experiment Setup
• Baseline Method
Unsupervised Hierarchical Clustering Method
• Measurement
#PairsCorrectlyPredictedToSameAuthor
Pairwise _ Precision 
TotalPairsPredictedToSameAuthor
# PairsCorrectlyPredictedToSameAuthor
Pairwise_Recall 
TotalPairsToSameAuthor
2  Precision  Recall
Pairwise _ F1  measure 
Precision+Recall
29
Disambiguation Results
Name
C. Chang
G. Wu
K. Zhang
J. Li
B. Liang
M. Hong
X. Xie
P. Xu
H. Xu
W. Yang
Average
30
Unsupervised Hierarchical
Clustering
Precision Recall
F1
0.65
0.59
0.62
0.71
0.62
0.66
0.75
0.60
0.67
0.62
0.52
0.57
0.82
0.76
0.79
0.79
0.65
0.71
0.77
0.73
0.75
0.89
0.95
0.92
0.65
0.59
0.62
0.71
0.62
0.66
0.75
0.60
0.67
Constraint-based Probabilistic
Framework
Precision
Recall
F1
0.73
0.67
0.70
0.75
0.75
0.75
0.79
0.71
0.75
0.66
0.59
0.62
0.85
0.89
0.87
0.82
0.75
0.78
0.83
0.82
0.82
0.94
1.00
0.97
0.73
0.67
0.70
0.75
0.75
0.75
0.79
0.71
0.75
Contribution of Different Constraint
Baseline
No-Reference
Constraint Combination
No-k-Author
No-CoOrg
No-CoEmail
No-Coauthor
Yi Li
Jing Zhang
Reference
k-Author
CoOrg
CoEmail
CoAuthor
All
0.000
31
0.200
0.400
0.600
0.800
Pairwise F1-Measure
1.000
How Profiling and Disambiguation Help
Expert Finding
• Expert finding by using a PageRank-based method
1
EF
EF+RPE
EF+RPE+ND
0.8
0.6
0.4
0.2
0
P@5
32
P@10 P@20
P@30 R-prec
MAP
bpre
MRR
Outline
• Motivation
• Related Work
• Problem Description
• Our Approach
• Experimental Results
• Summary
33
Summary
• Investigated the problem of researcher social
network extraction
• Proposed a unified approach to perform
profiling and a constraint-based probabilistic
model to name disambiguation
• Experimental results show that our approaches
outperform the baseline methods
• When applying it to expert finding, we obtain a
significant improvement on performances
34
Thanks!
Q&A
HP: http://keg.cs.tsinghua.edu.cn/persons/tj/
Online Demo: http://arnetminer.org
35
Download