ppt - CLAIR

advertisement

COMS 6998-06 Network Theory

Week 11

Dragomir R. Radev

Wednesdays, 6:10-8 PM

325 Pupin Terrace

Fall 2010

(29) Bibliometrics

Early work

• The Science Citation Index (1960)

– More than 8,700 journals in the natural and social sciences

• Eugene Garfield

• de Solla Price – study of networks of papers and citation patterns

Recent systems

• Citeseer

• Rexa

• Google Scholar

• ACL Anthology Network

Garfield’s indices

• Journal citation reports

• Impact factor:

– Computed over a three-year period as B/A, where

• First two years: A = number of citable items

• Third year: B = the number of citations to them

• In science (2006)

– Science (30.03)

– Nature (26.68)

– PNAS (9.64)

Criticism

• Favor certain fields and types of research

• Absolute value is meaningless

• Ignores certain type of scholarly work

(e.g., books, software, conference papers)

• Possible to manipulate

• Self-citations

• Ignore citation type (this applies to all other metrics!)

Citation types

[Weinstock 1971]

Networks of scientific papers

(1965)

• In a given year, about 35% of the papers of all existing papers are not cited at all.

Another 49% are cited only once. The rest are cited an average of 3.2 times each.

• Degree coefficient is about 2.5-3.0

• 7% annual growth

• Most papers are obsolete after 10 years

De Solla Price 1965

Miscellaneous metrics

• Citation count

• Impact factor

• Pagerank (e.g., http://www.eigenfactor.org/)

• H-index

H-index

• Proposed by Jorrge Hirsch of

UCSD in 2005

• Equals the number of papers of yours, h that have been cited at least h times.

• For physicists, 12=tenure,

18=full prof, 45=NAS

(statement by Hirsch)

• See demo (ACL Anthology

Network)

• also: PoP (guess what it means?) h papers

Criticism

• Galois’s is 2 (short career)

• Hard to compare two people with the same score but very different distribution

• Hugely different based on the underlying database

35

25

25

34

32

32

39

32

30

33

45

45

30

24

23

37

24

25

30

12

12

12

12

12

12

12

15

14

14

14

14

13

12

11

11

11

11

11

16

Example

AAN Google Scholar Name

38 Ken Church

Kevin Knight

Ralph Grishman

Aravind Joshi

Hermann Ney

Fernando Pereira

David Yarowsky

Michael Collins

Chris Manning

Daniel Marcu

Kathy McKeown

Robert Mercer

Franz Och

Yves Schabes

Stuart Shieber

Eric Brill

Eugene Charniak

Ido Dagan

Mark Johnson

Philip Resnik

88 Hector Garcia-Molina (Stanford), ACM Fellow, Member of the National Academy of Engineering

81 Jeffrey D. Ullman (Stanford), ACM Fellow, Member of the National Academy of Engineering

76 Robert Tarjan (Princeton), Turing Award, ACM Fellow, Member of the National Academy of Engineering

75 Deborah Estrin (UCLA), ACM Fellow, IEEE Fellow

75 Don Towsley (U Mass, Amherst), ACM Fellow, IEEE Fellow

73 Ian Foster (Argonne National Laboratory & U Chicago)

71 Scott Shenker (Berkeley), ACM Fellow, IEEE Fellow

70 David Culler (Berkeley), ACM Fellow, Member of the National Academy of Engineering

68 Takeo Kanade (CMU), ACM Fellow, IEEE Fellow, Member of the National Academy of Engineering

61 Mario Gerla (UCLA), IEEE Fellow

61 Nick Jennings (U Southampton), Fellow of the Royal Academy of Engineering

58 Anil K. Jain (Michigan State U), ACM Fellow, IEEE Fellow

57 Demetri Terzopoulos (UCLA), ACM Fellow, IEEE Fellow, Member of the European Academy of Sciences

56 Randy H. Katz (Berkeley), ACM Fellow, IEEE Fellow, Member of the National Academy of Engineering

56 Steven Salzberg (U Maryland)

55 Jennifer Widom (Stanford), ACM Fellow, Member of the National Academy of Engineering

54 Jack Dongarra (U Tennessee), ACM Fellow, IEEE Fellow, Member of the National Academy of Engineering

54 David E. Goldberg (UIUC)

54 Ken Kennedy (Rice), ACM Fellow, IEEE Fellow, Member of the National Academy of Engineering

54 Amir Pnueli (Weizmann and New York University), Turing Award, ACM Fellow, Member of the National Academy of Engineering

54 Herbert A. Simon (CMU), Turing Award, ACM Fellow, Nobel Laureate

53 Sally Floyd (ICSI), ACM Fellow

53 Tomaso Poggio (MIT)

53 Eduardo Sontag (Rutgers), IEEE Fellow

52 Rakesh Agrawal (Microsoft), ACM Fellow, IEEE Fellow, Member of the National Academy of Engineering

52 Stanley Osher (UCLA), Member of the National Academy of Sciences

52 Christos H. Papadimitriou (Berkeley), ACM Fellow, Member of the National Academy of Engineering

51 Jiawei Han (UIUC), ACM Fellow

51 Richard Karp (Berkeley), Turing Award, ACM Fellow, Member of the National Academy of Engineering

51 Alex Pentland (MIT)

[using PoP; collected by Jens Palsberg (UCLA)] http://www.cs.ucla.edu/~palsberg/h-number.html

Recent study (An et al. 2004)

• 31.5% of the papers have been cited.

• In-degree power law coefficient 1.71

• Diameters:

– Neural networks (n=23,371) d=24, ud=18

– Automata (n=28,168) d=33, ud=19

– Software eng (n=19,018) d=22, ud=16

• Largest connected components:

– NN WCC=79.6%

– Automata WCC=92%

– SE WCC=87.9%

Collaboration networks

Many reasons why people collaborate:

[Beaver 2001; Glaenzel 2003]

[Paul Erdos]

(23) The Ising model

(24) Percolation on graphs

What is percolation [Grimmett 1999]

• Will water flow through a porous stone?

• Let p be the probability that an edge is open.

• This process is called “bond percolation”

• Paths (percolation) appear at p =0.5059. This is a quintessential example for phase transitions q(p)

1

(1,1)

1 p

• Example: ferromagnetism. The Curie point is when there is no longer spontaneous magnetization

• Generic example of a magnetic field:

[http://ibiblio.org/e-notes/Perc/ising.htm]

The Ising model

• Given a lattice in D-dimensional space.

• Each vertex can be -1 or 1.

• Configurations: specific assignments of -1 and 1

• The energy of a configuration is

• In statistical physics: P(S) ~ e βE

[http://ibiblio.org/e-notes/Perc/trans.htm]

Demo

• http://webphysics.davidson.edu/applets/ising/def ault.html

• http://stp.clarku.edu/simulations/ising/ising2d.ht

ml

• http://www.phy.syr.edu/courses/ijmp_c/Ising.html

• Ferromagnetic alignment (J>0)

• Temperature tends to break the alignment: causes the spins to randomly change their values

• External magnetic field tends to support the alignment

Site percolation

• The critical value is around 0.59 but has not been derived analytically.

Demo

• http://theorie.physik.uniwuerzburg.de/~reents/ComputationalPhysi cs/percgr.html

• http://ibiblio.org/e-notes/Perc/perc.htm

• http://ibiblio.org/e-notes/Perc/distr.htm

• http://stp.clarku.edu/simulations/

(15) Diffusion on graphs

Epidemics in small worlds

• Epidemic = in the limit of a large graph, a nonzero fraction is infected.

• Fully mixed networks – everyone is connected to everyone the same way.

• In real life this is not true.

• Let f = average number of shortcuts per vertex.

• Let k = 1: every vertex is connected to at least its one nearest neighbor.

• For large L (#vertices), the prob. that two random vertices have a shortcut is:

1 1

2

L

2

 kfL

2 kf

L

Moore and Newman 2000. Epidemics and

Percolation in small-world networks.

Moore and Newman 2000 cont’d

Moore and Newman 2000 cont’d

More recent work

• Newman 2002

– Outbreak size distribution

– Degree of infected individuals

– Bipartite graphs

Download