LEE_SEWP_Conference_Presentation_10-19

advertisement
SEWP Research Conference
October 19, 2005
Creating a Longitudinal Research WorkerEstablishment Matched Dataset from Patent
Data:
Description and Application to Understanding
International Knowledge Flows
Jinyoung Kim (SUNY-Buffalo)
Sangjoon John Lee (Alfred University)
Gerald Marschke (SUNY-Albany)
Issues
• Construction of a longitudinal research
worker-establishment matched panel data
• Knowledge flow across national borders
Idea
• Policy implications on immigration, labor
market, and education arena
• productivity of scientific researchers
• transmittal mechanism of knowledge
• Technology spillover appears to be
geographically limited
• Firms access externally-located
technology partly through hiring of and
collaboration with researchers from the
outside.
We examined:
1. Trends in U.S. firms’ access to the
researchers overseas and those with
foreign research experience in the late
1980s through the 1990s
2. Role of research personnel as a pathway
for the diffusion of ideas from foreign
countries to U.S. innovators
3. The firm-level determinants of accessing
innovations developed overseas.
Main findings:
a. In recent years, an increase in the extent that U.S.
innovators access researchers residing in foreign country
•
The fraction U.S. residents with foreign research experience in
US firms appears to be falling.
•
U.S. pharmaceutical and semiconductor firms are increasingly
going to foreign countries to employ such researchers
b. Retaining researchers with overseas research experience
seems to facilitate access to innovations developed
overseas.
c. In the semiconductor industry, smaller firms and older firms
are more likely to make use of the output of non-U.S. R&D.
d. In the pharmaceutical industry, younger firms are more
likely to make use of the output of non-U.S. R&D.
Outline
• Literature Review
• Data Construction Process
• Empirical findings
• Conclusions
Literatures
Various mechanisms for technology and
knowledge transfer across institutional boundaries.
• Informal Contact
• Agrawal, Cockburn, and McHale (2003), Von Hippel
(1988)
• Spillovers
• Henderson, Jaffe, and Trajtenberg (REStat 1998), Jaffe
(AER 1989), Zucker, Darby, and Brewer (AER 1998),
Audretsch and Feldman (AER 1996), Mowery, Ziedonis
(NBER 2001).
• Transmission of Tacit knowledge
Feldman (1994)
• Collaboration and Hiring
Cohen, Nelson, and Walsh (Mgt Science 2002),
Almeida and Kugot (Mgt Science 1999),
Zucker, Darby, and Armstrong (NBER 2001),
Adams, Black, Clemmons, and Stephan
(NBER 2004)
Data
1. Patent Bibliographic data (Patents BIB)
•
•
•
U.S. utility patents issued between January
1975 and February 2002.
Patent ID number, patent application and
granting, patent assignee, and geographic
information (country, state, city, address) on
all inventors involved.
The number of patents during this period is
2,493,610 and the number inventor records is
5,105,754
2. ProQuest Digital Dissertations Abstracts
• Author, title of dissertation, degree conferring
institution, date of degree, academic field, and
type of degree
• From over 1,000 North American graduate
schools and European universities.
• For those who earned degrees in all natural
science and engineering fields between 1945 and
2003
• 1,068,551 degree holders.
3. The Compact D/SEC
• 12,000 publicly traded firms
• at least $5 million in assets and at least 500
shareholders
• Information obtained from Annual Reports, 10-K
and 20-F filings, and Proxy Statements for those
companies.
• pharmaceutical and semiconductor firms in the
Compact D/SEC data by their primary SIC.
• selected only the years 1989 through 1997 due to
patent grant lag
4. Standard & Poor’s Annual Guide to Stocks –
Directory of Obsolete Securities
•
histories of firm ownership changes due to mergers
and acquisitions, bankruptcy, dissolution, and name
changes, updated through December 2002.
5. NBER Patent-Citations
•
collected by Hall, Jaffe and Trajtenberg (2001)
•
all citations made and received by patents granted
between 1975 and 1999. (16,522,438 citation records)
6. Thomas Register
•
Firm founding year
3 Steps in Data Construction
Citation
Proquest
Patent BIB
S&P
+ Compact D/SEC
Thomas
1. Identifying the same inventor among
‘same/similar’ names (Patent BIB)
2. Identifying the Ownership Structure of
Subsidiaries (Compact D/SEC, S&P)
3. Combining Patent-Inventor Data with Firm Data
and Patent Citation Data
Front page of patent
Step 1: Identifying the Same Inventor
• Inventor name variants
Adam Smith vs. Adam Smith?
Adam E. Smith vs. Adam Smith?
Adam Smyth vs. Adam Smith?
:
:
• The size of data (1975-2002)
2,493,610 million patents
5,105,754 million inventor names
• Name of the inventor (last, first, middle, surname modifier)
• Street address, zip
•
City, state, country
Over 16 million patent citations (A. Jaffe)
How to identify?
• Pair each name with other names and
compare
N(N-1)/2 number of unique pairs.
= (5,105,754 x 5,105,753) / 2
≈ 13 trillion pairs
• Trajtenberg (2004)
How to Identify?
a. The pair is a ‘Match’ if
•
Last names (SOUNDEX coded) and First
Names in the pair are the same and
•
at least one of below categories are the same
Full Address: same street address+ city + country
ii) Self Citation: same name is found in the patent
that is citing
iii) Shared Partner (s): two names from the pair share
the same partner
i)
c.f. Strong Criteria (Trajtenberg 2004)
SOUNDEX Coding Method
• Code on the way a last name sounds rather than
the way it is spelled.
• Expand the list of similar last names to overcome
the potential for inconsistent foreign name
translations into English.
PETTIT (P330000), Chang (C520000), Chiang (C520000)
• Giving letters numerical values from 1 to 6
1 for B, F, P, V; 2 for C, G, J, K, Q, S, X, Z; 3 for D, T;
4 for L; 5 for M, N; 6 for R; 0 for punctuation, H, W, Y
b. The pair is a ‘Match’ if
• Full Last (not a Soundex coded) and First Names in the
pair are the same and
• at least one of below categories are the same
i) Zip Code
ii) Full Middle Name
c.f. Medium Criteria (Trajtenberg 2004)
c. The pair is a ‘Mismatch’ if middle name initials
are different.
Impose Transitivity
A matched to B

B matched to C,
A
matched to
C
An Example
ID
Inventor name SOUNDEX
Middle name
Co-inventor
ZIP
1
Adam Smith
Adam S530000
John Keynes
20012
2
Adam Smith
Adam S530000
Henry
John Keynes
14228
3
Adam Smith
Adam S530000
H
14228
4
Adam Smith
Adam S530000
Henry
14214
5
Adam Smith
Adam S530000
J
6
Adam Smyth
Adam S530000
John Keynes
14228
John Keynes
14228
-Match: 1:2 , 1:5, 1:6, 2:3, 2:4, 2:5, 2:6, 5:6: 3:6
-ID 5 is identified to be the same inventor through Transitivity
• 126 mismatches found after imposing
transitivity
• 3 categories of Mismatches
i) from data error
‘Laszlo Andra Szporny’ vs. ‘Laszlo Eszter Szporny’
ii) Inventor with 2 Middle names
iii) same Last and First names appear in the
same patent
Matching Results
• 2.3 million unique inventors (45%) out of 5.1 million names
c.f. Trajtenberg (2004)
•
1.6 million distinctive inventors (37%) out of 4.3 million names.
(Our patent database is larger because it includes additional
years, 2000-2002.)
•
a matching criterion of the same Assignee -> can yield a bias
in mobility among inventors.
•
assigns scores for each matching criteria
• Instead we apply the criterion that two inventors are not
treated as a match if their middle name initials differ.
•
SOUNDEX coding system sometimes so loosely specifies
names that apparently different last names are considered a
match.
Add Dissertation Abstract Information to Inventor data
• Match degree holders in the Dissertation
Abstract data with the Inventor data.
• contains a full name in a string for each
individual author
• Convert the last, first, middle names in the
inventor data to a string of aggregated names
• 64,507 (3 percent) Ph.D. or equivalent
degree holders out of 2.3 million uniquely
identified inventors
Step 2: Ownership Structure of Subsidiaries
• Necessary when combine firm-level information
with patent data file
• Patent Assignee: either a parent firm or its subsidiaries.
• Firm identifier does not exist.
• Frequent changes in firm ownership and corporate
names - During 1989 and 1997, 152 firms were merged,
15 firms were acquired, 145 firms changed their firm
names
• Firm ownership structure of subsidiaries, M&A,
and name change history
• Relate each assignee to a firm
• Enables to identify each inventor’s firm for which he/she
is innovating
1. Select two industry firms in the Compact D/SEC
•
Primary SIC 2834 (pharmaceutical preparation) or
Primary SIC 3674 (semiconductor and related devices)
2. Use S&P data
•
whether the change of an inventor’s firm is due to firmlevel M&A and/or corporation name changes.
3. List of subsidiary in the Compact D/SEC
throughout the period 1989-1997
•
not always complete –
•
if once a subsidiary of the firm, it is a subsidiary
throughout 1989-1997
4. Combined firms’ founding year
Step 3: Combining Inventor data with firm
data and Patent Citation data
•
Combine inventor file with firm-level data
•
•
Patent-inventor-firm matched data
Link to Hall, Jaffe, and Trajtenberg
citation data (2001)
•
16,522,438 citations for all granted patents
applied from 1975 through 1999.
Descriptive Statistics
1975 - 2002
• 2,493,610 patents
• 2.05 inventors per patent
• 2,299,579 unique inventors
Descriptive Statistics
Inventors (a)
Total
Pharmaceutical
Semiconductor
2.299,579
25,609
33,683
Total No.Patents
2.22
2.8
2.60
No. Patent/Year
1.31
1.62
1.72
Degree holders (b)
122,168
3,399
3,941
Total No. Patent
3.07
3.70
2.95
No. Patent/Year
1.52
1.84
1.91
(b/a)
5.3%*
13.3%
11.7%
* 3 percent (64,507) of Ph.D. or equivalent degree holders
Number of Patents Granted by Year of Application
180000
160000
140000
120000
100000
80000
60000
40000
20000
* Grant lag - 97 % of patents are granted within the first 4 years of the
applications date (Hall, Griliches, and Hausman 1986)
20
01
19
99
19
97
19
95
19
93
19
91
19
89
19
87
19
85
19
83
19
81
19
79
19
77
19
75
0
International Knowledge Flow
1. Trends in U.S. firms’ access to the
researchers with overseas research
experience
2. Role of research personnel as a pathway
for the diffusion of ideas from foreign to
U.S.
3. The firm-level determinants of accessing
innovations developed overseas.
Inventors with Foreign Experience in US Domestic Patents
Year
Number of Inventors
Fraction of Inventors by Foreign-Experience Type (%)
Current Foreign
Residents
All
Pharma
Semi
All
Pharma
Current US Residents
w/ Foreign Experience †
Semi
All
Pharma
Semi
Current US Residents
w/o Foreign Exp.
All
Pharma
Semi
1985
42,368
8.15
0.99
90.86
1986
44,828
8.30
1.07
90.63
1987
48,810
8.21
1.13
90.66
1988
54,947
8.49
1.13
90.37
1989
59,164
2,143
1,139
8.60
14.47
9.04
1.17
2.01
1.14
90.23
83.53
89.82
1990
63,812
2,259
1,362
8.02
17.35
7.78
1.22
1.51
1.25
90.76
81.14
90.97
1991
67,657
3,332
2,791
7.76
19.09
6.02
1.26
1.23
1.22
90.98
79.68
92.76
1992
73,640
3,876
3,370
7.86
20.38
7.15
1.30
1.21
1.13
90.85
78.41
91.72
1993
80,428
4,505
4,190
8.06
25.88
7.06
1.21
1.31
1.03
90.73
72.81
91.91
1994
90,910
5,320
5,739
8.44
26.86
14.76
1.20
0.98
0.94
90.36
72.16
84.30
1995
104,775
6,629
7,450
8.78
28.87
15.18
1.13
0.87
0.86
90.08
70.25
83.96
1996
104,829
4,894
7,916
9.19
31.55
13.26
1.07
0.90
0.78
89.75
67.55
85.95
1997
119,556
6,093
9,993
9.11 29.71 15.31 1.01
† Resided in foreign countries in the previous 10 years
0.75
0.80
89.87
69.54
83.89
Patent-Inventor Ratio by Foreign-Experience type
2.1
1.9
Current foreign
residents
1.5
Current US
residents w/
foreign
experience
1.3
1.1
Current US
residents w/o
foreign
experience
0.9
0.7
Year
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
0.5
1985
Patents per Inventor
1.7
Variable Definition and Sample Statistics
Mean (Standard Deviation)
Definition
Pharmaceutical
Semiconductor
CITE_FRGN
Fraction of citations to patents that are assigned to
foreign assignees
0.5505
(0.3319)
0.4760
(0.2850)
FRGN_EXP
= 1 if at least one inventor is residing or used to reside
in one of foreign countries where foreign assignees of
cited patents are located
0.0734
(0.2609)
0.0290
(0.1677)
INVENTOR
Number of all inventors in a patent assignee firm
326.0
(195.7)
923.5
(728.6)
EMPLOYEE
Number of employees in a patent assignee firm
35,979
(21,833)
41,538
(52,501)
R&D/INV
Real R&D expenditures in 1996 constant dollars over
the number of inventors in a patent assignee firm
31.67
(24.51)
12.04
(27.34)
NSIC
Number of secondary SIC’s assigned to a firm in a
patent assignee firm
3.791
(1.991)
3.154
(1.944)
MEXP
Median experience of all inventors in a patent assignee
firm
5.292
(1.582)
3.832
(1.067)
FIRMAGE
Years elapsed since the founding year of a patent
assignee firm
77.40
(51.51)
36.17
(23.40)
Determinants of Citation to Foreign-Assigned Patents
Dependent variable = logit transform of CITE_FRGN
Pharmaceutical
FRGN_EXP
3.3876
3.92
4.3832
3.87
5.5730
3.66
6.4162
3.75
Log INVENTOR
1.0813
1.10
1.1595
1.19
-1.1918
-2.69
-1.1702
-2.64
Log EMPLOYEE
0.2124
0.38
0.1885
0.34
0.3871
1.24
0.3550
1.14
Log R&D/INV
0.0557
0.66
0.0488
0.59
0.0658
1.14
0.0691
1.18
Log NSIC
-0.2723
-0.38
-0.4079
-0.57
1.1469
1.57
1.1562
1.56
Log MEXP
-6.5845
-4.41
-6.4702
-4.40
-6.8640
-2.76
-6.8410
-2.66
Log FIRMAGE
-1.0956
-1.96
-1.1361
-2.06
2.3439
2.88
2.3771
2.83
1247
0.1462
1215
0.1539
4186
0.1280
4112
0.1306
Observations
R2
3.8950
4.95
Semiconductor
1430
0.0189
5.8609
4.18
4316
0.0283
Note: Rows show the estimated coefficient and the t statistic for each regressor. The result for a constant term
is suppressed. The t statistic is based on the Huber-White sandwich estimator of variance.
Conclusion
•
An increase in the extent that U.S. innovators access
researchers with foreign R&D experience in recent years
•
An increase in U.S. firms’ employment of foreign-residing
researchers;
•
The fraction of research-active U.S. residents with foreign
research experience appears to be falling
•
Possibly to capture the geographically dispersed knowledge
spillovers.
•
Having researchers with research experience abroad
seems to facilitate access to foreign produced knowledge.
•
In the semiconductor industry smaller firms and older firms
are more likely to make use of the output of non-U.S. R&D.
•
In the pharmaceutical industry, younger firms are more
likely to make use of the output of non-U.S. R&D.
Future Extension
• The consequences of the mobility of R&D
personnel on firm R&D.
• The impact of the arrival of a researcher with a
particular set of R&D experiences on the
character and quantity R&D done by a firm
• The importance of inter-firm mobility for
technological diffusion.
• How firms organize the R&D enterprise, the
extent of collaboration among scientists
geographically dispersed, and the extent of
interaction among scientists with different
backgrounds.
Download