Estimating Firms` Research Quotient (RQ)

advertisement
Estimating Firms’ Research Quotient (RQ)
Anne Marie Knott
The Harvard Business Review article, “The Trillion Dollar R&D Fix” describes a measure
of firms’ R&D effectiveness called RQ (Research Quotient) and explains how firms can
utilize it to determine the optimal level of R&D investment, R*.
At the request of the HBR editors, the article also includes an excel example capturing the
intuition behind RQ. This is a very simplified approach to estimating RQ, which leads to
substantial errors. In fact the raw RQ for the fictitious firm in the Excel example is high by
a factor of two. I included a caution in the article comparing the two estimates, but it didn’t
survive final edit.
I have received a number of inquiries from firms indicating they want to estimate RQ
internally on a regular basis. Because erroneous estimates of RQ could create worse
problems than the current problem of not knowing RQ, I have created this tutorial.
The tutorial describes how firms’ can estimate RQ on a regular basis. In addition to
generating top-level (company-wide) estimates for RQ using “real numbers”, the advantage
of internal estimation is the ability (in some cases) to generate division level estimates of
RQ and R*. Such division level estimates 1) inform how R&D should be allocated across
divisions, and 2) support internal benchmarking (by identifying which divisions have
higher/lower R&D effectiveness).
The tutorial has five sections. Section 1 describes the link to theory; Section 2 defines the
fully specified approach to estimation; Section 3 describes the excel exercise and explains
why it leads to erroneous estimates; Section 4 identifies a commercial means to obtain firm
RQs; Section 5 provides links to other RQ resources.
1. Theory
The formal link between firms’ R&D spending and growth come from endogenous growth
theory, e.g., Romer 1990. These models rely on a construct of R&D elasticity to define the
R&D spending that maximizes firm market value. The principal limitation of growth
theory from the firm perspective is that it is concerned with macro-economics and thus
typically treats firms as being identical. Accordingly, when the models are tested, R&D
elasticity is estimated at the industry level or higher. More recent growth models e.g, Lentz
and Mortensen 2008, have accommodated firm heterogeneity, but there hasn’t been a firmlevel measure of R&D elasticity to test them directly. RQ is the missing measure.
Raw RQ is the “firm-specific output elasticity of R&D”. It is the term, γ, in the firm’s final
goods production function (equation 1). Gamma is interpreted as the percentage increase
in revenues associated with a 1% increase in R&D.
Y = K L RSA
where:
α
β
γ
δ
ε
(1)
Y = output
K = capital
L = labor
R = R&D
S = spillovers (R&D available for free-riding)
A = advertising
Because γ is normally distributed, firm RQ resembles individual IQ. Both capture problem
solving capability. For individuals, IQ is captured as the speed and accuracy of solving
problems of increasing difficulty--within any given time constraint, individuals with higher
IQ solve more problems correctly than those with lower IQ. For firms, IQ is efficiency
solving new problems. For any given level of R&D spending, high IQ firms will generate
more innovations, or for any given innovation, high IQ firms will invest less developing it.
Accordingly, the raw values of γ are mapped onto the IQ scale (mean = 100, standard
deviation = 15) to support intuition.
2. Fully-specified estimation
2.1 Model
We derive (RQ) by estimating the production function (equation 1) with a random
coefficients model that allows for heterogeneity in the output elasticity for R&D (as well as
all other inputs). A random coefficients model represents a general functional form model
which treats coefficients as being non-fixed (across members of a cross- section or over
time) and potentially correlated with the error term. Random coefficient models are those
in which each coefficient has two components: 1) the direct effect of the explanatory
variable and 2) the random component that proxies for the effects of omitted variables. The
empirical model (Equation 2) is a log transform of equation 1 that models output (valueadded, Y) for firm i in year t with random coefficients for all inputs (capital, K, labor, L,
R&D, R, spillovers, S, and advertising, A) as well as the intercept:
ln Yit = ( β 0 + β 0i ) + ( β1 + β1i ) ln K it + ( β 2 + β 2i ) ln Lit + ( β3 + β3i ) ln Rit
+ ( β 4 + β 4i ) ln Sit + ( β5 + β5i ) ln Ait + ε it
(2)
We estimate Equation 2 using the Stata program, xtmixed. xtmixed fits linear mixed
models (both fixed effects and random effects) using maximum likelihood estimation. The
random effects, β_i, are not directly estimated, but we form best linear unbiased predictions
(BLUPs) of them (and standard errors) using xtmixed postestimation.
2.2 Data and variables
We estimate firm RQ using moving 7- year panels of all US publicly traded firms engaged
in R&D. Data for the study comes from the Compustat industrial annual file. Firm level
data items include (in $MM unless otherwise stated): revenues (Yit), capital as net property,
plant and equipment (Kit), labor as full-time equivalent employees (1000) (Lit), advertising
(Ait), and R&D (Rit). From these primary data, we derive a secondary measure: firm
specific spillovers (Sit) which is computed as the sum of the differences in knowledge
between focal firm i and rival firm j for all firms in the respective industry (2-digit SIC)
with more knowledge than firm i. Spillovers represent the knowledge firms free-ride on in
generating their products/processes. Failure to include them in the estimation leads to
substantial upward bias in gamma.
2.3 Getting by without the full compustat dataset
In principle RQ estimation does not require the full set of firms in compustat. Non-focal
firms play two roles: First, they “bootstrap” the production function (the estimation exploits
information from all firms to better gauge what the elasticities should be for each input).
Second, they help control for year-to-year changes in economic conditions. Third, they
provide data on the spillover pool.
There should be a subset of firms (most likely the set of firms in the two digit industry)
sufficient to support reliable estimation. If so, firms could obtain 10K data for these firms
directly from EDGAR rather than subscribing to Compustat. Since the reliability of these
subset estimates varies across firms and industries, we recommend validating results
against the full dataset (either on your own or by providing subset estimates to Berkeley
Research Group for comparison to the master set of estimates)
Note however subset estimation still requires the use of random coefficients to obtain the
firm (or division) specific coefficients.
3. Spreadsheet estimation (use for intuition only)
As mentioned in the introduction, spreadsheet estimation should only be used for
developing intuition. It does not generate reliable estimates. I estimate RQ by combining
data from all US publicly traded firms reporting R&D. This generates precise estimates
that control for spillovers between firms and for economy-wide effects like recessions.
To generate the spreadsheet estimate, you need several years’ data on revenues, and annual
expenditures on inputs: PP&E (property, plant and equipment), labor, R&D and
advertising. We show these for a fictitious firm in Table 1 (columns 1-5).
Table 1
Year
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
1
2
Revenue
PP&E
1484
1646
1717
1642
1846
1991
2225
2541
2753
4010
4089
3903
4061
4144
4324
4388
4644
4847
5273
5450
5534
518
522
581
538
533
525
551
571
596
1054
1079
1046
922
1072
1052
999
1004
976
960
955
979
3
4
Employee Advertising
6
6
6
5
5
5
5
6
7
11
11
11
10
9
9
8
8
8
8
8
8
219
244
270
243
287
272
285
349
362
474
465
352
397
456
429
435
450
474
486
499
518
5
6
7
8
9
10
R&D
ln(Rev)
ln(PPE)
ln(Emp)
ln(Adv)
ln(R&D)
39
47
45
42
45
45
46
50
56
62
63
67
67
76
84
88
99
108
111
114
119
7.30
7.41
7.45
7.40
7.52
7.60
7.71
7.84
7.92
8.30
8.32
8.27
8.31
8.33
8.37
8.39
8.44
8.49
8.57
8.60
8.62
6.25
6.26
6.37
6.29
6.28
6.26
6.31
6.35
6.39
6.96
6.98
6.95
6.83
6.98
6.96
6.91
6.91
6.88
6.87
6.86
6.89
1.70
1.81
1.76
1.55
1.58
1.55
1.67
1.70
1.89
2.40
2.40
2.40
2.25
2.19
2.15
2.03
2.03
2.05
2.12
2.12
2.12
5.39
5.50
5.60
5.49
5.66
5.60
5.65
5.85
5.89
6.16
6.14
5.86
5.98
6.12
6.06
6.08
6.11
6.16
6.19
6.21
6.25
3.67
3.86
3.80
3.75
3.80
3.80
3.82
3.92
4.03
4.13
4.14
4.20
4.20
4.33
4.43
4.48
4.60
4.68
4.71
4.74
4.78
Next, transform each variable into log form (columns 6-10 of the table). To perform the
analysis in Excel, choose “regression” from the Data Analysis tab.1 Designate column 6 as
your dependent variable by highlighting rows 1-21 of that column. Designate columns 710 as your independent variables. Then choose “labels” to indicate the first row includes
the variable name. When you have run the analysis, Excel will open a new worksheet with
the regression results. We’ve shown you these results for the fictitious firm in Table 2.
Column 2 marked “coefficients” contains the elasticity for each variable in column 1. The
elasticities for PP&E, employees, advertising and R&D for the sample firm are 0.23, 0.15,
0.74, 0.43, respectively. These coefficients are not accurate for the reasons discussed
previously. (When the fictitious firm data is combined with all publicly traded firms in a
data set that also includes knowledge spillovers, its coefficients are 0.13, 0.51, 0.18, 0.21,
respectively). Thus the coefficients on capital and R&D are high by a factor of 2,
advertising is high by a factor of 4, and the labor coefficient is 70% below the fullyspecified estimate.
1
You need to have loaded the “Analysis Toolpak” to get the Data Analysis tab Table 2.
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.990523
R Square 0.981137
Adjusted R S0.976421
quare
Standard Error
0.070398
Observations
21
ANOVA
df
Regression
Residual
Total
SS
MS
F Significance F
4 4.12428 1.03107 208.0504 1.42E-­‐13
16 0.079294 0.004956
20 4.203573
Coefficients
Standard Error t Stat
P-­‐value Lower 95%Upper 95%Lower 95.0%
Upper 95.0%
Intercept
0.0705
0.8407
0.0838
0.9342 -­‐1.7117
1.8527 -­‐1.7117
1.8527
ln(PP&E)
0.2322
0.2055
1.1299
0.2752 -­‐0.2035
0.6679 -­‐0.2035
0.6679
ln(Employee)0.1499
0.1633
0.9176
0.3724 -­‐0.1963
0.4961 -­‐0.1963
0.4961
ln(Advertising0.7356
0.1653
4.4498
0.0004
0.3852
1.0861
0.3852
1.0861
ln(R&D)
0.4303
0.1089
3.9509
0.0011
0.1994
0.6612
0.1994
0.6612
4. Alternatives to internal estimation
If you have no need to do divisional estimates, then it probably doesn’t make sense to
invest in compustat and labor to generate RQ annually. Instead, you can obtain RQ and R*
on a regular basis via an RQ data subscription.
5. Other RQ resources
5.1 Academic articles
R&D/Returns Causality: Absorptive Capacity or Organizational IQ
RQ and endogenous firm growth
5.2 NSF grants
0965147: Firm IQ: A Universal, Uniform and Reliable Measure of R&D
Effectiveness
The Impact of R&D Practices on R&D effectiveness (RQ)
5.3 HBR article
5.4 Consulting
Download