Prepared for the MISRC/CRITO Symposium on the Digital Divide Carlson... Management, University of Minnesota

advertisement
Prepared for the MISRC/CRITO Symposium on the Digital Divide Carlson School of
Management, University of Minnesota , August, 2004. NOT TO BE QUOTED WITHOUT
PERMISSION OF AUTHORS
DETERMINANTS OF INTERNET UTILIZATION IN THE US:
DEMOGRAPHIC AND SPATIAL-ECONOMIC FACTORS
John B. Horrigan
Pew Internet and American Life Project
1100 Connecticut Ave, NW Suite 710
Washington, DC 20036
202/557-3465 (voice), jhorrigan@pewinternet.org
Chandler Stolp
LBJ School of Public Affairs
University of Texas at Austin
Austin, TX 78712
512/471-8951, stolp@mail.utexas.edu
Robert H. Wilson
LBJ School of Public Affairs
University of Texas at Austin
Austin, TX 78712
512/475-7906 (voice), 475-7909 (fax), rwilson@mail.utexas.edu
Key Words: Internet utilization, Internet users, Geography and the Internet
Abstract
Networked information technologies have become a critical component of today’s society. The Internet
segment of this industry is believed to have reached 945 million users worldwide in 2004 and has been
forecast to reach nearly 1.5 billion users by 2007. Electronic commerce has exceeded expectations, with
$2.4 trillion in business-to-business e-commerce and $95 billion in consumer e-commerce conducted in the
United States in 2003. Given this expanded use of the Internet, a series of questions emerge concerning
who uses the Internet and for what purposes.
This paper utilizes a unique data set of over 25,000 survey responses generated by the Pew Internet and
American Life Project in 2000. Several multivariate logit models of Internet use are estimated as a
function of several demographic characteristics, such as gender, educational attainment, age, income, and
workforce status, and spatial-economic factors. Although the demographic characteristics largely explain
utilization, the spatial-economic factors associated with location of users play a secondary, but
distinguishable, role.
Acknowledgements: The authors wish to thank Ms Mee Young Han, formerly a graduate student at the LBJ
School of Public Affairs for her valuable assistance in developing the database and statistical analysis. In
addition, the authors acknowledge the Pew Internet and American Life Project for making the data
available and the Mike Hogg Urban Policy Professorship for research support of the project.
Prepared for the MISRC/CRITO Symposium on the Digital Divide Carlson School of
Management, University of Minnesota , August, 2004. NOT TO BE QUOTED WITHOUT
PERMISSION OF AUTHORS
Networked information technologies have become a critical component of today’s
society. The Internet segment of this industry is believed to have reached 945 million
users worldwide in 2004 and has been forecast to reach nearly 1.5 billion users by 2007
(ClickZ, 2004). Electronic commerce has exceeded expectations from the heady days of
the dot-com boom, with $2.4 trillion in business-to-business e-commerce and $95 billion
in consumer e-commerce conducted in the United States in 2003 (Business Week, 2003).
The utilization of the Internet is of great interest to many parties.
Internet service providers (ISPs) must understand opportunities for market
expansion. For the remarkable range of businesses utilizing the Internet to reach
customers, knowledge of demographic factors affecting utilization is critical.
Recognizing the fundamental importance of Internet communications in contemporary
society, social scientists and communications researchers are keenly interested in the
reasons for and the consequences of utilization as well as in determining who is not using
the Internet and why.
Public policy decisionmkers also have broad interest in Internet utilization. The
high levels of telephony penetration in the US are the results of a public commitment to
universal service, dating from the 1930s. Today’s policymakers are facing similar
challenges concerning Internet utilization, a debate often framed as the digital divide. In
addition, a wide range of Internet-based applications in the provision of public services
and in management of public infrastructure systems has proven effective, thus raising
additional issues concerning citizen access. Fiscal pressures at all levels of government
have force decisionmakers to explore possible productivity enhancing applications of
advanced telecommunication systems. The feasibility of e-government, including
electronic access to services and information, will be at least partially determined by
success in making Internet access universal. In addition, the geography of Internet
deployment and utilization may have important implications for regional development
policy.
This study utilizes a unique data set of over 25,000 responses to a survey in the
US, generated by the Pew Internet and American Life Project (Pew Project) in 2000, to
examine several questions relating to Internet use. The paper first reviews the existing
literature on demographic determinants of Internet usage, such as gender, educational
attainment, age, income, and workforce status. Gross utilization rates by demographic
categories are generated with the Pew data and these are contrasted with the findings in
the literature. A series of logit models are then estimated to provide for more complex
modeling of demographic and spatial-economic determinants of usage. Characteristics of
the 2,576 counties in which respondents live are incorporated into the modeling of
Internet use. The paper concludes with suggestions for further research.
1.0 Determinants of Internet Use: Findings in the Literature
Prepared for the MISRC/CRITO Symposium on the Digital Divide Carlson School of
Management, University of Minnesota , August, 2004. NOT TO BE QUOTED WITHOUT
PERMISSION OF AUTHORS
The demographic characteristics of Internet users have been investigated by a
number of researchers, but efforts in this field face several difficulties. The diffusion of
information technology has been quite rapid (Leigh and Atkinson, 2001; Department of
Commerce, 2002) and the rates of utilization are subject to rapid change over time.
Furthermore, given the multiple sites (at home, at work, at school and at other locations)
and multiple purposes of Internet usage (to make purchases, obtain information,
entertainment, social interaction, and others), variation in the effects of demographic
factors across different settings must be considered.1 In addition, the usage at one site
may affect the individual’s usage at other sites (Department of Commerce, 2002). For
example, after exposure to computers and Internet at work, an individual may decide to
use the Internet at home.
Non-demographic factors of Internet access, such as availability of computers and
ISPs, have also been explored. Some researchers (Novak and Hoffman, 1998; Chaudhuri,
Flamm and Horrigan) have chosen to investigate computer ownership and Internet access
as a joint decision. Access from any site requires a computer to be available. In terms of
home usage, purchase of a computer must precede the decision to access the Internet.
Others, adopting a microeconomic approach, attempt to integrate the quality and
cost of Internet access into the analysis of the decision to use the Internet (Kridel,
Rappoport, and Taylor 1999; Chaudhuri, Flamm, Horrigan). It is hypothesized that the
decision of the consumer acquire access to the Internet will depend on the price as well as
the quality of the service (Government Accounting Office). This question raises the
intriguing issue of whether areas of intense competition among services providers will
benefit from lower prices and, as an effect, higher utilization. The Pew Project data do
not include information on infrastructure availability and prices. At a result, this analysis
should be considered exploratory in nature.2
Returning to the literature, education and income of individuals have invariably
been found to affect Internet usage (Novak and Hoffman, 1999; UCLA, 2000; Leigh and
Atkinson, 2001; Government Accounting Office, 2001; Department of Commerce, 2002).
Internet utilization rates increase with levels of income and education. Presumably, the
more highly educated place greater value on services and information provided on the
Internet and possess the intellectual capital to utilize it. Higher levels of income would
certainly be associated with an enhanced ability to purchase computers and access. It has
been noted, however, that more rapid increases in penetration have recently occurred
among lower income groups (Department of Commerce, 2002).
1
Since the Pew Project dataset used in this study does not include individuals below the age of 18 and does
not consider usage at school, this study does not explore how Internet use at school may relate to overall
usage at home or at work.
2
The models estimated in this project are effectively reduced form equations, incorporating both the direct
and indirect effects of a variety of demographic factors on Internet utilization.
Prepared for the MISRC/CRITO Symposium on the Digital Divide Carlson School of
Management, University of Minnesota , August, 2004. NOT TO BE QUOTED WITHOUT
PERMISSION OF AUTHORS
Another factor found to be important is the age of the individual; utilization rates
decline with age, but not necessarily with aging (Department of Commerce, 2002). The
results concerning gender are less clear. Some studies have found that women were
significantly less likely than men to use the Internet, but that the gender gap with respect
to use per se had narrowed by 2000, although there remain gender differences in the
intensity of internet use (Ono and Zavodny, 2003; Bimber, 2000). One large study found
virtually no difference in utilization among men and women (Department of Commerce,
2002). These results suggest distinct diffusion patterns among men and women, but with
women converging on the utilization rate of men.
Researchers have been quite concerned with the impact of race/ethnicity on
Internet utilization. The concept of digital divide emerged from a concern that differential
rates of utilization by poverty status, race, or other factors were preventing a large
segment of the population from fully participating in a modern, communications-based
society. Several studies produced by U.S. governmental agencies found significant
differences in utilization by race/ethnicity with higher rates among the white population
(Department of Commerce, 2002 and 2000; Government Accounting Office, 2001).
Novak and Hoffman find differences between whites and blacks in computer
ownership and Internet usage in their study using 1997 survey data (Novak and
Hoffman). Computer ownership was higher for whites, holding income constant, but
statistically significant differences between whites and blacks were few. In terms of
Internet usage, once education, gender and income were held constant, the race of users
produced few statistically significant differences. The authors conclude that blacks were
quickly achieving usage profiles similar to whites. The study also found similar
utilization rates for males and females. In later research, with expanded datasets,
Hoffman, Novak, and Schlosser, once again found income and education to be key
factors explaining Internet use, but higher rates of use are found among whites than
blacks in similar demographic segments. Differences are more accentuated at similar
levels of education status than at similar levels of income (Hoffman, Novak, and
Schlosser, 1998).
The Internet utilization rate among Hispanics has been estimated to be 50 percent
among individuals over 18 years of age (Spooner, 2001) but rates of utilization vary
substantially by site, with higher usage at home than at work. Internet penetration among
Hispanics grew rapidly between March 2000 and March 2001.
The existing research literature focuses primarily on the demographic
determinants of Internet utilization. Some studies, however, have analyzed variation in
utilization across space. Substantial rural-urban disparities in Internet utilization, for
example, have been found (Leigh and Atkinson, 2001; Department of Commerce, 2002).
It has also been demonstrated cities on the eastern and western seaboards display higher
densities of the Internet infrastructure than cities in the interior of the country, the socalled “coastal effect” (Gorman and Malecki, 2000). The investigation of the geography
Prepared for the MISRC/CRITO Symposium on the Digital Divide Carlson School of
Management, University of Minnesota , August, 2004. NOT TO BE QUOTED WITHOUT
PERMISSION OF AUTHORS
of utilization is warranted for several reasons. The availability of Internet services is not
spatially ubiquitous and this may well lead to variation in prices of access to services and
in rates of utilization. Furthermore, demographic variables themselves are not uniform
over space. The distribution of levels of educational attainment, for example, varies
dramatically across states and regions (Wilson, 1993). It is also well-established that
many of the rapidly growing service sectors, such as finance and insurance, business
services, computer and data processing, real estate, wholesale and retail trade, and hotels
are intensive users of telecommunications services (Wilson, 1993). Firms in these sectors
are not uniformly distributed across the U.S. and several are relatively concentrated in
larger cities such as Atlanta, Chicago, Dallas, Los Angeles and New York. This
concentration of intensive users is leading to the provision of superior
telecommunications infrastructure in these cities (Schmandt, Williams, Wilson, and
Strover, 1990; Greenstein, Lizardo, Spiller, 1997; Moss and Townsend, 2000). As a
result, one might expect Internet utilization to be higher in areas with relatively high
levels of these telecommunications-intensive sectors, independent of the demographic
characteristics of individuals. Therefore, the potential role of spatial characteristics in
explaining Internet utilization will be examined below.
2.0 Gross Utilization Rates in the Pew Project Data
The Pew Project conducts random-digit dial telephone surveys of Americans age
18 and older, focusing on whether people use the Internet, where they gain access to the
Internet, and what they do online. The question that measures Internet use is phrased as
follows: “Do you ever go online to access the Internet or World Wide Web or to send and
receive email?” This captures Internet use at home, work, or other places people may
access the Internet. The survey methodology reflects the research standards of AAPOR,
the American Association for Public Opinion Research.3 Data used in the analysis for this
paper reflects surveys conducted from March 2000 through December 2000; aggregating
the surveys yields a sample size of more than 25,000 individuals in over 2,500 counties
across the United States for the year 2000.4
To initiate the examination, gross utilization rates for various groups of
respondents were calculated without controlling for intervening or inter-related variables
(Table 1). These rates must be interpreted with caution since the relatively low utilization
rate for a particular group of people may not be the result of the variable defining the
3
The telephone sample is provided by Survey Sampling International, LLC (SSI) according to
specifications provided by Princeton Survey Research Associates International, the firm Pew Internet
contracts with to administer surveys. The sample is drawn using standard list-assisted random digit dialing
(RDD) methodology. Active blocks of telephone numbers (area code + exchange + two-digit block number)
that contain three or more residential directory listings are selected with probabilities in proportion to their
share of listed telephone households; after selection, two more digits are added randomly to complete the
number. This method guarantees coverage of every assigned phone number regardless of whether that
number is directory listed, purposely unlisted, or too new to be listed. Selected numbers are called 10 times
in efforts to get responses; response rates for surveys conducted by the Pew Internet Project are typically
about 33%.
4
Data from the Pew Internet Project is available online at: http://www.pewinternet.org/data.asp.
Prepared for the MISRC/CRITO Symposium on the Digital Divide Carlson School of
Management, University of Minnesota , August, 2004. NOT TO BE QUOTED WITHOUT
PERMISSION OF AUTHORS
group, such as non-married status, but rather of some other variable associated with nonmarried status, such as age. Although the analysis below controls for such indirect
influences, gross utilization rates will be briefly reviewed since the literature discussed
above tends to adopt this type of analysis.
[Insert Table 1]
The Pew Project survey respondents demonstrate patterns of utilization similar to
those found in the literature: for example, the higher level of education attainment, the
higher the rate of Internet utilization. For those with less than eight years of education,
the (unconditional) probability of utilization is a little over 4 percent, but for those with
graduate level educational attainment the probability is close to 80 percent. Family
income also has a positive effect on utilization: the higher the family income, the greater
the likelihood of Internet utilization. For individuals in families with annual household
incomes of less than $10,000, the rate of utilization of the Internet is about 23 percent but
for those above $100,000 the rate is 80 percent.
The survey also reveals gender differences in utilization, with the rate among men
being 56 percent compared to 49 percent for women (Table 1). The findings in the
literature concerning the effect of gender on utilization, discussed above, were mixed.
Some studies indicated that women have lower rates of utilization than men and others
reporting essentially similar rates. Being a parent is found to have a significant effect on
utilization in the Pew Project survey. Married people show a utilization rate of 57 percent
and respondents with children a rate of 62 percent. Individuals without children and
unmarried individuals have substantially lower rates. Student status has also been
reported in the literature to be a significant determinant, suggesting that having children
at home who are exposed to Internet at school may well have a positive effect on their
parent’s use at home.
The gross utilization rates in the Pew Project survey support the argument that
Internet use differs by race/ethnicity (Table 1). The survey data shows 54 percent
utilization for whites, 39 percent for blacks, 47 percent for Hispanics, and 57 percent for
Asian/other. These pronounced differences are similar to those found in several of the
research results discussed above (Department of Commerce, 2002 and 2000; Government
Accounting Office, 2001; Pew, 2001).
The employment status of an individual has a substantial impact on utilization.
Those employed demonstrate a utilization rate of over 30 percentage points higher than
those not employed (Table 1). In addition to providing a measure of Internet use in
general (the "USER" dataset), the Pew Project survey allows for a narrower measure of
utilization that allows us to examine use at work (the "WORK" subset of the USER
dataset) for those individuals who were employed (Table 2). One can hypothesize that the
effect of demographic variables may differ when the Internet is being used in a job-
Prepared for the MISRC/CRITO Symposium on the Digital Divide Carlson School of
Management, University of Minnesota , August, 2004. NOT TO BE QUOTED WITHOUT
PERMISSION OF AUTHORS
related activity as compared to its utilization at other locations. For example, having a
child might affect a parent’s utilization at home but not at work.
[Insert Table 2]
The county of residence of each respondent is provided in the Pew Project survey,
permitting the examination the spatial variation in Internet utilization. Substantial
variation across regions for the two measures of utilization is found (Table 2, Part a). The
Northeast and, especially, the Western regions of the country reveal higher rates than the
Midwest and South with respect to USER. This finding is consistent with the reporting of
a “coastal effect” where cities on the eastern and western seaboards display higher
densities of the Internet infrastructure than cities in the interior of the country (Gorman
and Malecki, 2000). In terms of utilization at work, the Western region shows
distinctively higher utilization than the other three.
The county in which a respondent resides was classified in one of four categories:
counties outside a Metropolitan Statistical Area (MSA, a definition of the U.S. Census
Bureau), counties that constitute a single-county MSA, central counties of multi-county
MSAs, and non-central counties of multi-county MSAs (Table 2, Part b).5 The first
category consists of sparsely populated counties with an average population of around
29,000. The latter two categories include counties within multi-county MSAs but
distinguished between central counties, which normally contain the largest central city of
the metropolitan area, and the surrounding suburban counties. Internet utilization in nonMSA counties is substantially lower than in other types of counties for the two types of
utilization. The Single-County MSA type has the highest average level of utilization
among all county types for USER. For the WORK variable, the Central County and
Suburban County categories demonstrate the highest average rates of utilization.
To explore the possible effects on utilization of the interaction among variables,
utilization at work is also cross-tabulated by region and county type (Table 3). Non-MSA
counties in the South and the Midwest show particularly low rates of utilization. The
level of utilization among county types in the Northeast varies to lesser extent than in the
other three regions. Utilization rates in the Single-County MSAs category in the Midwest
and West are relatively high. The utilization rate in Central County category is the
highest of all county types in the Northeast. Suburban counties have the highest
utilization in other counties, and the rate is especially high in the West. This somewhat
puzzling result may be explained by the historical pattern of urban development. In the
Northeast, central cities retain their historical importance as a place of work while
metropolitan areas in the West have followed a much more decentralized pattern due to
their growth after the general availability of the automobile and supporting infrastructure.
These results confirm a complex pattern of Internet diffusion across the country noted by
others (Moss and Townsend, 1998; Gorman and Malecki, 2000).
5
In the Northeast, boundaries of townships often do not coincide with MSA designations. Adjustments in
population of the respondent ‘s county were made in instances where boundaries were not coterminous.
Prepared for the MISRC/CRITO Symposium on the Digital Divide Carlson School of
Management, University of Minnesota , August, 2004. NOT TO BE QUOTED WITHOUT
PERMISSION OF AUTHORS
3.0 Modeling Internet Utilization
Building upon the analysis of unconditional, univariate utilization rates, discussed
in the previous section, attention now turns to examining Internet usage as a function of
demographic and spatial-economic characteristics. A series of logistic regression models
are estimated in which the dependent variable is a binary indicator of Internet usage
(usage=1, nonusage=0) expressed as function of a linear combination of explanatory
variables. Maximum likelihood statistical techniques are employed to estimate the
marginal contribution that each explanatory variable makes to the probability of Internet
usage. Venturing into the world of conditional probabilities of usage (i.e., the probability
of Internet use given a vector of values for the explanatory variables) suggests a causal
relationship between the explanatory variables and usage. The nexus of causality
surrounding Internet use, however, is far more complex than this and remains a rich
territory for future research. The models estimated in this exploratory study are, again,
best understood as reduced-form expressions of more complicated structural relations,
ones that nevertheless serve to illuminate the mechanisms driving Internet use. In
particular, the Pew Project data do not include information on the price of Internet use
incurred by respondents. Although the effects of this important variable may be indirectly
captured by spatial-economic variables, the explicit effect of this potentially important
variable will not be examined in the modeling presented here.
With this introduction, attention can now turn to the results of the logit modeling
exercise, first for Internet use at any site analyzing the USER dataset (Section 3.1), and
then for Internet use at work with the WORK dataset (Section 3.2). The nonlinear
functional form in which logit models express the relationship between a linear
combination of explanatory variables and the dependent variable makes it difficult to
interpret the estimated coefficients directly. Consequently, the marginal effect of each
explanatory variable on Internet use is, with the exception of Figure 3 (below), expressed
in terms of the partial odds ratio of utilization (i.e. the antilog of the estimated
coefficients) rather than in terms of its impact on the probability of utilization.6 The
partial odds ratio represents the odds in favor of usage over non-usage attributable to a
unit change in a given explanatory variable, holding all other explanatory variables
constant. For binary (dummy) indicator variables, like most of the variables in the Pew
dataset, the partial odds ratio is interpreted as the marginal impact on the odds in favor of
usage due to the presence of the indicator (eg, setting the PARENT=1 for a parent or
guardian of a child under 18 years of age respondent in contrast to PARENT=0 for a nonparent respondent), ceteris paribus.
3.1 Determinants of Internet Use
6
“Odds” are calculated as p/(1-p), where p=the probability of Internet use. The odds ratio is the probability
of usage divided by the probability of non-usage. As such, a partial odds ratio of 1 for a given explanatory
variable suggests no impact on the choice to use the Internet. Consequently, the null-hypothetical value for
testing the statistical significance of partial odds ratios is 1 rather than 0.
Prepared for the MISRC/CRITO Symposium on the Digital Divide Carlson School of
Management, University of Minnesota , August, 2004. NOT TO BE QUOTED WITHOUT
PERMISSION OF AUTHORS
Modeling Internet usage with the dependent variable USER (Y=USER=1 if the
respondent uses the Internet, 0 if not) produced quite strong statistical results (Table 4).
The share of concordant pairs, one measure of the ability of a logit model to predict the
correct outcome, for the models is over 83 percent, indicating strong predicative power.7
The large number of odds ratios that are significant at the  =0.01 level or better is
another indicator of the explanatory power of the models, but one that is also partially
attributable to the large sample size over which the models were estimated.
[Insert Table 4]
The models are also quite robust. The odds ratios for almost all of the variables
are remarkably stable across the four specifications, a noteworthy outcome in light of the
large sample size and of the complex nonlinearity of a logit specification.8 Four
specifications are presented in Table 4: Model 1 explains Internet use solely in terms of
the personal characteristics of the individual; Model 2 adds basic county characteristics
and population to the mix; Model 3 substitutes industry characteristics (discussed in
detail, below) for the county characteristics in Model 2; Model 4 includes both county
and industry characteristics, along with population interactions with both these sets of
spatial-economic variables.
Two summary statistics are offered to evaluate the four models in Table 4, the
Akaike Information Criterion (AIC) and the Schwarz Bayesian Information Criterion
(SBIC). Both are based on the the log of the likelihood function at its maximum, but
include a penalty for including an excessive number of parameters. The SBIC is less
forgiving of additional parameters than the AIC. Identifying the "best" model is a
classical unsolved problem in statistics, nevertheless, in the exploratory spirit of this
study, these information statistics provide a convenient way to evaluate these four
reduced-form models. The lower the AIC or SBIC, the "better" the explanatory power of
the model. Accordingly, the AIC suggests that Model 4 provides the best fit to the data,
while the SBIC gives the nod to Model 2. From the AIC perspective, all four models are
statistically distinguishable from one another with p-values in pairwise tests of less than
0.001 for all but a still significant contrast between Models 3 and 4 at <0.01.9 The
7
If the models offered no predictive power, the concordance rates would be around 50 percent (i.e.,
predicting usage with these models would be no better than that achieved by flipping a coin). The models,
in other words, improve upon randomly flipping a fair coin to predict Internet usage by more than 33
percentage points.
8
The one striking exception to cross-model stability is the odds ratio for Population, which rises from 1.00
(rounded) in Models 2 and 3, yet jumps to 52.55 in Model 4. This is explained by the presence of
population interaction terms in Model 4, especially those associated with the indicator variables for county
type. Multiplying the odds for the population interaction with county type, all of which round to 0.03, by
the population coefficient of 52.55 yields odds of 1.78, a figure nearly identical to what would be obtained
with Model 4 without the "population by county type" interaction terms (odds=1.75, not shown here).
9
The difference in information statistics (either AIC or SBIC) across models is asymptotically chi-square
distributed with degrees of freedom equal to the difference in the number of parameters of the two models
being compared.
Prepared for the MISRC/CRITO Symposium on the Digital Divide Carlson School of
Management, University of Minnesota , August, 2004. NOT TO BE QUOTED WITHOUT
PERMISSION OF AUTHORS
pairwise contrasts are less crisp from the SBIC perspective. Models 1 and 4 and Models
3 and 4 are statistically indistinguishable at the conventional <0.05, the remaining pairs
are significantly different, two (Model 1 vs 2 and 2 vs 4) at  <0.05, one (1 vs 2) at 
<0.01, and one (2 vs 3) at  <0.001.
In the sections that follow, we adopt the more expansive AIC perspective and
generally focus the discussion on Model 4. We turn first, in the section that follows, to
the demographic characteristics of individuals, then in Section 3.1.2 to the spatialeconomic variables.
3.1.1 Demographic Determinants
Most of the demographic variables included in the logit models are found to affect
the likelihood of Internet utilization. In particular, the education variables strongly
influence utilization. Holding other explanatory variables constant, utilization increases
significantly with higher levels of education across all four specifications (Table 5).10 The
effect of education can be expressed graphically, with the effects of other variables held
constant (Figure 1, USER dataset). Whereas the partial odds ratio in favor of utilization
by an individual with postgraduate training is over 4.0 (i.e., the probability of usage is
four times as large as the probability of non-usage), for those with less than the eight
years of education, the partial odds of Internet use is 0.16-0.17.
[Insert Figure 1]
Income also has a very substantial effect on Internet utilization, holding constant
the effect of other variables. Individuals in households with income over $100,000 are
three times more likely to use the Internet than not, whereas individuals with household
income of less than $10,000, the partial odds ratio of utilization is 0.46, that is the odds of
utilization are less than half the odds of non-utilization (Figure 2, USER dataset).11 In
sum, these results confirm the findings in the literature concerning education and income,
but with more compelling evidence in that the effects of other demographic variable are
held constant in determining the unique effects of income and education.
[Insert Figure 2]
The literature finds that Internet utilization rates tend to decline with the age of
the user, as discussed above. In the estimation here, however, the authors hypothesize
that this decline is not linear with age. We test this by incorporating a quadratic term for
age within the logit specifications. This complicates the interpretation of the impact of
age on Internet usage and, although the partial odds associated with the quadratic term is
not statistically significant, the resulting estimation shows a declining probability of
10
The reference category for the dummy indicator variables for education and income is, in both cases,
"Don't Know".
11
See previous footnote.
Prepared for the MISRC/CRITO Symposium on the Digital Divide Carlson School of
Management, University of Minnesota , August, 2004. NOT TO BE QUOTED WITHOUT
PERMISSION OF AUTHORS
usage as age increases, but that the decline lessens as age increases (Figure 3).12 Further,
the gap in the probability of Internet use between those without college education and
those with at least some college education reaches its maximum at the age of 48. This
may reflect the fact that people around that age group were among the first to be exposed
to the computer revolution that swept colleges and universities earlier and more intensely
than it did many other institutions in the 1970s and 1980s. These results represent a
distinct refinement of the impact of age on utilization reported in the literature.
[Insert Figure 3]
The effect of family structure was also examined. Marriage has a significant
positive effect on the odds of utilization, whereas the presence of children (i.e., being a
parent) has no distinguishable effect on the odds of utilization, holding other variables
constant (Table 4).
The estimation shows significant differences in the likelihood of using the
Internet according to the race/ethnicity of the respondent, confirming results frequently
found in the existing literature (Table 4). The odds of utilization by blacks and Hispanics
are significantly lower than for whites (all compared against the reference group "Asian
and others"), with blacks having almost half the odds of using the Internet than
Hispanics.
Finally, the respondent’s employment status was found to have a strong effect on
the odds of utilization. The partial odds ratio in favor of Internet usage of those employed
is 1.7 times that of those unemployed. This result clearly establishes the importance of
employment status. The determinants of utilization at work are examined separately in
Section 3.2.
To summarize, the demographic characteristics of individuals have a quite
substantial effect on the propensity to use the Internet. The results from this estimation
largely confirm the findings in the literature, but a more sophisticated methodology has
been employed. The overall predictive power of the models and stability of the
coefficients of demographic variables across several model specifications confirms the
robustness of the results.
3.1.2 Spatial-Economic Determinants
Having established the importance of various demographic characteristics on
Internet utilization, this section examines the effects of characteristics of the community
12
Probabilities of utilization were calculated from logit model forecasts in light of the characteristics of
each individual, substituting the marginal impact of "some college" (the weighted impact of the first four
education variables) then replacing it with that of "some college +" (the weighted impact of the last three
education variables) for each. The two probability plots represent gentle 6-th order polynomial smoothings
of the outcomes arrayed by age. Ages below 20 and above 80 were dropped due to small-sample
irregularities and problems of interpretation (eg, an 18-year old is unlikely to have graduated from college).
Prepared for the MISRC/CRITO Symposium on the Digital Divide Carlson School of
Management, University of Minnesota , August, 2004. NOT TO BE QUOTED WITHOUT
PERMISSION OF AUTHORS
in which the respondent resides. As noted earlier in the discussion of Table 3, the
availability of Internet access varies across space. To investigate this dimension of
Internet utilization, communities are measured in three ways: (1) whether the county is in
a metropolitan statistical area (MSA) and if so, how it relates to the MSA; (2) the
population of the county; and, (3) the relative size of telecommunications intensive
sectors of the local economy.
Respondents residing in counties within an MSA have substantially higher odds
of utilization than those in non-MSA counties, the reference category for the dummy
variables accounting for county type (listed under the heading "County Characteristics"
in Table 4). The variables accounting for the MSA status of the county allow for the
consideration of the impacts that differing broad patterns of urbanization may have on
Internet utilization. The three partial odds estimates in Models 2 and 4 fall in the range of
1.25 to 1.40. All are statistically significant with p-values less than 0.0001, and SingleCounty MSA is statistically distinguishable from the other two county types in Model 2
(but not in Model 4). Finding greater Internet utilization in non-rural, MSA counties is
consistent with the literature, discussed above, which shows that utilization in rural areas
(roughly consistent with the non-MSA county definition) is lower than in non-rural areas
(Leigh and Atkinson, 2001; Department of Commerce, 2000; Kolko, 1999).
Recent data from the Pew Internet Project suggests that lower Internet use in rural
areas persist. Data from 2003 show that Internet penetration is about 10 percentage points
lower than in urban and suburban areas. Some of the gap is attributable to the older
population of American and some to an interaction effect of living in a rural area with
income, i.e., the effect of income on Internet use varies across geographical region. In
this case, low-income rural residents are much less likely to be online than their
counterparts elsewhere or, put differently, high-income individuals in rural areas are just
as likely to be online as high-income people elsewhere (Bell, et.al., 2004). The difference
is more pronounced for broadband penetration at home. 2004 data show that rural
Americans are about a third as likely to have high-speed connections at home as urban
Americans; 10% of all rural Americans have broadband connections at home compared
with 29% of urban Americans (Horrigan, 2004).
County population is included in Models 2-4 to capture the network externality
effects that size could conceivably have on the supply of and demand for Internet services
and, consequently, on Internet utilization. Population is not a significant predictor in
Models 2 and 3, but is significant when interacted with county and industry
characteristics in Model 4 (see discussion in Section 3.1.3, below).
The second spatial-economic measure, referred to here as "economic structure", is
defined in terms of the percentage of county employment in each of seven
telecommunications-intensive sectors in the economy: Communications; Finance,
Insurance, and Real Estate ("FIRE"); Professional Services; Education, Health, and
Social Services; Retail Trade; Manufacturing; and Other Telecom-Intensive Sectors
Prepared for the MISRC/CRITO Symposium on the Digital Divide Carlson School of
Management, University of Minnesota , August, 2004. NOT TO BE QUOTED WITHOUT
PERMISSION OF AUTHORS
(Distribution Services, Personal Services and Entertainment, and Miscellaneous Business
Services).13 It is expected that nature of the local economy may well affect the general
environment for Internet usage. If adults in an area are familiar with the Internet because
of its use at work, a culture of usage among adults in the area may emerge. Furthermore,
higher utilization of the Internet in local businesses may generate better
telecommunications infrastructure and this supply side factor could influence rates of
utilization of the general community.
Taken individually, these variables help identify sectors of the economy that may
have a particularly strong impact on Internet use. As a set, they can be thought of as
instruments that control for variation in the economic profile of counties across the
United States, thereby providing more refined estimates of the partial odds associated
with other predictors of Internet use.
The results show that the nature of the local economy does affect utilization rates
(see Table 4, Models 3 and 4). In particular, the higher the share of workers in
Communications, Professional Services, Retail Trade and Telecommunications-intensive
Manufacturing, the greater the marginal probability of Internet use. With partial odds
ratios barely greater than 1.0, the effect of economic structure is quite modest compared
to that of the demographic characteristics of respondents. The set of seven economic
structure variables as a whole is nevertheless statistically significant with p<0.0001, and
five of the partial odds are individually statistically significant at =0.05.
3.1.3 Population Interaction Effects
To allow for differential effects on utilization across space, more complex forms
of urbanization were introduced through the use of interaction terms. These terms were
constructed for population and county type as well as for population and the economic
structure variables. One can interpret the partial odds ratios of the interaction terms as
modifying the partial odds associated with population in ways that depend on county type
or feature of the economic structure. In other words, in Model 4 the effect of county
population on Internet use is allowed to depend on the type of county and/or the nature of
the economic structure of the county.
13
Using the input-output table for the United States, Schmandt and Wilson (1990) originally defined
telecommunications-intensive sectors as industries whose purchases from the communication sector, SIC
48, are more than 1 percent of the total value of intermediate inputs. While 1 percent seems small, it is a
relatively large threshold compared to input purchases across all sectors. These activities were translated
here into a series of 2- to 6-digit NAICS (North American Industrial Classification System) industrial codes
which were, in turn, mapped to the March 2000 U.S. County Business Patterns dataset to identify the
proportion of total county employment attributable to these sectors in each of the 2575 counties represented
in the Pew Survey. The non-trivial problems of data supression at this level of detail (due to having only
one to three employers in a particular NAICS category in a given county) were resolved by an algorithm
that involved exploiting unsuppressed information on the number of plants within employment categories,
along with an interative proportional fitting routine to impute county employment within each of the seven
telecommunications intensive sectors. Details are available from the authors.
Prepared for the MISRC/CRITO Symposium on the Digital Divide Carlson School of
Management, University of Minnesota , August, 2004. NOT TO BE QUOTED WITHOUT
PERMISSION OF AUTHORS
Without interaction terms, population is found to have no effect on the overall
odds of Internet use (Table 4; the partial odds are approximately equal to 1.00 and
insignificant for Models 2 and 3). In the more complex specification of Model 4,
increasing population is found to have a positive effect on the partial odds of utilization,
holding other variables constant. In other words, the size of the county has an important,
and positive effect, only when other features of the county are taken into account. The
extraordinarily large partial odds associated with population in Model 4 (52.55) must be
interpreted in conjunction with the partial odds of the interaction terms. From Model 4
(Table 4), all else held constant, a one-unit (one million) increase in population in a
Single-County MSA, for example, increases the odds of Internet use by a factor of 1.79
(almost 80 percent) compared to that of a rural county: 1.79 = 52.55 x 0.034, the product
of the respective partial odds ratios. For Central-County MSAs the partial odds factor is
1.68 (=52.55 x 0.032), and for Suburban-County MSAs 1.47 (=52.55 x 0.028). All three
of these compound partial odds ratios are statistically significant at =0.001.
Results for the interaction of population with the economic structure variables in
Model 4 are less remarkable from a statistical point of view. Only the population
interaction with the percent of county employment in professional services is statistically
significant (p=0.004). The rest of the population/economic structure interactions are
singly and collectively insignificant. The interpretation of the estimated partial odds
ratios here is complicated by the fact that population and the economic structure variables
are all continuous rather than discrete indicators like county type. Moreover, the
magnitude of the partial odds for population (=52.55) and those of its interactions with
economic structure yield compound partial odds that are difficult to interpret
substantively. To the extent that any credence is given to these interactions at all, it is
best to think of them as capturing some nonlinear complexity relating to economic
structure which is held constant in interpreting the partial impacts of the remaining
variables in the model.
To summarize the findings resulting from the incorporation of interaction terms,
the odds of Internet utilization for an individual, with a particular set of demographic
characteristics, can vary by place of residence. In general, counties with a more
telecommunication-intensive economy, larger population, and in MSAs have higher
utilization rates holding constant demographic characteristics. Since this estimation is
essentially a reduced-form specification, it cannot be determined whether these
differences arise due to local Internet infrastructure and cost of the service or to the social
milieu of the particularly area. Nevertheless, the results strongly suggest that future
research on utilization should take into account community characteristics in estimating
utilization of individuals.
3.2 Explaining Internet Utilization at Work
Models for exploring the determinants of Internet use at work (binary dependent
variable = WORK) were estimated for those 16,680 respondents in the sample who were
employed at the time of the survey. The overall explanatory power of these models
Prepared for the MISRC/CRITO Symposium on the Digital Divide Carlson School of
Management, University of Minnesota , August, 2004. NOT TO BE QUOTED WITHOUT
PERMISSION OF AUTHORS
(Table 5) is somewhat weaker than those estimated for USER, presented above, but the
results are again quite robust across the alternative specifications.14 Model 4 is again the
preferred specification according to the AIC criterion, while Model 3 is favored by the
SBIC; all are pairwise distinguishable by both the AIC and SBIC criteria15 except for
Models 3 and 4 under the AIC. The same general patterns of demographic effects on
Internet utilization found in the USER models are repeated, but with some intriguing
differences with respect to some demographic variables.
The lower likelihood of Internet use among those with lower educational
attainment, discussed earlier for the USER models, is somewhat more accentuated in the
workplace models (Figure 1, WORK dataset). Only at the level of college education and
post-graduate education do significantly higher likelihoods of utilization appear. In
addition, the same positive monotonic trend in the partial odds ratios for income
categories seen in the USER models (Figure 2) is observed in the WORK models, but
with statistically significant differences only appearing once the $40,000 to $50,000
bracket is reached (as opposed to the $20,000 to $30,000 bracket for USER). In the
workplace only individuals with high levels of education, and presumable higher levels
of income, demonstrate markedly higher utilization rates (partial odds ratios greater than
one). The actual process by which education levels affect workplace utilization cannot be
conclusively determined from this dataset, but a hypothesis can be advanced.
Employment for more highly educated individuals probably occurs in occupations
requiring the processing or manipulation of information, the so-called symbolic analyst
(Reich, 1992). Given that information technology facilities the exchange of information,
such occupations are likely to comprised of more intensive users of the Internet
(Department of Commerce, 2002).
Several demographic variables that are significant in the USER models are found
to have less significance on Internet use at work. The impact of age on the workplace use
is slightly less diminished compared to the earlier models for USER (compare Tables 4
and 5). Being married has a positive effect on Internet usage in general, but has a small
negative effect in the WORK models, although a statistically insignificant one in Model 4
(Table 5).
Another quite interesting result is found in the race/ethnicity variables. The
differences in the odds of utilization by race found in the estimation of USER models
virtually disappear in the estimation for use at work. There is basically no statistically
significant difference in the odds of use at work among whites, Hispanics, and blacks,
holding constant the effects of other variables (the only exception being a lingering low
partial odds for blacks in Models 1 and 2). The essential absence of effect of
14
An F-test determined that a separate set of regressions for use at work (WORK) is preferred to a single
set of regressions on USER with interaction terms for EMPLOY (p<0.001).
15
P-values are all less than 0.001 with the exception of the contrast between Models 1 and 2 under the
SBIC where p<0.05.
Prepared for the MISRC/CRITO Symposium on the Digital Divide Carlson School of
Management, University of Minnesota , August, 2004. NOT TO BE QUOTED WITHOUT
PERMISSION OF AUTHORS
race/ethnicity on workplace use of the Internet is encouraging, but perhaps not surprising
when one considers the employer’s interest. If computers and Internet access at work
were needed in certain jobs, it would make little sense for an employer to distinguish
among workers based on race/ethnicity (or gender and marital status). Data on the
occupation of the respondent would be needed to further explore this hypothesis, but
unfortunately, the Pew Project data does not include occupation.
The effects of economic structure and county type variables are similar in sign to
those found in the USER estimations, but with lower values of the partial odds ratio.
High levels of telecommunications-intensive industries in a county result in higher odds
ratio of use at work. Although this result is consistent with the researchers’ expectations,
data on the industry in which a respondent works would provide a sounder basis for
drawing conclusions about the effect of this variable on usage. In terms of county type,
the positive marginal effect of MSA counties over rural counties remains, although the
effects are slightly smaller and, for Model 4, statistically weaker than in the USER
models.
The interaction terms again improve on the explanatory power of the models, but
largely in terms of curve fitting without important interpretative value. That is to say that
the marginal effects of economic structure, county type and population are not constant
but vary in relationship to each other.
The findings concerning workplace utilization reinforce the importance of
demographic characteristics in explaining Internet use but with subtle differences with
the USER models. The effects of education on Internet use are less pronounced in the
workplace than they are more generally. By contrast, several variables in the USER
estimation, race/ethnicity, marriage and parental status have little effect on the likelihood
of utilization. We hypothesize that Internet usage at work would be made available for
only those tasks where a need, irrespective of these characteristics of individual workers,
exists.
4.0 Summary of Findings and Directions for Future Research
The study has reconfirmed several findings reported in the literature on the impact
of demographic variables on Internet utilization. The estimation of the logit models in
this study, however, represents a distinct methodological improvement over other
approaches used in the existing literature. The gross utilization rates, calculated for single
variables, can mask the true effects of demographic characteristics on utilization.
Although the very powerful effects of education and family income reported in the
literature using bivariate analysis is confirmed in this multivariate approach, the
difference in gross utilization between males and females found in earlier research
disappears in the more complex specifications tested in the logit models.
The research also suggests that the determinants of utilization at work will vary
significantly from utilization in other environments. Utilization at work will largely
Prepared for the MISRC/CRITO Symposium on the Digital Divide Carlson School of
Management, University of Minnesota , August, 2004. NOT TO BE QUOTED WITHOUT
PERMISSION OF AUTHORS
depend on whether an individual holds a position that the employer believes requires the
capabilities offered by the Internet whereas utilization outside a work environment will
certainly be a question of personal choice in many circumstances. Future data collecting
efforts might well incorporate the occupation of the individual as well as the industry in
which the individual works. The effect of education on the odds of utilization at work
found in this project may actually derive from the occupation of the individual.
The logit modeling in this research indicates a modest, but statistically significant,
effect of a set of spatial-economic variables on the odds of utilization. A respondent
residing in an urban county (i.e. a county within an MSA) has greater odds of utilizing
the Internet than one in a rural (non-MSA) county. Furthermore, everything else held
constant, the larger the population of the county, the greater the odds of utilization.
Individuals residing in counties with relatively high levels of telecommunication
intensive employment have higher odds of Internet utilization, holding other individual
demographic characteristics constant. We explored the effects of more complex
relationships among these three spatial-economic variables through the use of interaction
terms. While these results were less robust than those in models without interaction
terms, on the whole, community level effects appear significant.
Although the empirical analysis presented in this paper is exploratory, promising
directions for future research can be identified in terms of public policy evaluation and
alternative approaches to the modeling of utilization. Many state and local governments
have adopted policies, using a variety of mechanisms, to encourage the use of the
Internet. The use of county level variables, as used in this study, forecloses the
opportunity to investigate impacts of local policies since several local governments
usually exist within a single county. However, the analysis could be used to identify
counties where the odds of Internet utilization are much higher (or lower) than expected.
This set of counties could be further investigated to determine whether innovative or
aggressive Internet policies are being pursued by jurisdictions in the county.
Several more sophisticated modeling approaches should be considered in future
research. Rather than a focus on the users of the Internet in a single reduced form
equation, as adopted in this paper, a simultaneous equation specification could allow for
the estimation of supply and demand functions. The demand function would incorporate
characteristics of individuals and the price of services to estimate demand for Internet
services. Similarly, a supply function would estimate the level and quantity of services
offered by telecommunications companies based on various characteristics of their
provision and prices. Difficult measurement problems will be faced, such as a variable to
measure price and quantity of services. But as new data sources incorporate information
on prices, this methodological approach will become more common since it permits the
calculation of price elasticities (this approach is adopted in Anindya, Flamm and
Horrigan, forthcoming).
Prepared for the MISRC/CRITO Symposium on the Digital Divide Carlson School of
Management, University of Minnesota , August, 2004. NOT TO BE QUOTED WITHOUT
PERMISSION OF AUTHORS
Further investigation of community-level (or spatial-economic) effects on
utilization may be pursued through multilevel modeling. The argument advanced in this
paper is that utilization rates are affected by characteristics of individuals and of the
community in which the individual resides. Despite the relatively crude measures of
"community" used here – spatial-economic variables of county type, population and
economic structure – the empirical results provide sufficient evidence to justify further
investigation and the development of more refined measures of community. Multilevel
modeling in spatial-economic settings can, for example, account for the nesting of
geographic structures such as "suburban counties within an MSA" and do so in ways that
allow for suburban county effects to "borrow statistical strength" from information about
the overall MSA of which it is a component. The essential logic of multilevel modeling
can be extended more abstractly to empirical Bayes estimators in ways that may provide
an improved framework for capturing the effect of the structure of telecommunications
intensive sectors on Internet use. The impact of "professional services" on a particular
county, for example, could be "nested" within an overall "professional services" effect for
the state, subnational region, or nation as a whole. Richly promising as multilevel and
empirical Bayes modeling are in settings like the one studied here, these approaches
typically require strong assumptions about the parametric structure and independence of
the multiple error components that attend them. Future work is needed to identify the
appropriate balance to strike in weighing statistical efficiency against strong a priori
assumptions in settings like this.
A second way to incorporate geographic information into the statistical modeling
is to account for spatial autocorrelation. When present, this form of autocorrelation
implies that, at some level, behavior in a single community is affected by behavior in
proximate communities. For example, unobserved factors operating at the metropolitan
level might have a distinct effect on county level behaviors within the metropolitan area.
Evidence of such an effect would be reflected in spatial autocorrelation, where the error
terms of proximate counties are correlated. Spatial autocorrelation models would provide
a useful framework for examining spillover effects that may spread from the central MSA
counties to suburban counties to exurban non-MSA counties bordering the MSA. While
it is somewhat daunting to specify the covariance structure linking each county in the
survey dataset to its neighbors, modern advances in computing power and in software for
spatial statistics open a large arena for future research aimed at capturing the spatialeconomic structure underlying Internet use.
The academic and policy literature on Internet utilization is expanding. The vital
interests of telecommunications providers, public policy makers and social scientists are
driving this research agenda. It will remain a challenging area of investigation for various
theoretical, methodological and empirical reasons. Nevertheless, the importance of this
new technology in reshaping society requires that the investigation be pursued.
Prepared for the MISRC/CRITO Symposium on the Digital Divide Carlson School of
Management, University of Minnesota , August, 2004. NOT TO BE QUOTED WITHOUT
PERMISSION OF AUTHORS
References
Bell, Peter, Pavani Reddy, and Lee Rainie, February 2004. Rural Areas and the Internet.
Pew Internet & American Life Project. Available online at:
http://www.pewinternet.org/PPF/r/112/report_display.asp.
Bimber, Bruce. 2000. Measuring the Gender Gap on the Internet. Social Science
Quarterly 81: 868-76.
Business Week. May 12, 2003. Special Report: The E-Biz Surprise.
Chaudhuri, Anindya, Kenneth Flamm, and John Horrigan. 2004. An Analysis of the
Determinants of Internet Access (Unpublished manuscript).
ClickZ Internet Statistics and Demographics. Online. Available at:
http://www.clickz.com/stats/big_picture/geographics/article.php/5911_151151. Accessed
on July 27, 2004.
Cooper, Mark and Gene Kimmelman. February 1999. The Digital Divide Confronts the
Telecommunications Act of 1996. Online. Available at
http://www.consumersunion.org/pdf/telecom1-0299.pdf. Accessed February 9, 2002.
Cooper, Mark N. , October 2000. Disconnected, Disadvantaged, and Disenfranchised:
Explanations in the Digital Divide. Consumers Union. Online. Available at
http://www.consumersunion.org/pdf/disconnect.pdf. Accessed February 9, 2002.
Devol, Ross C., America’s High-Tech Economy. July 1999. Milken Institute. Online.
Available at http://www.milkeninstitute.org/mod30/ross_report.pdf. Accessed February
8, 2002
Department of Commerce, A Nation Online: How Americans are Expanding Their Use of
the Internet. February 2002. Online. Available at
http://www.ntia.doc.gov/ntiahome/dn/anationonline2.pdf. Accessed March 10, 2002.
Department of Commerce. October 2000. Falling Through the Net: Toward Digital
Inclusion. Online. Available at http://search.ntia.doc.gov/pdf/fttn00.pdf . Accessed
January 18, 2002.
Government Accounting Office. February 2001. Telecommunications: Characteristics
and Choices of Internet Users. Online. Available at
http://www.gao.gov/new.items/d01345.pdf. Accessed February 3, 2002.
Gorman, Sean P., and Edward J. Malecki. March 2000. The Networks of the Internet: An
Analysis of Provider Networks in the USA. Telecommunications Policy 24(2): 113-134.
Prepared for the MISRC/CRITO Symposium on the Digital Divide Carlson School of
Management, University of Minnesota , August, 2004. NOT TO BE QUOTED WITHOUT
PERMISSION OF AUTHORS
Greenstein, Shane with Mercedes M. Lizardo and Pablo T. Spiller. February 1997. The
Evolution of Advanced Large Scale Information Infrastructure in the United States,
National Bureau of Economic Research Working Paper No. 5929.
Hoffman, Donna L., Thomas P. Novak and Ann E. Schlosser. March 2000. The Evolution
of the Digital Divide: How Gaps in Internet Access May Impact Electronic Commerce,
Vanderbilt University. Online. Available at
http://www.ascusc.org/jcmc/vol5/issue3/hoffman.html. Accessed February 5, 2002.
Horrigan, John. February 2004. Broadband Penetration on the Upswing. Pew Internet &
American Life Project. Available online at:
http://www.pewinternet.org/PPF/r/121/report_display.asp.
Kolko, Jed. July 1999. The High-Tech Rural Renaissance?: Information Technology,
Firm Size and Rural Employment Growth. Online. Available at
http://www.sba.gov/advo/research/rs201tot.pdf. Accessed February 7, 2002.
Kridel, D. J., P. R. Rappoport, and L. D. Taylor. 1999. An Econometric Analysis of
Internet Access. In The Future of the Telecommunications Industry: Forecasting and
Demand Analysis, eds. David G. Loomis and Lester D. Taylor, p. 21-42. Kluwer
Academic Press.
Leigh, Andrew and Robert D. Atkinson. 2001. Clear Thinking on the Digital Divide,
Washington, D.C.; Progressive Policy Institute. Available online at:
http://www.ppionline.org/ppi_ci.cfm?knlgAreaID=107&subsecid=126&contentid=3490.
Accessed on July 27, 2004
Moss, Mitchell L., and Anthony M. Townsend. 2000. The Internet Backbone and the
American Metropolis. The Information Society 16(1): 35-47.
Novak, Thomas P. and Donna L. Hoffman. February 1998. Bridging the Digital Divide:
The Impact of Race on Computer Access and Internet Use, Project 2000, Vanderbilt
University. Online. Available at
http://ecommerce.vanderbilt.edu/research/papers/html/manuscripts/race/science.html.
Accessed February 5, 2002.
Ono, Hiroshi and Madeline Zavodny. March 2003. Gender and the Internet. Social
Science Quarterly, 84(1):111-121.
Reich, Robert B. 1992. The work of nations: preparing ourselves for 21st century
capitalism. NY: Vintage Books.
Prepared for the MISRC/CRITO Symposium on the Digital Divide Carlson School of
Management, University of Minnesota , August, 2004. NOT TO BE QUOTED WITHOUT
PERMISSION OF AUTHORS
Schmandt, Jurgen, Robert H. Wilson et al, 1990. The New urban infrastructure : cities
and telecommunications. CT: Praeger.
Spooner, Tom and Lee Rainie. July 2001. Hispanics and the Internet. Pew Internet &
American Life. Online. Available at
http://www.pewinternet.org/PPF/r/38/report_display.asp. Accessed July 27, 2004.
UCLA Center for Communication Policy. November 2000. Surveying the Digital Future.
Online. Available at http://sfpl4.sfpl.org/btdir/ucla-internet.pdf. Accessed March 29,
2002.
Wilson, Robert H. 1993. States and the Economy Policymaking and Decentralization.
CT: Praeger.
Prepared for the MISRC/CRITO Symposium on the Digital Divide Carlson School of
Management, University of Minnesota , August, 2004. NOT TO BE QUOTED WITHOUT
PERMISSION OF AUTHORS
Table 1: Definition of Variables with Gross Utilization Rates
Variable Name
Definition
USER
Use the Internet at home, work, or both (=1)
or not (=0)
WORK
Employed and use the Internet at work and/or home
(=1)
or not (=0)
Education level (highest level achieved)
ED-Elem
Grades 1-8 or less
ED-NoHS
High school incomplete
ED-HS
High school graduate
ED-Tech/Voc
Business/technical/vocational school
ED-SomeCol
Some college, no 4-year degree
ED-ColGrad
College graduate
ED-PostGrad
Post-graduate training
ED-Unknown
Don't know or missing
Gross
Utilization
Number of
(%)
Observations
13,185
100.00
12,026
0.00
6,617
10,063
100.00
0.00
616
1,941
7,851
1,078
6,102
4,895
2,571
217
4.38
19.53
34.85
44.53
62.77
74.38
79.39
29.77
Ethnicity
WHITE
White, not Hispanic origin (=1)
19,101
54.01
BLACK
Black, not Hispanic origin (=1)
2,654
39.30
HISPA
Hispanic or Latino origin (=1)
1,757
46.96
ASIAN/Other
Income
INC-LT10
INC-10/20
INC-20/30
INC-30/40
INC-40/50
INC-50/75
INC-75/100
INC-100+
INC-Unknown
Other
PARENT
Asian or other ethnicity (=1)
1,398
57.44
Household income less than $10,000
$10,000 to < $20,000
$20,000 to < $30,000
$30,000 to < $40,000
$40,000 to < $50,000
$50,000 to < $75,000
$75,000 to < $100,000
$100,000 or more
Don't know or missing
1,498
2,247
3,042
2,923
2,407
3,419
1,935
2,019
5,781
22.76
27.90
41.42
53.27
61.32
71.48
78.19
82.81
41.34
8,880
16,391
13,933
11,338
61.53
47.11
56.74
46.57
16,680
8,591
11,920
13,351
NA
63.35
30.49
55.82
48.92
NA
MARITAL
EMPLOY
GENDER
AGE
Parent/guardian of child under 18 (=1)
or not (=0)
Marital status, married or living as such (=1)
or not (=0)
Employment status, employed full or part time (=1).
Used to subset the USER dataset into the WORK
dataset.
or not employed (=0)
Male (=1)
Female (=0)
Age between 18-98 (continuous variable)
Prepared for the MISRC/CRITO Symposium on the Digital Divide Carlson School of
Management, University of Minnesota , August, 2004. NOT TO BE QUOTED WITHOUT
PERMISSION OF AUTHORS
Source: Pew Internet & American Life Project Survey (March–December 2000).
Prepared for the MISRC/CRITO Symposium on the Digital Divide Carlson School of
Management, University of Minnesota , August, 2004. NOT TO BE QUOTED WITHOUT
PERMISSION OF AUTHORS
Table 2: Gross Utilization Rates by Regions and County Type†
USER Dataset
Variable
a. Region
Region1
Region2
Region3
Region4
Overall
Definition
Northeast region of the U.S.
Midwest
South
West
%
Total Number Internet
of Responses
Use
WORK Dataset
Total
Number of
Responses
%
Internet
Use
4,713
6,119
9,639
4,800
25,271
53.8
50.3
49.9
57.7
52.2
3,184
4,074
6,270
3,152
16,680
39.3
38.3
39.1
42.9
39.7
5,911
41.9
3,623
30.2
4,153
56.0
2,713
40.2
9,054
55.7
6,146
42.5
6,153
55.7
4,198
43.4
25,271
52.2
16,680
39.7
b. County Type
Non-MSA
County
County outside an MSA
(N=1,756, average pop=28,982)
MSA comprised of a single county
Single-County
(N=130, average pop= 392,094)
MSA
Central
County
Suburban
County
Overall
Central county of a multi-county MSA
(N=235, average pop= 440,656)
Non-central county within a multicounty MSA
(N=455, average pop= 152,785)
(N=2,576, average pop= 106,730)
†
The USER dataset contains all usable responses to the survey. The WORK dataset is a subset of the
USER dataset consisting of all respondents who are employed. The binary variable USER is equal to 1 if a
respondent uses the Internet at home, at work, or both, and equal to 0 otherwise. Likewise, WORK=1 if a
respondent uses the Internet at work (including "at work and at home").
Source: Pew Internet & American Life Project Survey (March–December 2000), U.S. Bureau of Census
(Census, 2000).
Prepared for the MISRC/CRITO Symposium on the Digital Divide Carlson School of
Management, University of Minnesota , August, 2004. NOT TO BE QUOTED WITHOUT
PERMISSION OF AUTHORS
Table 3: Percent Internet Utilization at Work by Region and Type of County
Type of County
Non-MSA County Single-County MSA
Central County
Suburban County
Overall
Region
Northeast
Midwest
South
West
Total
Number of
Observations
33.1
31.5
27.5
33.5
30.2
31.7
40.5
37.7
42.9
40.2
41.4
40.7
43.3
45.0
42.5
40.4
42.0
45.0
50.5
43.4
39.3
38.3
39.1
42.9
39.7
3,623
2,713
6,146
4,198
16,680
Source: Pew Internet & American Life Project Survey (March–December 2000), U.S. Bureau of Census
(Census, 2000).
Prepared for the MISRC/CRITO Symposium on the Digital Divide Carlson School of
Management, University of Minnesota , August, 2004. NOT TO BE QUOTED WITHOUT
PERMISSION OF AUTHORS
Table 4: Impact of Demographic and Spatial-Economic Factors
on Internet Utilization (Y = USER)†
Model 1
Personal Characteristics
ED-Elem
ED-NoHS
ED-HS
ED-Tech/Voc
ED-SomeCol
ED-ColGrad
ED-PostGrad
INC-LT10
INC-10/20
INC-20/30
INC-30/40
INC-40/50
INC-50/75
INC-75/100
INC-100+
White
Black
Hispanic
Male
Age
AgeSquared
Parent
Married
Employed
County Characteristics
SglCty-MSA
CtrCty-MSA
SubCty-MSA
Population
Pop
Economic Structure
%Comms
%FIRE
%ProfServs
%Educ/Hlth/Social
%Retail
%Mfg
%OtherServices
Interaction Effects
Pop*SglCty
Pop*CtlCty
Pop*SubCty
Pop*%Comms
0.16***
0.36***
0.70
1.15
2.06***
3.25***
4.79***
0.45***
0.51***
0.83**
1.15**
1.41***
2.06***
2.60***
3.17***
0.94
0.53***
0.60***
0.95
0.92***
1.00***
0.99
1.22***
1.72***
Model 2
0.16***
0.36***
0.70*
1.14
2.00***
3.15***
4.65***
0.46***
0.52***
0.84**
1.16**
1.42***
2.05***
2.57***
3.10***
0.97
0.53***
0.59***
0.96
0.92***
1.00***
1.00
1.24***
1.72***
Model 3
0.17***
0.37***
0.71
1.16
2.04***
3.14***
4.64***
0.46***
0.52***
0.85**
1.16**
1.42***
2.03***
2.56***
3.04***
0.99
0.54***
0.60***
0.95
0.92***
1.00***
1.00
1.24***
1.73***
1.40***
1.27***
1.25***
1.00
Model 4
0.17***
0.37***
0.71
1.15
2.00***
3.09***
4.57***
0.46***
0.53***
0.85**
1.17**
1.42***
2.03***
2.55***
3.03***
1.00
0.55***
0.59***
0.96
0.92***
1.00***
1.00
1.25***
1.74***
1.35***
1.26**
1.37***
1.00
1.07**
1.01
1.03***
1.00
1.03***
1.05***
1.01*
52.55***
1.06*
1.02*
1.04***
1.00
1.02***
1.04**
1.00
0.034***
0.032***
0.028***
1.03
Prepared for the MISRC/CRITO Symposium on the Digital Divide Carlson School of
Management, University of Minnesota , August, 2004. NOT TO BE QUOTED WITHOUT
PERMISSION OF AUTHORS
Pop*%FIRE
Pop*%ProfServs
Pop*%Ed/Hlth/Soc
Pop*%Retail
Pop*%Mfg
Pop*%OtherServices
N = 25271
AIC
SBIC
% Concordant
0.98
0.97**
0.99
0.99
0.99
1.00
25464.8
25668.2
83.0
25420.1
25656.8
83.1
25351.7
25684.8
83.2
*** p < .001
** p < .01
*
p < .05
†
Figures reported are partial odds ratios; basis for significance tests is "H 0: Partial Odds=1".
Source: Pew Internet & American Life Project Survey (March–December 2000).
25319.4
25685.6
83.3
Prepared for the MISRC/CRITO Symposium on the Digital Divide Carlson School of
Management, University of Minnesota , August, 2004. NOT TO BE QUOTED WITHOUT
PERMISSION OF AUTHORS
Figure 1:
Partial Odds Ratio by Level of Education (Model 4)
5.00
4.57
4.50
3.83
4.00
3.50
Partial Odds Ratio
3.09
3.00
2.45
2.50
2.00
2.00
1.50
1.29
1.15
1.00
0.50
0.83
0.71
0.49
0.37
0.17 0.14
0.22
0.00
Elem
NoHS
HS
Tech/Voc
USER Dataset
SomeCol
WORK Dataset
ColGrad
PostGrad
Prepared for the MISRC/CRITO Symposium on the Digital Divide Carlson School of
Management, University of Minnesota , August, 2004. NOT TO BE QUOTED WITHOUT
PERMISSION OF AUTHORS
Figure 2:
Partial Odds Ratio by Household Income (Model 4)
5.00
4.50
4.00
Partial Odds Ratio
3.50
3.03
3.00
2.55
2.36
2.50
2.03
1.99
2.00
1.17
1.17
1.03
0.85 0.8
1.00
0.50
1.5
1.42
1.50
0.46 0.43
0.53 0.55
0.00
LT10K
10K/20K
20K/30K
30K/40K
40K/50K
Household Income
USER Dataset
WORK Dataset
50K/75K
75K/100K
100K+
Prepared for the MISRC/CRITO Symposium on the Digital Divide Carlson School of Management, University of Minnesota , August,
2004. NOT TO BE QUOTED WITHOUT PERMISSION OF AUTHORS
Figure 3: Probability of Internet Use by Age & Education
(Y=User)
0.9
0.8
0.7
0.6
Probability
<-- Some College+
0.5
0.4
No College -->
0.3
0.2
0.1
0
20
25
30
35
40
45
50
Age
55
60
65
70
75
80
Determinants of Internet Utilization-August 1, 2004
Table 5: Impact of Demographic and Spatial-Economic Factors
on Internet Utilization (Y = WORK)†
Model 1
Personal Characteristics
ED-Elem
ED-NoHS
ED-HS
ED-Tech/Voc
ED-SomeCol
ED-ColGrad
ED-PostGrad
INC-LT10
INC-10/20
INC-20/30
INC-30/40
INC-40/50
INC-50/75
INC-75/100
INC-100+
White
Black
Hispanic
Male
Age
AgeSquared
Parent
Married
County Characteristics
SglCty-MSA
CtrCty-MSA
SubCty-MSA
Population
Pop
Economic Structure
%Comms
%FIRE
%ProfServs
%Educ/Hlth/Social
%Retail
%Mfg
%OtherServices
Interaction Effects
Pop*SglCty
Pop*CtlCty
Pop*SubCty
Pop*%Comms
Pop*%FIRE
Pop*%ProfServs
Pop*%Ed/Hlth/Soc
Model 2
Model 3
Model 4
0.13***
0.20***
0.46**
0.78
1.23
2.39**
3.76***
0.42***
0.54***
0.78***
1.02
1.15*
1.51***
2.00***
2.46***
1.00
0.84*
0.95
0.98
0.98***
0.13***
0.20***
0.46**
0.78
1.22
2.35**
3.69***
0.43***
0.55***
0.79**
1.03
1.16*
1.51***
1.98***
2.42***
1.02
0.84*
0.95
0.99
0.98***
0.14***
0.22***
0.49*
0.82
1.29
2.44**
3.81***
0.42***
0.56***
0.80**
1.03
1.17*
1.50***
1.98***
2.35***
1.07
0.87
0.98
0.98
0.98***
0.14***
0.22***
0.49*
0.83
1.29
2.45**
3.83***
0.43***
0.55***
0.80**
1.03
1.17*
1.50***
1.99***
2.36***
1.07
0.86
0.98
0.98
0.98***
1.00
1.05
0.90*
1.00
1.05
0.91*
1.00
1.07
0.92*
1.00
1.07
0.92
1.25***
1.29***
1.28***
0.99
1.22*
1.09
1.32*
0.97**
1.09***
1.01
1.03***
0.99**
1.01
1.02
1.01
29.92**
1.11***
1.02*
1.04***
0.99*
1.02*
1.02
1.00
0.110*
0.116
0.088*
0.98
0.97*
0.97**
1.00
31
Determinants of Internet Utilization-August 1, 2004
Pop*%Retail
Pop*%Mfg
Pop*%OtherServices
N = 16680
AIC
SBIC
% Concordant
0.95***
0.99
1.00
18910.0
19095.3
76.0
18889.5
19105.7
76.1
18809.6
19056.7
76.5
18801.1
19148.5
76.5
*** p < .001
** p < .01
*
p < .05
†
Figures reported are partial odds ratios; basis for significance tests is "H 0: Partial Odds=1".
Source: Pew Internet & American Life Project Survey (March–December 2000).
32
Download