The Determinants of Broadband Availability: Economics, Demographics, & State Policy Kenneth Flamm University of Texas at Austin kflamm@mail.utexas.edu Motivation Very preliminary work presented today FCC data on broadband entry now offers opportunity for longitudinal analysis relevant to major telecomm policy issues Linking to multiple other data sets, have constructed rich data set, sophisticated models with greater range of explanatory variables now possible Extends and improves on early work of others, some new approaches to be outlined below New results, relevant to policy Overview of FCC Data FCC identifies zip codes where at least 1 high speed line installed Estimates zip codes where no high speed lines, to track penetration FCC maps “point” zip codes to “geographic” zip codes Result: remote areas with no regular mail service absorbed into zips with mail delivery Census maps remote areas with no regular mail service to post office of boxes/general delivery for remote residents Maps geographic areas to “point” zips (actually ZCTAs) 3245 areas with P.O. Box-only delivery zip codes, no conventional mail delivery in 2000 Census the only organization mapping zip codes to people Implications for FCC-Census Match-up Implications: FCC BB numbers probably overestimate providers/zip in zips to which “point” zips are mapped FCC BB numbers for zips with ANY service probably about right “Point” Census zips not showing up on FCC list do NOT necessarily not have broadband service Confining analysis to “geographic” zips only probably best fix Probably very few remote areas without mail service but with broadband, that adjoin more populated areas with mail service but without BB But understand that remote, sparsely populated rural zips underrepresented in resulting sample Issue important for geographic BB coverage, but no longer important for population BB coverage Rapid Change in US Broadband Penetration, Competition over 4 Years Distribution of Zip Codes by # BB Providers 50 45 40 35 30 25 Percent 20 15 10 5 Jun-00 Dec-00 3-Jun 3-Dec 0 3-Dec 8 3-Jun 7 6 Dec-00 5 Jun-00 4 1-3 0 9 10 11 12 13 14 15 16 17 18 19 20 Note: Census zips not showing up on FCC list credited with 0 BB providers– overestimates true zeros to unknown extent >99% Population now has at least 1 provider in their zip code Pop-weighted Distribution of Zip Codes by # BB Providers 50 45 40 35 30 25 Percent 20 15 % Pop 6/00 % Pop 12/03 10 5 0 % Pop 12/03 % Pop 6/00 0 1-3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Note: # providers may be overestimated in geographic zips to which “point” zips have been assigned by FCC Economic models of broadband penetration L-R Approach– Firms enter markets to make profits Market characteristics: Demand side: consumer socioeconomics, demographics Supply/cost sides: technology, geography, regional cost factors Approach: estimate “reduced form” “solve” for number of firms that “fit” into market as function of characteristics Price and quantity “solved for” as functions of exogenous variables, given N players in market and all above characteristics Simplest decision– for anyone to enter market—requires few assumptions—just ask whether a hypothetical monopolist would make a profit Much more complex decision if we ask how many enter Need to assume oligopoly model Need to deal with asymmetries among players Ordered Choice Models The “natural” way to think about this decision Hypothetical monopoly profit > 0, enter, otherwise don’t An unobserved “latent variable” a function of market characteristics Logit or probit a “natural” solution For number of entrants: Profit of least profitable potential entrant > 0, enter Next least profitable entrant ends up with profit <0, they don’t enter, defines equilibrium Construct function N* giving number of entrants that just makes marginal entrant profit =0 Since N is integer, largest integer N <= N* defines number of entrants in equilibrium N* is a latent variable that gives number of firms, falls below integer N “cut point” Ordered logit or probit marginal the “natural” choice Data Issues Have constructed zip code level longitudinal (2000-2003) panel from 7 sources: The bad news—A lot of tedious work FCC high-speed survey FCC CLEC competiton survey 1997 Economic Census 2000 Population & Housing Census Universal Service Fund School and Library (“eRate”) and Rural Health Care funding Commitments Hydrographic, topological, land cover geophysical databases Plus, various zip code data bases Still not done! Still fixing small issues in data The good news—A very rich data set Current Research Road Map Simple logit/probit (single years) Any Entrant at all (fewest assumptions) Today’s Talk Correlated Data model (panel data) Bivariate logit/probit (Use info on CLEC competition) X Ordered logit/probit Fails proportional odds/|| lines test Number of entrants (more assumptions) Non-proportional ordered models: Partial proportional odds Continuation ratio Generalized ordered logit Initial observations Functional form Preliminary work showed logs for selected continuous variables worked marginally better Little difference in coefficient signs, significance Years covered terrain variables Results for 2000 led to investigation of geophysical/terrain variables 2000 known to have data collection problems, FCC revised 2000 results qualitatively different from later years 2000 dropped CLEC competition data In principle, could be used to separate telephone competition from other elements of state BB policy Simultaneity, identification issues Stuck with “completely” reduced form Econometric Approach Start with standard binary logit models for 12/00, 12/01, 12/02, 12/03 Any statistically significant variable (10% level) in any year goes into “interesting” pool Relax statistical assumptions Exploit information in repeated observations on zips over time Error term generalized to entire exponential family Calculate robust standard errors Longitudinal panel data structure Allow coefficients to vary over time Allow for correlations in observations over time Generalize estimating equation (GEE) estimator A cousin of generalized method of moments (GMM) estimator What’s Statistically Significant, Inclusive GEE Model (all significant variables in any standard 2001-03 logit included, 10% level), 2001-02 Effect on BBand Penetration Population density Geographic size of zip Mean CTI “Wetness” index (actually marginal at 11% level) MODIS land cover classification (=1, 4, 11, baseline urban) NAICs 31, 44, 54, 72, 81 Estabs Pct Pop on Farms Pct Pop 55-74 Afro American (2001 only), Native American, Native Hawaiian Pop English is 2nd language Higher educational attainment Share Over 16 in Armed Forces, Civ Labor Force or Not in Labor Force Share pop working in education Per capita income Occupied housing density Share of houses occupied Share of housing indoor plumbing Share Living in new building Share living in 50yr+ old building Share living in pre WWII building Average home age Average home value Sign + + + + + + + + + + + + + + + What’s Statistically Significant, Parsimonious GEE Model (only significant variables from inclusive models, 10% level), 2001-02 Effect on BBand Penetration Population density Geographic size of zip Mean CTI “Wetness” index (actually marginal at 11% level) MODIS land cover classification (=1, 11, baseline urban) NAICs 31, 44, 54, 72, 81 Estabs Pct Pop on Farms Pct Pop 55-74 Afro American (2001 only), Native American, Native Hawaiian Pop English is 2nd language Higher educational attainment Share Over 16 in Armed Forces, Civ Labor Force or Not in Labor Force Share pop working in education Per capita income Percent pop female Occupied housing density Share of houses occupied Share of housing indoor plumbing Share living in 50yr+ old building Share living in pre WWII building Average home age Average home value Sign + + + + + + + + + + + + + + Dogs that did not bark: Small point estimates, not significant Numbers of households for given pop density, zip size is scale variable Similarity of coefficientspop is scale Pop/housing unit Household size Age variables (except 55-74, over 16) eRate, rural health care grants Role of State Policy/Effects Baseline was Texas, impact on BB May not be best baseline, but many zip codes, active state subsidy program (TIF, 1996-2004, $1.5B), relatively competitive market Statistically significant + effects: CA,CO,FL, MD, MA, NY, NC, OR, TN, Statistically significant – effects: IL, IN, IA, KS, MN, MO, NE, NV, ND, PA, SD, UT, VA, WI IL, IA, KS increasing in 2002 relative to 2001 Greater in 2001, parity in 2002 MD & TN increasing in 2002 relative to 2001 CT, ME Less in 2001, parity in 2002 HI, MI, WV First Pass at an Ordered Model 4 levels: 0, 1-3, 4-7, 8+ Score Test for the Proportional Odds Assumption Chi-Square 2001 2002 2003 2406 2665 2716 DF 218 218 218 More flexible model needed! Pr > ChiSq <.0001 <.0001 <.0001 Possible endogeneity issues Industry broadband availability e-Rate $ broadband availability But pre-bb industrial mix (‘97 econ census) But long lags for e-Rate apps-approvalscommitments-disbursements Similar issues for car ownership, home quality? Conclusions Estimated state effects correlate with accounts Terrain effects significant in some parts of country Terrain effects exciting! Instrumental variables for demand studies Income, pop density expected effects eRate irrelevant Industrial activity very significant (prof & technical services largest) Gender, education, farm location as expected Age effects generally not supported Digital divide ethnicity/gender show up, but small effects and decreasing Tests for proportional odds/parallel lines hypothesis critical in ordered logit/probit models