Using the Dun & Bradstreet (D&B) Database as a Sampling Frame for Company Surveys Sarah Cotton, Anil Bamezai U.S. National Computer Security Survey (NCSS) • Sponsored by: – U.S. Department of Justice – U.S. Department of Homeland Security • Focused on collecting data on the nature, extent, and consequences of computer security incidents – Goal of producing reliable national estimates of the incidence and prevalence of cybercrime against businesses, and their resulting losses • Fielded by the RAND Corporation in 2006 cyber - 2 Study Goals • Provide information about cybersecurity practices and cost of cybersecurity breaches to the US economy • Provide statistically meaningful contrasts of the above by broad industrial sectors • Representative sample of 36,000 companies across 36 industry sectors. cyber - 3 Previous Effort in 2001 • In 2001, the pilot Computer Security Survey was administered to 500 companies as a pilot test by BJS and Census, whose “findings…illustrate[d] the feasibility and utility of a data collection program to be initiated in 2004 among some 36,000 businesses…Results of this test demonstrated a need for an increased response rate to produce valid national estimates and a need to refine survey questions.” cyber - 4 Study Goals • We had to build on the resources available from the 2001 Pilot Survey • We had limited ability to tweak the existing survey instruments • We did, however, review the Pilot’s methods and assumptions, and also conducted cognitive interviews to assess how best to implement this survey cyber - 5 Establishment vs. Company • Cybersecurity practices generally set at corporate level, so surveying at establishment level was considered wasteful • Census also conducted the 2001 Pilot Survey at the company level • Census’s Business Register (BR) served as the frame, but this obviously was unavailable to us cyber - 6 Types of Companies • Three types of companies – Single-location companies (relatively straightforward) – Many establishments, with one company HQ, not in turn owned by another company (also relatively straightforward) – Companies with one or more establishments, in turn owned by another company (very complicated) cyber - 7 Why Complicated? • Company ownership patterns in flux due to M&A activity • Information security practices may vary across companies even though they may be part of the same corporate tree • Companies that are part of a complex corporate tree may belong to different industrial sectors – Our data suggests that approximately 20% of corporate trees span 2 or more industries cyber - 8 We Used D&B as Company Frame • D&B provides one of the most comprehensive list of companies operating in the US • Provides a detailed description of company ownership patterns to permit construction of complex corporate trees • Includes SIC and NAICS codes to indicate each company’s primary activity • Includes a variety of contact information cyber - 9 Surveying Corporate Trees • Research during the cognitive testing stage suggested IT practices variable across companies within the same tree • Not clear if the “ultimate” entity at the apex of a corporate tree could even report cybersecurityrelated information for all subsidiaries below • Even if apex entity could, reporting burden in getting them to break out information by subsidiary likely to be excessive, thus low response • Breakdown by subsidiary was required, however, to permit industry-level estimates cyber - 10 Options Considered • Send survey first only to apex entity in a corporate tree, then to each sampled subsidiary for which apex was unable to report – Two-stage fielding strategy would have required more time and money • Treat every company within a corporate tree as an independent sampling and reporting unit cyber - 11 Key Survey Design Decisions • For complex corporate trees, a compromise was struck – We decided to treat each company within a tree as an independent sampling and reporting unit – However, while respondent was instructed to report only for his company, the instrument was modified to include space where respondent could indicate whether this was the case or subsidiary data had also been included, and if so, for which ones • Each company was assigned to one of 36 broad industrial sectors • Several types of companies were sampled with certainty cyber - 12 Scale of the Complex Company Challenge • Dealing with complex companies is a core issue in surveys of this type • Companies from complex corporate trees comprise roughly 25% of our sample • High overlap between complex and certainty companies—85% of complex companies sampled with certainty • Certainty companies comprise 32% of total frame employment, but 85% of total sample employment cyber - 13 Challenges Faced in Using D&B File (1) • Contact information for top leadership often repeated among companies within a corporate tree – 12% had the same address – 10% had the same telephone number – 8% had the same contact name • High CSO/CTO turnover • Data for the smallest companies also problematic because of known high business turnover cyber - 14 Challenges Faced in Using D&B File (2) • Required greater than anticipated effort to obtain clean contact information (precalling, web lookup) • Wave I Pre-Calling: – – – – 2% confirmed company name and address 29% updated their name and/or address 2.5% were flagged for supervisor review relatively inefficient - ~12 min per case • Web II Web Lookup: – – – – 67% of company names and addresses confirmed 29 % of name and/or address updated % flagged for supervisor review Very efficient – less than 1 min per case cyber - 15 Lessons for Future Studies • Two stage fielding may generate a higher response rate and cleaner data, but will also cost more and increase fielding period • Aiming for a smaller sample but with more intensive prescreening, follow-up, spread over a longer surveying window may represent a better allocation of resources • Considerable budget needs to be allocated for updating contact information in the D&B file, and also to identify companies that have closed • Smallest firms (less than 10 employees) may best be omitted from the frame cyber - 16 cyber - 17