18th International Roundtable on Business Survey Frames Beijing, China 18 – 22 October 2004 Session No 4 Edward D. Walker and Ron S. Jarmin U.S. Census Bureau Implementing a Major Revision to the Industry Classification System1 1 Introduction 1.1 In 1997, the United States adopted the North American Industry Classification System (NAICS) as its new standard for categorizing business establishments by primary activity and for collecting, compiling, disseminating, and comparing industry statistics. Developed in collaboration with North American partners Canada and Mexico,2 NAICS represented a major structural and conceptual departure from its U.S. predecessor, the Standard Industrial Classification (SIC). Therefore, successful implementation of this largest-ever change in U.S. industry classification required the Census Bureau’s Business Register,3 its data sources, and client statistical programs to employ extraordinary measures. Those measures included adaptations in administrative records systems, coding schemes that permitted simultaneous classification according to old and new systems, preparatory reclassification surveys, new or expanded inquiries on economic census collections, modified industry coding processes and procedures, and last-resort modeling techniques. Additionally, statistical products eased the transition for data users by providing special data presentations retrospective NAICS-based estimates for critical time series. This paper describes those measures and concludes with a summary of lessons learned. 2 A Comparison of NAICS and SIC 2.1 Earlier presentations have described the development of NAICS, circumstances that prompted this landmark change in U.S. industrial classifications, the collaborative effort with partners Canada and Mexico, and the conceptual framework upon which NAICS was built. Ambler (1998) and the U.S. Office of Management and Budget (1998, 2002) provide excellent summaries on these subjects. Further, the NAICS pages of the Census Bureau’s website have links to a series of papers describing the principles that guided the design of NAICS and to U.S. Federal Register notices that presented NAICS proposals for public comment. The present paper will not repeat the ample background material on NAICS development; instead, it will highlight a few key differences between the SIC and NAICS that affected the nature and magnitude of the NAICS implementation effort. 1 2 3 This paper reports the results of research and analysis undertaken by Census Bureau staff. It has undergone a Census Bureau review more limited in scope than that given to official Census Bureau publications. This report is released to inform interested parties of ongoing research and to encourage discussion of work in progress. Principals in the three-nation collaboration were Statistics Canada; Mexico’s Instituto Nacional de Estadística, Geografía e Informática (INEGI); and the U.S. Office of Management and Budget through its Economic Classification Policy Committee. The Business Register formerly was known as the Standard Statistical Establishment List or SSEL. 1 2.2 The Census Bureau’s statistical programs have been through several SIC revisions since that classification system was introduced near the end of the 1930s.4 Those revisions, however, have been evolutionary, leaving the foundation and structure of the SIC intact. In contrast, the 1997 transition to NAICS was revolutionary. Table 1. Structural Comparison of NAICS and SIC. OLD NEW U.S. Standard Industrial Classification–1987 Divisions5 Major Code Groups Title Agriculture, Forestry, and A 01-09 Fishing B 10-14 Mining C 15-17 Construction D 20-39 Manufacturing Transportation, E 40-49 Communications and Public Utilities F 50-51 Wholesale Trade Finance, Insurance, and Real H 60-67 Estate G 52-59 Retail Trade North American Industry Classification System–1997 No of U.S. 5 Sectors Industries No of IndusNew6 Total tries Code Title Agriculture, Forestry, Fishing and 58 11 20 64 Hunting 31 21 Mining 0 29 26 23 Construction 3 28 459 31-33 Manufacturing 79 474 22 Utilities 6 10 67 48-49 Transportation and Warehousing 28 57 69 42 52 53 53 44-45 64 72 51 54 I 70-89 Services 150 56 61 62 71 81 J K Total 4 5 6 [No SIC counterpart; classified in the divisions of establishments managed.] 91-97 Public Administration 99 Unclassifiable – 55 Wholesale Trade Finance and Insurance Real Estate and Rental and Leasing Retail Trade 0 23 15 17 69 42 24 72 Accommodation and Food Services 10 15 Information Professional, Scientific, and Technical Services Administrative and Support and Waste Management and Remediation Services Educational Services Health Care and Social Assistance Arts, Entertainment, and Recreation Other Services (except Public Administration) 20 34 28 47 29 43 12 27 19 17 39 25 19 49 1 3 Management of Companies and Enterprises 27 92 Public Administration – [No NAICS counterpart.] 1,004 Total 2 – 29 – 1,17 358 0 The SIC was revised most recently for 1987. Table 1 reorders SIC Divisions and NAICS Sectors slightly to improve alignment for side-by-side comparison. “New” industries are those defined separately for the first time by NAICS 1997. The counts shown allocate new industries to NAICS Sectors. 2 7 $ First, NAICS introduced industrial classifications that are based on a consistent production-oriented (supply-side) concept: specifically, NAICS industries group economic activities that use like processes to produce goods or services. This contrasts with the SIC, which was an ad hoc mixture of production-oriented and market-oriented (demand-side) industries.7 The distinction between retail trade and wholesale trade illustrates how this difference in underlying concept affects classifications. According to NAICS, the principal differences between these sectors are defined in terms of production processes: establishments classified in retail trade are open to the public, attract customers through location and advertising, and generally display merchandise; on the other hand, establishments classified in wholesale trade usually operate from a warehouse or office, display little or no merchandise, and do not advertise to the general public. But according to the SIC, the distinction generally was based on class of customer: under the old system, retail establishments sold mostly to household consumers and individuals, whereas wholesale establishments sold mostly to industrial, commercial, professional, institutional, farm (for production), and government customers. $ NAICS also introduced a completely new system of industry codes. As shown by Table 1, above, NAICS took a clean-slate approach, so code assignment has no similarity to the SIC, even at the highest level. Moreover, NAICS codes differ in length and structure compared to SIC codes. In NAICS, top-level sectors have 2-digit codes, second-level subsectors have 3-digit codes, third-level industry groups have 4-digit codes, and NAICS industries (the level intended for North American comparability) have 5-digit codes; additionally, NAICS permits each of the North American participating countries to identify more detailed national industries using 6-digit codes. Beyond that, the Census Bureau supplements the NAICS code for U.S. industries with a 2-digit subindustry suffix for internal use only (e.g., to identify emerging industries), giving an extended processing code with a total of 8 digits.. This contrasts with the SIC, where top-level divisions had alphabetic codes A – J, second-level major groups had 2-digit codes, third-level industry groups had 3-digit codes, and industries had 4-digit codes. Here again, the Census Bureau supplemented the basic SIC industry code with a 2-digit subindustry suffix for internal use, giving an extended processing code with a total of 6 digits. $ NAICS and the SIC are further separated by the fundamental structural differences that Table 1 also illustrates. As shown there, NAICS has 20 top-level sectors, whereas the SIC had only 10 top-level divisions (plus a final class for unclassifiable establishments). The 10 sectors added by NAICS represent a disaggregation and reorganization of top-level classes for services-producing industries. Most notably, SIC Division I–Services was subdivided, with its major components going to: NAICS Sector 51–Information; Sector 54–Professional, Scientific, and Technical Services; Sector 56–Administrative and Support and Waste Management and Remediation Services; Sector 61–Educational Services; Sector 62–Health Care and Social Assistance; Sector 71–Arts, Entertainment, and Recreation; Sector 81–Other Services (except Public Administration), and the Accommodation subsector of Sector 72–Accommodation and Food Services. Market-oriented industry classifications group economic activities based on their production of goods or services that have similarities in use, belong together, or are used together for some purpose. 3 $ 3 Finally, Table 1 shows that NAICS has 1,170 industries overall, and the SIC had 1,004 industries, giving NAICS a net gain of 166.8 Contributing to this net gain, NAICS has 358 industries8 that are defined separately for the first time, some 70 percent of which are in the services-producing sectors. Further, NAICS makes substantial revisions to the definitions of some 330 industries8 compared to SIC counterparts. The NAICS pages of the Census Bureau’s website present comprehensive NAICS-to-SIC and SIC-to-NAICS concordance tables that detail the changes introduced by NAICS. Implementation Strategies 3.1 Mesenbourg (1996) describes U.S. plans for NAICS implementation as they stood in the latter part of 1996. Generally, the Census Bureau followed the broad outlines of that account, particularly with respect to schedules for phased implementation by administrative records suppliers, other U.S. statistical agencies, and the Census Bureau’s statistical programs. However, it was necessary to revise some of the collection and processing particulars as the funding picture became more firm and as we determined requirements in more detail. 3.2 The Census Bureau’s goals for NAICS implementation were very similar to those outlined by MacDonald (1995). First, we planed to incorporate the new classification system into the Business Register during the course of a single reference period, and we chose the first practical period following the completion and approval of NAICS—i.e. calendar year 1997. Second, we strived to make the transition as complete and accurate as possible. Many of the agency’s economic programs, including sensitive business cycle indicator surveys, rely on the Business Register for sampling frames, and their movement to NAICS depended on clean, high-quality implementation by the register. Third, we needed to minimize cost. We made the change to NAICS during a time of very tight Federal budgets, and our initiatives to secure additional appropriations for the purpose of supporting this effort generally were underfunded or unfunded. Fourth, we needed to minimize response burden on businesses. The U.S. Office of Management and Budget has oversight authority under the U.S. Paperwork Reduction Act and limits the number of burden hours that statistical collections can impose; more important, we do not want to undermine the goodwill that contributes to prompt and accurate response by most businesses. Finally, we thought it was important to ease the transition for data users. The development of NAICS responded to changes in the economy and evolving user needs by introducing the first fundamental reorganization of U.S. industrial classifications in nearly 60 years. As a result, significant breaks in statistical time series were inevitable, so we made special efforts to assist data users in bridging those breaks. In particular, we published data for the 1997 reference period on both NAICS and SIC bases, we prepared special data presentations to showed the SIC composition of NAICS industries and vice versa, and we have been developing retrospective NAICS-based estimates for critical time series. 3.3 A central element in the Census Bureau’s NAICS implementation strategy was its reliance on the 1997 Economic Census as the principal vehicle for collecting information needed to assign new classifications and for introducing NAICS-based industry statistics. This approach followed a tradition that has proved the economic census to be the agency’s 8 These figures are for NAICS 1997. There is a NAICS 2002 revision that adds U.S. industries and brings the total to 1,179. 4 most effective means for implementing revisions to the industrial classification system. This is true for at least four reasons. First, the census is our most broad-based and comprehensive collection; it is a complete enumeration of the establishments in covered industries.9 Second, the economic census and the Business Register are tightly integrated; census data flow through the register and are an important source of updates. Third, the economic census collects a rich variety of specialized content, including attributes that are important for assigning accurate industrial classifications—far more data of this type than the Census Bureau’s other economic collections. And finally, the economic census produces the Census Bureau’s most extensive and detailed industry statistics, which are optimal for introducing an updated picture of the economy according to a new classification system. 3.4 A second important element in the Census Bureau’s implementation strategy was a classification scheme that permitted us to identify each establishment’s industry according to both the new and the old systems for the 1997 reference period. Such an approach was essential to tabulating economic census industry statistics on both NAICS and SIC bases and to preparing other data presentations that would ease the transition for data users . To develop such a classification scheme, we first needed to establish a set of industry/subindustry classes that could be translated unambiguously to discrete NAICS and SIC industries. Generally, this required us to identify the components of each SIC industry that translated to distinct NAICS industries and the components of each NAICS industry that translated to distinct SIC industries. The result was a set of transitional classifications that had nearly twice as many industry/subindustry classes as were present in either NAICS or the SIC. Second, we needed a coding scheme to represent the transitional classifications. At least two approaches were possible here: we could have maintained two codes for each establishment, one NAICS-based and the other SIC-based, or we could have maintained a single hybrid code for each establishment, adding subindustry detail wherever needed to permit derivation of the corresponding NAICS and SIC industries. The Census Bureau chose the latter course because we had used it successfully for previous revisions to the industrial classification system and because the Business Register database in place at the time did not make it easy to store a second industry code for each establishment.10 Further, the transitional codes used during NAICS implementation had a SIC-based root because old classifications were the starting point for the transition to NAICS; because the Census Bureau’s annual, quarterly, and monthly economic surveys required SIC-based classifications 9 10 Economic census coverage excludes: NAICS Sector 11–Agriculture, Forestry, Fishing and Hunting (the U.S. Agriculture Department’s National Agricultural Statistics Service conducts a separate Census of Agriculture that covers NAICS 111–Crop Production and NAICS 112–Animal Production); NAICS 481111 Part–Certificated Passenger Air Carriers; NAICS 4821–Rail Transportation; the U.S. Postal Service (in NAICS 491–Postal Service); NAICS 525 except 52593–Funds, Trusts, and Other Financial Vehicles except Real Estate Investment Trusts (REITs; the census does cover REITs); NAICS 6111–Elementary and Secondary Schools; NAICS 6112–Junior Colleges; NAICS 6113–Colleges, Universities, and Professional Schools; NAICS 8131–Religious Organizations; NAICS 81393–Labor Unions and Similar Labor Organizations; NAICS 81394–Political Organizations; NAICS 814–Private Households; and NAICS Sector 92–Public Administration. These industries are out-of-scope to most of the Census Bureau’s other economic programs as well. The Business Register has tax reporting units for business/originations that operate exclusively in out-of-scope industries but generally does not identify establishments associated with those tax reporting units. We have since completed a Business Register redesign, and the new database has locations for storing both old and new industry codes during a revision to the industrial classification system. 5 from the register until they made their transitions to NAICS between 1998 and 200111; and because existing economic census collection and processing systems were set up to use extended 6-digit SIC-based processing codes that could accommodate the required subindustry detail. Since these modified SIC codes served to bridge the gap between NAICS and SIC, we referred to them as “bridge SICs”—a term that will name them in discussion that follows. Table 2a. An Illustration of Relationships between SIC and NAICS Industries/Subindustries and the Use of Transitional Bridge SICs OLD Standard Industrial Classification—1987 Code Title NEW Bridge North American Industry Classification System— SIC 1997 Code Code Title Architectural, Engineering, and Related Services 871200 541310 Architectural Services 5413 8712 0781 (Part) 8748 (Part) 8711 7389 (Part) 7389 (Part) 8713 (Part) 1081 (Part) 1382 (Part) 1481 (Part) 7389 (Part) 8713 (Part) 8734 (Part) Architectural Services Landscape Counseling and Planning (except Horticultural Consulting) Business Consulting Services, NEC (Urban Planners and Industrial Development Organizations) Engineering Services Business Services, NEC (Drafting Services) Business Services, NEC (Home and Building Inspection Services) Surveying Services (Geophysical Surveying) Metal Mining Services (Geophysical Surveying and Mapping) Oil and Gas Field Exploration Services (Geophysical Surveying and Mapping) Nonmetallic Minerals Services, Except Fuels (Geophysical Surveying and Mapping) Business Services, NEC (Map Making Services) Surveying Services (except Geophysical Surveying) Testing Laboratories (except Veterinary Testing Laboratories) 078120 541320 Landscape Architectural Services 874820 871100 541330 Engineering Services 738913 541340 Drafting Services 738912 541350 Building Inspection Services 871320 108120 138220 541360 Geophysical Surveying and Mapping Services 541370 Surveying and Mapping (except Geophysical) Services 148120 738909 871310 873410 541380 Testing Laboratories Table 2b. An Illustration of Relationships between SIC and NAICS Industries/Subindustries and the Use of Transitional Bridge SICs OLD Standard Industrial Classification—1987 Code Title Metal Mining Services Metal Mining Services (Except 1081 (Part) Geophysical Surveying Services) NEW Bridge North American Industry Classification System— SIC 1997 Code Code Title 1081 11 108110 213114 Support Activities for Metal Mining The Business Register maintained bridge SIC codes (which translate directly to NAICS codes) through the end of its 2001 reference period and replaced them with true NAICS codes for 2002 and subsequent reference periods. This change coincided with implementation of a redesigned Business Register database. 6 Table 2b. An Illustration of Relationships between SIC and NAICS Industries/Subindustries and the Use of Transitional Bridge SICs OLD Standard Industrial Classification—1987 Code Title Metal Mining Services (Geophysical 1081 (Part) Surveying Services) NEW Bridge North American Industry Classification System— SIC 1997 Code Code Title Geophysical Surveying and Mapping 108120 541360 Services 3.5 Table 2a, above, uses NAICS Industry Group 5413–Architectural, Engineering, and Related Services to illustrate the relationships between SIC and NAICS industries/subindustries and the use of bridge SIC codes to provide an unambiguous mapping of those relationships. Table 2b shows similar relationships, this time from the complementary perspective of a whole SIC industry that sent one of its parts to a NAICS industry represented in Table 2a—i.e., to NAICS 541360–Geophysical Surveying and Mapping Services. Together, these tables illustrate important characteristics of the bridge SIC scheme and associated codes used in the Census Bureau’s NAICS implementation: Many SIC industries were divided into subindustry classes, as identified by the “(Part)” designation in the first column of each table, and these parts went to different NAICS industries. For example, Table 2a shows three parts that came from SIC 7389–Business Services, NEC12 (in fact, SIC 7389 was subdivided into 36 parts; the 33 parts not shown in Table 2a went to NAICS industries outside NAICS Industry Group 5413); also, Table 2b shows two parts that made up SIC 1081–Metal Mining Services. Many NAICS industries also consisted of subindustry classes, and these parts came from different SIC industries. For example, Table 2a shows that NAICS 541360– Geophysical Surveying and Mapping Services had four such parts, one for each of its SIC subindustry components. In some cases, parts of a SIC industry went to NAICS industries in different sectors. For example, Table 2b shows that SIC 1081–Metal Mining Services had parts going to both NAICS Sector 21–Mining and NAICS Sector 54–Professional, Scientific, and Technical Services. In some cases, parts of a NAICS industry came from SIC industries in different divisions. For example, Table 2a shows that NAICS 541360–Geophysical Surveying and Mapping Services had one part that came from SIC Division I –Services and three parts that came from SIC Division B–Mining. More generally, it is possible to find examples where the relationship between SIC and NAICS industries was 1:1, 1:many, many:1, or many:many. Therefore, the bridge SIC scheme needed to provide unambiguous translation between NAICS and SIC for all these possibilities. Bridge SIC codes shown in the third column of each table had a total of 6-digits. They consisted of a 4-digit root that identified the base 1987 SIC industry and, where applicable, a 2-digit subindustry suffix that uniquely identified parts going to different NAICS industries (where there were no such subindustry parts, the 2-digit suffix was ‘00’; in that case, the entire SIC industry went to a single NAICS industry). Hence, for each industry/subindustry class identified by the bridge SIC scheme, it was 12 NEC – Not Elsewhere Classified 7 always possible to identify the originating SIC industry by examining the code’s 4digit root and the corresponding NAICS industry by referring the full 6-digit code to a concordance table that converted it to one and only one classification under the new system. For example, Table 1 shows that bridge SIC 874820 translated to SIC 8748– Business Consulting Services, NEC and to NAICS 541320–Landscape Architectural Services. 3.6 Extending the SIC-to-NAICS relationships shown by the foregoing illustrations to the full breadth of the bridge SIC classification scheme, there were: 614 SIC industries that translated to a single NAICS industry (subindustry suffix = ‘00’); establishments in these industries could be converted directly to the corresponding NAICS industry, and no collection of reclassification information was required. 390 SIC industries that consisted of two subindustry parts or more, each of which went to a different NAICS industry; these industries required collection of reclassification information to determine subindustry parts that went to different NAICS industries. 1,357 subindustry parts, total, across the 390 SIC industries described just above (all these subindustry parts had a subindustry suffix ≠ ‘00’) 80 SIC industries of the 390 described above that had subindustry parts going to NAICS industries in different sectors. 401 NAICS industries that had subindustry parts coming from two SIC industries or more. 1,971 bridge SIC classes, total. 4 Administrative Records 4.1 The Census Bureau relies on the administrative records other Federal agencies as lowcost, low-burden, high-quality sources of information for maintaining the business register and for representing smaller single-establishment enterprises in the economic census . These records are important for identifying new businesses and for providing a stream of updated size and activity measures for continuing businesses. Additionally, administrative records are an important source of industrial classifications, particularly for new businesses and for the smaller single-establishment business that do not receive an inquiry from the economic census. 4.2 For the most part, industrial classifications from administrative records did not play a significant role in the Business Register’s implementation of NAICS because supplying agencies’ introduction of the new classification system lagged the Census Bureau’s by a year or more. As a result, the Census Bureau needed to rely almost entirely on classifications derived from its own collections, at least through the 1997 reference period when NAICS implementation occurred. At the same time, it was important for supplying agencies to make the transition as soon as possible because Business Register operations for 1998 and later needed reliable administrative sources of NAICS classifications. 4.3 The U.S. Treasury Department’s Internal Revenue Service (IRS) led the way by introducing a NAICS-based “Principal Business Activity” inquiry on business income tax forms for tax year 1998. This timing meant 8 that the Census Bureau started receiving NAICS classifications from this source in February of 1999. On the positive side, the IRS is by far the largest supplier of administrative records used in Business Register maintenance, so their prompt implementation of NAICS was important. Less favorably, the IRS’ business income tax returns use a self-coding method by which the taxpayer is referred to a list of short industry descriptions in an instruction booklet (the IRS Forms and Publications web page noted in References at the end of this paper has examples for tax year 2003), is instructed to select the description that best fits the business covered by the tax return, and is asked to record the corresponding code on the tax form. Because it is impossible for the instruction booklet to list descriptions for all 1,170 U.S. NAICS industries and for the taxpayer to locate the best fit from such a large array, the tax forms’ approach is to list a selected group of 300 – 425 NAICS categories, depending on tax form type. The majority of those categories are 4-digit NAICS Industry Groups and, less often, 3-digit NAICS Subsectors; only a relatively small number of 6-digit U.S. NAICS industries appear for activities that occur with relatively high frequency among businesses that use a particular tax form type. As a result of the limitations associated with this selfcoding approach, NAICS codes supplied by the IRS often have less than the 6-digit industry detail that the Business Register requires, and the codes sometimes are missing; also, earlier evaluations by the Census Bureau have suggested that taxpayer self coding (SIC-based) is less accurate than other approaches. 4.4 The Social Security Administration (SSA) also supplies industrial classifications to the Census Bureau. Important primarily for recent entries, these classifications are based on information that businesses report when they file IRS Form SS-4, “Application for Employer Identification Number”13 (the IRS Forms and Publications web page has an example). To promote timely NAICS implementation by this key supplier, the Census Bureau licensed a computer-assisted NAICS coding tool known as the Coding Assist and Ruling Request System14 from Statistics Canada and made it available for use by the staff at the SSA. This made it possible for the agency to make a more timely transition; to begin assigning NAICS codes to Forms SS-4 received after January 1, 1999; and to deliver the first such codes to the Census Bureau during the second quarter of that year. Although this was too late to be helpful for our 1997 NAICS implementation, the SSA’s move to the new classification system has been important for continuing operations. Unlike the industry codes obtained from the IRS’ business income tax returns, those assigned by the SSA staff represent 6-digit U.S. NAICS industries almost exclusively; further, the quality generally is quite high. 4.5 Finally, the U.S. Labor Department’s Bureau of Labor Statistics (BLS) supplies industrial classifications from their Longitudinal Establishment Database, which is based on the administrative records and supplementary statistical collections of the joint Federal/State unemployment insurance system. NAICS implementation for this business list was carried out in phases that spanned three years. The vehicle for collecting required classification information was the program’s Annual Refiling Survey. This regular collection usually divides the universe of business establishments into three annual panels, approximately equal in size, and canvasses the panels in a three-year rotation to verify/update industrial classifications. The timetable for NAICS implementation in the BLS’ business list was as follows: (i) beginning October 1997, the program used an automated process to recode 13 14 The Employer Identification Number is the Federal taxpayer identification number for most businesses and other organizations (sole proprietorships excepted); some other administrative records systems also incorporate this identifier. Since superseded by the Industry Classification Coding System (ICCS). 9 establishments that could be converted directly from SIC to NAICS; (ii) beginning October 1998, the program directed an Annual Refiling Survey collection to establishments with 50 employees or more, selected smaller establishments, and unclassified establishments, together accounting for about half of those that had not been converted directly; and (iii) beginning October 1999, the agency directed an Annual Refiling Survey collection to all remaining establishments that did not yet have a NAICS code assigned and to auxiliary establishments. Each collection included up to three nonresponse follow-up contacts in an effort to produce high response rates. At the conclusion of all three phases, about 72 percent of the establishments on the BLS’ business list had been converted successfully, and a modeling procedure was used to impute NAICS classifications for the balance (see Mikkelson et al., 2000, for details). Results of each phase in the BLS’ NAICS implementation updated their business list early in the third quarter of the following year, and the newly assigned NAICS codes then became available to the Census Bureau during the regular quarterly exchange of classification information scheduled for October. Hence, NAICS classifications were available for some establishments on the BLS business list as early as October 1998 and for all establishments on the list by October 2000. Here again, the NAICS codes represent 6digit U.S. NAICS industries, and the quality is quite high. 5 Preparatory Reclassification Collections 5.1 The Census Bureau conducted pre-census reclassification or “refiling” collections to help prepare for the transition to NAICS. Our focus was establishments in the 80 SIC industries that had parts going to NAICS industries in different sectors. Returning to the illustration above, Table 2b gives an example of such an industry: SIC 1081–Metal Mining Services had a part going to NAICS Sector 21–Mining and a part going to NAICS Sector 54– Professional, Scientific, and Technical Services. It was most important to determine NAICS Sector for establishments in industries such as this one because the 1997 Economic Census used 454 questionnaires, each of which had content tailored to a fairly narrow group of related industries. Those questionnaires were more effective when an establishment received one that was appropriate to its principal activity, and since content varied markedly from sector to sector, it was particularly important for an establishment to receive a questionnaire for the correct sector. 5.2 The refiling survey for single units (single-establishment enterprises) was a mail collection directed to 215,829 establishments in December 1996. The collection instrument for the single unit refiling survey was one standard letter-size (8.5- by 11-inch) sheet of paper with an address label block and a short cover letter on the front and classification inquiries on the back. There were seventeen versions of the classification inquiries that were tailored to establishments in as many groupings of related industries from the set of 80 SICs that had been targeted for refiling. These inquiries were designed to determine an establishment’s subindustry class according to the bridge SIC scheme described earlier. Attachment A shows an example of the collection instrument—in particular, the version that was sent to establishments in SIC Industry Group 871–Engineering, Architectural, and Surveying Services (all seventeen versions are available on the Census Bureau’s website as shown in References). 5.3 The refiling of multiunits (establishments of multi-establishment enterprises) was done in connection with the Census Bureau’s annual register proving collection for multiestablishment enterprises, the Company Organization Survey. The usual procedure for constructing this survey’s collection panel 10 always targets large, complex enterprises and additionally targets others that are likely to report changes in establishment make-up or company organization based on an inference from administrative records indications. For the 1996 reference period, this selection procedure was modified; it selected large, complex enterprises, as before, and it additionally selected companies that had establishments in any of the 80 SIC industries that had been targeted for refiling. The result was a survey panel of 50,698 enterprises, which operated some 1.1 million establishments (the total multiunit population of the Business Register had 180,000 enterprises and 1.5 million affiliated establishments). Of the enterprises selected for the 1996 panel, 17,572 were included only because they had establishments in the SIC industries that had been targeted for refiling. Overall, the panel had 67,973 multiunit establishments targeted for refiling. 01-2345678 Enter code from Insert K XYZ Enterprises 21 Any Mining Services Company 123 456 Any Street Any City 9876543210 AB 00 108100 10 43210 40 000 Figure 1. Illustration of Establishment Inventory Listing from the 1996 Report of Organization 5.4 The Company Organization Survey collection instrument used the format illustrated in Figure 1, above, to list an inventory of establishments for each enterprise to which it was directed. It then requested updates to the inventory, including additions for openings, acquired establishments, or mergers; deletions for closings, sold establishments, or spinoffs/break-ups; and updates to information listed for each establishment’s Federal employer identification number (EIN), major activity (industrial classification), primary and secondary name, store or plant number, and physical location address. Finally, the instrument collected each establishment’s end-of-year operating status and standard employment and payroll measures. 11 5.5 For 1996, this inventory list was modified slightly to facilitate collection of updated classifications for establishments that had been targeted for refiling. They were identified by a symbol () printed on the left-hand side of the establishment’s entry in the inventory list (just above a sequence number, which is shown as ‘21’ in the illustration), and the usual major activity description was replaced by an instruction that referred the respondent to an insert. There were eleven versions of the “Industry Classification Assignment” insert, and each version listed descriptions of bridge SIC subindustry classes for a grouping of related industries from the set of 80 SICs that had been targeted for refiling; Attachment B shows an example of an insert—in particular, the version that was sent to establishments in SIC Industry Group 871–Engineering, Architectural, and Surveying Services. Since it was possible for a heterogeneous enterprise to receive two such inserts or more, the instruction in the inventory listing directed the respondent to a specific insert, identified by an alphabetic designation (A – K), that was appropriate to each targeted establishment’s SIC. The insert then asked the respondent to identify the bridge SIC subindustry description that best fit the establishment’s primary activity and to enter the corresponding code in the space provided on the inventory listing. 5.6 The single unit refiling survey received a response from 88.9 percent of the establishments in its collection panel, whereas the 1996 Company Organization Survey received a response from 86.5 percent of the enterprises in its panel (Company Organization Survey response rates are not available on an establishment basis, nor are they computed separately for the refiling component of the collection). When these collections and associated processing were complete, the Census Bureau applied resulting bridge SIC updates to the Business Register. Additionally, there was a secondary reclassification activity for multiunits whereby industry analysts identified selected homogeneous enterprises in the services-producing sectors and determined bridge SICs for the establishments of those enterprises; a batch process applied resulting bridge SIC code updates to the Business Register as mass corrections. Updated industry codes from these refiling activities were then available to determine appropriate assignment of questionnaires for the upcoming economic census. 6 1997 Economic Census Collection and Processing 6.1 As noted earlier, the 1997 Economic Census was a complete enumeration of establishments in all covered industries—some 6.4 million total. This enumeration consisted of four main components that used differing data acquisition methods, as shown in Table 3 and described below. All census questionnaires are available on the 1997 Economic Census Forms web page noted in References at the end of this paper. 12 Table 3. Components of the 1997 Economic Census Collection Economic Census Component and Standard Industrial Classification Division Establishments with Paid Employees Multiunits and Larger/Selected Single Units (Standard Forms) Construction Mining Manufacturing Transportation, Communications & Utilities Wholesale Trade Retail Trade Finance, Insurance & Real Estate Services Smaller, Partially Classified Single Units (Sector-Specific Classification Forms) Mining and Manufacturing Transportation, Communications, & Utilities Retail Trade Finance, Insurance, & Real Estate Services Unclassified Single Units (General Classification Form) Establishments without Paid Employees Number of Establishments Final Response Not Statistics Mailed Responses Rate Mailed 6,430,633 4,929,279 4,244,023 86.1% 1,501,354 6,430,633 3,371,392 2,887,152 85.6% 639,482 25,251 377,776 293,575 521,127 1,561,195 661,388 2,350,839 131,313 14,785 217,193 117,603 458,496 1,036,337 315,423 1,080,242 105,050 12,464 180,270 100,433 381,927 891,250 273,787 941,971 80.0% 84.3% 83.0% 85.4% 83.3% 86.0% 86.8% 87.2% 1,195,151 1,062,692 88.9% 85,016 32,350 319,237 96,346 662,202 74,049 27,562 269,117 86,711 605,253 87.1% 85.2% 84.3% 90.0% 91.4% 362,736 294,179 81.1% 15,439,609 15,439,609 6.2 Multiunits and Larger/Selected Single Unit Employers—The census relied primarily on direct collection to obtain data for multiunit establishments,15 larger single units, and a sample of smaller single units (excluding unclassified single units of any size). The instruments for this component were standard economic census questionnaires that collected basic economic measures and a rich variety specialized content, including attributes that were important for assigning accurate industrial classifications according to the bridge SIC scheme. Generally, these questionnaires were organized by SIC Division, and each one was tailored to a fairly narrow group of related industries within the division. Content that was important to assigning a bridge SIC included detail of shipments, sales, receipts, or revenue (i.e., by product, type of construction, merchandise, commodity, or service class); kind-ofbusiness; type-of-operation; materials consumed; selling characteristics; method of selling; class of customer; employees by occupation; and other industry-specific items. In many cases, NAICS implementation required new inquiries or substantially revised inquiries to separately identify bridge SIC subindustry classes that went to different NAICS industries. The Census Bureau used automated processes, known as industry coding edits, to assign an industrial classification to each employer establishment that completed a standard census questionnaire. While details varied somewhat for manufacturing/mining, construction, and the services-producing sectors, editing methods generally applied a relatively complex set of decision rules that considered the classification-relevant data noted above and industry codes from administrative records. The edits for 1997 differed substantially from those for earlier censuses because it was necessary to resolve distinctions among 1,971 bridge SIC classes (as 15 Some larger multiunit enterprises had the option of reporting electronically. 13 opposed to 1,004 SIC classes) and to consider new content that had been added to census questionnaires for the purpose of supporting those decisions. 6.3 Smaller, Partially Classified Single Unit Employers—If a small single unit employer was not selected to receive an economic census standard form and had a complete bridge SIC (i.e., was in a SIC industry that could be converted directly to a NAICS industry), no economic census collection was necessary; the establishment was represented in the census by the available bridge SIC and by data from administrative records. However, if a small single unit employer had an incomplete bridge SIC, the Census Bureau sent the establishment a shorter, specialized classification form designed for businesses in a particular SIC Division (divisions in the services-producing sectors had several such classification forms each, and the forms were tailored for rather broad groupings of related industries). Generally, these forms listed bridge SIC subindustry descriptions, and the recipient was asked to mark the checkbox associated with the one description that best fit the establishment’s primary activity. A variation on this approach for partially classified mining and manufacturing establishments used one generic questionnaire format with print-on-demand bridge SIC descriptions that varied depending on the incomplete classification available from the Business Register. Further, many of these sector-specific classification forms had additional classification inquiries, such as method of selling and selling characteristics items on forms for partially classified establishments in retail trade and wholesale trade. Once this specialized classification form was returned and a classification was assigned, the establishment was represented in the census by data from administrative records (plus the bridge SIC determined by means of the specialized classification form); no further collection was necessary. 6.4 Unclassified Single Unit Employers—The census directed a general classification form to unclassified single unit employers of any size, mostly entries that started operation shortly before or during the census reference period. This questionnaire presented a large array of bridge SIC industry/subindustry descriptions organized by SIC Division, and recipients were asked to mark the checkbox associated with the one description that best fit the establishment’s primary activity. Additional inquiries requested information on principal sources of sales, receipts, or revenue; principal materials consumed (for manufacturers); and class of customer. Once this general classification form was returned and a classification was assigned, the Census Bureau examined administrative data for the establishment to determine whether it was large enough to require completion of an economic census standard form; if so, we assigned the industry-appropriate standard form and asked the respondent to complete this second questionnaire. If the establishment was too small to require completion of a standard form, it was represented in the census by data from administrative records (plus the bridge SIC determined by means of the general classification form); no further collection was necessary. 14 6.5 Nonemployers—The Census Bureau did not send a collection instrument of any kind to single-establishment enterprises without paid employees. More than 15 million small businesses of this type (average annual receipts of U.S. $38,000), mostly self-employed sole proprietorships, were represented in the 1997 Economic Census exclusively by data from administrative records. It should be noted that nonemployers were an exception to the use of bridge SICs; they were classified only according to NAICS, and a separate Nonemployer Statistics publication summarized data only on that basis—generally for sectors and subsectors, selected industry groups, and only a few 6-digit industries (i.e., classifications for nonemployers had considerably less industry detail than classifications for employers). The nonemployer component of the census presented a particular challenge for NAICS implementation because administrative records suppliers did not make the transition to the new classification system until the 1998 reference period or later. As a result, the Census Bureau had to apply the following special procedures to nonemployers: 16 $ For sole proprietorships (the great majority of nonemployers) that did not have a SIC code on their 1997 tax return, we attempted to assign a classification based on a principal activity description reported only on the tax form type that applied to these businesses. First we used an automated procedure that parsed keywords in the industry description, referred them to a computerized coding dictionary, and assigned a SIC-based classification16 when there was a reliable match. If this automated procedure did not provide a code, we referred the description to an Industry Coding Unit (described below), where the clerical staff attempted to assign a classification. $ We then used a concordance table to convert SIC-based codes to a NAICS basis for establishments in SIC industries that could be translated directly. $ The IRS started using NAICS-based industrial classifications on 1998 business income tax forms, and by December 1999, the Census Bureau had received 1998 tax data that were essentially complete. Following the premise that industrial classifications for these very small businesses were reasonably stable from year to year, we matched those 1997 nonemployer records that did not have a NAICS code from the direct conversion procedure to their 1998 counterparts and applied NAICS industry codes from the latter to the former. $ Finally, for those residual nonemployers that still did not have a NAICS code, we used a modeling procedure to impute one. This last-resort measure referred the establishment’s SIC-based industry code to a table that gave the relative SIC-toNAICS distribution observed for the population of nonemployer establishments for which we had classifications of both types and selected a NAICS code from that distribution (more on modeling procedures below). The result was SIC based because we used an existing process and coding dictionary, developed for previous years’ data, that assigned classifications according to the old system. 15 6.6 In some cases, it was necessary for members of the Census Bureau staff to assist with industry code assignment. Generally, this intervention occurred at the Census Bureau’s National Processing Center (NPC) in Indiana, where the staff performed a variety of economic census data collection and processing tasks. $ The single unit refiling survey mailed at the end of 1996 and part of the main census collection mailed at the end of 1997 used classification forms to collect industry coding information for selected establishments. Respondents returned these forms and other economic census questionnaires to the NPC. If the response to a classification form unambiguously identified a single bridge SIC class, our processes captured that fact and updated the Business Register with the corresponding industry code. If, on the other hand, the response identified two bridge SIC classes or more, consisted of a write-in industry description, or otherwise failed to provide a straightforward answer, the staff of a NPC Industry Coding Unit examined the form, evaluated available classification information, and assigned the bridge SIC code that was judged to be the best fit. The staff of this unit had two components: (i) Clerical industry coding specialists resolved easier cases that could be classified by following fairly explicit procedures, and their work was subject to quality assurance evaluation. (ii) Professional industry coding specialists classified cases that were referred to them as being too difficult for clerical resolution. Both clerical and professional components of the staff used industry coding references to support their bridge SIC assignments and an interactive database application to update the Business Register. $ When a sole proprietorship business without paid employees did not report a SIC-based principal business activity code on its 1997 tax form but instead wrote in a description, we first applied an automated process that attempted to assign a classification based on the description. If the automated process could not make a reliable assignment, the nonemployer establishment was referred to the NPC Industry Coding Unit, where members of the clerical staff examined the description and assigned a classification whenever possible. $ Some employer establishments that were reclassified across sector boundaries and all employer establishments that were reclassified from an industry that was inscope to the census to one that was out-of-scope were subject to review and final adjudication by the professional staff of the NPC Industry Coding Unit. $ When industry coding edits could not assign a bridge SIC code, the establishment was referred to the staff of a sector-specific NPC Problem-Solving Unit that specialized in resolving data questions from the edits (industry coding and others). There were several such units, and details of their operation varied somewhat from sector to sector. Generally, they examined the census questionnaire and other sources of classification information, considered their knowledge of the industry, and applied expert judgment in order to assign the bridge SIC code that fit best. In the case of larger establishments or establishments with other difficult data questions, these problem-solving specialists might contact the business in order to seek clarification. Here again, the units usually had a clerical component for easier cases and a professional component for cases that were referred as being too difficult for clerical resolution. This work was supported by industry coding references, secondary data sources (e.g., company websites), and an interactive 16 database application for correcting census data. $ Industry statistics from the 1997 Economic Census were subject to careful review by the professional analytical staff at Census Bureau headquarters. Generally the members of this staff are experts on a particular sector or smaller industry group, and they are thoroughly familiar with larger companies that operate in their areas of expertise. When these industry analysts prepared 1997 data for publication, their examination sometimes identified errors in industrial classification. They resolved those errors by correcting the bridge SIC code for each misclassified establishment. This work was supported by historical data, macro-edits, macroanalytical review tools, secondary data sources (e.g., company websites), and an interactive database application for correcting census data. 6.7 Statistical collections generally have some residual component for which complete observations cannot be obtained, and the 1997 Economic Census was no exception. Mandatory authority and repeated follow-up contacts produced an 86.1 percent final response rate, which left a gap in the data needed to assign complete industrial classifications to all establishments. The Census Bureau filled this gap by applying a simple statistical modeling procedure to impute missing industry/subindustry detail. To support this procedure, first we constructed a parameter table by isolating the subpopulation of fully classified establishments and determining the relative distribution of units from each 2-, 3-, 4-, or 6-digit SIC17 to the corresponding set of valid 6-digit bridge SICs. The modeling procedure then referred each establishment’s SIC to a set of matching rows in the parameter table and selected one such row by comparing a uniformly distributed 3-digit index18 to a set of cumulative distribution parameters arranged in ascending sequence. The first row in which the distribution parameter exceeded the 3-digit index gave the bridge SIC assigned to the establishment. Thus, the resulting imputations assumed similar bridge SIC distributions for the subpopulations of fully- and partially-classified establishments and preserved the distributions observed for fully-classified establishments in the employer population as a whole. 6.8 Table 4a presents an example of the parameters, and Table 4b illustrates their use in the modeling procedure. 17 18 $ The establishment in Example 1 has a 2-digit incoming code, SIC 870000. The modeling procedure refers this SIC to the set of 5 matching rows in the parameter table and selects the first such row where the cumulative distribution parameter shown in the table’s second column exceeds the value represented by digits 7 – 9 of the establishment’s employer identification number (i.e., 248). Accordingly, the establishment is assigned bridge SIC 871310. $ The establishment in Example 2 has a 3-digit incoming code, SIC 871000, and the procedure assigns bridge SIC 871200 (SIC 8712 has no subindustry detail, so the bridge SIC’s subindustry suffix is ‘00’). As noted earlier, administrative records, particularly business income tax returns, may not have provided a complete 4-digit SIC, so some single units that were classified on that basis had 2- or 3-digit codes. There were no 5-digit SICs. The parameter table also included complete 6-digit codes, which converted 100% to the bridge SIC with an identical code. This index was based on positions 7 – 9 of the establishments Federal employer identification number. 17 $ The establishment in Example 3 has a complete 6-digit incoming code, SIC 871310, and there is only one matching row in the parameter table. Accordingly, and the procedure assigns the same code as the establishment’s bridge SIC. Table 4a. Parameters for Bridge SIC Modeling Procedure Example 1 Example 2 Example 3 Distribution within SIC Code (In) Cumulative Percent19 Percent19 17.8% 17.8% 7.0% 24.8% 0.1% 24.9% 2.9% 27.8% 72.2% 100.0% SIC Code (In) 870000 870000 870000 870000 870000 Cumulative Distribution Parameter (> Index ) 178 248 249 278 1000 871000 871000 871000 871000 639 890 894 1000 871100 871200 871310 871320 63.9% 25.1% 0.4% 10.6% 63.9% 89.0% 89.4% 100.0% 54133 54131 54136 54137 871100 1000 871100 100.0% 100.0% 54133 871200 1000 871200 100.0% 100.0% 54131 871300 871300 37 1000 871310 871320 3.7% 96.3% 3.7% 100.0% 54136 54137 871310 1000 871310 100.0% 100.0% 54136 871320 1000 871320 100.0% 100.0% 54137 Bridge SIC Code (Out) 871100 871200 871310 871320 Etc.20 Corresponding NAICS Code19 54133 54131 54136 54137 Etc.20 Etc. Table 4b. Illustrations of the Bridge SIC Modeling Procedure Example 1 ID 0123456248 2 3 0123456789 0123459876 EIN (Index based on SIC Code positions 7 – 9) (In) 870000 123456248 871000 123456789 871310 123459876 Bridge SIC Code (Out) 871310 871200 871310 6.9 A variation on this modeling procedure applied to nonemployer establishments. In this case, however, the parameter table’s third column contained NAICS codes, and the distributions were based on the subpopulation of nonemployers that had both SIC and NAICS codes, the latter obtained from 1998 tax data. As noted earlier, the 1997 Economic Census presented nonemployer statistics only on a NAICS basis. 19 20 This column does not exist in the actual parameter table. It is added here to provide context and improve the illustration. This row represents a group of discrete classes in SIC 87 except 871 and accounts for 72.2 percent of the establishments in SIC 87. It is collapsed to a single category to simplify the illustration and conserve space. 18 7 1997 Economic Census Data Presentation 7.1 A major revision to the industrial classification system, such as the Census Bureau’s implementation of NAICS, necessarily causes a break in statistical time series. To mitigate the effects of this break, the Census Bureau prepared special presentations of 1997 Economic Census data that were designed to assist data users in making the transition. These presentations depended on the bridge SIC scheme that allowed us to classify each establishment according to both NAICS and SIC. The statistical publications described briefly below are available on the Census Bureau’s website, as listed in References at the end of this paper. 8 $ Comparative Statistics—This report allowed data users to compare 1997 and 1992 Economic Census data on a 1987 SIC basis. Specifically, it presented detailed industry statistics for both reference periods in a side-by-side format with separate summaries for the United States, each of the 50 States, and the District of Columbia. Attachment C provides an example of this presentation for selected industries used in earlier illustrations. Since microdata records from the 1992 Economic Census did not include content needed to reclassify those establishments according to the bridge SIC scheme, it was not possible for us to prepare a complementary presentation that compared 1997 and 1992 census data on a NAICS basis (but see the section below on Time Series Data for a discussion of efforts to recast pre-1997 data on a NAICS basis). $ Bridge Between NAICS and SIC—This report detailed the relationships between NAICS and SIC industries. One tabulation presented industry statistics on a NAICS basis and showed the composition of each NAICS industry in terms of 1987 SIC industries. Conversely, a second tabulation presented industry statistics on a 1987 SIC basis and showed the composition of each SIC industry in terms of NAICS industries. Both tabulations summarized data for the United States; no further geographic detail was available. Attachment D provides an example of this data presentation for selected industries used in earlier illustrations. Time Series Data 8.1 As mentioned earlier, the change from SIC to NAICS created serious concern among data users. A diverse community of users including the Bureau of Economic Analysis (BEA), the Federal Reserve Board of Governors (FRB) and the National Association of Business Economists (NABE) make extensive of use of economic time series data that are either directly sourced from the Census Bureau or are constructed from our data. These series include industry output, employment, productivity, investment, capital utilization, and industrial production indices to name a few. In addition to these users of publicly available industry time series data, there is a growing cadre of researchers from government and academic institutions that utilize confidential Census Bureau microdata. This class of users is particularly concerned with longitudinal analysis and, therefore, need consistent industrial classification schemes over time. 19 8.2 The Census Bureau published the 1997 Economic Census data under both classification schemes and provided a detailed SIC-NAICS concordance. While useful, these measures offer little comfort to users of industry level time series. Both SIC and NAICS are used to classify establishments, whose data are then aggregated up to form industry level statistics. There is not a one-to-one mapping of SIC codes to NAICS codes, except for special codes developed for the 1997 Economic Census. Thus, for any 4 digit SIC, some of its constituent establishments will be coded to one 6-digit NAICS industry and some to another. Many establishments engage in multiple activities that can further complicate industry coding, especially over time. Thus, even with a detailed concordance one is not ensured of being able to produce accurate NAICS industry level time series from SIC based historical data. One way of getting around these problems for industry data is to use the 1997 SICNAICS bridge to compute the shares of 1997 activity (using a measure such as sales, employment, or payroll), in a given SIC that are reclassified into various NAICS codes. These shares can then be used to “re-weight” SIC based industry level statistics. While this strategy may work going back a few years from 1997, it becomes more questionable for longer time series. Consider the following example taken from Bayard and Klimek (2003). Table 5. Sample Bridge Between SIC 3578 and NAICS NAICS 333313 334119 Description SIC 3578 – Calculating & Accounting Machines, Except Electronic Computers Office Machinery Manufacturing Other Computer Peripheral Equipment Manufacturing Value of Shipments ($1,000) Share of Shipments Paid Employees Share of Employment 2,014,806 100% 7,683 100% 144,380 7% 966 13% 1,870,426 93% 6,717 87% Source: Bayard and Klimek (2003) 8.3 Clearly, these shares become less appropriate the farther back in time one applies them to SIC 3578 data. Namely, we would expect that NAICS 333313, which includes typewriters and mail handling equipment, to be much more prominent in 1975 than it is in 1997. Likewise, we would expect NAICS 334119, which includes computer printers, monitors, keyboards, etc., to be less prominent in 1975 relative to 1997. 8.4 It is possible, however, to use the original microdata files to more accurately recode historical data on a NAICS basis. An ongoing project at the Census Bureau’s Center for Economic Studies (CES) does just that. Bayard and Klimek (2003) utilize several techniques to assign each establishment record in historical (pre-1997) economic censuses the best NAICS code possible. Their approach has been used on manufacturing data in order to recast the FRB’s Indices of Industrial Production and Capacity Utilization on a NAICS basis. These are both intensely watched “Leading Economic Indicators” that utilize data from the Census Bureau and other sources. 8.5 The methodology employed by Bayard and Klimek uses the most detailed and accurate information first. They make use of the 1997 SIC-NAICS bridge and longitudinal data files maintained at CES. They reclassify all establishments in the Censuses of Manufactures going back to 1963. They use a recursive procedure. Namely, they use 1997 information to reclassify 1992 establishments. Likewise, they use 1992 information to reclassify 1987 and so on. 20 8.6 In manufacturing, SIC and now NAICS codes are assigned by the Census Bureau using detailed information on the products produced at an establishment. Bayard and Klimek, therefore, first attempt to assign establishments to NAICS codes using detailed product data available each economic census year. This is the most accurate method of assigning manufacturing establishments a NAICS code. Unfortunately, not all establishment records in historical files meet the information requirements for this method. For these establishments, the next preferred solution is to take the subset in SIC industries that have a one-to-one mapping to a NAICS industry and apply the corresponding NAICS code. For establishments that are not in these one-to-one SIC-NAICS industries, longitudinal establishment data prove very useful. In cases where establishments in the census year being reclassified (say 1987) still exist in 1997 and do not change SIC codes over that period, Bayard and Klimek assign the NAICS code given those establishments in the 1997 bridge. A statistical model was used to assign NAICS codes to the remaining establishment records. 8.7 The Bayard and Klimek methodology worked very well for the manufacturing sector and data products that rely on it, such as the FRB’s re-cast Industrial Production Index, have been well received by data users. Work is beginning now to re-classify other sectors of the economy using this methodology21. The hope is that all employer establishments that have operated in the U.S. private non-farm economy since 1975 will eventually have NAICS codes assigned to them for each year of operation22. Unfortunately, most sectors lack the detailed product data that proved so useful in re-coding the manufacturing sector back to 1963. Thus, sectors such as services, retail and wholesale trade, and others will be re-coded with less precision. 9 Lessons Learned 9.1 The Census Bureau’s implementation of NAICS was quite successful. As noted earlier, this transition was a much larger challenge than earlier SIC revisions had been. Nevertheless, the agency achieved the goals it had set for completely incorporating the new classification system into the Business Register for the 1997 and subsequent reference periods and for introducing NAICS to data users by means of comprehensive and detailed statistical products from the 1997 Economic Census. Overall, our implementation strategy and specific methods worked very well; however, there was room for improvement in at least two respects. 9.2 First, there were relatively small imperfections in our introduction of NAICS. The most significant were minor discrepancies between the definition and interpretation of some industries according to the final NAICS 1997 specification and classifications that 1997 Economic Census collections could support. These discrepancies occurred because the magnitude of the changes introduced by NAICS required additional time and care for specifying the new system, completing detailed industry definitions, making decisions about the correct industry placement of some 35,000 specific business activities, and completing final NAICS documentation. As a result, NAICS development teams continued this work 21 22 A variant of this methodology was used to reclassify establishments in the 1992 Censuses of Wholesale and Retail Trade. Namely, the plan is to put NAICS codes on all establishment records in the Census Bureau Longitudinal Business Database (LBD). The LBD is a longitudinal version of the Census Bureau’s Business Register that is invaluable for examining producer dynamics, job flows, structural change and regional economic growth. See Jarmin and Miranda (2002) for more details. 21 after an April 1997 U.S. Federal Register notice announced the decision to adopt NAICS, and an August 1998 notice followed with late changes to the NAICS 1997 specification. At the same time, the Census Bureau’s schedule for preparing, printing, and labeling questionnaires for the December 1997 census mailout required us to determine the instruments’ final content during the first half of 1997. Since NAICS refinements continued after the questionnaires were final, there were some cases where the instruments’ classification content did not align precisely with final industry definitions, and it was necessary to implement some NAICS industries on an “as collected” rather than an “as defined” basis. Drawing from this experience, the lessons are: expect last-minute changes, particularly when there is a major revision to the industrial classification system; insofar as possible, allow time for the new system to stabilize before implementing it; and best efforts notwithstanding, be prepared to adapt implementation plans to accommodate some remaining degree of definitional uncertainty. 9.3 Second, the 1996 refiling collection for multiunits did not work as well as we needed it to. It was designed to obtain classification updates while causing minimal change to the reporting procedures with which companies were familiar and to existing collection and processing systems. As a result, the Report of Organization collection instrument was modified only slightly, and we addressed the refiling requirement by using a self-coding procedure that referred the respondent to classification inserts (separate enclosures; an example is shown in Attachment B). Unfortunately, companies did not respond well to this design, and we initially received classification updates for only 47 percent of the multiunit establishments that had been targeted for refiling. A substantial amount of costly follow-up contact and other supplementary effort were needed to improve the multiunit reclassification rate to an acceptable level. Drawing from this experience, the lessons are: it was ineffective to refile multiunit establishments by adapting our regular register proving survey and/or by using a self-coding procedure that relied on classification inserts; we needed a better design than the one used for the 1996 Company Organization Survey. 22 References “1997 Economic Census Classification Report” (instruments used for the single unit refiling survey done in December 1996). U.S. Census Bureau. Undated (date of access September 2004). <http://www.census.gov/epcd/www/pdf/97nc/nc9922.pdf>. “1997 Economic Census Forms” web page. U.S. Census Bureau. Undated (date of access September 2004). <http://www.census.gov/epcd/www/ec97form.html>. Ambler, Carole A. “NAICS and U.S. Statistics.” Joint Statistical Meetings. Dallas, Texas. August 1998. Available at < http://www.census.gov/epcd/www/asambler.htm>. Bayard, Kimberly N., and Shawn D. Klimek. “Creating a Historical Bridge for Manufacturing Between the Standard Industrial Classification System and the North American Industry Classification System.” Proceedings of the Annual Meeting of the American Statistical Association. San Francisco, California. August 2003. “Bridge Between NAICS and SIC—1997.” U.S. Census Bureau. June 2000 (date of access September 2004). <http://www.census.gov/epcd/ec97brdg/97x-cs3.pdf> “Comparative Statistics—1997.” U.S. Census Bureau. June 2000 (date of access September 2004). <http://www.census.gov/epcd/ec97sic/97x-cs2.pdf>. Internal Revenue Service Forms and Publications web pages. U.S. Department of the Treasury, Internal Revenue Service. Undated (date of access September 2004). Instructions for IRS Form 1120, “U.S. Corporation Income Tax Return,” and IRS Form 1120A, “U.S. Corporation Short-Form Income Tax Return” (both 2003), pp. 21-22: <http://www.irs.gov/pub/irs-pdf/i1120_a.pdf>. Instructions for IRS Form 1065, “U.S. Return of Partnership Income” (2003), pp. 32-34: <http://www.irs.gov/pub/irs-pdf/i1065.pdf>. Instructions for IRS Form 1040, “U.S. Individual Income Tax Return” (2003; sole proprietorships file Schedule C, “Profit or Loss from Business”), pp. C-7 – C-9: <http://www.irs.gov/pub/irs-pdf/i1040.pdf>. IRS Form SS-4, “Application for Employer Identification Number”: <http://www.irs.gov/pub/irs-pdf/fss4.pdf>. MacDonald, Brian. “Implementing a Standard Industrial Classification (SIC) System Revision.” Business Survey Methods. Ed. Brenda G. Cox et al. New York. John Wiley & Sons, Inc., 1995. Mesenbourg, Thomas L. “NAICS Implementation Plan for the United States.” 10th International Roundtable on Business Survey Frames. Quebec, Quebec. October 1996. Mikkelson, Gordon, Teresa L. Morisi, and George Stamas. “Implementing the NAICS for Business Surveys at BLS.” International Conference on Establishment Surveys II. Buffalo, New York. June 2000. Available at <http://www.bls.gov/ore/pdf/st000070.pdf>. 23 “North American Industry Classification System (NAICS)” web page. U.S. Census Bureau. January 14, 2004 (date of access September 2004). <http://www.census.gov/epcd/www/naics.html> United States Office of Management and Budget. North American Industry Classification System—United States, 1997. Lanham, Maryland: Bernan Press, 1998. United States Office of Management and Budget. North American Industry Classification System—United States, 2002. Lanham, Maryland: Bernan Press, 2002. 24 Attachment A Page 1 of 2 1997 Economic Census Classification Report Instrument Used for 1996 Refiling Survey of Single Units (17 Versions) Attachment A Page 2 of 2 1997 Economic Census Classification Report Instrument Used for 1996 Refiling Survey of Single Units (17 Versions) 2 Attachment B Page 1 of 2 1996 Report of Organization Industry Classification Insert (11 Versions) Attachment B Page 2 of 2 1996 Report of Organization Industry Classification Insert (11 Versions) 2 Attachment C Page 1 of 1 Comparative Statistics on a 1987 SIC Basis—1997 and 1992 [Data are shown for selected SIC industries from earlier illustrations.] Table 1. Comparative Statistics for the United States—1997 [Includes only establishments with payroll. Data are in current dollars and have not been adjusted for inflation.] Receipts Establishments Paid employees ($1,000) SIC 1987 SIC Description 1997 1992 % chg 1997 1992 % chg 1997 1992 % chg 1081 Metal mining services Oil and gas exploration 1382 services Nonmetallic minerals 1481 services, except fuels Annual payroll ($1,000) 1997 1992 % chg 203 251 -19.1 341,888 350,441 -2.4 3,066 2,973 3.1 110,070 104,612 5.2 1,197 1,473 -18.7 818,607 964,629 -15.1 7,039 12,930 -45.6 215,996 423,687 -49.0 172 N N 190,942 188,932 1.1 1,874 N N 63,551 N N 69,376 52,375 32.5 62,276,780 32,885,901 89.4 867,462 523,650 65.7 17,597,943 9,783,317 79.9 Engineering services 52,526 41,834 Architectural services 20,602 17,875 Surveying services 9,025 8,418 Testing Laboratories % 5,488 4,540 Business consulting 8748 services, not elsewhere 17,853 12,628 classified % N – Comparable data not available 25.6 88,180,688 65,245,236 15.3 16,988,338 11,244,379 7.2 3,453,489 2,280,177 20.9 6,442,964 4,763,614 35.2 51.1 51.5 35.3 730,048 657,609 146,702 121,675 56,880 45,324 82,024 70,462 11.0 35,337,890 27,246,839 20.6 6,468,524 4,408,064 25.5 1,712,316 1,089,694 16.4 2,708,782 1,998,829 29.7 46.7 57.1 35.5 41.4 90.0 47.4 80.7 7389 8711 8712 8713 8734 Business services, not elsewhere classified 8,687,728 4,573,223 77,341 52,456 3,191,884 1,766,156 Note: This is an HTML version of the Comparative Statistics tabulation similar to the one available on the Census Bureau’s website. Printed and Adobe Portable Document Format (pdf) versions also are available. Attachment D Page 1 of 2 Bridge Between NAICS and SIC—1997 [Data are shown for selected industries from earlier illustrations.] Table 1. NAICS-Based Industry Statistics for the United States with Distribution Among 1987 SIC-Based Industries—1997 [Includes only establishments with payroll. The figures to the left of each SIC code gives the percentage of the SIC industry’s receipts that are represented by the part shown and provide a hyperlink to Table 1 data for other parts of the NAICS industry. The symbol provides a hyperlink to Comparative Statistics data for the whole SIC industry.] EstabAnnual NAICS SIC Pt Description lishReceipts Paid payroll ments ($1,000) employees ($1,000) Architectural, engineering, & related 5413 92,710 116,986,061 1,038,317 46,942,816 services 54131 Architectural services 20,602 16,988,338 146,702 6,468,524 541310 Architectural services 20,602 16,988,338 146,702 6,468,524 8712 54133 541330 8711 54134 541340 Architectural services Engineering services 20,602 52,526 16,988,338 88,180,688 146,702 730,048 6,468,524 35,337,890 Engineering services Engineering services Drafting services 52,526 52,526 1,872 88,180,688 88,180,688 605,362 730,048 730,048 9,150 35,337,890 35,337,890 310,342 1,872 1,872 2,771 605,362 605,362 639,041 9,150 9,150 8,674 310,342 310,342 240,080 2,771 2,771 639,041 639,041 8,674 8,674 240,080 240,080 587 1,087,786 9,905 445,595 587 1,087,786 9,905 445,595 21 3,783 41 1,101 213 518,667 2,907 104,681 17 8,313 62 2,877 336 557,023 6,895 336,936 8,864 3,041,882 51,814 1,431,603 8,864 3,041,882 51,814 1,431,603 175 8,689 5,488 145,416 2,896,466 6,442,964 1,829 49,985 82,024 56,223 1,375,380 2,708,782 Drafting services 1% of 7389 13 Drafting services 54135 Building inspection services 541350 Building inspection services 1% of 7389 12 Building inspection services Geophysical surveying & mapping 54136 services 541360 1% of 1081 20 63% of 1382 20 4% of 1481 20 16% of 8713 20 54137 Geophysical surveying & mapping services Geophysical surveying services only for metal mining, contract basis Geophysical surveying services for oil & gas fields, contract basis Geophysical surveying services for nonmetallic minerals (excluding fuels) Geophysical surveying Surveying & mapping (except geophysical) services Surveying & mapping (except geophysical) services 0% of 7389 09 Map making services 84% of 8713 10 Surveying services 54138 Testing laboratories 541380 Testing laboratories 541370 5,488 6,442,964 82,024 2,708,782 100% of 8734 10 Testing laboratories 5,488 6,442,964 82,024 2,708,782 Pt – Part The symbol is used as a link to the 1992 figures shown in Comparative Statistics. Note that there are links only for SIC industries, not for NAICS industries. Attachment D Page 2 of 2 Bridge Between NAICS and SIC—1997 [Data are shown for selected industries from earlier illustrations.] Table 2. 1987 SIC-Based Industry Statistics for the United States with Distribution Among NAICS-Based Industries—1997 [Includes only establishments with payroll. The figures to the left of each SIC code gives the percentage of the SIC industry’s receipts that are represented by the part shown and provide a hyperlink to Table 1 data for other parts of the NAICS industry. The symbol provides a hyperlink to Comparative Statistics data for the whole SIC industry.] EstabAnnual Receipts Paid SIC NAICS Pt Description lishpayroll ($1,000) employees ments ($1,000) Engineering, architectural, and 871 82,153 108,622,515 933,630 43,518,730 surveying services 8711 Engineering services 52,526 88,180,688 730,048 35,337,890 541330 Engineering services Architectural services 541310 Architectural services 8713 Surveying services Geophysical surveying & mapping 51% of 541360 10 services (pt) 95% of 541370 10 Surveying services Pt – Part 8712 52,526 20,602 20,602 9,025 88,180,688 16,988,338 16,988,338 3,453,489 730,048 146,702 146,702 56,880 35,337,890 6,468,524 6,468,524 1,712,316 336 557,023 6,895 336,936 8,689 2,896,466 49,985 1,375,380 Bridge symbols shown indicate the degree of comparability between SIC and NAICS categories. (Bridge complete.) Comparable Table 1—NAICS derivable from SIC data. Table 2—SIC derivable from NAICS data. (Drawbridge slightly open.) Almost comparable Table 1—Sales or receipts from SIC are within 3% of sales or receipts from NAICS. Table 2—Sales or receipts from NAICS are within 3% of sales or receipts from SIC. Not comparable Table 1—NAICS sales or receipts cannot be estimated within 3% from SIC data. Table 2—SIC sales or receipts cannot be estimated within 3% from NAICS data. (Drawbridge open.) Note: These are HTML versions of Bridge Statistics tabulations similar to the ones available on the Census Bureau’s website. Printed and Adobe Portable Document Format (pdf) versions also are available. 2