Collecting Electronic Data From the Carriers: the Key to Success in the Canadian Trucking Commodity Origin and Destination Survey François Gagnon and Krista Cook Statistics Canada ICES III, Montreal, June 2007 PRESENTATION Outline 1. Background 2. Methodology of the Redesigned Survey 3. Advantages/Disadvantages of the Canadian Approach 4. Challenges of Collecting Electronic Data 5. Conclusion 1. BACKGROUND Commodity Flow Surveys in Canada Shipments Ship from admin data (census) Rail from admin data (census) Truck TCOD 1. BACKGROUND What is TCOD? – Purpose : To measure trucking commodity movements – Unit of interest : Shipments – Variables collected for each shipment : • commodity carried, tonnage • origin and destination of shipment • distance, transportation revenues – Outputs : Estimates and CVs, microdata file – Input to : System of National Accounts – Main user & Co-sponsor: Transport Canada 1. BACKGROUND Why a redesign? - TCOD was developed in the early 1970s - In 2000, Statistics Canada approved a multiyear project to redesign the survey To improve data quality To better meet the new requirements of the users - Constraint: no additional production costs 1. BACKGROUND Addressing data coverage needs Needs identified and decisions made Trucking industry Long-distance & local $1M (in terms of company revenue) < $1M (in terms of company revenue) Trucking activity in non-trucking businesses (Private trucking) Foreign companies : no frame for now 1. BACKGROUND Addressing other needs Annual data Provincial & Territorial estimates Improve precision Other variables such as “value of shipment”: not available on shipping documents => Improve coverage + precision + detail AT NO ADDITIONAL COST: a good challenge! 2. REDESIGNED TCOD Coverage of the Old and New TCODs (Number of Companies) Trucking companies Non-trucking companies Revenue 1,828 351 1,462 $1M Long Distance Hhld goods moving Local Other trucking activity Canadian Companies Old TCOD Coverage Added Coverage in the new TCOD Foreign Companies Source: BR - 2004 2. REDESIGNED TCOD Key estimates to be produced Key domains: Matrix: Origin x Destination x Commodity NFLD 051: 061: NFLD … … 991: 051: 061: P.E.I. … … 991: 051: … 061: … … 991: 051: 061: B.C. … … 991: … P.E.I. 051: 061: … … 991: 051: 061: … … 991: 051: 061: … … 991: 051: 061: … … 991: 051: 061: … … 991: 051: 061: … … 991: 051: 061: … … 991: 051: 061: … … 991: B.C. 051: 061: … … 991: 051: 061: … … 991: 051: 061: … … 991: 051: 061: … … 991: => Sample size in each cell of the matrix is random Key variables of interest: => Tonnage, Distance, Revenue 2. REDESIGNED TCOD Need for a larger sample size Main challenge of commodity flow surveys: No efficient stratification possible to control sample size by estimation domain (O/D/Commodity cells) => random sample size in O/D/Commodity cells => poor precision in many estimation domains One solution: increase sample size Old TCOD: 0.5 M shipments (sampling fraction: 0.8%) New TCOD: 7.4 M shipments (sampling fraction: 11.2%) 2. REDESIGNED TCOD Data Collection A) Personal on-site visits Similar process to the old TCOD Improved CAPI application 79% of the sampled companies (was 91%) reduction of the overall collection costs (since this collection method is expensive) • 0.2 M shipments (comparable to the old TCOD) 2. REDESIGNED TCOD Data Collection B) Profiling using CATI Used for all companies with < 50 combinations of Origin/Destination/Type of commodity 21% of the sampled companies (was 9%) 3.7 M shipments in the sample (49% of the sample) => Profiling allows to: Reduce collection costs Improve precision (through an increased sample size) 2. REDESIGNED TCOD Data Collection C) Electronic Data Reporting (EDR) ► 1st years of the new TCOD - for the same 7 large companies - 100% of their data (only 5% in the old TCOD) - 3.6 M shipments (48% of the total sample) - automation of coding + imputation ► Future years: - potentially 200+ companies => EDR will allow to: Reduce collection costs Improve precision (through an increased sample size) 2. REDESIGNED TCOD Sample Design 4-Stage Design: 1st stage: Stratified SRSWOR of companies Must-take strata for Profile & EDR companies > 2nd stage: Sample of a period of time (e.g., a 6-month period) > 3rd stage: Systematic sample of shipping documents > 4th stage: Systematic sample of shipments 2. REDESIGNED TCOD Domain Estimation H Yˆ (d ) nh w 1hi rhit mhitj w2 hit w3hitj w4 hitjk yhitjk (d ) h 1 i 1 j 1 k 1 where: yhitjk = value of the variable of interest for the shipment k on shipping document j from the survey period t of company i in stratum h d = domain of interest yhitjk if hitjk d yhitjk (d ) 0 elsewhere >> Variance estimation: Jackknife method 3. CANADIAN APPROACH vs. Other Commodity Flow Surveys Most other commodity flow surveys Collect shipment information from the shippers Canadian TCOD Collects shipment information from the carriers 3. CANADIAN APPROACH Advantages Survey population clearly defined: no subjective decision on which industries (NAICS) to include Collection via EDR & profiles large increase of sample size at a minimal cost reduces sampling errors estimates at a more detailed level On-site collection reduces non-sampling errors higher response rate => reduces nonresponse bias 3. CANADIAN APPROACH Disadvantages Incomplete coverage of trucking activity On-site collection is very expensive Variable “value of commodity” cannot be collected 4. COLLECTING ELECTRONIC DATA Challenges Companies’ data vs. TCOD variables file formats + concepts Security of electronic data Automation of the processing coding of commodities and origin/destination imputation of commodities 5. CONCLUSION Canadian Approach Collection from the carriers: Larger sampling fraction => reduces sampling errors On-site collection: => reduces non-sampling errors => higher response rate Electronic data collection: huge potential to be developed in future years! For more information please contact Pour plus d’information, veuillez contacter François Gagnon Francois.Gagnon@statcan.ca Krista Cook Krista.Cook@statcan.ca www.statcan.ca