Data & Information Integration Framework for Highway Projects Mid-Continent Transportation Symposium Asregedew Woldesenbet David H. Jeong (Ph.D.) Michael P. Lewis (Ph.D., P.E.) August 15, 2013 Outline Research Question Lessons Learned Methodology Evolution Integration Framework Case Study Gap Analysis Conclusion/Future Work Research Question Is data currently being collected provides the information needed for decision-making? ◦ Minimal recognition or interest in using these data ◦ Lack of in-house resources and capabilities to analyze data ◦ Insufficient data for any meaningful analysis ◦ Nonstandard /non-digital data format ◦ Poorly defined procedures/mechanism Lessons Learned ◦ Strategic decisions supported by statistically reliable information Credit card industry Retail industry Healthcare industry ◦ Big Data ◦ System/Tools KM tools and KDD approaches ◦ DM, AI, DSS, ML, BI Management philosophies ◦ BPR, TQM, SCM, CE, LC Database System ◦ Ontology frameworks, cloud computing tom ate d ma ted Au to M /D Sem i-A u M Pap anu er- al/ ba sed ch roa D KD el/ Exc stics ti Sta ert Exp ment g Jud Da ta C p Ap olle ctio n Generations of Data & Information Management: Transportation Industry Knowledge Portal 3rd Generation 2nd Generation Database/ Datawarehouse 1st Generation File Cabinet, PC System Evolution of Data and Information Integration for Highway Agencies 3rd Generation 1st Generation Various DATABASES - Data Collection Efforts Data Collection - Manual/Paper-Based Approach - Expert Judgment System - File Cabinet e.g. Contract Documents - PC e.g. Cost Data - Database e.g. Road Inventory - Other Databases 2nd Generation Active Information & Knowledge Extract Data Collection - Semi-Automated/Automated Approach - Statistical Tools - Artificial Intelligence System - Project Management System - Database e.g. SiteManager - Data Warehouse (DW) Integrated Data & Information Framework ion to Support Decision Making Data Collection - Automated - Standard Data Collection Procedure Approach - Pattern Recognition - Knowledge Discovery in Database (KDD) - Data Mining (DM) System - Ontology Based Knowledge Management System - Big Data Analytics Algorithm - Knowledge Portal e.g. cloud-based system Data & Information Integration Input Processor DM4 Output X DM3 D1 D2 DM2 I2 ..… ..… Dn In X DM1 DM3 D1 ..… D3 DM2 DM1 I1 X D2 DMn X X X I1 I2 X X X D3 X D4 X I3 Row Form X X Column Form Context Graph Input/Output Matrix Element Form Data & Information Integration Framework Planning Phase Decision Design Phase Bidding Phase DMA Data D11 D12 I11 I12 ….. I1N D13 D14 DATABASE I ….. D1n I21 D21 I22 D22 ….. I2n D23 DATABASE II ….. Legend : DMN Im1 D2n Operation Phase Active Path Inactive Path Non-Existing Path …..... DMB Information Construction Phase Dm1 ……………. Im2 Dm2 Im3 ….. Dm3 Imn ….. DATABASE N Three-Tiered Hierarchical Framework Dmn Case Study Daily Work Reports (DWR) Preconstruction Cost Data Pavement Condition Data Case Study Division/ Source Database Type of Data Roadway Inventory System Planning/ Research Grip lite/ Highway Inventory Traffic Sub-Elements Collection Method Functional Class, Right of Way, Route Classification, Terrain Area Type, right-ofway, railroad crossing, etc. Average Annual Daily Traffic (AADT), Manual / Semisignals, lightings, traffic control, crash Automated statistic, etc. Bridge Inventory Bridge span, width, length, load limit, inspection reports, etc. Preconstruction In-house Spreadsheets Preliminary Engineering Data Engineering hours, number of sheets, etc. Manual Construction Division SiteManager Construction Data Daily work report, reported quantity, material, change order contractor payment etc. Manual Pavement History Pavement surface type, thickness, composition, etc. In-house - Automated Distress Data Longitudinal Cracking, Transverse Consultant - Roadware Cracking, Patching, Spalling, Fatigue, etc. Friction Data Average Roughness, Ride, Average Rut etc. In-house Other (structural) Deflectometer (FWD), ESAL In-house Roadrater Pavement Management Pavement management System (PMS) Current Data Utilization DWR Info Contractor Description Data Type No Use I1 I2 D6 ID 000001-100000 Last and first name xx/xx/xxxx Temp. oF Temp. oF Sunny, windy, cloudy, etc Numeric : Ordinal Character : Nominal Numeric : Ordinal Numeric : Interval Numeric : Interval Character : Nominal X X X I3 X X X X X X D7 Sunny, windy, cloudy, etc. Character : Nominal X X Work Suspended Time D8 Time AM/PM Numeric : Ordinal X X Work Resumed Time Humidity Precipitation Contractor D9 D10 D11 Time AM/PM X X D12 Name Numeric : Ordinal Character : Nominal X X Subcontractor D13 Name Character : Nominal X X Supervisor D14 Foreman, superintendent, etc. Character : Nominal Personnel Supervisor Hourly work Personnel Hourly work Supervisor Number D15 Laborer, concrete finisher, etc. Character : Nominal X X Number of Hours Numeric : Interval X X Number of Hours Count Numeric : Interval Numeric : Interval X X X X Contractor ID Inspector Name Date Low Temperature High Temperature AM Condition D1 PM Condition D2 D3 D4 D5 D16 D17 D18 - X X X Percentage Completion Data Reporting Data Attributes Dispute Resolution Type of Data Contractor Payment Current Use I4 Ideal Data, Information & Decision-Making Framework Three-Tiered Framework Planning Phase Decision DM1 DM2 DM3 Resource Allocation Determine Contract Time Production Rate Databases Contractor Type D1 Inspector D2 Date ………. D3 DM4 Maintenance Roadway Design Information Data Construction Phase Design Phase Precipitation D11 DM5 DM6 DM7 Bridge Design Traffic & Safety Design Cost Tracking I5 Project Type D12 Accident Analysis ………. Distance D29 Sitemanager Accidents ………. D30 I6 Contractor Payment Supervisor Remark D36 Project Management I1 Prime ………. Contr. Work D37 Performance Measure Type of Day D39 Construction Data Gap Analysis Missing data (D1 - D3) - Humidity, precipitation, etc. Unstructured Data (D1 - D3) - Remarks ((D33 –D41) Not used data (D1 - D3) - Accidents (D30), delays (D31), etc. Current Data Ideal Data Current Information Ideal Information Missing information (I5 - I9) - Production rate - Accident analysis - As-built information, etc. Current Decisions Ideal Decisions Missing decisions (DM1 - DM9) - Resource Allocation - Contract time determination - Maintenance, etc. Gap Analysis Criteria Gap Need for data analyst or data scientist Staff Need for responsible party in data collection, information generation and decision-making Need for decision-maker requirement, identifying Function characteristics and use Need for data and information to reach the user or decisionTime maker in a timely manner Availability Missing data and information Need for change of textual or linguistic data types, lack of Format/Structure standard Division having standalone units to match only particular Individuality needs Need for appropriate tools and technology to extract Technology information Conclusion Summary ◦ DWR are often utilized in reporting and preparation of legal disputes. ◦ Reported quantity and work item are the primary data that are utilized in contractor payments and tracking project progress. ◦ More than 35% of the DWR data are linguistic in nature. ◦ Conclusion ◦ Lack of skilled data analysts and experts to analyze data ◦ Lack of well-developed requirement analysis and performance measures. ◦ Focus of specific divisions or business processes to promote own division’s need rather than develop integrated system Conclusion Data, Information & Decision-making Guideline Requirement Analysis Identify Key Decisions Identify Data, Information & Knowledge (DIK) Strategic & Network Level Decisions Quality Function Deployment Program & Project Selection Level Decisions Identify Key Performance Indicators Project Level Decisions Identify Database & Decision Tools Evaluation/Assessment Check Availability of Data, Information & Knowledge Data Generation Scheme Assess Current level of Use & Quality Define Data, Information & Knowledge Develop Data, Information, Decision-Making Path Develop New Module/ Database Perform Cost/Benefit Analysis Data Process/Manipulation Cleanup Data Convert Data into Information Apply Appropriate Tool or Decision Support System Conclusion Contribution ◦ Ability to show types of data that should be collected and potential information & knowledge generation ◦ A general guide to highway agencies in the development of active utilization of currently existing databases. ◦ Help develop new data collection, information & knowledge generation plan to support key decisions Future Study ◦ Emphasize in developing an enterprise wide ontology-based framework ◦ Application of big data analytics to justify the return on investment for the data collection efforts and effectively utilize the increasing amount of data.