Learning From the Past and the Present Those who refuse to learn from the past are doomed to repeat its failures. Presented by D.E. Moon CDT Core Decision Technologies, Inc. Lets Put Inventory Data Into Perspective The Knowledge Progression Data Organization Interpretation Information Analysis © 2001 CDT Core Decision Technologies Inc. Integration/Model Prediction/ Knowledge Criteria Decision Process Slide #2 A Brief History of Land Resource Inventory in Canada 1950s Traditional agricultural soil inventories 1960s Canada Land Inventory (external paying client) Systematic, 1:250K nation wide land capability Forestry, Agriculture, Recreation, Wildlife Based on climatic zonation, distribution of landscapes with in zones, and soils within landscapes Criteria, procedures, data, and products were clearly defined before the inventory was started. It was very well done in only 10 years © 2001 CDT Core Decision Technologies Inc. Slide #3 A Brief History (cont.) 1970s Upgrade of CLI maps to “1:100K, Multi-purpose Soil Inventories” (sold on the success of the CLI) sold as a one time effort because soils are stable! able to answer a wide range of suitability questions! Landscape / hydrology / genetic bias, soil associations modal data, many attributes, few sites beautiful, well edited, technical reports and maps when potential clients came to us, we could not answer their questions. And so closed the 70s. © 2001 CDT Core Decision Technologies Inc. Slide #4 A Brief History (cont.) 1980s we told potential clients that we could not answer their questions because we needed more detail we moved from 1:100 K to 1:20K and we tried to answer everything! at 10-20x the cost of answering the clients question and the extra questions were answered poorly 1987 brought the closure of the B.C. Soil Survey program, the largest provincial soil inventory program in Canada © 2001 CDT Core Decision Technologies Inc. Slide #5 Decision Decision Knowledge/ Prediction Knowledge/ Prediction Information Information Data Data We had moved from here, To here, a few decisions many decisions based on sound data. based on unsound/inappropriate data. © 2001 CDT Core Decision Technologies Inc. Slide #6 A Brief History (cont.) 1990s Reverse Direction 1:1 million Soil Landscapes of Canada minimum data set but complete national coverage But we had large gaps in coverage and data missing coverage mapped with no verification missing data estimated by expert systems or regional regressions applied nationally We now claimed to have complete national coverage and were ready to answer questions! © 2001 CDT Core Decision Technologies Inc. Slide #7 A Brief History (the final chapter) 1994 We got a real, internal client. It needed indicators of environmental sustainability indicator procedures were developed by our scientists our “data” was evaluated but found wanting “pedo-transfer” functions were developed by inventory experts to infer required parameters from our previous estimates resulting in an internally contradictory, dialectical disunity e.g. we had more soil water than voids to hold it © 2001 CDT Core Decision Technologies Inc. Slide #8 A Brief History (the epilogue) 1995 National Soil database supported by a National Research Centre 120 FTEs, 11 Regional Offices Annual Budget $ 13 million 1997 Supported by a section of a regional research centre 11 FTEs Annual Budget $ 0.5 million © 2001 CDT Core Decision Technologies Inc. Slide #9 The final irony 1997 Statistics Canada launched a program to recover the digital database for the 1970s Canada Land Inventory. Why? It provides complete national coverage It allows comparison of resource values It is consistent, non-contradictory, and does what it was designed to do It is better than anything produced since! © 2001 CDT Core Decision Technologies Inc. Slide #10 So were we stupid or what? The closure of the Province of B.C.’s soil survey program in 1987 was a wake up call for some but not many. We tried a number of things. We looked at our clients, we looked at our mandate, how we did our jobs, how we packaged the results, and how we promoted our products. © 2001 CDT Core Decision Technologies Inc. Slide #11 Our clients Until the end, the inventory community was our only paying client therefore, until the end we defined our own needs, defined our own procedures, set our own priorities, performed our own QA/QC did our own evaluations Boy did we look good! © 2001 CDT Core Decision Technologies Inc. Slide #12 Our mandate Our job was to collect and interpret land resource data (we decided for what purpose). We convinced senior management that our products were widely needed. In the end, we did not deliver and had no real clients. We believed that we knew what potential clients needed better than they did. We were, after all, the land resource experts! © 2001 CDT Core Decision Technologies Inc. Slide #13 How we did our job? We actually did evaluate some of our inventory procedures. Map units (what we drew lines around). Inventory procedures (cost effectiveness). Data management procedures and systems. Reliability. The results failed to inspire confidence so, they were deemed non-representative and irrelevant. © 2001 CDT Core Decision Technologies Inc. Slide #14 How we packaged it We moved to “Productization” standard map and report formats standard interpretations standard packaging computer automation electronic distribution promotion and advertising but we still had inappropriate, inaccurate, and incomplete data that could not answer client’s questions reliably. © 2001 CDT Core Decision Technologies Inc. Slide #15 So what was missing? Paying With When clients (who could hold us accountable). real problems and needs to provide direction! we finally accepted a client (1994) We let them define the problem. We adopted or developed procedures to solve them. We did a functional analysis to determine data needs and we discovered that we did not have the required data. Too late. We had told everyone that we had the data, so we made it up. So was it all inevitable. © 2001 CDT Core Decision Technologies Inc. No! Slide #16 What should we have done? Determined Not what we wanted or hoped to map. Found One what we could reliably map a formal client with clear objectives. to whom we would be accountable. Involved the client in defining deliverables Determined the degree to which we could meet the client’s needs and at what costs. © 2001 CDT Core Decision Technologies Inc. Slide #17 Been honest about: what we could and could not do, and what was really needed to answer the question. Imposed rigorous internal & external QA / QC. Put the client’s, not our own needs and aspirations, first. Conducted follow-up How well were the clients needs being met. How could the needs have been better met. © 2001 CDT Core Decision Technologies Inc. Slide #18 Why Did We Fail? Management reasons Technical reasons Political reasons Human resource reasons Human nature reasons © 2001 CDT Core Decision Technologies Inc. Slide #19 Management Causes Mandate Not was inventory not decision support. tied to departmental business functions. Project approval was based on internal priorities. No external clients or accountability. Inappropriate performance criteria. No executive commitment to needed change. © 2001 CDT Core Decision Technologies Inc. Slide #20 Technical Causes Interpretations and decision support were secondary and, therefore, poorly served. We had spent a bundle in the 60’s and 70’s on Soil classification the Soil only national taxonomy in Canada interpretation Every thing from crop suitability to suitability for septic fields based on site data and at we were damn well going map it. scales of 1:15K to 1:1Million © 2001 CDT Core Decision Technologies Inc. Slide #21 Legacy systems constrained progress. Designed to do the wrong thing but since the systems were available we continued to use them. Data which the legacy systems did not handle were ignored, generalized, or forced to fit. Decision support procedures which the systems could not handle were rejected. © 2001 CDT Core Decision Technologies Inc. Slide #22 Political causes Line and middle management power politics (aided by senior management’s lack of commitment to change). managers feared the success of subordinates who were allowed to use new skills and techniques, managers feared loss of control if they could not develop or master the new procedures, managers feared that the introduction of new procedures would imply that the previous procedures were wrong. © 2001 CDT Core Decision Technologies Inc. Slide #23 Human Resources Demographics. last significant staffing was late 1970s. academically qualified people not available. staffed rather than lose the position. indoctrinated new staff into the current system. present managers were the product of 1970s indoctrination. attitudes and positions of 1970s and 1980s became entrenched and then retrenched. Note similar conditions in information technology today. © 2001 CDT Core Decision Technologies Inc. Slide #24 Human nature Fear of no longer being able to do their job. Fear of losing hard won intellectual equity if it is made obsolete by new procedures. Fear of losing professional status and influence if no longer “the expert”. Resistance to learning new ways of doing what you already know how to do. © 2001 CDT Core Decision Technologies Inc. Slide #25 So what else should we have done? Commit to adaptive change (senior and middle management). Develop a client-centric mode of operation. Let them define their problems. Jointly develop procedures to solve the problems. Develop relevant performance measures. Reward innovation and adaptive change. Incorporate client satisfaction. Require rigorous problem analysis. © 2001 CDT Core Decision Technologies Inc. Slide #26 Learning from the Present? Your present is looking a lot like our past. High credibility based on success of BEC zones and sub-zones and identification of site-series. You have a tremendous volume of legacy data They are at best suspect, at worst wrong, and most probably inadequate. You are being pressured to use it to answer today’s site management questions at low cost. You are selling an untested product. Wanting and believing does not make it so. © 2001 CDT Core Decision Technologies Inc. Slide #27 Highlights of a Problem Analysis of Input Data Quality for PEM An initial assessment of input data quality identified potentially serious problems with: spatial accuracy and resolution thematic accuracy and resolution © 2001 CDT Core Decision Technologies Inc. Slide #28 Spatial Issues Terrain Overlay A7B3 Spatial Overlay Soil X5Y5 A7B3 X5Y5 © 2001 CDT Core Decision Technologies Inc. Alluvial fans that should correspond. Slide #29 Thematic Issues Compound map unit overlay AX from 21% to 50% AY from 21% to 50% BX from 0% to 30% BY from 0% to 30% © 2001 CDT Core Decision Technologies Inc. X5Y5 A7B3 Slide #30 Thematic Reliability Estimate and 95% Confidence Interval for the area of defined Error Classes Class Correct Area Confidence Interval 17% 8-27 Similar Dissimilar Contrasting Diss + Cont 18% 7% 61% 68% © 2001 CDT Core Decision Technologies Inc. 9-21 2-12 48-73 57-78 Slide #31 Approach to an Input Data Evaluation Framework Documented map mapping concepts used in inventories units and reliability Documented the elements of data quality spatial and thematic Defined the effect of mapping procedures on data quality Specified the meta-data required to evaluate data quality and we then Developed a framework and criteria for evaluating data inputs to PEM © 2001 CDT Core Decision Technologies Inc. Slide #32 Conclusion Input data quality is highly variable. The strong appeal of PEM will require a tightly reasoned, widely accepted input data evaluation framework if it is to have wide acceptance and application. The framework developed in the Input Data Quality Report provides an effective basis for both PEM input and PEM output data quality evaluation. © 2001 CDT Core Decision Technologies Inc. Slide #33 Knowledge-based systems Knowledge-based systems incorporate data and information with rules, relationships, probabilities, and logic or models to: predict outcomes support decisions classify unknowns identify new relationships and patterns © 2001 CDT Core Decision Technologies Inc. Slide #34 Examples of older and current knowledge-based systems Taxonomic models (intuitive, empirical, no causality) Statistical models (empirical, no causality) Process models (causal, no feedback) Systems models (causal with feedback mechanisms) Expert systems (heuristic, no causality) Eldar Artificial neural networks (non-rigorous statistics) Belief matrixes and decision trees (heuristic) Ecogen © 2001 CDT Core Decision Technologies Inc. Slide #35 The new knowledge paradigm Ontology A representational vocabulary for a shared domain of discourse – definitions of classes, relations, functions, processes and other objects. Ontologies will be the basis for knowledge integration analogous to data dictionaries and data models for data management and integration. © 2001 CDT Core Decision Technologies Inc. Slide #36 Knowledge management The definition and maintenance of knowledge standards and protocols is a necessary precursor to knowledge management. The management of knowledge-based systems will require: a knowledge syntax and semantic lexicon domain ontologies for the knowledge area © 2001 CDT Core Decision Technologies Inc. Slide #37 The current situation Knowledge management is following an evolution similar to data and information management. Many organizations are building knowledge bases in an uncoordinated and ad hoc manner. There is growing recognition of the need for and value of knowledge sharing and reuse. © 2001 CDT Core Decision Technologies Inc. Slide #38 The current situation As with data management, the major impediments to knowledge sharing and reuse are: inconsistent concepts, definitions, terminology, structures, and formats in addition there is no standard syntax or semantic of inference to support communication standards and protocols to enable knowledge sharing and reuse are only just emerging © 2001 CDT Core Decision Technologies Inc. Slide #39 Conclusions It is feasible to establish a generic knowledge structure that will accommodate most, if not all, evolving knowledge models including those used in TEM and PEM this structure would be able to store, retrieve and share disparate knowledge bases. integration into a true knowledge management system with a common syntax and semantic of knowledge inference and retrieval is also feasible but would be much more difficult and costly. © 2001 CDT Core Decision Technologies Inc. Slide #40 Conclusions The generic knowledge structure would accommodate the ontology of TEM and PEM approaches the ontology would include the definitions of classes, concepts, relations, functions, and processes assumed to produce the classes recognized in the TEM classification and would also accommodate the inferences used in PEM. © 2001 CDT Core Decision Technologies Inc. Slide #41 Our end, your beginning? © 2001 CDT Core Decision Technologies Inc. Slide #42 When interpretation was attempted we discovered that: Although the maps were reasonably accurate, the necessary data had not been collected, or the data was in the wrong format, or we had used the wrong method of analysis, or class limits, or precision were inappropriate, or the map units and scale severely limited the site specificity of the interpretation (e.g. a wide range of response could be expected in most polygons). © 2001 CDT Core Decision Technologies Inc. Slide #43 Agriculture Land Reserve Survey cost $ 3-10 million reliability of predicting reserve status 92% Post completion we did a pilot project using air photo interpreted land use as an indicator of reserve status the reliability of reserve status was 96%. The estimated cost of full survey $ 0.3 million © 2001 CDT Core Decision Technologies Inc. Slide #44 Now when interpretation was attempted we discovered that: The necessary data had been collected, the data was in the correct format, we had used the appropriate method of analysis, and we had used appropriate soil definitions, class limits, and precision for the required interpretations, but © 2001 CDT Core Decision Technologies Inc. Slide #45 map units and scale were still inappropriate Compound map units (TEM convention) gave conflicting interpretations The maps were unreliable Thematic reliability < 30% Mappers were restricted to naming 3 soils Maps had a modal value of 7 soils / polygon Interpretations required greater precision than could be mapped at 1:20K © 2001 CDT Core Decision Technologies Inc. Slide #46 Minimum data set Review of National Soil Database average of 2,500 attributes per site about 12,000 sites only 10 attributes were used routinely in interpretations only 30% of data sets had all 10 collection of these 10 takes about 1/8 - 1/4 of the field time and 1/20 of the lab costs. deemed inappropriate to collect missing data! © 2001 CDT Core Decision Technologies Inc. Slide #47 Selling the Product We used models developed at the plot level (10 m2) to make predictions at 1: 1 million (10 – 50 km2) Average polygon had 10 – 30 major soils Algorithm accessed 1 detailed soil description and extrapolated the results to 10s of polygons and hundreds to thousand of hectares. The required data was most frequently estimated. One enterprising fellow sold Health Canada on a 1:1 million map of suitability for carcass burial. © 2001 CDT Core Decision Technologies Inc. Slide #48 Requirements Data Inventory Understand the questions to be answered. Determine the data needed to answer the questions. Evaluate existing data for adequacy. It may be that a sub-optimal but adequate procedure can be developed for existing data. It may be that adequate data does not exist and must be collected. If the needed data cannot be collected at appropriate cost and reliability, the requirement cannot be met. Look for alternative approaches. It did not matter how badly Regan wanted star wars, it just could not be built. © 2001 CDT Core Decision Technologies Inc. Slide #49