Challenges in Information Fusion Technology Capabilities for Modern Intelligence and Security Problems Speaker: Prof. Sten F. Andler Director, Infofusion Research Program University of Skövde, Skövde, Sweden (*) Author: Dr. James Llinas Center for Multisource Information Fusion University at Buffalo, Buffalo, New York, USA llinas@buffalo.edu (*) Key Information Fusion Challenges Driven by Operational Problems and Modern IT • Heterogeneity of Data, Information • Common Referencing and Data Association Impacts • Dealing with Semantics • The Entry of Graphical Methods • Architecting Systems and Analytic Frameworks Heterogeneity of Data/Information Heterogeneity from modern IT capabilities/problems and networked systems Lack of reliable a priori knowledge to support dynamic deductively-based reasoning “Weak Knowledge” problems • Observational – “Hard” Sensor Data and “Soft” linguistic/reported/unstructured Data • Open-source & Social Media – Issues: Mostly in linguistic form; Trust, Volume, Formats, Modalities • Contextual differences – Issues: Format, Middleware reqmt, dynamics, relevance • Ontological differences – Issues: Multiple-ontology cases, semantics, dynamics, relevance • Learned knowledge – Issues: integrating inductive and other inferencing procedures Some Impacts due to Data Heterogeneity • Soft (linguistic) data -- New preprocessing Front Ends: requirement for semantically robust Text Extraction/NLP processes – Marginally available today – If not extracted, properly labeled entities never enter the Fusion process – If not tagged with some level of (reliable) uncertainty/confidence, entity uncertainty not considered • Confounds both Common Referencing and Data Association • Exploiting Contextual Data requires Middleware to condition data in a form useable by Fusion process (native form-to-useable form) – Can also require hybrid algorithms, eg context-aided Kalman Filter designs • In networked systems, there can be multiple Ontological versions being used – Creates a need for ontological normalization (Common Referencing function) – Also impacts Data Association; inconsistent nomenclature will prevent feasible associations • Information learned in real-time creates a Level 4 Knowledge Management functional requirement, and real-time adaptation that can include dealing with out-of-sequence evidence (retrospective adaptation) Some further Impacts regarding Common Referencing and Data Association • Common Referencing – Temporal alignment within streaming Soft data feeds is challenging • Dealing with linguistic tense: past/present/future – Impacts correct Temporal Reasoning » Creates a need for agile Temporal Reasoning – Networked environments open the possibility for inconsistent forms of uncertainty representation • Creates a need for uncertainty transforms, normalization methods • Data Association – Major impact due to Soft (linguistic) data and availability of Relational links • Association now of higher dimension: Entities/attributes and interentity Relations — becomes a Graph Association problem • New scoring functions required; eg Relational similarity Representative Impacts regarding Common Referencing and Data Association, cont. G. Tauer, R. Nagi, M. Sudit, The graph association problem: Mathematical models and a lagrangian heuristic, Naval Research Logistics (NRL) Volume 60, Issue 3, pages 251–268, April 2013 Representative Impacts regarding Graphical Forms and Operations • Graphs as a Representational Form – The standard for language representation – Deals with Entities and Relations – Quantitatively-based; visually manageable • Graph-based Analytics – Framework for Data Association as shown – Evidential searching/matching (supports query-based, discovery-based analysis) • Variety of Graph-Matching paradigms, issues – Stochastic due to tagged uncertainties in graph elements – Incremental to handle streaming real-time data – Large scale to handle “Big Data”; eg Cloud-based Some further Impacts regarding Semantics • Optimal strategies for semantic “control” – control of semantic complexities – Rigorous control of Ontologies – Controlled vs Uncontrolled Languages • Eg Battle Management Language – Robust Text Extraction, NLP – Role of Human Mediators in system architecture • Speed (automation) vs semantic accuracy • Semantic Uncertainty • Vague predicates; issue of Truth—leads to 3-valued forms of Uncertainty Representation Some Impacts regarding System Architectures and Analytical Frameworks • Many problems are “Weak Knowledge” problems wherein the extent of reliable a priori dynamic knowledge about the domain is limited • This motivates an approach that must combine deductive and inductive (or abductive) methods in an effective way – These tend to require technologies that support discovery and learning-based hypothesis-formulation strategies • Methods such as Complex Event Processing, Probabilistic Argumentation, Graph-based Relational Learning are some of the new inferencing methods being studied. Representative Architectures: Inductive + Deductive Earliest Thoughts on Combining Inductive and Deductive Inferencing for Fusion* * Integrating the Data Fusion and Data Mining Processes Ed Waltz, Natl Symp on Sensor and Data Fusion, 2004 Representative Architectures: Hard and Soft Fusion Processes; Disparate Analytic Tools Intel Cell – or –Company Opns Intell Support Team Evidence and Entity -estimate Foraging Services Sensemaking Services Analytic Support Services Enterprise Service Bus on on g - Hard (sensor) fusion on n g encin g encin Soft (intel) fusion Information (Evidence) Services (Sensor) Data and Computational Services Core Enterprise Servces Summary • Requirements for Data and Information Fusion Processes and Systems have gone far beyond the goal of estimating properties and geometries of entities – Dealing with complex Semantics, inter-entity Relations, Social Media and other Contextual effects, complex Temporal dynamics, and Heterogeneous Data have made the design of IF Systems a markedly new challenge. • Incremental advances and accomplishments are being realized but there is much to be done • Major advances are needed in dealing with more complex inferencing challenges to support efficient learning and discovery processes. • New partnerships are needed across various multidisciplinary areas in order to address these new complexities