It is well beyond the scope of this course to provide a general

advertisement
It is well beyond the scope of this course to provide a general discussion of systems engineering issues
related to data fusion systems. After all, data fusion systems may involve a distributed information
technology system with sensors, communications networks, central information fusion software,
human-computer interactions and many other aspects. So our intent is not to provide a systems
engineering course captured in a single lesson. However, we will discuss some systems engineering
concepts as they relate to the design and implementation of a data fusion system.
The objective of this topic is to discuss some issues associated with the definition, design and
implementation of a data fusion system.
What is the environment that we face in implementing a data fusion system? Often, we need to think
through six key issues:
i) A combined hardware and software environment – We may need to develop, procure and integrate
both hardware and software to implement a fusion system. The hardware may involve new types of
sensors and specialized hardware for image or signal processing, while the software may involve the
development of large-scale software operating in a real-time environment that seeks to keep up with
the pace of the observations and required decision-making.
ii) Complex system architecture – The system architecture may involve distributed sensors and
processors with a connecting communications network, multi-processing with some algorithms
operating within and on the sensors, and some at intermediate processing sites or a central site. The
data associated with the fusion system is typically distributed and much of the data may be external to
the central processing.
iii) Stringent and changing environment – In some cases the environment for the fusion system may be
quite difficult and ever changing. Sensors may be located on aircraft, automobiles, within a machine
environment, or situations in which weather, water, humidity and other ambient environmental factors
may affect the sensing process as well as the communications between the sensors and the fusion
processing. There will almost certainly be requirements for throughput or response requirements (e.g.,
the time from observing and entity to reporting its location and identity may be very short). The
deployment environment of the sensors and processing may require special protection of the
equipment. Finally, the system may be required to be highly reliable and available 24 hours per day.
iv) Multiple users and developers – In large scale system development there will likely to be multiple
individuals and organizations who are involved in the design and development process. Moreover,
there are often multiple types of users – each with their own perspectives and priorities.
v) Security requirements - When gathering, processing and disseminating critical data, there are often
security requirements. Almost every application domain such as military, medical, industrial or
environmental, have specific requirements to protect the collection and dissemination of data.
vi) Enterprise services environment – In modern environments, we are usually not dealing with a standalone, self-contained environment. Instead, we may have to rely on services (e.g., data collection,
processing, communications) provided by a larger enterprise. Such an environment requires
understanding the larger enterprise and how to access the requisite services. Emerging service
oriented architectures and web services environments provide both opportunities and challenges for
system design.
The development of a fusion system in a real setting entails a lot of software and processing beyond the
basic fusion processes. This chart shows a conceptual system involving multiple sensors, a
communications system, a message/data processing system to ingest the data from the sensors and
sources, a database management system, and human-computer interaction system. These functions
and systems interact with functions such as sensor and source management, decision planning and
cognitive aides, and single sensor/source processing. All of these in turn interact with a multi-source
fusion system. Thus, the fusion part of this larger system is only one component. In fact it may be a
relatively small part of such an overall system.
Several years ago, the large-scale U. S. Army Tactical Command and Control (TCAC) fusion system was
surveyed to identify the number of lines of source code associated with the different functions of the
overall command and control system. This was a system involving many hundreds of thousands of
lines of source code. The code associated with the data fusion functions accounted for only 12% of the
overall system. Thus, while we are focused in this course on the data fusion functions and techniques,
such techniques may constitute a relatively small part of an overall system involving sensors,
communications, data bases, human computer interactions and sensor tasking functions.
When selecting sensors or sources of information, designers may typically begin with an assessment of
“what’s available” (e.g., what sensors and sources of information do I have available to use). Instead,
however, the system designer should begin the process by determining what is the mission to be
supported by the fusion system and what information or decisions are required. What must be known,
at what level of specificity and at what times? What is the operational and observing environment,
what entities, activities or targets must be observed. How must the fusion system perform in order to
be effective. Once these questions are addressed, then one can begin to consider sensor technologies,
an analysis of the observing, signal propagation environment and uncertainties. Finally, these analyses
drive the specification of the sensor/source requirements. One can map sensors or sources to system
performance requirements to ultimately drive the selection and design of sensors. Hence, we need to
start at the user end and define what information is really needed rather than start at the sensor or
source end (what do I have laying around).
We previously considered a formal approach for establishing requirements for sensors for a fusion
system. We can also consider a formal analysis approach for of all subsystems. This chart shows the
conceptual relationship between requirements for; sensor systems design, processing systems design,
communications systems, and the display system.
We start by understanding the general requirements for the fusion system – what must be inferred,
what decisions must be supported, what is the operational environment, who are the users, what is the
observational environment and other factor. The previous chart described concepts for establishing
requirements for and selecting sensors for the fusion system. As part of this we need to determine what
the performance goals are for the system. This involves understanding factors such as decision
timelines, processing constraints, accuracy and specificity needs, etc. Concurrently, the test and
evaluation requirements are established to determine how we will evaluate a fusion system design –
that is, how can we determine if our system design will meet the performance requirements
established.
Ultimately, the requirements flow down analysis leads to specific requirements for sensors,
communications, fusion processes, and displays and human system interaction.
Marty Liggins and his colleagues have described three different distributed fusion architectures (see
“Distributed fusion architectures and algorithms for target tracking”, by M. E. Liggins, C-Y Chong, I.
Kadar, M. G. Alford, V. Vannicola and S. Thomopoulos, in Proceedings of the IEEE, vol 85, No. 1, January
1997). In the centralized architecture, sensor nodes feed their collected data into a central fusion node
which fuses all of the data and provides this to an information consumer node. In the hierarchical
architecture, sensors may feed several fusion nodes, which in turn interact with other fusion nodes,
finally providing information to one or more consumers. Finally, in a distributed architecture, the
sensors, fusion nodes, and consumer nodes all interact over a network environment. The selection of
a specific architecture, depends upon what specific application, the selection and deployment of
sensors, and the needs of one or more users.
In this, and the next two charts, we provide an example of three generic processing architectures for
level-1 fusion; i) centralized fusion (shown here), ii) autonomous or distributed fusion shown in the next
chart, and iii) hybrid fusion shown in the third chart. Here we illustrate centralized fusion. In this case
multiple sensors each provide data directly to a centralized fusion process. The raw data must be
processed using data alignment, association and correlation methods. Then these data streams are
input to a composite tracking (or state estimation) algorithm with subsequent classification. The
output is a combined or fusion estimate of an entity’s state and identity or classification.
The second architecture illustrated here involves autonomous or distributed fusion. In this case we
perform preprocessing and tracking and classification on each input sensor or source. Each sensor and
associated process provides an independent estimate of observed entity’s state and identity. These
estimates are input into a fusion process which seeks to combine the data at a state vector/identity
level. However, in order to do that, it is still necessary to perform data alignment, association and
correlation and finally composite filtering and classification. Notice that the functions of alignment,
association, correlation, filtering and classification are still performed (as they were in the previous
architecture), however, here they are preformed on state vector/identity data rather than on the raw
sensor data.
Finally, we present a hybrid architecture that combines BOTH centralized and distributed fusion. This
allows most of the processing to be performed in a distributed manner, but allows the system to “reach
back” to the sensors to access raw sensor data when needed.
A summary of these alternative architectures is provided in this table. We summarize briefly below.
Centralized architecture - The centralized architecture fuses raw data obtained directly from the
sensors. In principle this is the most accurate type of fusion because all data are available at the
central fusion function – that is, there is no information loss from the sensors to the fusion process.
However, it also requires the maximum amount of communications, since all data (such as image or
video data) must be transferred from the sensors to the fusion process. The fusion of the raw data
can cause challenges to the association and correlation process because we are trying to
associate/correlate heterogeneous data – e.g., signal data from a radar to image data from an
infrared sensor and reports from a human observation. The techniques for the data fusion involve
physical models, pattern recognition and estimation techniques.
Distributed/autonomous architecture – In the distributed/autonomous architecture, feature vectors
or state vectors are passed from the sensors to the fusion process. The fusion is conducted at a
decision-level. Identity estimation techniques include Bayesian inference, Dempster-Shafer’s
method, logical templates and voting methods. There may be a loss of information from the
sensors to the fusion process because we are not sending the raw sensor data to the fusion process
but rather only a representation of the data via the state vectors. However, this does minimize the
communication requirements. The association and correlation process may be simplified because
we are comparing information about state vectors, rather than raw sensor data. This simplification
may induce issues with hidden statistical interdependencies between the data sources.
Hybrid architecture – The hybrid architecture seeks to obtain the best of both the centralized
architecture and the distributed architecture. The hybrid architecture would ordinarily operate in a
distributed or autonomous manner until more detailed information is required from the sensors.
This may happen for example in cases in which the raw data are required for improved entity
identification or characterization.
We turn now to issues related to test and evaluation of fusion systems. There are several issues shown
here.
Numeric/symbolic components – A challenge with testing data fusion systems is that we are dealing
with both numerical data as well a symbolic information. For numerical systems we can utilize a
number of well known evaluation methods for determining how the fusion system performs
including accuracy, probability of correct identification, false alarm rates, timeliness, computational
efficiency, etc. When we incorporate symbolic components such as the identification or naming of
an entity by a human observer or pattern recognition technique, the evaluation becomes more
complex.
Acid tests – with and without fusion – In order to strictly evaluate a fusion system, we should
compare the accuracy of estimation of targets or entities locations and identities both WITH and
WITHOUT data fusion. That is, what happens to our assessment of a situation when we “turn on”
the data fusion processing. How does data fusion assist in complex or difficult observing
environments?
Measuring global optimality – How can we establish the global optimality of a fusion system as part
of a larger operational system? For example, if a fusion system is part of a monitoring system for a
mechanical system such as an aircraft, does the value added of the fusion system overcome possible
issues related to added maintenance of the fusion system itself? Do false alarms of the fusion
system actually induce more maintenance actions over what would be done without the fusion
system?
Stochastic details – Fusion processing involves working with noisy observations and often nondeterministic processes. In some cases, we are seeking to identify rare events. For example, for a
system monitoring the health of a mechanical system, we are seeking to identify potential failure
conditions that rarely if ever occur. How can we test the accuracy of our fusion system in observing
and identifying such events? This imposes challenges in test and evaluation.
Lack of Gold Standard – A final issue involves what to compare our fusion system with. What
constitutes the ideal or “gold standard” of performance. Do we want the fusion system to
perform as well as a human operator or analyst? How can we “grade” the performance of our
fusion system?
Jim Llinas in his original text, Multisensor Data Fusion, co-authored with Ed Waltz in 1990, discussed the
issue of measures of performance or measures of merit for fusion systems. He identified a hierarchy of
measures including; i) dimensional parameters, ii) measures of performance, iii) measures of
effectiveness, and iv) measures of force effectiveness (for military systems). These are summarized in
this chart and described briefly here.
Dimensional parameters – refer to the properties or characteristics inherent in the physical entities
whose values determine system behavior. These include the characteristics of the physical sensors,
communications systems and computing resources. Examples of dimensional parameters include
signal-to-noise ratio, number of operations per second, aperture dimensions, resolution, sample rates
and others.
Measure of performance (MOP) – involve measures related to dimensional parameters which are
attributed to the behavior of the fusion system. Examples include; detection probability for entities or
targets, false alarm rates, location estimation accuracy, identification range, etc.
Measures of effectiveness (MOE) – measure how the fusion system performs its function within an
operational environment. For a command and control system, this would include measures such as
target nomination rate, timeliness of information, information accuracy, warning times, etc. There are
clear analogies for other applications such as medical systems, environmental monitoring systems, etc.
Measure of force effectiveness (MOFE) – At the highest level of evaluation, we are concerned with how
our overall mission is affected by the performance of the fusion system. For a military command and
control system, this would include understanding how the fusion system affects the outcome of a battle,
survivability, attrition rates, etc. Again, there are analogs of these measures for other applications.
This chart shows an example of a test and evaluation framework for a fusion system. This is from a
system developed by the MITRE Corporation. The system utilizes a scenario generation function to
allow a test engineer to create simulated target files, platform files (for the platforms or entities which
contain sensors), administration data and so forth. The test and evaluation system generates synthetic
data for input to a data fusion system or fusion algorithm under test. In addition, one could use live
recorded data from field exercises as input to the fusion algorithm to be evaluated. The results of the
fusion algorithm are compared against known “ground truth” to provide output measures of
effectiveness. Finally, post processing tools can be applied to establish statistical comparisons among
different fusion algorithms.
It should be noted that the generation of such a test and evaluation framework may involve as much or
more work as the creation of the fusion system itself. Even the collection of useful test data on live
test ranges can be challenging. However, it is important to develop or access such data sets to fairly
evaluate selected algorithms and sensors.
If we want to predict the computational performance of a fusion system, we can start at the data end to
determine how much processing is required to ingest the data. For example, we may hypothesize the
number of targets to be observed, the number of sensors, and the data collection rate by each sensor
for each target. For each observation processed, we can estimate the computational requirements for
functions such as data association and tracking, sensor management, and other computations. These
can be combined using queueing theory to determine what the computational and response time
implications are.
Some of the key parameters for performance estimation are shown here. They include:
Application parameters such as; the number of sensors, the sensor spatial resolution, target density,
track and identification response times, measurement bandwidth and others,
Algorithm factors such as the computational resources for tracking and correlation, recursiveness,
degree of parallelism for algorithms, etc.
Processing parameters include the size of the track storage database, decision rates, decision rule
storage, tracking recursion, and symbolic processing
Computational parameters involve issues such as the required operations per second, input/output
transfer rates, data base size and capacity and other issues.
These parameters are shown here as examples of the types of factors that may be considered in
estimating the performance and processing needs for a fusion system.
One technique that can be used for analysis is transaction analysis. In this approach, we look at each
way that a fusion system can be activated – that is how is any processing initiated in a fusion system.
Typically, this would involve factors such as, the receipt of a new observation from a sensor or source,
and the interaction with human users (such as a data base query, hypothesis generation, etc.). The
transaction analysis approach considers each of these input actions and follows the transaction or
thread of processing from the initiation through some final action such a the update of a database. For
each of these transactions, we estimate the required computing and communications requirements.
Then using queueing theory one can develop a profile of transaction rates and subsequent demands on
computing and communications for a fusion system.
A major component of any data fusion system is the database system. While there many commercial
DBMS systems available (e.g., ORACLE), these are generally unable to provide all of the services required
for a fusion system. This is because the data in a fusion system often includes textual information,
scalar and vector data, image data, and both numerical and symbolic information. In this and the next
chart we provide a partial list of the types of data in a fusion system. The categories include;
Model parameters provide information such as the characteristics of the sensors, sensor locations
physical constants, models to allow prediction of target or entity dynamics, and prediction of
observations, given an estimate of an entity state vector.
Sensor data is the collection of observed data from the sensors or sources.
External databases may include information from external sources, other fusion systems or nodes, and
general information from the web.
Human inputs include control information, requests for data or processes to be performed, annotations
to the evolving fusion inferences, and collaboration information.
Environmental data includes geography, topology, hydrology (for underwater systems), weather
information, environmental information, information about man-made objects or entities, and other
information about the surrounding environment.
This continues the list of data categories. They include:
The situation database involves the evolving characterization of the situation under consideration.
Examples of information include the location, identity and characteristics of observed entities, the
relationships among entities such as communications, hierarchy, sequential information, etc. This also
includes information about the observed entities and their relationship with the surrounding
environment.
The threat/consequence database contains the predicted consequences of current situation
assessments. This might include the location, identity and characteristics of impending or threatening
conditions, entities or anticipated events, probable courses of action, and the interpretation of possible
consequences of actions.
Performance data may be stored related to the ongoing performance of the data fusion system.
A priori data may involve knowledge bases such as rules, scripts, frames, about entity behavior and
interaction, technical data about the sensors, environment and entities, and finally “mission” data that
guide the application and use of the data fusion system.
The concept of using a service oriented architecture (SOA) approach for distributed sensing and fusion
processes has been introduced for many applications. This chart shows the concept of enterprise
services for the Global Information Grid (GIG), adapted from a presentation by Rob Walker of the
Defense Information Systems Agency (DISA) in 2004. The concept illustrates a web enabled
infrastructure including data services, messaging services, transformation services and service registry to
allow consumers of data and information fusion products to service providers.
In general, a Service-Oriented Architecture is a component model that inter-relates the different
functional units of an application, called services, through well-defined interfaces and contracts between
these services. The interface is defined in a neutral manner that is intended to be independent of the
hardware platform, the operating system, and the programming language the service is implemented in.
This allows services, built on a variety of such systems, to interact with each other in a uniform and
universal manner.
From the same briefing by Rob Walker, this chart shows a hierarchy of anticipated services for a military
situation awareness application. These range from high-level services such as tools for planning and
conducting operations (e.g., force projection services) to sense-making and battle-space understanding
services, to middle level services such as organizing and managing workflow and command and control
functions, to data level and security services. While these are defined for a military situation
awareness application, analogs exist for other applications such as crisis management or environmental
monitoring.
There are a number of anticipated advantages to utilizing a service oriented architecture approach.
These include:
Leveraging existing assets such as existing sensing, communications and data fusion systems
Support for all types or styles of integration including; user interaction, connectivity among multiple
applications, integration of processes and information integration
SOA allows for incremental integrations and moving or migrating of computing and sensing assets
SOA includes a development environment that; promotes better reuse of modules and systems, allows
legacy assets to be migrated into the framework and supports the timely implementation of new and
emerging technologies.
Finally, SOA allows implementation of new models including grid computing, on-demand computing and
new portal-based client models.
While there are anticipated advantages to service oriented architectural approaches, there are potential
operational and implementation issues. For operational issues these include; i) how will real users
utilize the dynamic link capabilities to discover new sources and functions that support their processing,
ii) what is the role of human in the loop processing in which humans dynamically change or augment
automated processing, iii) hierarchical heterogeneous users with different needs and priorities, iv) how
to handle evolution and changes to components and make those transparent to all system users, v)
adjustment and adjudication of resource demands, and finally, vi) the impact on legacy systems.
For implementation, issues include; i) when and why to develop new application-level components and
services, ii) the amount of imposed structure (e.g., a “free market” type of approach versus a federated
approach), iii) who gets to demand changes in services and components, iv) how to specify and provide
for an orderly evolution of core services, iv) the development of standards, ontologies and templates for
data, processes and methods, v) security across all of the services and components, and finally vi)
inclusion – who gets to use what services and information sources.
It is beyond the scope of this course to discuss security issues for fusion system development.
However, there are formal security engineering processes which provide guidelines for understanding
and specifying a security concept of operations (CONOPS), system security guidance policies, software
security models, and ultimately how to develop and assure secure operations. Even if one is working
in a non-military application, such as a medical application there are numerous issues of privacy and
security to be concerned about.
There are two key questions (and sub-questions) to be answered before you proceed to design a data
fusion system. These seem simple, namely, i) do you understand the application and ii) do you
understand the inference process? The typical immediate response to both questions is, “of course”.
However, when developers who know about one application, such as a military situation awareness
application and try to address another application such as medical informatics or environmental
monitoring, they can make serious errors in the design process, because of many assumptions. So,
when addressing a specific application, a designer or design team, should explicitly address the following
questions.
Do you understand the application? Specifically, can you address the following sub-questions; What
decisions/inferences are required?, What are the timeframe/rates for decisions?, Who are the system
users?, How do decisions affect the mission?, What is the typical situation/threat environment?, What is
the appropriate MOE/MOP?, What are the platform/system constraints?, What are the other data
sources?, Is real data available?, Do we have (former) users/analysts available?, and How does mission
environment affect system architectures?
Do you understand the inference process? Specifically, can you address the following sub-questions?
What is processing chain from energy detection to inferences?, Do you understand the cognitive process
for effective inferences?, Can you characterize effective versus ineffective inferences?, What are the
barriers to correct inferences (human limitations, Counter measures, etc.)?, What is the use of negative
information?, What is the decision environment (stress, decision styles, doctrine, etc.)?, and What are
the effective/applicable processing techniques?
The process of designing and implementing a data fusion system may seem overwhelming and painful –
indeed numerous data fusion systems don’t actually fuse anything (they develop infrastructure software
and run out of time/funding before implementing the fusion software). On the positive side there are
significant resources available to support a successful implementation including COTS software.
Download