paper

advertisement
VERIFICATION, VALIDATION, AND ACCREDITATION (VV&A) OF COMPLEX
SOCIETAL MODELS & SIMULATIONS (M&S)
Robert Clemence, Jimmie McEver, David Signori, EBR
Dean Hartley, Hartley Consultants
Al Sciarretta, CNS Technologies
Stuart Starr, Barcroft Research Institute
ABSTRACT
The goal of this research program was to explore innovative approaches to performing
verification, validation, and accreditation (VV&A) of composable political, military, economic,
social, information, and infrastructure (PMESII) models. Consistent with that goal, the program
focused on two objectives. First, it developed a “baseline” VV&A process that is well-suited to
the development and evolution of DARPA’s Conflict Modeling, Planning and Outcomes
Experimentation (COMPOEX) program. Second, it developed highly accelerated VV&A
processes for two conditions, compressed and hyper-compressed VV&A.
To establish the context for these objectives, this paper briefly characterizes the nature of the
problem. That is followed by a brief description of the “baseline” VV&A approach that the study
team has developed. The bulk of this paper focuses on innovative techniques for performing
compressed and hyper-compressed VV&A. It describes the activities that must be performed
during “baseline” VV&A to prepare the VV&A Team for compressed and hyper-compressed
approaches. That leads to the formulation of several V&V techniques that are well-suited to
these circumstances: experimental design, logic traceback, and model comparison. These
methods are described and residual challenges are identified. The paper concludes by identifying
residual issues that should be addressed in future research efforts.
A. NATURE OF THE PROBLEM
The VV&A of tools that model PMESII behaviors is very challenging under the best of
circumstances. The study team began the research activity by performing an in-depth review of
the VV&A literature. That review confirmed that the history of VV&A of PMESII models is
limited and the level of V&V has generally been inadequate. Furthermore, that review revealed
that a PMESII model requires integrating multiple domains that often are based on competing
theories of varying maturity.
The VV&A of DARPA’s COMPOEX initiative poses additional challenges. COMPOEX is
designed to be augmented and reconfigured rapidly to address new problems over a spectrum of
changing environments. This requires methods and tools that permit rapid VV&A. A brief
description of COMPOEX is provided in Appendix C.
Conventional VV&A techniques, while useful, are not adequate for rapid model reconfiguration.
The study team’s research revealed that “baseline” VV&A must be performed periodically,
during and right after development. These techniques are based mainly on the use of subject
1
matter experts (SMEs) who require time to review models. Note, however, that these SMEs are
not always available where and when needed.
The study team was challenged to develop highly accelerated VV&A processes for two
conditions: compressed and hyper-compressed VV&A. In the compressed scenario, the events
triggering the need for the changes allow for several weeks for the model changes. In the hypercompressed scenario, the time available is on the order of several hours. Note, however, that the
available time will be employed to perform research, implement code changes, document the
changes, and perform VV&A. Thus the time available for VV&A is measured in days for the
compressed situation and in tens of minutes for the hyper-compressed situation.
Thus, there is a need for a means of rapidly establishing credibility with the users of such models
shortly after changing components or configuration. By developing V&V processes that are well
suited to that problem, it will ensure that COMPOEX technology will be transitioned to truly
useful capability. In addition, it will minimize the possibility that users will unwittingly do harm.
These factors require novel thinking and approaches to VV&A.
B. BASELINE VV&A
As a context for addressing the compressed and hyper-compressed V&V approaches, the study
team developed a “baseline” VV&A process and assessment methodology for a PMESII model.
This “baseline” VV&A process is based on extensive literature review and prior experiences
with PMESII modeling (e.g., DMSO’s Flexible Asymmetric Simulation Technologies [FAST]
Tool Box). Furthermore, the “baseline” VV&A process was subjected to extensive peer review,
drawing on world-class VV&A experts.
In developing the “baseline” VV&A process, the study team built on the latest thinking in
VV&A approaches. This included an entrenched process that would be employed throughout the
life cycle of the model, the adaptation of a VV&A maturity model that was developed by the
Defense Modeling & Simulation Office (DMSO), and the employment of a risk assessment and
mitigation process to enhance communications with the user of the model.
Furthermore, the study team developed a spreadsheet model that provided a systematic way of
organizing V&V tests and capturing their results. These results capture the degree of success
achieved in each test. However, with hundreds of tests, covering the many areas of the system,
simple combinations of the results tend to obscure, rather than elucidate problems. To enhance
the transparency of those tests, the study team linked a set of “spider” diagrams to the spread
sheet (see Figure 1). The “spider” diagrams support visualization of multiple dimensions in a
single chart and support an overview and segmentation by each individual model. Those
diagrams provide the V&V Team with deep insight into the model’s strengths and weaknesses
that emerge from the “baseline” VV&A process.
2
System Validation Metrics
Model Validation Metrics
Political
5.00
Political
5.00
4.00
3.00
Connect
Military
2.00
1.00
0.00
User Issues
4.00
3.00
User Issues
Economic
DIME
DIME
Military
2.00
1.00
0.00
Economic
Social
Infrastructure
Infrastructure
Social
Information
Information
PMESII System
Name 1
Inter-Model Connections
Model Validation Metrics
Political
5.00
User Issues
DIME
4.00
3.00
2.00
1.00
0.00
Military
Economic
Model 1-2
Model 5-6 5.00
4.00
Model 4-6
3.00
2.00
Model 4-5
1.00
0.00
Model 3-6
Model 1-3
Model 3-5
Infrastructure
Social
Model 1-4
Model 1-5
Model 1-6
Model 2-3
Model 3-4
Model 2-6
Model 2-4
Model 2-5
Information
Name 2
PMESII System
Figure 1. Illustrative “Spider” Diagrams
C. COMPRESSED AND HYPER-COMPRESSED VV&A
C.1 Preparation
To set the stage for the discussion of compressed/hyper-compressed VV&A, it is useful to
introduce a metaphor: the “two-minute drill” in football. All football teams anticipate a situation
where they must score when they have only two minutes in which to operate (typically at the end
of a game). In order to prepare for this contingency, they perform a complex set of actions long
before the game begins. First, they know the conditions that they must contend with (e.g., their
adversary; his strengths and weaknesses). Consistent with that knowledge, they formulate a
constrained “play book” that reflects the adversary’s vulnerabilities. Second, they decide on the
team that will be used to implement the “play book”. Typically, that team will deviate from
normal practice by not forming a huddle during that period of time (wherein decisions are made
and plays are selected) to take maximum advantage of the limited time. Furthermore, that team
practices and refines the “two-minute drill” intensely so that they will be effective and efficient
when they have to execute it.
Although the “two-minute drill” is a useful conceptual model for planning compressed or hypercompressed V&V, it is still an imperfect analogue. Note that an operational V&V task will occur
with limited warning and there will be significant ambiguity about the appropriate size and
composition of the V&V Team.
In order to perform compressed and/or hyper-compressed VV&A, it is vital that critical steps be
taken while performing “baseline” VV&A. Four steps are of particular importance. First,
because of the complexity of the task, it is necessary to anticipate likely Areas of
3
Responsibility (AoRs) and take anticipatory steps. These steps would include, inter alia,
acquiring minimum essential information, designating the members of the V&V Team, and
recruiting and socializing the appropriate SMEs. Second, it is necessary to plan and implement a
VV&A process that is consistent with anticipated resource constraints. This requires
consideration of the needed people, processes, and tools. Third, it is vital to exercise and evaluate
compressed and/or hyper-compressed VV&A processes periodically. Finally, it is important to
develop and implement best practices based on results from exercises and experiences with other
programs. The “bottom line” is that preparation is vital if compressed and hyper-compressed
VV&A are to be performed efficiently and effectively.
To deal with the “curses of complexity and time compression”, several automated tools will be
needed to support compressed and hyper-compressed V&V. First, as noted above, techniques are
needed to identify a small set of impending AoRs of interest. Second, ontologies are needed to
support the automated collection and conversion of needed data. Third, automated expert
elicitation techniques would be valuable to generate a “SME-in-a-box” and to create a “just-intime” needed model. Finally, techniques are needed to generate meta-model(s) of the multiresolution model to support compressed and hyper-compressed VV&A. This latter issue is
discussed below.
C.2 Approaches to Compressed and Hyper-Compressed VV&A
The authors of this paper developed several possible approaches to the problem of performing
useful V&V within the compressed and hyper-compressed scenarios. The team grouped these
approaches into three categories, exploratory design, logic tracing, and comparison (Figure 2).
The initial approaches were refined, reduced, and combined.
Baseline VV&A
Exploratory
Design
Approaches
Preset
Data &
Preset
Tests
Exploratory
Space
Analyses
Logic Tracing
Approaches
Risk
Analyses
“Flight
Recorder”
Automatic
Comparison
Approaches
Compare
with Generic
model
(e.g.,ISSM)
Compare with
domain specific
model (e.g.,
Senturion)
Compare
with SME
Figure 2. Approaches to Compressed & Hyper-Compressed VV&A
Compressed and hyper-compressed situations occur during use of the model. Time and other
resource compression are important factors. Triggers include introduction of a new tool,
modifications to an old model, and a changed environment. A changed environment can be a
new scenario with new data categories (for example, the real world has changed, as Iraq changed
after bombing of the Golden mosque in Samarra). A changed environment can also be a different
way of using the model.
4
The three test systems described below are all potentially valuable elements of developmental
and periodic testing. However, they may be the only elements that can be used during the
compressed or hyper-compressed time lines of triggered testing. The model comparison and
embedded results-traceback systems are likely to be the only extra test-oriented elements that
will be used during ordinary model use (as opposed to during testing). The model comparison
system will support calibrating the PMESII model to maximize confidence in using it to simulate
the future and the results-traceback system will be available to examine any unusual results.
The experimental design system must be run during the uncompressed time lines of
developmental and/or periodic testing to create the pre-set tests and results that are needed for
later testing. The model comparison and embedded results-traceback systems do not have this
absolute requirement for running during developmental and periodic testing; however, without
this use, they are less likely to be available, ready, and supported by trained users during the
critical times. Just as with the two-minute drill for the final two minutes of a football game,
compressed and hyper-compressed validation techniques must be developed and practiced before
those final critical moments.
For the selected approaches, we will characterize the system design and system operation.
Furthermore, we will provide additional comments on the approaches.
C.2.1 Experimental Design System
At any stage of testing, there are always more tests that can be thought of than there is time to
accomplish. Of particular concern is the concept of region(s) of validity. Because models, by
definition, are abstractions, they cannot give exact answers to all possible questions. In general
(and especially in the case of PMESII models), they cannot give exact answers to any question,
but may give answers that are close enough to be useful. Sometimes the region in which the
answers are “close enough” encompasses the entire question space; however, in many instances
the region is a subset of the entire space and it is important to understand the limits of the region
of validity (and whether it is a connected region or is composed of non-intersecting sub-regions).
System Design. Experimental design is a technique for producing a (relatively) small set of
experiments (sets of data inputs) for exploring the space and discovering the shape of the region
of validity. Even though the set of experiments is relatively small, for PMESII models it will still
involve a large number of tests. However, the results of the tests may yield a still smaller set of
preset tests that can be used in later tests to determine whether changes in the model or
environment (scenario or model use) have caused changes in the region of validity. During
system design, the team should define a process for creating the experimental design, analyzing
and storing its results, and later using these results.
System Operation. Figure 3 illustrates the process. In this figure, “Now” represents the time at
which the question of changes arises. If these changes occur during development, there may be
time to run the experimental design to determine possible changes in the region of validity.
However, at other times the VV&A Team must rely on having run the experimental design in the
past and having available the preset inputs and expected outputs to test for changes in validity.
Therefore, the system operation has two components, set-up and use.
The set-up consists of exploratory analyses and risk analyses that are performed during the
model’s stable period. The time that is most likely to be conducive to the set-up is the period
following a developmental cycle. Both the exploratory and risk analyses would provide excellent
5
support to the VV&A process that should occur at that time. The purpose of the exploratory
analyses is to characterize the total response surface of the model, permitting definition of the
“interesting” parts of the surface. The risk analyses define the areas that have the most variability
and appear most susceptible to validity problems. Together, these analyses support the definition
of a filter that will support selection of a reduced set of tests.
Figure 3. Experimental Design
Unfortunately, the dimensionality of PMESII problems is too great for simple characterization.
During the set-up, an experimental design must be created to support disciplined analyses.
Depending on the nature of the model and the resources available, the experimental design may
range from millions to tens of tests. It is likely that all but the last will require data farming on
high performance computers. The immediate purpose of these tests is to characterize the
response surface – which includes evaluation of the validity of the model at each part of the
response surface. In cases where data farming is used, this will almost certainly mean that the
data farming will merely identify areas that look questionable. Because of constrained resources,
only a select few areas can be further tested with conventional V&V techniques.
After the experimental design tests have been run through the model and the results have been
analyzed, the team must create a reduced set of tests and expected outputs for later use. The filter
that is developed should consider the “interesting” parts of the response surface (perhaps where
outliers are likely or where the surface bends more sharply than elsewhere) and the riskiest parts
of the surface. While the preset tests should concentrate on areas where validity seems most
likely to be challenged by any changes to the model, they should also include a few
uncontroversial areas to anchor the results.
The plan is to run the preset tests following any change to the model and compare the outputs of
the modified model with the preset outputs. Naturally, changes to the model should lead to
changes in results; however, the output changes should be along expected lines. Excessive
differences or differences in unpredicted areas must be investigated.
6
Some of the investigation may be through SME evaluation, which should include rating on a risk
scale, with the riskiest being subject to the most rigorous review.
Another approach follows the characterization of the response surface. This approach consists of
a comparison of the characteristics of the new response surface to the old one. Areas of change
would be subjected to the most rigorous review. Naturally, this approach is only feasible in
situations where the response surface was previously well-characterized.
The details of the use of the preset tests will depend on whether the situation is compressed or
hyper-compressed. In compressed situations, the preset tests will be further filtered to stress the
riskiest tools, interactions, and data. In hyper-compressed situations, the filter will be for the
most important inputs for the riskiest outputs and for the changed structures.
Although outliers produced by the preset tests in the modified PMESII model deserve extra
attention, the fact that they are outliers does not prove that they are invalid or that the changed
model is invalid.
The changed model may require new data for variables that were not present in the preset data or
they may not use all of the data in the preset data because old variables have been dropped.
These changes present problems both in execution and in interpretation.
The SME assessments will require time because the humans involved must work at human
speeds. Methods to automate portions of this assessment can be major multipliers in increasing
the speed and value of the validation process.
Comments. The exploratory design system will have to be tailored to the PMESII model or
system. Naturally, the time it takes for a single model run – and accompanying V&V analysis
will impact the overall system design.
The types of inputs will also impact the system design. Standard inputs (e.g., situational state
variables and planned actions) are relatively simple to design variations for. However, there are
also possible “model definition” inputs (e.g., characteristics that differentiate one city from
another or one leader from another) and inputs that define how one module interacts with
another. These model definition inputs must be also considered in the experimental design. Both
types of input have uncertain values and need to be varied and validated in testing.
C.2.2 Embedded Results-Traceback System
During the use of the PMESII model, and especially during testing, there will be occasions when
the outputs seem wrong. They may in fact be wrong or they may be shedding insight that can be
used to correct misunderstandings of the real world. One useful technique for resolving the
ambiguity is that of tracing the results back to their origins; however, doing this manually in a
complex system can be difficult, if not impossible.
System Design. If the PMESII model permits the installation of intermediate data capture
instruments, an automated system can be created to help in determining the validity of suspicious
results. Figure 4 illustrates such a system. If such a system is created, it will support the
validation of results at the time they are discovered (“Now” in the figure). An embedded resultstraceback system can be important in developmental and periodic testing and may be critical in
triggered testing and model use evaluation.
7
Figure 4. Embedded Results-Traceback System
The embedded data capture instruments illustrated in the figure are designed to capture the
transformation of data from inputs to outputs. Ideally, the values for each variable would be
captured as it changes along with the time of change. However, the sheer volume of data would
make any practical use of such a data capture regime difficult, if not impossible. A more
practical regime involves the capture of selected variables that are likely to be important in
understanding final output results and only capturing the data at periodic (simulated) intervals to
further reduce the volume of data.
COMPOEX writes its significant variables (basically those that are passed from one internal
model to another) to a “backplane,” or vector of variable values. It also uses a time-stepped
architecture, yielding a natural period for data collection. Instrumenting other models may be
more difficult, requiring the development of a “flight recorder,” and creating and inserting code
blocks in each module to capture and send the variable values to the flight recorder.
The second requirement for an embedded traceback system is the need for a database to store and
serve up the traceback data as needed.
The third requirement is for a program to support the analysis of the traceback data. This
program and a simulated display are also illustrated in Figure . Automatic causal tracing would
be desirable; however, in PMESII situations, complexity is the rule. Cause and effect
relationships are actually causal networks, with multiple causes of multiple effects. The tracing
combinatorics require significant amounts of time and preclude much depth of tracing.
Fortunately, the pattern recognition abilities of biological beings are available. Human beings
8
can simply look at current and “past” traces of variable values and identify and traceback
“humps” and “valleys.” This provides a first order estimate of the source and likely validity of
possible anomalies in output results.
System Operation. The embedded results-traceback system naturally requires an upfront
investment; however, no additional resource drain is absolutely required subsequently. This
system should be operational at all times. It will be used at any time that possible anomalies are
spotted; however, it should also be exercised periodically to ensure rapid response during critical
times.
Its use in compressed and hyper-compressed situations to check the validity of modified versions
of the model clearly depends on the time available. In compressed situations, all major results
should be reviewed to investigate their causal networks. In addition, several minor results should
be similarly investigated, as spot checks. In hyper-compressed situations, any major results that
look questionable should be investigated. In either case, if nothing looks out of line, then the user
would assume that the validity level is no worse than it was before the changes.
Comments. Because of the architecture of COMPOEX, the creation of an embedded resultstraceback system is relatively simple. In systems that require the building of a “flight recorder”
and internal code blocks, this creation is more difficult. It requires extra time and effort. It also
introduces new sources for errors. In either case, organizing the traces in meaningful ways will
be difficult.
C.2.3 Model Comparison System
The model comparison system produces validation evidence by comparing the outputs of the
PMESII model to other reasonable sources. Three sources are considered: real world results,
opinion from area experts, and a much simpler, fast running model that uses different logic.
This first source would be considered sufficient in itself except for the belief that real world
results from a given starting point are not deterministic in the sense that we are confident that no
other result could have occurred. It may be that the real world is truly stochastic or that we
simply cannot observe enough of it at sufficient levels of granularity to understand how the
results are determined. Thus divergent results from a model may not be an indication of lack of
validity, but rather an indication of alternate results that could occur.
The opinions of true area experts might well be the best source; however, this source has its own
problems. First, the experts might not be universally available. Second, the determination of the
level of expertise of putative experts is problematic. Third, creating expert opinions on future
results by humans is a slow process. Fourth, for practical reasons, the experts should be required
to make their predictions prior to the actual unfolding of events.
The last source, a simpler model, has its own pros and cons. The validity of that model is almost
certainly inadequately known. Despite this indeterminate validity, a model that has some
credibility and uses entirely different logic from the model in question can provide independent
support for the PMESII model by providing similar results for similar starting situations.
Alternatively, by disagreeing with the PMESII model, the simpler model can suggest a need for
further investigation and provide some diagnostic information concerning areas of maximal
disagreement and maximal agreement.
The model comparison system also provides a calibration mechanism, as explained below.
9
System Design. Figure 5 uses an icon for the Interim Semi-Static Stability Model (ISSM) to
represent the simpler model. The ISSM is used because it is known to have the needed attributes
for the task; however, there are other possibilities (e.g., Senturion from Sentia Group, Inc.).
Figure 5. Model Comparisons
The system for running the simpler model with the same data and with comparable outputs must
be designed and set up prior to V&V testing or model use. For maximal ease of use, the data for
the simpler model should be automatically created from the data needed to run the PMESII
system in question. As shown in the figure, the simpler model will also need data feeds of any
inputs that are exogenous to the PMESII system that are introduced during the course of
simulated time within the system (such as interventions and externally modified state variables).
The simpler model should also produce outputs that are as closely comparable to the PMESII
system outputs as possible.
System Operation. In Figure 5, “Now” represents the time when the PMESII system is to be
tested or used. The events shown as being in the past, unlike those shown in the past in Figure
are not things that need to have been done by the testing team prior to “now.” The past in Figure
represents real world past events that should be simulated prior to attempting to simulate the
future. The concept is to start the simulation at some time in the past, simultaneously keep track
of events with another, simpler model, and compare the results of the PMESII simulation, the
simpler model, SME projected results, and the current situation. Rough agreement should be
considered a successful validation event.
If the PMESII simulation did not achieve comparable results, rerun the simulation with changed
parameters. Continue adjusting the parameters until the simulation is calibrated to achieve
comparable results, only then proceeding with the simulation of the future events. If the
parameters cannot be adjusted to achieve comparable results, this would be a serious non10
validation event.
The determination of the comparability of the results will need to be made by SMEs. Some of the
results will not be easily quantifiable (especially real world status), the significance of the
differences in quantified variables will not be clear, and the possibility of alternative futures will
need to be considered.
Comments. A model comparison system can be important in developmental and periodic testing
and may be critical in triggered testing. It may be the best means of creating confidence in the
model during use through the use of known calibration points.
C.3 Comments on the Novel Validation Approaches
The exploratory design approach may require extremes in computing power in situations where
data farming methods are required. It should be adequate in compressed situations; however, it
may be too slow for use in hyper-compressed situations. Its use during baseline VV&A will
provide useful information, both for the baseline VV&A and for later compressed and (possibly
for) hyper-compressed situations.
The embedded results-traceback approach should be usable in compressed and hypercompressed situations. Its use during baseline VV&A will provide useful information, both for
the baseline VV&A and for later compressed and hyper-compressed situations. Further, the
embedded nature of this approach means that it can be used at any time during the use of the
PMESII system to investigate any questionable results.
The model comparison approach should be usable in compressed and hyper-compressed
situations. Its use with projected SME results during hyper-compressed situations may not be
possible because of time constraints. Its use during baseline VV&A will provide useful
information, both for the baseline VV&A and for later compressed and hyper-compressed
situations. Further, the embedded nature of this approach means that it can be used at all times
during the use of the PMESII system to produce calibrated results. This approach is known to be
feasible using the ISSM as the simpler model.
Using combinations of these approaches would be beneficial in all situations: baseline,
compressed, and hyper-compressed. The approaches to the problem of validating a modified
model are orthogonal and thus likely to yield the most information with the least effort.
11
D. SUMMARY
The study team has achieved the following major accomplishments during the course of this
project. First, it has developed a peer-reviewed, “baseline” VV&A methodology for a PMESII
multiple resolution model. That methodology draws on state-of-the-art thinking and includes
novel tools to implement it.
Second, the project has served to clarify the linkage between the “baseline” VV&A methodology
and the proposed compressed and hyper-compressed approaches to V&V (i.e., the “two minute
drill”). In particular, it has stressed the importance of taking anticipatory steps during the
“baseline” process, planning a compressed and hyper-compressed V&V process that is consistent
with time and resource constraints, exercising and evaluating the compressed and hypercompressed approaches systematically, and developing and implementing “best practices” to
apply to the process. Within that context, the study team has developed three approaches that are
suitable for compressed and hyper-compressed VV&A: exploratory design, logic tracing, and
comparison with High Level Models. However, there are key major residual challenges that
should be addressed in follow-on research.
12
APPENDIX A: REFERENCES
A.1
PRESCRIPTIONS AND THEORETICAL DESCRIPTIONS OF VV&A (SORTED
BY DATE)
1. Naylor, Thomas H. and J. M. Finger. "Verification of Computer Simulation Models."
Management Science 14, 2 (Oct 1967): B92-B106.
2. McKenney, James L. "Critique of: 'Verification of Computer Simulation Models'."
Management Science 14, 2 (Oct 1967): B102-B103.
3. Schrank, William E. and Charles C. Holt. "Critique of 'Verification of Computer
Simulation Models'." Management Science 14, 2 (Oct 1967): B104-B106.
4. Kleijnen, Jack P. C. Statistical Techniques in Simulation Part I. New York: Marcel
Dekker, Inc., 1974.
5. Kapper, Francis B. "Wargaming I: The Simulation of Crisis." Defense/81. Arlington, VA:
American Forces Information Service, May 1981, 14-21.
6. Hoeber, Francis P. Military Applications of Modeling: Selected Case Studies, s. New
York: Gordon and Breach Science Publisher, 1981.
7. Law, Averill M. and W. David Kelton. Simulation Modeling and Analysis. New York:
McGraw-Hill Book Company, 1982.
8. Balci, Osman. "Credibility Assessment of Simulation Results." Proceedings of the 1986
Winter Simulation Conference. Ed. J Wilson J Henriksen S Roberts. 1986, 38-44.
9. Schmidt, J. W. "Introduction to Systems Analysis, Modeling and Simulation."
Proceedings of the 1986 Winter Simulation Conference. Ed. J Wilson J Henriksen S
Roberts. 1986, 5-16.
10. Sargent, Robert G. "Overview of Verification and Validation of Simulation Models, An."
Proceedings of the 1987 Winter Simulation Conference. Ed. A Thesen; H Grant; W D
Kelton. Atlanta, GA, 1987, 33-39.
11. Balci, Osman. "How to Assess the Acceptability and Credibility of Simulation Results."
Proceedings of the 1989 Winter Simulation Conference. Ed. E MacNair, K Musselman, P
Heidelberger. Washington, DC, 1989, 62-71.
12. Williams, Marion L., and James Sikora (1991), "SIMVAL Minisymposium-A Report,"
Phalanx, Vol. 24, No. 2, June.
13. Hodges, James S. (1991). Six (or So) Things You Can Do with a Bad Model. RAND,
Santa Monica, CA.
14. Davis, Paul K. 1992. Generalizing Concepts and Methods of Verification, Validation, and
Accreditation (VV&A) for Military Simulations. RAND, R-4249-ACQ.
15. Ritchie, Adelia E. (1992). Simulation Validation Workshop Proceedings (SIMVAL II).
MORS, Alexandria, VA.
16. Knepell, Peter L. and Deborah C. Arangno (1993). Simulation Validation: A Confidence
Assessment Methodology. IEEE Computer Society Press, Los Alamitos, CA.
17. Anderson, Robert H. and Paul Davis (1993). Proceedings of a Workshop on Methods and
Tools for Verification, Validation, and Accreditation October 5-6, 1993. RAND, PM179-DMSO.
18. Hartley, Dean S., III. and Howard Whitley (1996). “Verification, Validation, and
Certification (VV&C).” Simulation Data and Its Management Mini-Symposium
(SIMDATAM ’95). Ed. C E Gettig Jr. and Julian I Palmore. MORS, Alexandria, VA,
1996.
19. Balci, Osman. "Verification, Validation and Accreditation of Simulation Models."
13
Proceedings of the 1997 Winter Simulation Conference. Ed. S Andradottir, K J Healy, D
H Withers, and B L Nelson. Washington, DC, 1997, 135-141.
20. Hartley, Dean S., III. "Verification & Validation in Military Simulations." Proceedings of
the 1997 Winter Simulation Conference. Ed. S Andradottir, K J Healy, D H Withers, and
B L Nelson. Washington, DC, 1997, 925-932.
21. Verification, Validation, and Accreditation of Army Models and Simulations (1999). DA
PAM 5-11, Washington, DC.
22. Verification, Validation, and Accreditation (VV&A) of Models and Simulations (1999).
SECNAV Instruction 5200.40, Washington, DC.
23. Conwell, Candace L., et al. "Capability Maturity Models Support of Modeling and
Simulation Verification, Validation, and Accreditation." Proceedings of the 2000 Winter
Simulation Conference. Ed. J A Joines, R R Barton, K Kang, and P A Fishwick.
Washington, DC, 2000, 819-828.
24. Balci, Osman. “Verification, validation and testing of models,” Encyclopedia of
Operations Research and Management Science. Ed. Saul I. Gass and Carl M. Harris.
Kluwer Academic Publishers, Boston, MA. 2001.
25. Balci, Osman, et al. "A Collaborative Evaluation Environment for Credibility Assessment
of Modeling and Simulation Applications." Proceedings of the 2002 Winter Simulation
Conference. Ed. E Yucesan, C-H Chen, J L Snowdon, and J M Charnes. Washington,
DC, 2002, 214-220.
26. Youngblood, Simone (2004), “DoDI 5000.61 and the VV&A RPG”, DMSO,
Washington, DC.
27. Sargent, Robert G. "Validation and Verification of Simulation Models." Proceedings of
the 2004 Winter Simulation Conference. Ed. R G Ingalls, M D Rossetti, J S Smith, and
BA Peters. Washington, DC, 2004.
28. Modeling and Simulation Verification, Validation, and Accreditation Implementation
Handbook: Volume I VV&A Framework (2004). Department of the Navy, Washington,
DC.
29. Simulation Verification, Validation and Accreditation Guide (2005). Australian Defence
Simulation Office, Department of Defence, Canberra, Australia.
30. Youngblood, Simone (2005), “VV&A Standards Initiative”, DMSO, Washington, DC.
31. Management of Army Models and Simulations (2006). DA Regulation 5-11, Washington,
DC.
32. Waltz, Ed (2007). “Integrated Battle Command Program: Model Validation.” BAE
Systems.
33. Draft ATEC Reg 73-21 (Accreditation of M&S for Test and Evaluation).
34. DMSO VV&A website, http://vva.dmso.mil/.
35. “Agent Based Simulation (ABS) VV&A Workshop,” sponsored by Mike Bailey, USMC,
held at Northrop Grumman, Fair Lakes, VA, 1-2 May 2007.
A.2
PRACTICE OF VV&A (SORTED BY DATE)
36. Hartley, Dean S., III (1975), An Examination of a Distribution of TAC CONTENDER
Solutions, NMCSSC, Washington, DC, TM 101-75.
37. Vector Research (1981), Verification Analysis of VECTOR-2 with the 1973 Arab-Israeli
War and Analysis of Related Military Balance Issues, VRI-NGIC-1 FR 81-1. VRI., Ann
Arbor, MI.
38. Hartley, Dean S., III, et al. (1989), Sensitivity Analysis of the Joint Theater Level
14
Simulation I, Martin Marietta Energy Systems, Inc., Oak Ridge, TN, K/DSRD-70.
39. Hartley, Dean S., III, John D. Quillinan, and Kara L. Kruse (1990), Phase I Verification
and Validation of SIMNET-T, Martin Marietta Energy Systems, Inc., Oak Ridge, TN,
K/DSRD-116.
40. Hartley, Dean S., III, Charles Radford, and Cathrine E. Snyder (1991), Evaluation of the
Advanced Battle Simulation in WAREX 3-90, Martin Marietta Energy Systems, Inc., Oak
Ridge, TN, K/DSRD-597.
41. Hartley, Dean S., III (1991), Confirming the Lanchestrian Linear-Logarithmic Model of
Attrition, Martin Marietta Energy Systems, Inc., Oak Ridge, TN, K/DSRD-263/R2.
42. Hartley, Dean S., III, et al. (1994), An Independent Verification And Validation of The
Future Theater Level Model Conceptual Model, Martin Marietta Energy Systems, Inc.,
Oak Ridge, TN, K/DSRD-1637.
43. Bauman, Walter (1995), Ardennes Campaign Simulation (ARCAS), CAA-SR-95-8. US
Army Concepts Analysis Agency, Bethesda, MD.
44. Metz, Michael L. (2000), "Joint Warfare System (JWARS) Verification and Validation
Lessons Learned." Proceedings of the 2000 Winter Simulation Conference. Ed. J A
Joines, R R Barton, K Kang, and P A Fishwick. Washington, DC, 2000, 855-858.
45. Hartley, Dean S., III. (2001), Predicting Combat Effects, INFORMS, Linthicum, MD.
46. Wait, Patience (2001). “Project Protection Comes at a Price.” Washington Technology,
8/28/01. http://www.washingtontechnology.com/news/1_1/daily_news/17071-1.html.
47. Chaturvedi, Alok R. (2003). “SEAS-UV 2004: The Theoretical Foundation of a
Comprehensive PMESII/DIME Agent-Based Synthetic Environment,” Simulex, Inc.
48. Hartley, Dean S., III (2004). FAST for the Warfighter: Test Strategy & Plan (Revision 3).
DRC, Vienna, VA.
49. Hartley, Dean S., III (2005). MOOTW FAST Prototype Toolbox: FY04 Validation
Strategy & Plan. DRC, Vienna, VA.
50. Hartley, Dean S., III (2005). MOOTW FAST Prototype Toolbox: FY05 Validation
Strategy & Plan. DRC, Vienna, VA.
51. Senko, Robert M. (2005). Flexible Asymmetric Simulation Technologies (FAST) for the
Warfighter: FY05 Final Verification Test Report. DRC, Vienna, VA.
52. COAST GUARD: Changes to Deepwater Plan Appear Sound, and Program Management
Has Improved, but Continued Monitoring Is Warranted (2006). GAO, Washington, DC.
53. McDonald, C. S. (2006). Verification and Validation Report for the Synthetic
Environment for Analysis and Simulation (SEAS). The Johns Hopkins University –
Applied Physics Laboratory, Laurel, MD.
54. Sheldon, Bob (2006). “Memorandum for the Record: V&V Report for the Synthetic
Environment for Analysis and Simulation (SEAS), 27 October 2006.”
55. Davis, D. F. (2006). “Consolidated Report Consisting of three Research and
Development Project Summary Reports,” Contract #W15P7T-06-T-P238. Peace
Operations Policy Program, George Mason University, Arlington, VA.
A.3
PMESII CONSIDERATIONS
56. Hartley, Dean S., III (1996). Operations Other Than War: Requirements for Analysis
Tools Research Report. Lockheed Martin Energy Systems, Inc., Oak Ridge, TN,
K/DSRD-2098.
57. Hayes, Bradd C. and Jeffrey I. Sands, 1997. Doing Windows: Non-Traditional Military
Responses to Complex Emergencies, CCRP, Washington, DC.
15
58. Staniec, Cyrus and Dean S. Hartley III. (1999), OOTW Analysis and Modeling
Techniques (OOTWAMT) Workshop Proceedings, January 1997. MORS, Alexandria,
VA.
59. Hartley, Dean S., III (2001). OOTW Toolbox Specification. Hartley Consulting, Oak
Ridge, TN.
60. DMSO (2001). “Human Behavior Representation (HBR) Literature Review,” RPG
Reference Document, http://vva.dmso.mil/.
61. DMSO (2001). “Validation of Human Behavior Representations,” RPG Reference
Document, http://vva.dmso.mil/.
62. Allen, John G. (2006). “Modeling PMESII Systems.” DARPA, Washington, DC.
63. Hartley, Dean S. III (2006). Operations Other Than War (MOOTW) Flexible
Asymmetric Simulation Technologies (FAST) Prototype Toolbox: ISSM v4.00 Analysts'
Guide, Dynamics Research Corporation. 2006.
64. Wagenhals, Lee W. and Alexander H. Levis (2006), “Dynamic Cultural Preparation of
the Environment: Course of Action Analysis Using Timed Influence Nets”, George
Mason University, Fairfax, VA.
65. HQ DA, “Systems Perspective of the Operational Environment,” The Operations
Process, FM 5-0.1, 2006.
66. Hartley, Dean S., III, Dave Holdsworth, and Christian Farrell (2006). OOTW FAST
Prototype Toolbox: Analysis Process. DRC, Orlando, FL.
A.4
SIMULATION TECHNOLOGY
67. Starr, Stuart (2000). SIMTECH 2007 Mini-Symposium and Workshop. MORS,
Alexandria, VA.
68. Davis, Paul and Robert H. Anderson (2003). Improving the Composability of Department
of Defense Models and Simulations. RAND, Santa Monica, CA.
16
APPENDIX B. ABBREVIATIONS AND ACRONYMS
Acronym
AoR
COMPOEX
DARPA
DIME
DMSO
FAST
ISSM
M&S
MoM
PMESII
RoI
SME
V&V
VV&A
Definition/Description
Area of Responsibility
Conflict Modeling, Planning and Outcomes Experimentation
Defense Advanced Research Projects Agency
Diplomatic, Informational, Military, Economic
Defense Modeling and Simulation Office (now M&S CO)
Flexible Asymmetric Simulation Technologies
Interim Semi-Static Stability Model
Modeling & Simultion
Measure of Merit
Political, Military, Economic, Social, Information and
Infrastructure
Return on Investment
Subject Matter Expert
Verification & Validation
Verification, Validation & Accreditation
17
Appendix C. Major Characteristics of the COMPOEX Multi-Resolution Model
DARPA’s COMPOEX program is developing a decision aid to support leaders in
designing and conducting future coalition-oriented, multi-agency, intervention campaigns
employing unified actions, or a whole of government approach to operations. It generates
a distribution of “plausible outcomes” rather than precise predictions.
It provides a comprehensive family of interacting models that span the relevant
DIME/PMESII domain. The execution of COMPOEX:

Automatically forces models to interact to suggest plausible activities and
outcomes;

Allows the user to modify or create models and to import models;

Allows multi-sided analysis; and

Allows the user to visualize the interactions (normally in the form of graphical
data).
COMPOEX’s components include:

Conflict Space Tool: this provides leaders and staff the ability to explore and map
sources of instability, relationships, and centers of power to develop their theory
of conflict.

Campaign Planning Tool: a framework to develop, visualize, and manage a
comprehensive campaign plan in a complex environment.

Family of Models: these are instantiated for the current problem and additional
models to more accurately represent the operational environment.

Option Exploration Tool: this enables a staff to explore multiple series of actions
in different environments to see the range of possible outcomes in all
environments.
Figure C-1 illustrates the structure of the COMPOEX Option Exploration Tool. Each
PMESII area is represented by a set of models, divided by level of resolution. In general,
the models communicate with each other by posting variable results to the backplane at
the end of each time step and reading variable values at the beginning of the next time
step. The models are tailored to the Area of Responsibility (AOR) and, perhaps, to the
problem at hand.
18
Levels
of
Resolution
Political-Social
Economic
PS1
Nat’l Gov’t and
Regional Influences
National
Sector
ROL
Infrastructure
Information
Military
E1
Nat’l
Econ
PS2
Sector Area
Power Structures
Media
STORM
JCAT
Aggregation
Districts
PS3
E2
Meso
Econ
E3
Employment
E4
Black
Market
HS
Health
Sanitatn
EP
POL
Food
Distr
Interaction
MIL
Backplane
Figure C-1. COMPOEX Structure
From a V&V perspective, each model must be addressed and the nature of the
assumptions built into the communications links must be addressed as part of the V&V of
the entire system. The fact that multiple simulation paradigms (systems dynamics
models, SOAR technology, and possible other agent-based modeling systems) are used
within the system complicates the process. However, the most challenging aspect from a
VV&A perspective is the proposed use of different sets of models, depending on the
situation, with models being custom built or modified immediately prior to use.
19
Download