Until the next of the formal release of the UCODE... can be maintained by inserting the following sections in the... ADDITIONS TO THE UCODE MANUAL APPLICABLE TO UCODE 3.0

advertisement
ADDITIONS TO THE UCODE MANUAL APPLICABLE TO UCODE 3.0
Until the next of the formal release of the UCODE manual, a working copy of the manual
can be maintained by inserting the following sections in the original copy of the manual.
A hard copy can be ordered by contacting igwmc@mines.edu.
1
THIS NEW DISCUSSION OF WEIGHTING (with particularly useful application
to concentration observations) IN UCODE SHOULD FOLLOW TABLE 1, ON
PAGE 12 OF THE ORIGINAL MANUAL
Weighting Observations
Weights are applied to the residuals to reflect their relative certainty, and thus
importance to obtaining a close fit between the simulated value and the field observation.
Generally, the inverse of the variance of the measurement is used to weight squaredresiduals. This can be determined for practical applications by converting approximate
knowledge of the accuracy of a measurement into more exact terms through a statement
such as: “I am 95% confident that this measurement is within 1 meter.” From a table
describing the normal distribution, one determines that 95% confidence is associated with
a standard deviation of 1.96. Consequently, 1 meter represents 1.96 standard deviations.
It follows that one standard deviation is 1/1.96 meters, or 0.51 meters. The variance is
the square of the standard deviation, or 0.26 meters squared. Finally the weight applied to
the squared residual is the inverse of the variance, or 3.84 square-meters-1. If an
observation is the result of the difference between two measurements (e.g. seepage to a
stream is the difference between the measured stream flow at two locations), the variance
is calculated for each measurement and summed to determine the variance on the
difference.
Errors associated with some types of measurements (e.g. hydraulic head in a
ground-water system) are independent of the magnitude of the observation. While for
some types of measurements, the error is related to the magnitude of the true value in the
field setting (e.g. concentration in a ground-water system). Consequently, it is
appropriate to allow weights to be proportional the value. Such an approach facilitates
attainment of equally desirable fit to both large and small values. The true concentrations
are not known, so either the observed or the simulated values need to be used to
determine the weight.
Researchers including Barlebo and others, 1998; Sonnenborg and others, 1996;
Gailey and others, 1991; and Wagner and Gorelick, 1987, experimented with using the
simulated concentration to determine the weight. These researchers used the same
2
relative weight for any given type of observation. In UCODE the user defines a
dimensionless factor, cv j , that reflects the confidence associated with the simulated
value, ĉ j , thus the weight is:
wc j =
1
(cv j cˆ j )2
(3)
Keidser and Rosbjerg (1991) and Van Rooy and others (1989) used the observed values,
c~ , to calculate weights. Keidser and Rosbjerg (1991) added a constant, η , to prevent
j
large differences between simulated and observed values at low concentrations from
dominating the regression, thus the weight is:
wc j =
1
~
(cv j c j + η )2
(4)
The estimated parameter values are sensitive to the value of the constant (Van Rooy and
others, 1989). If the constant is too small then large deviations at the periphery of the
plume will dominate. Van Rooy and others, suggest setting η to 10-percent of the source
strength.
Use of observed concentration to determine the weight can result in biased
estimated parameter values (Anderman and Hill, 1999). Calculation of weights using
observed concentration produced biased parameter estimates (average parameter
estimates were significantly, as much as 35 percent, different than the true values).
Calculation of weights using simulated concentrations produced unbiased parameter
estimates. Anderman and Hill (1999) demonstrated that this deviation is due to the fact
that the weighted residuals based on observed concentrations are not normally
distributed, even when the true errors are normal, while the weighted residuals based on
the simulated concentrations are normally distributed. They further indicated that using
simulated values to calculate the weights was problematic when parameter values are not
close to the optimal parameter values because the simulated values and, thus, the weights,
are unrealistic. They improved the regression by using observed concentrations to
calculate weights for early parameter iterations and using simulated concentrations when
estimated parameter values changed less than 10 percent between parameter-estimation
3
iterations. UCODE makes this approach available and uses the scaling parameter
described below to calculate the weights. This is described in more detail in the section
entitled: Universal Input File—filename.uni.
Gailey and others (1991) introduced a scaling parameter to the weight equations
to ensure that the hydraulic-head and concentration residuals form a single population
with a uniform variance and that both types of observations contribute equally to the
calibration. It is initially set to 1.0 and is subsequently calculated at each parameterestimation iteration as,
S h (b r )
λ r = nh
S c (b r )
nc
(5)
which is, the ratio of the average weighted squared residual for hydraulic heads and the
average weighted squared residual for concentrations. λr replaces the numerator in the
equations 3 and 4. UCODE allows the user to choose whether to use this alternative
weighting scheme and to define: the individual weighting factors for each observation;
which observations receive the alternative weighting, which observations are included in
the numerator of the scaling parameter, the constant η , and the threshold level for using
the simulated rather than the observed value for calculating the weight.
4
THIS NEW DISCUSSION OF A THE UCODE RUN COMMAND, REPLACES
THE PREVIOUS DISCUSSION ON PAGES 23 & 24 OF THE ORIGINAL
MANUAL
Run Command
The run command needs to be executed in the directory containing the UCODE
input files. An example run command is:
perl
ucode fn [optional open window flag]
where: "perl" is the command necessary to invoke the Perl program on the computer
where the code is installed; and "ucode" is the pathname for the UCODE Perl script;
filename is a file name prefix that needs to be replaced using any name that conforms
with file name restrictions of the operating system being used and an open window flag
of 1 keeps the DOS window open to allow viewing of error messages until the user
presses the “enter” key, while a 0 or absence of a flag causes the window to close upon
termination. The only exception is that spaces are not allowed in fn, even on operating
systems that allow spaces in file names.
5
THIS NEW DISCUSSION OF A THE UCODE TOL PARAMETER, REPLACES
THE PREVIOUS DISCUSSION ON PAGE 26 OF THE ORIGINAL MANUAL
TOL [ optional UCODE version 3.0 and higher MARQUARDT_DIRECTION FACTOR
INCREMENT]
When parameter values change less than the fractional amount define by
the value of TOL between regression iterations, parameter estimation
converges (0.01 is recommended). The Marquardt parameter is utilized
during the regression when difficult mathematical conditions arise,
typically resulting from a poorly posed problem. The
MARQUARDT_DIRECTION, and FACTOR and INCREMENT are
optional input items and can be controlled by the user in UCODE version
3.0 and higher. The direction is the angle (in degrees) between the down
gradient direction and the direction of the parameter update vector. If this
angle approaches 90o, the regression will not make progress towards the
minimum sum-of-weighted-squared residuals. The default is 85 o. The
Marquardt parameter, ur, starts at zero and for iterations in which the
condition described by Cooley and Naff (1990, p.71-72) is met using ur =
0, ur is increased as:
unewr = FACTOR uold r + INCREMENT
If FACTOR and INCREMENT are not specified, the defaults are 1.5 and 0.001
respectively.
6
AN OPTIONAL INPUT ITEM CAN BE INCLUDED BEFORE THE LIST OF
OBSERVATIONS IN THE *.UNI FILE, THIS SHOULD BE PLACED BETWEEN
THE EXPLANATION OF THE “NUMBER-RESIDUAL-SETS” AND “LIST OF
OBSERVATIONS” ON PAGE 28 OF THE ORIGINAL MANUAL
OPTIONAL FLAG - If the user has an extensive list of observations, the time required
to compare each observation name to every previously entered name can
become excessive. To avoid this check include an additional line before
the observation list with the following flag beginning in column 1,
otherwise omit this line entirely:
DO_NOT_CHECK_OBSERVATION_NAMES
*DO NOT CHECK OBSERVATION NAMES OPTION *UCODE version 3 and
higher OPTIONAL LINE OF INPUT: - If the user has an extensive list
of observations, the time required to compare each observation name to
every previously entered name can become excessive. To avoid this check
include the following:
DO_NOT_CHECK_OBSERVATION_NAMES
on a line by itself immediately before the first observation in the uni file.
7
THIS NEW DISCUSSION OF A THE STAT-FLAG FOR OBSERVATIONS IN
THE *.UNI FILE, REPLACES THE PREVIOUS DISCUSSION ON PAGE 29 OF
THE ORIGINAL MANUAL
STAT-FLAG - A flag indicating whether STATISTIC is a variance,
standard deviation, or coefficient of variation.
STAT-FLAG
-1
STATISTIC___________________________
cv j of alternate weighting scheme, described under
options below, UCODE version 3 and higher
0
variance
1
standard deviation
2
coefficient of
variation________________________
If any observations are assigned a negative stat-flag, then the
ALTERNATE WEIGHTING OPTION must be used and the
CONSTANT and THRESHOLD must be specified at the end of the
list of observations as discussed in the section Universal Input File
UCODE Version 3.0 Additions—filename.uni below.
8
OBSERVATIONS CAN NOW BE ASSOCIATED WITH X,Y,Z COORDINATES
AS DESCRIBED IN THIS DISCUSSION WHICH SHOULD APPEAR AFTER
THE PLOT-SYMBOL DISCUSSION ON PAGE 29 OF THE ORIGINAL
MANUAL
If the user wishes to relate the observations to x, y, z, locations and/or to
points in time then fn.xyz file is created by the user with the format:
observation-ID x-coordinate y-coordinate t-coordinate t-coordinate
The observations may be listed in any order, but their names must exactly
match the names in the fn.uni file and all observations listed in fn.uni must
be included in fn.xyz. If the file fn.xyz exists in the directory where
UCODE is launched, UCODE will place “x, y, z, t” at the end of the line
in each plot file for which a link to an x,y,z,t is appropriate. In files where
data related to prior information is included, the x, y, z, t values will be
printed as 0, 0, 0, 0.
9
THIS NEW DISCUSSION OF THE END COMMAND IN THE *.UNI FILE,
REPLACES THE PREVIOUS DISCUSSION ON PAGE 29 OF THE ORIGINAL
MANUAL
END - Indicates the end of the data input for this file. UCODE is not case-sensitive to the
word “END”. If the modeler chooses to use one of UCODE’s optional features, then
END will not appear at this point in the uni file, but will instead appear at the end of
the file after all the optional commands. The description of flags for optional features
follows. The flags are not case-sensitive. Because some computers are sensitive to eth
presence of a carriage return at the end of the final line of input, it is good practice to
include a blank line at the end of all UCODE input files.
10
THERE ARE MANY NEW, OPTIONAL, INPUT ITEMS FOR UCODE3.0 THAT
CAN BE INCLUDED BEFORE THE END COMMAND OF THE *.UNI FILE,
THE FOLLOWING PAGES SHOULD BE PLACED AFTER THE DISCUSSION
OF THE END COMMAND ON PAGE 29 OF THE ORIGINAL MANUAL
Universal Input File UCODE Version 3.0 Additions—filename.uni
New items related to UCODE version 3.0 follow. UCODE version 3.0 is completely back
compatible with previous UCODE input files.
ALTERNATE WEIGHTING OPTION – Unique weighting allows the user to vary the
weight throughout the regression for some of the observations. It was
designed particularly with the idea of adjusting the weight of
concentration observations in a ground-water model, but may be
applicable to situations in other disciplines. If the user specifies unique
weighting by a negative stat-flag for any observation, the first line
following the last observation in the filename.uni file needs to contain the
flag UNIQUE_WEIGHT (not case-sensitive) followed by the values for
CONSTANT and THRESHOLD:
UNIQUE_WEIGHT CONSTANT THRESHOLD
where, the non-zero, positive constant, η , is used to calculate weights
using observed values for observations with negative stat-flag values
during early parameter iterations as follows:
wc~j =
λ r −1
(cv c~
j
+η)
2
j
(6)
where:
cv j
is a dimensionless user-defined weight factor, calculated from the
c~j
“STAT” value
is the jth observed concentration
η
is a non-zero, positive CONSTANT. Keidser and Rosbjerg (1991)
suggested adding the constant to prevent large relative differences
at low concentrations from dominating the regression. Van Rooy
and others (1989) found that the estimated parameter values are
sensitive to the value of the constant that is chosen. A small value
for the constant caused large residuals at the periphery of the
11
plume to dominate the regression. They suggested using a value
equal to 10-percent of the source strength for η .
During later parameter iterations, when the estimated parameter values are
changing by less than the THRESHOLD value (0.1 is a reasonable choice
when working with a tolerance of 0.01), weights are calculated using
simulated values for observations with negative stat-flag values as
follows:
wc~ j =
λ r −1
(7)
(cv cˆ )
2
j
j
or, if the simulated value is less than or equal to zero, then:
wc~ j =
λr −1
(η )2
(8)
where:
ĉ j
is the jth simulated concentration
λ r −1
is the scaling parameter initially set to 1, then calculated from
iteration r-1:
S h (b r −1 )
λr −1 = nh
S c (b r −1 )
nc
(9)
where:
nh
(
)
2
S h (b r ) is ∑ whi hi − hˆi (b r ) ,
i =1
whi
hi
ĥi (b r )
nh
is the weight of the ith observation that is not receiving alternate
weighting, i.e. those with stat-flags equal to or greater than 0 (in
ground-water applications this is likely a hydraulic head),
is the ith observed value not receiving alternate weighting, i.e. statflags equal to or greater than 0 (in ground-water applications this is
likely a hydraulic head),
is the ith simulated value, from iteration r, not receiving alternate
weighting, i.e. stat-flags equal to or greater than 0 (in ground-water
applications this is likely a hydraulic head),
is the number of values not receiving alternate weighting, i.e. those
with stat-flags equal to or greater than 0 (in ground-water
applications this is likely the number of hydraulic heads),
12
S c (b r ) is
∑ w (c
2
nc
j =1
cj
j
− cˆ j (b r )) ,
wc j
is the weight of the jth observation to receive alternate weighting,
cj
i.e. those with stat-flags equal to -1 (in ground-water applications
this is likely a concentration),
is the jth observed value to receive alternate weighting, i.e. stat-
ĉi (b r )
nc
flags equal to -1 (in ground-water applications this is likely a
concentration),
is the jth simulated value, from iteration r, to receive alternate
weighting, i.e. stat-flags equal to -1 (in ground-water applications
this is likely to be a concentration), and
is the number of values to receive alternate weighting, i.e. those
with stat-flags equal to -1 (in ground-water applications this is
likely the number of concentrations).
To prevent difficulty converging the weights are held constant at the last
calculated weight once the parameter values change by an amount
less than the “final threshold” which is calculated as:
final threshold = 0.1 (THRESHOLD − TOL) + TOL
(10)
ALTERNATE PRINT FILES OPTION - If the user would like to have the extensive
lists of observations, residuals, and sensitivities printed to alternate files so
that it is easier to read the primary output files, the following flag is
specified:
ALTERNATE_PRINT (not case-sensitive)
When this specification is made the alternate output for phases 1 through 3
is written to fn._alt, alternate output for phase 33 is written to fn._3a, for
phase 44 is written to fn._4a, and for 45 is written to fn._5a. If it is
desired that the intermediate information be printed to these alternate files,
the intermediate printing option will need to be turned on. Otherwise the
alternate files will hold this information for the initial and the final
parameter values.
13
PRINT GRAPH FILE HEADER OPTION - If the user would like to have labels
printed as the first line in the output files that are used for graphing, the
following flag is specified:
HEADER_PRINT (not case-sensitive)
When this specification is made a header will appear as the first line of the
following files: fn._cp, fn._c4, fn._c5, fn._nm, fn._os, fn._r, fn._rd, fn._rg,
fn._s1, fn._s2, fn._s3, fn._sp, fn._w, fn._ws, fn._ww.
LINEAR EXTRACTION OPTION - If the user would like to extract each observation
in its order of occurrence in the filename.uni file from an output file that
includes only the simulated equivalents, one on each line, in the proper
order, the following flag is specified:
LINEAR_EXRTACT (not case-sensitive)
In this case the extract file contains only the codes <filename and END,
and may appear as follows:
<head_flow.out
<concentration.out
END
Blank lines and comments should not occur in either the filename.ext file
or the application output files that contain the values to be extracted.
GROUP STATISTICS REPORTING OPTION - If the user would like to calculate
statistics by observation type then plot-flags need to define the groups of
interest in the observation list of the filename.uni file and specify the
following flag after the observation list and before the END flag in the
filename.uni file.
GROUP_STATS (not case-sensitive)
In this case, composite scaled sensitivities will be printed for each group,
for each iteration, and the final residual statistics (minimum, average,
maximum, #positive, #negative, number of runs, and the sum of squared
weighted residuals) will be printed for each group for the optimal values.
14
RESTART OPTION - If the user would like to restart the regression program at a
previous iteration with different regression controls or omitting insensitive
parameters without repeating the time-consuming executions of the
forward model, the restart option can be selected by specifying:
RESTART_SAVE (not case-sensitive)
W hen this specification is made scratch files are written to a subdirectory
within the directory where UCODE was executed and labeled with the
iteration number. Each directory will contain the following files:
filename.uni
u9_zscr.pre
u9_zscr.2
u9_zscr.3
u9_zscr.5
u9_zscr.6
u9_zscr.8
u9_zscr.12
u9_zscr.13
and if central differences were utilized:
u9_zscr.1
u9_zscr.11
If the user decides to restart the regression from an intermediate stage they
need to:
•
copy all of these files from the selected iteration to the parent
directory
•
edit the filename.uni file to:
specify any desired alternative regression controls, such as
varying MAX-CHANGE or NOPT etc. NOTE: the
observations cannot be changed at this time because only
the extracted values are saved and these are not available
for other observations
specify RESTART_RUN (not case-sensitive) in the
position where RESTART_SAVE had been
•
edit the u9_zscr.pre file by simply removing the asterisk in front of
the parameter number that should be fixed at the value it had at
that point in the regression.
The u9_zscr.pre file contains information such as:
15
56
F yes
<bcf.tpl
>bcf.sub
/!k1,,,,,,,,,,! *0 3.0e-5 3.0e-3 0.03 %14.7e 1 1
/!kv,,,,,,,,,,! *1 2.0e-7 1.0e-6 0.03 %14.7e 1 1
/!k2m,,,,,,,,,! !2 4.0e-6 4.0e-4 0.03 %14.7e 1 0
<riv.tpl
>riv.sub
/!krb,,,,,,,,,! *3 1.2e-4 1.2e-2 0.03 %14.7e 1 1
<rcharr.tpl
>rcharr.sub
/!rch1,,,,,,,,,! *4 0.5e-8 4e-8 0.1 %15.7e 0 1
/!rch2,,,,,,,,,! *5 0.5e-8 4e-8 0.1 %15.7e 0 1
END
Notice that the initial parameter values have been replaced with either a *
followed by the parameter number, if the parameter was being estimated
or, an ! followed by the parameter number, if it was not estimated. The
only option you have is to remove the * indicating that you would like to
proceed from that point in the regression with the parameter fixed at the
value at the start of the iteration. The parameter values are listed in
u9_zsrc.8 in order of occurrence in the filename.pre file with all of the
parameters to be estimated list first and the parameters that have a “do not
estimate flag second.
For example, the first 5 values in the following list are k1, kv, krb, rch1,
rch2, and the last value is k2m which is not estimated.
0.4031901E-03
0.1817285E-06
0.1134773E-02
0.1050016E-07
0.1449825E-07
0.4000000E-04
16
PARALLEL PROCESSING OPTION – To use the parallel option, the companion
Master-Tasker code is needed and must be operational on the master and
slave computers in order to use the parallel option in UCODE. If the parallel
option is invoked, rather than indicating END at this location in the filename.uni
file, a PARALLEL flag is input followed by the information required to undertake
the parallel solution as delineated below. The Master-Tasker code manages a
“master” computer and “slave” computers that have been setup to accept parallel
executions of the application codes. All batch files that are to be executed on
slave machines must include a final line as follows:
echo process_complete>%1
PARALLEL (not case-sensitive) - Indicates the parallel option of UCODE will be
invoked.
If “PARALLEL” is specified, each the following items must be specified in the
following order, starting a new line for each item delineated in bold
below:
PATH_to_MASTER - path to the directory where the subdirectories to hold the
information for the parallel executions will be created, including a
slash at the end. (e.g. D:\master\)
PATH_to_DYNAMIC_FILES - path to the directory where files that change
throughout the parallel execution are stored the application codes,
batch files and input files are held including a slash at the end.
Everything in this directory will be copied to the slave machines.
These files need to be all that is needed to conduct a forward
model run when the proper command is given, except the input
files that are created by UCODE substitution of values into
template files.
PATH_to_STATIC FILES - path to the directory where files that will not
change throughout the parallel execution are located. Everything in
this directory will be copied to the slave machines only once at the
beginning of the parallel operation. The setup of the application
must reference these files with the correct relative directory
references.
REPEAT_TIME - frequency with which to check “slave” machines where one
forward execution of the application model(s) is performed to
obtain information for calculation of sensitivities.
17
REPEAT_TIME_UNIT - time unit for REPEAT_time (s=second, m = minutes,
h = hours)
DEFAULT_SPACE - minimum amount of space required to hold the application
software, data files, and output
SPACE_UNIT - unit for default_SPACE (M = megabytes, G = gigabytes)
DEFAULT_SPEED - speed assumed for slave computer to perform one forward
execution of the application model(s) in the default_time (see
below)
SPEED_UNIT - unit for default_SPEED (Mhz = megahertz)
DEFALT_TIME - estimated time required to perform one forward execution of
the application model(s) on a machine with the default speed
TIME_UNIT - time unit for default_TIME (s=seconds, m = minutes, h = hours)
DEFAULT_LAUNCH - command to launch one forward execution of the
application model(s) when executed within the directory indicated
as PATH_to_APPLICATION_FILES
TIME_OUT - time limit for waiting for SHORT TERM responses (e.g.
acknowledgement that requests have been made or status has been
updated) from the Master (secs) before terminating
ABORT_CONDITION
-
condition to leave slave machines in after the program terminates
unnaturally
specify, either:
clean
leave
AFTER ALL OPTIONAL FEATURES HAVE BEEN SPECIFIED, the
fn.uni file ends with:
END (not case-sensitive) - Indicates the end of the data input for this file.
18
ADDITIONAL REFERENCES FOR UCODE3.0 ON PAGE 61 & 62 OF THE
ORIGINAL MANUAL
Anderman, E. R. and Hill, M. C., 1999, a New Multistage Groundwater Transport Inverse
Method: Presentation, Evaluation, and Implications, Water Res. Research, 35(4), p. 10531063.
Barlebo, H. C., Hill, M. C., Rosbjerg, D., and Jensen, K.H., 1998, Concentration Data
and Dimensionality in Groundwater Models: Evaluation Using Inverse Modeling,
Nordic Hydrology, v. 29, p. 149-178.
.
.
Gailey, R. M., Gorelick, S. M., and Crowe, A. S., 1991, Coupled Process Parameter
Estimation and Prediction Uncertainty Using Hydraulic Head and Concentration
Data, Adv. in Water Res., 14(5), p. 301-314.
.
.
Keidser, A. and Rosbjerg, D., 1991, A Comparison of Four Inverse Approaches to
Groundwater Flow and Transport Parameter Identification, Water Res. Research,
27(9), p. 2219-2232.
.
.
Sonnenborg, T. O., Engesgaard, P., and Rosbjerg, D., 1996, Contaminant Transport at a
Waste Residue Deposit, 1. Inverse Flow and Nonreactive Transport Modeling,
Water Res. Research, 32(4) p. 925-938.
Van Rooy, D., Keidser, A., and Rosbjerg, D., 1989, Inverse Modelling of Flow and
Transport, in Groundwater Contamination (Proceedings of the Symposium held
during the Third IAHS Scientific Assembly, Baltimore, MD, May 1989), Linda
Abriola ed., IAHS Publ. no. 185, p. 11-23.
Wagner, B. J. and Gorelick, S. M., 1987, Optimal Groundwater Quality Management
Under Parameter Uncertainty, Water Res. Research, 23(7), p. 1162-1174.
19
Download