Dimensional Analysis

advertisement
Dimensional Analysis
It can be argued that much of what statisticians talk about in most experimental design courses (beyond the basic principles of randomization, blocking, and replication) is, for
practical purposes, largely about details. One much larger issue, which generally receives
little attention, is the selection of independent variables or factors to be included in a study.
Statistical discussion of this topic often centers around generic power analysis, e.g. the consequences of how many factors can be considered in an experiment operationally constrained
to a given number of runs, where interactions of a specified order cannot be ignored. There
have been some papers published in statistical journals that offer practical advice on how to
interact with investigators in selecting the experimental factors to be used in an experiment;
some of these discussions are valuable, even though most of them are essentially based on
(valuable) experience and common sense rather than more formal arguments.
Physical scientists (perhaps especially physicists and engineers dealing with mechanical
and hydrodynamic systems) use dimensional analysis as part of their approach to modeling
systems and designing experimental studies. Dimensional analysis (DA) is not rooted in
statistical ideas, but is built on principles that are generally accepted as necessary axioms
in any modeling exercise that represents a physical process. Perhaps the most remarkable
aspect of DA is that it can, in some cases, help an investigator reduce the number of experimental factors to be considered in an experimental study. “Optimal” application of DA in
complex applications may require substantially more subject-matter-knowledge than most
statisiticans possess. However, because it can be a very powerful experimental design tool
from the investigator’s perspective, some knowledge of it’s basic principles should be helpful
to statisticians involved in experimental design.
A foundational paper in the topic discussed here was written by Buckingham (1914), for
whom the Buckingham Pi Theorem is named. Palmer’s (2007) monograph is very accessible
for those with a very basic background in physical systems, and contains a large number
of good exercises. Albrecht et a. (2013) wrote a general introduction to this topic for
statisticians, which is followed by discussion by others.
Physical Quantities
Any physical quantity has both a value and units. For example, X = 6 ft. has the value
of “6” and units of “feet,” a specific unit of the dimension “length.” It is often convenient to
think about values and units separately. We will use v(−) to denote the value of the physical
argument, and [−] to denote its units, e.g. in the above example, v(X) = 6 and [X] = feet.
Systems of units are built around the idea that there are a few “fundamental” physical
dimensions that require the definition of units, and that the units of all physical quantities
1
can be expressed as monomials of these. For example, length and time are ordinarily regarded
as fundamental dimensions, where specific units may be assigned to be (for example) feet and
seconds, respectively. But velocity, while a physical quantity, would be given units of feet
× seconds−1 in this system, while acceleration would be assigned units of feet × seconds−2 .
Systems can obviously be defined using different specific units for any one dimension e.g.
length may be given units of meters or miles rather than feet, but derived units (e.g. for
velocity and acceleration) would also be modified to match.
Absolute units, or systems of them, are defined so that their corresponding values are on
ratio scales with meaningful zeroes. Units of length and mass generally have this property;
the most common exception is temperature, where the centegrade and farenheit unit scales
are not absolute, but the Kelvin scale is. Conversion between absolute systems of units
involves only multiplication by a constant, e.g. X and Y are physically equivalent if
v(X) = 1 and [X] = miles, and v(Y ) = 5280 and [Y ] = feet.
Note that this does require the units to be absolute to eliminate an “intercept” from the
transformation.
More particularly, every physical quantity X has units expressed as a product of integer
powers of r fundamental units:
[X] = F1a1 F2a2 F3a3 ...Frar
(1)
where the F ’s denote entities like “feet”, “seconds”, et cetera. Much of physical modeling can
be done with a system of r = 4 fundamenal physical units defining length, mass, time, and
electric charge. For example, denoting generic units of these by L, M, T, and Q, respectively:
[velocity] = LT−1
[acceleration] = LT−2
[force] = MLT−2
[mass density] = ML−3
[pressure] = ML−1 T−2
[energy] = ML2 T−2
[electric current] = QT−1 [electric field] = MLQT−2
It should be noted, however, that there is no “universally accepted” system of dimensions/units. For example, the SI system is based on r = 7 fundamental units, while some
physicists argue that systems of 3 or fewer fundamental dimensions should be adequate.
Physical Equations
Physical scientists like to express the deterministic relationship among a collection of
physical quantities through a physical equation of general form:
f (X1 , X2 , X3 , ..., Xn ) = 0.
2
(2)
This equation can be regarded (in the usual way) as a relationship among variables for which
the numerical values must be related:
f (v(X1 ), v(X2 ), v(X3 ), ..., v(Xn )) = 0.
(3)
But it can also be regarded as a statement about the physical units involved:
f ([X1 ], [X2 ], [X3 ], ..., [Xn ]) = [f ]
(4)
where the right side identifies the physical units attached to the quantity for which the value
must be zero. For example, if X1 is the length and X2 the width of a rectangular region,
each expressed in units of feet, and X3 is the area of that rectangle in units of square-feet,
f (X1 , X2 , X3 ) can be evaluated numerically or with units as:
X1 × X2 − X3
2
feet × feet − feet
=0
= feet2
(5)
The Dimensional Homogeneity Principle (apparently discussed first in some form by
Fourier) actually says two things:
1. The terms of a physical equation must have units that are monomials of fundamental
physical units, e.g.
[any term] = F1b1 F2b2 F3b3 ...Frbr
(6)
with one factor for each fundamental unit. If b1 = b2 = b3 = ... = br = 0, this means
the term is unitless, and for such terms we write [-]=1.
2. All terms in a physical equation must have the same units.
The second of these statements is perhaps the more immediately intuitive; it says you cannot
add or subtract quantities that express fundamenally different physical entities, e.g. no
addition of length to area, or area to mass. The first is a more fundamental statement
about what is required of a physical quantity; it essentially implies that every term in a
physical equation must be comprised of products and quotients of integer powers of physical
quantities, perhaps multiplied by a constant. Taken together, they imply that every physical
equation can be written in terms that are unitless, because statement (1) requires that
f (X1 , X2 , X3 , ..., Xn ) =
X
c
c
c
αi X1 i,1 X2 i,2 X3 i,3 ...Xnci,n = 0
(7)
i=1
for some number of terms, and statement (2) implies that the equation can be made unitless
by dividing each term by any one of them, e.g.
f 0 (X1 , X2 , X3 , ..., Xn ) = 1 +
αi ci,1 −c1,1 ci,2 −c1,2 ci,3 −c1,3
X1
X2
X3
...Xnci,n −c1,n = 0.
i=2 α1
X
3
(8)
Physical equations with unitless terms have the nice property that they are true for any
(absolute) system of physical units. For example, if length is involved in such an equation,
expressions of length in each term must be raised to the same power in both the numerator
and denominator, so that the units of length “cancel out.” This being the case, the equation
can be written or evaluated with either “feet” or “yards” used as the fundamental unit of
length, because the power of 3 used for conversion from one to the other also “cancels out”
of each term. Given that one writes unitless equations, the units of individual quantities can
then be expressed in generic form (e.g. L, M, T, and Q, et cetera).
Buckingham’s Pi Theorem
Consider a physical equation expressed in absolute units:
f (X1 , X2 , X3 , ..., Xn ) = 0,
(9)
where
a
a
a
[Xj ] = F1 1,j F2 2,j F3 3,j ...Frar,j .
(10)
Now define an r-by-n matrix A with integer elements {A}i,j = ai,j . We assume for the
moment that n > r. Let the rank of A be denoted r0 , and note that r0 ≤ r. Suppose that we
have assigned subscripts to the X’s so that the first r0 columns of A are linearly independent.
As a result, columns r0 + 1 through n of A can each be written as a linear combination of
columns 1 through r0 . It follows that f can be rewritten as
F (X1 , X2 , X3 , ..., Xr0 , π1 , π2 , π3 , ..., πn−r0 ) = 0,
(11)
where each π is a unitless product/quotient of integer powers of the X’s; i.e. [π] = 1. For
example, if r0 = 2, X1 has units of F1 , X2 has units of F2 , and X3 has units of F1 F2−2 , then π1
can be defined as X1−1 X22 X3 , and F defined by multiplying π1 by X1 X2−2 where X3 appears
in f .
Buckingham’s “Pi Theorem” (Buckingham, 1914) states that, in fact, F can be written
as an equation in only π1 through πn−r0 , i.e. that X1 through Xr0 can be eliminated. To see
why this is true, consider rewriting F for every specific set of values of X1 through Xn in
the following way. For any given situation, invent a new system of absolute units for which
v(X1 ) = v(X2 ) = v(X3 ) = ... = v(Xr0 ) = 1 (or any other constant). In each case, this means
X1 through Xr0 are irrelevant, but π1 through πn−r0 are unaffected. As a result, the physical
function can be rewritten as:
F (π1 , π2 , π3 , ..., πn−r0 ) = 0.
(12)
Physical scientists have long used this result as a “dimension reduction” technique, to
reduce the number of variables that must be simultaneously analyzed in an experiment
4
to determine the underlying physical law. A popular example is the relationship between
physical quantities involved in describing the motion of a simple pendulum:
period of the pendulum t
[t] = T
length of the pendulum l
[l] = L
mass of the pendulum m
[m] = M
acceleration due to gravity g
[g] = LT−2
(horizontal) amplitude a
[a] = L
where in this simple modeling exercise, the mass is assumed to be concentrated at the bottom
of the pendulum. Writing the matrix A with rows corresponding to T, L and M (Q isn’t
involved), and columns corresponding to t, l, m, and g, we have:


1 0 0 −2 0 



A= 0 1 0
1 1 


0 0 1
0 0
(13)
Using Buckingham’s result, we begin by stating that the relation governing these physical
quantities can be written as
F (t, l, m, π1 , π2 ) = 0
and in this case can quickly determine that:
π1 =
t2
g,
l
π2 = al .
Note that m, the mass of the pendulum, is not involved, and therefore not needed in the
final form of the equation:
F (t2 l−1 g, al−1 ) = 0,
that is, the physical equation reduces to a statement that the single argument must be a zero
of some unknown functional form. If a pendulum of a given length can have only one period,
there can be only zero of F , and experimentation can be used to establish that unless al−1 is
large it’s effect is negligible, and that t2 l−1 g must be 4π 2 for any simple pendulum. The most
important thing to realize is that this experiment would be performed in two independent
variables, rather than the five variables initially noted. (In any case, g would be very hard
to experimentally manage, but π1 can be controlled through changes in t and/or l.) This
requires much less effort than investigating the general relationship that might exist among
the four original variables.
Statistical Modeling
Think now about a “standard” set-up for a regression experiment.
We have a re-
sponse variable y, thought of as a “noisy” measurement of a “true” value Y , and predictors
5
X1 , X2 , ..., Xp . Thinking of the noiseless law relating Y to the X’s, we are in search of how
these variables must be related so as to satisfy
f (Y, X1 , X2 , ..., Xp ) = 0
(14)
for an unknown function f . It is convenient here to label independent variables so that the
units of X1 through Xr0 constitute a “basis” for all p + 1 physical quantities, i.e. that the
units of all variables can be expressed as products/quotients of integer powers of these. (r0
again represents the rank of the matrix A.) Then Buckingham’s theorem implies that the
physical law can be written as
F (X1 , ..., Xr0 , π1 , ..., πp+1−r0 ) = 0.
(15)
But we can exert a “choice” at this point that helps us toward our regression model. Specifically, let π1 be a function only of X1 through Xr0 and Y , π2 be a function only of X1 through
Xr0 and Xr0 +1 , et cetera. That is, arrange variables so that each of Y , Xr0 +1 , Xr0 +2 , ... , Xp
is involved in exactly one dimensionless π. Buckingham’s theorem then allows us to write
an equivalent expression:
F (π1 , ..., πp+1−r0 ) = 0.
(16)
If we insist that only one value of Y (noiseless response) can follow from any specific values of
X1 through Xp , this implies that there can be only one value of π1 that satisfies the equation
once π2 through πp+1−r0 are specified, i.e. that the relationship can be inverted to the form:
π1 = G(π2 , ..., πp+1−r0 )
(17)
Finally, let a be the integer power to which Y is raised in π1 , and let ρ1 denote the collection
of all factors in π1 except for Y a , i.e. π1 = ρ1 Y a . Then:
1
a
Y = [ρ−1
1 G(π2 , ..., πp+1−r0 )]
(18)
leading to a regression model
−1
y = ρ1 a G0 (π2 , ..., πp+1−r0 ) + (19)
That is, we have a regression model in r0 fewer predictors than would have been used in the
original formulation.
Example
Péan et al. (1998) described experiments to investigate the effects of several process
variables on the encapsulation yield of nerve growth factor in poly(D,L-lactide-co-glycolide)
(abbreviated PLGA) biodegradable microparticles, which may have substantial potential
6
in the development of drug delivery systems for specific target cell populations. In one
experiment, a unitless response variable reflecting encapsulation efficiency was measured over
16 experimental trials in which 10 controlled variables were varied according to a regular,
2-level fractional factorial design. The two values (and corresponding units) used for each
controlled variable are listed in Table 1. While the authors reported these values in commonly
used laboratory units, some of them have been converted here for consistency, using seconds
(s) as the basic unit of time, milliliters (ml) of volume, and milligrams (mg) of weight or
mass. The experimental design (in coded controlled variables) and responses are listed in
Table 2. Note that 5 of the 16 values recorded for the response are 5.0 (reported by Péan
et al as ‘5’, but extended here to match the precision given for other data values); while no
comment about this is given by the authors, the most likely explanations suggest that the
standard assumptions generally made in regression analysis may not exactly hold.
The authors used these data to screen the ten controlled variables by fitting a first-order
(main effects) regression model. They note that by their analysis, controlled variables 4, 6,
1, and 2 have the most important effects; these are the terms for which the associated t-test
p-values are less than 0.05. The authors also note that variables 3 and 5 may have smaller
effects; these terms have associated p-values of between 0.05 and 0.10. Three of the remaining
4 terms have p-values of less than 0.20, so it is difficult to make a firm conclusion regarding
which variables should be ignored in further experimentation. Using the step command of
R, a stepwise regression was performed on these data beginning with the null model, limited
to ten main effects terms and intercept in the full main effects model, and using the direction
= “both” option so that terms can be iteratively added or removed so as to minimize the
value of the Akaike Information Criterion (AIC); details are given in the Appendix. The
resulting model contained the intercept and all 10 main effects (11 parameters, AIC = 78.42,
√
M SE = 10.43, R2 = 0.9557, adjusted R2 = 0.8671), again suggesting that while p-values
for some individual terms are greater than 0.05, it is difficult to confidently screen out any
of the terms in this model. Depending on “where the line is drawn,” a follow-up experiment
in from 4 controlled variables (if everything not significant at the 0.05 level is eliminated) or
6 controlled variables (if a p-value of 0.10 were used as a cut-off) or even more (to include
“suggestive” variables) might be necessary.
As an alternative to this analysis, the foregoing discussion suggests that an analysis based
on 7 unitless controlled variables (10 original variables, minus the 3 fundamenal units of time,
length, and mass) might be considered. Table 3 displays seven independent, unitless ratios
of the original 10 controlled variables; X4 was the only original variable defined on a unitless
scale and it is retained as π1 . Note that the selection of these unitless variables is not unique
(and is in fact quite arbitrary in the case of this demonstration). For example, any product
or ratio of these quantites is also a unitless function of X’s. With this in mind, we again used
stepwise regression (in the same manner described above, see the Appendix) to model the
7
response variable as a linear regression of π1 through π7 , beginning the fit with a null model,
but allowing the algorithm to consider any term defined as the product of any number of π’s
(e.g. interactions through order 7). While this is not all possible unitless functions of the X’s
(again, because ratios of π’s are also unitless), it is a very large collection of terms relative to
the size of the data set. In this case, the algorithm stopped when 6 unitless terms had been
added. (Despite the fact that the direction = “both” option was used, the algorithm did
not delete terms at any iteration; once admitted to the model, no terms were subsequently
removed.) Even though this model contains fewer terms than the full first-order polynomial
in the original variables, the quality of fit to the 16 data values is comparable (7 parameters,
√
AIC = 77.94, M SE = 9.834, R2 = 0.9291, adjusted R2 = 0.8819). Furthermore, this
model is expressed as a function of only 4 unitless variables, including main effects for π1 , π3 ,
π4 , and π5 ; and “two-factor interactions”, or products representing other unitless variables
for π1 × π4 and π3 × π5 . Because the fit of the regression to these 4 unitless variables is
essentially as good as that for the full main-effects model in the 10 original X’s, it might be
argued that a reasonable follow-up experiment could be conducted in 4 dimensions. Note
that this involves varying 6 of the original variables (2, 3, 4, 5, 6, and 8, including all but one
of the variables, X1 , that appeared to be most effective as predictors in the first analysis),
but only in combination so as to construct a good experimental design in π1 , π3 , π4 and π5 .
An additional interesting point about the regression model just described based on unitless variables is that p-value of the intercept, computed in the context of the 7-parameter
model, is 0.19, suggesting that this model term may be unnessary. Exploring this further,
the stepwise regression exercise was repeated for the 7 unitless π’s and all their interactions,
but beginning with a null model and including an intercept in the collection of terms that
could be added. In this case, the algorithm continued to add terms until the model was
saturated (16 terms), which is perhaps not surprising since there are so many candidate
terms available. But the 7-parameter model identified in this run fit the data values even
√
better (7 parameters, AIC = 65.89, M SE = 6.749), and included terms for π1 , π3 , π4 , π5 ,
π7 , π1 × π4 , and π5 × π7 .
Conclusion
The initial response of some to the basic idea of DA is that this seems too much like
“magic”, because it appears to offer dimension reduction without problem-specific information or data. In fact, as with a priori simplification of a statistical model, DA is explicitly
based on assumptions – in this case, the fundamental assumptions dimensional homogeneity.
These principles are firmly established in physical theory and experience, but it should be
remembered that they provide the starting point for Buckingham’s theorem.
There is one other point that should be explicitly made. The “axioms” that lead to
DA are themselves tacitly built on the idea that the relationship to be modeled is complete,
8
i.e. that all relevant variables are included in the development. If this is not true, it is
hard to understand how a “physical law” can be contemplated. This, in turn, suggests that
DA is primarily relevant in what might be called “closed systems”, where the influence of
uncontrolled factors is eliminated or at least minimized.
In principle, DA is applicable in any application for which the dimensional homogeneity
principle holds, and which is “closed” in the above sense. These conditions (especially the
second) are probably most often applicable when studies are done under tight experimental
control (i.e. laboratory work).
References
Albrecht, M.C., C.J. Nachtsheim, T.A. Albrecht, and R.D. Cook (2013). “Experimental
Design for Engineering Dimensional Analysis,” with discussion, Technometrics 55,
257-295.
Buckingham, E. (1914). “On Physically Similar Systems: Illustrations of the Use of Dimensional Equations,” Phys. Rev. 4, 345-376.
Palmer, A.C. (2007). Dimensional Analysis and Intelligent Experimentation, World Scientific, New Jersey.
Péan, J.M., M.C. Venier-Julienne, R. Filmon, M. Sergent, R. Phan-Tan-Luu, and J.P.
Benoit (1998). “Optimization of HSA and NGF Encaapsulation Yields in PLGA Microparticles,” International Journal of Pharmaceutics 166, 105-115.
9
Table 1: Controlled Variables in the Experiment of Péan et al.
Controlled Variable
Units
Lower Value
X1
Volume of internal aqueous phase
mg
X2
Concentration of HSA in internal aqueous phase
mg/ml
X3
Quantity of PLGA in organic phase
ml
X4
Concentration of CMC Na in internal aqueous phase
X5
Upper Value
0.2
0.6
12.5
25.0
50
100
%, unitless
0
1
Unltrasonic time
s
5
15
X6
Mannitol in internal aqueous phase
mg/ml
0
10
X7
Volume of external aqueous phase
ml
30
70
X8
Emulsification time
s
60
300
X9
Volume of extracting aqueous phase
ml
150
400
Time of extraction
s
120
600
X10
10
Table 2: Experimental Design and Unitless Response in the Experiment of Péan et al.
Controlled Variables (coded)
Response
−
−
−
−
−
+
−
+
−
−
48.6
+
−
−
+
−
−
+
+
−
+
5.0
−
+
−
+
−
+
−
−
+
+
5.0
+
+
−
−
−
−
+
−
+
−
36.7
−
−
+
+
−
+
+
+
+
−
9.6
+
−
+
−
−
−
−
+
+
+
61.1
−
+
+
−
−
+
+
−
−
+
32.8
+
+
+
+
−
−
−
−
−
−
5.0
−
−
−
−
+
−
+
−
+
+
85.4
+
−
−
+
+
+
−
−
+
−
5.0
−
+
−
+
+
−
+
+
−
−
9.7
+
+
−
−
+
+
−
+
−
+
16.5
−
−
+
+
+
−
−
−
−
+
77.0
+
−
+
−
+
+
+
−
−
−
39.2
−
+
+
−
+
−
−
+
+
−
68.4
+
+
+
+
+
+
+
+
+
+
5.0
11
Table 3: Unitless Re-expression of Controlled Variables for the Experiment of Péan et al.
π1
X4
π2
π3
X1 /(X2 × X3 ) X6 /X2
π4
π5
π6
π7
X3 /X7
X5 /X8
X7 /X9
X8 /X10
12
Appendix
# Data from Pean et al:
D1 <- matrix(c(
200, 1.25,
50, 0,
5, 1, 30, 5, 150,
2, 48.6, 22,
600, 1.25,
50, 1,
5, 0, 70, 5, 150, 10,
5
, 31,
200, 2.50,
50, 1,
5, 1, 30, 1, 400, 10,
5
, 24,
600, 2.50,
50, 0,
5, 0, 70, 1, 400,
2, 36.7, 38,
200, 1.25, 100, 1,
5, 1, 70, 5, 400,
2,
600, 1.25, 100, 0,
5, 0, 30, 5, 400, 10, 61.1, 26,
200, 2.50, 100, 0,
5, 1, 70, 1, 150, 10, 32.8, 34,
600, 2.50, 100, 1,
5, 0, 30, 1, 150,
2,
9.6, 38,
5
, 28,
200, 1.25,
50, 0, 15, 0, 70, 1, 400, 10, 85.4, 29,
600, 1.25,
50, 1, 15, 1, 30, 1, 400,
2,
5
200, 2.50,
50, 1, 15, 0, 70, 5, 150,
2,
9.7, 27,
600, 2.50,
50, 0, 15, 1, 30, 5, 150, 10, 16.5, 21,
, 19,
200, 1.25, 100, 1, 15, 0, 30, 1, 150, 10, 77.0, 24,
600, 1.25, 100, 0, 15, 1, 70, 1, 150,
2, 39.2, 51,
200, 2.50, 100, 0, 15, 0, 30, 5, 400,
2, 68.4, 25,
600, 2.50, 100, 1, 15, 1, 70, 5, 400, 10,
5
, 43
),ncol=12,byrow=T)
# Changed to homogeneous units of time (s), mass (mg), and volume (ml)
U1<-D1[,1]/(10^3) # recorded in micro (mu) grams, converted to mg.
U2<-D1[,2]*10
# recorded in w/v (grams per 100 ml), converted to mg/ml
U3<-D1[,3]
# recorded in ml
U4<-D1[,4]
# unitless
U5<-D1[,5]
# recorded in seconds
U6<-D1[,6]*10
# recorded in w/v (grams per 100 ml), converted to mg/ml
U7<-D1[,7]
# recorded in ml
U8<-D1[,8]*60
# recorded in minutes, converted to seconds
U9<-D1[,9]
# recorded in ml
U10<-D1[,10]*60
# recorded in minutes, converted to seconds
Y1<-D1[,11]
# unitless
Y2<-D1[,12]
# unitless
# Stepwise modeling using original (scaled) variables:
fit1lower<-lm(Y1~1)
13
fit1upper<-lm(Y1~U1+U2+U3+U4+U5+U6+U7+U8+U9+U10)
summary(fit1upper)
step(fit1lower,
scope=list(lower=fit1lower,upper=fit1upper),
direction="forward",trace=1)
#Transformation to unitless Pi’s
Pi1 <- U4
Pi2 <- U1/(U2*U3)
Pi3 <- U6/U2
Pi4 <- U3/U7
Pi5 <- U5/U8
Pi6 <- U7/U9
Pi7 <- U8/U10
# Stepwise modeling using unitless variables:
fit2lower<-lm(Y1~1)
fit2upper<-lm(Y1~Pi1*Pi2*Pi3*Pi4*Pi5*Pi6*Pi7)
summary(fit2upper)
step(fit2lower,
scope=list(lower=fit2lower,upper=fit2upper),
direction="both",trace=1)
14
Download