MEASUREMENT THEORY AND ITS APPLICATIONS FRED S. ROBERTS RUTGERS UNIVERSITY 1 MEASUREMENT Measurement has something to do with numbers. Representational theory of measurement: Assign numbers to “objects” being measured in such a way that certain empirical relations are “preserved.” In measurement of temperature, we preserve a relation “warmer than.” In measurement of mass, we preserve a relation “heavier than.” 2 A: set of objects R : binary relation on A aRb $ a is warmer t han b aRb $ a is heavier t han b: f :A ! R aRb $ f (a) > f (b): R could be preference. Then f is a utility function (ordinal utility function). In mass, there is more going on: There is an operation of a±b combination of objects and mass is additive: means a combined with b: f (a ± b) = f (a) + f (b): 3 Homomorphisms Relational System: A = (A; R 1 ; R 2 ; :::; R p ; ±1 ; ±2 ; :::; ±q ) A set R±ii relation on A (not necessarily binary) operation on A (binary) A Homomorphism from 0 ; ±0 ; ±0 into B = (B ; R 0 ; R 0 ; :::; R ; :::; ±0 ) 1 2 p 1 2 q f :A ! B is a function such that (a1 ; a2 ; :::; am ) 2 R i $ i (f (a1 ); f (a2 ); :::; f (am )) 2 R i0 i f (a ±j b) = f (a) ±0j f (b): 4 (This only makes sense if Ri and R0 i are both m i ¡ ar y:) ) Examples of Homomorphisms Example 1: a´ b Let a mod 3 be the number b in {0,1,2} such that (mod 3). Let A = {0,1,2,…,26} and aRb $ a mod 3 > b mod 3. (R;>): Then f(a) = a mod 3 is a homomorphism from (A,R) into f(11) = 2, f(6) = 0, ... 5 Example 2: A = {a,b,c,d} R = {(a,b),(b,c),(a,c),(a,d),(b,d),(c,d)} f(a) = 4, f(b) = 3, f(c) = 2, f(d) = 1. (R;>): f is a homomorphism from (A, R) into Second homomorphism: g(a) = 10, g(b) = 4, g(c) = 2, g(d) = 0. Example 3: A = {a,b,c}, R = {(a,b),(b,c),(c,a)} f : (A;R) ! (R;>): There is no homomorphism aRb! f (a) > f (b); bRc! f (b) > f (c) Therefore, f(a) > f(c): But cRa f(c) > f(a). 6 Example 4: A = {0,1,2,…}, B = {0,2,4,…} A homomorphism f : (A;>;+) ! (B;>;+) : f(a) = 2a a > b $ f (a) > f (b) f(a + b) = f(a) + f(b). Second homomorphism: g(a) = 4a: A homomorphism does not have to be onto. Example 5: f : (R;>;+) ! (R+;>;£) : A homomorphism f(a) = e!a f (a) > f (b) a > bà f (a + b) = ea+ b = ea eb = f (a) £ f (b): 7 Temperature Measurement: homomorphism (A;R) ! (R;>) Mass Measurement: (A;R;±) ! (R;>;+): homomorphism Since negative numbers do not arise(R in+the case of mass, we can think ;>;+): of the homomorphism as being into The Two Problems of Representational Measurement Theory A Representation Problem: Find conditions on (necessary) and A sufficient for the existence of a homomorphism from into . Subproblem: Find an algorithm for obtaining the homomorphism. 8 A A If f is a homomorphism from into , we call ( , ,f) a scale. A representation theorem is a solution to the representation problem. A Uniqueness Problem: Given a homomorphism from unique is it? into , how A uniqueness theorem is a solution to the uniqueness problem. A representation theorem gives conditions under which measurement can take place. A uniqueness theorem is used to classify scales and to determine what assertions arising from scales are meaningful in a sense we shall make precise. 9 The Theory of Uniqueness Admissible Transformations An admissible transformation sends one acceptable scale into another. Cent igrade ! F ahr enhei t K ilograms ! P ounds In most cases one can think of an admissible transformation as defined on the range of a homomorphism. A Suppose f is a homomorphism from into . ' ±f ' : f (A) ! B is called an admissible transformation of f if is A again a homomorphism from into . 10 Cent igrade ! F ahr enhei t : ' (x) = 9 x + 32 5 K ilograms ! P ounds : ' (x) = 2:2x Example: (R;>) ! (R;>) by f (x) = 2x: It is a homomorphism: a > b $ f (a) > f (b): ' : f (R ) ! R by ' (x) = 2x + 10: is admissible: ' ± f (x) = 4x + 10: a > b $ ' ± f (a) > ' ± f (b): 11 Can every homomorphism g be obtained from a homomorphism f by an admissible transformation of scale? No. (Semiorders) A = f 0; 1=2; 10g; aRb $ a > b+ 1: B = f 0; 1=2; 10g; aSb $ a > b+ 1: Two homomorphisms (A,R) into (B,S): f(0) = f(1/2) = 0, f(10) = 10 g(0) = 0, g(1/2) = 1/2, g(10) = 10. g = ' ±f : ' : f (A) ! B There is no such that A homomorphism f is called irregular if there is a homomorphism g not attainable from f by an admissible transformation. 12 Irregular scales are unusual and annoying. We shall assume (until later) that all scales are regular. Most common scales of measurement are. In the case where all scales are regular, we can develop the theory of uniqueness by studying admissible transformations. A classification of scales is obtained by studying the class of admissible transformations associated with the scale. This defines the scale type. (S.S. Stevens) 13 Some Common Scale Types Class of Adm. Transfs. ' (x) = ®x, ® > 0 Scale Type Example ratio Mass Temp. (Kelvin) Time (intervals) Loudness (sones)? Brightness (brils)? interval Temp(F,C) Time (calendar) IQ tests (standard scores?) ' (x) = ®x + ¯, ® > 0 14 Class of Adm. Transfs. x ¸ y $ ' (x) ¸ ' (y) Scale Type Example ordinal Preference? Hardness Grades of leather, wool, etc. IQ tests (raw scores)? absolute Counting strictly increasing ' (x) = x 15 Theorem (Roberts and Franke 1976) If every homomorphism from A into is regular and f and g are homomorphisms, then f is a ratio scale iff g is a ratio scale. Same for interval, ordinal, absolute, etc. However: This fails in the irregular case. 16 Meaningful Statements In measurement theory, we speak of a statement as being meaningful if its truth or falsity is not an artifact of the particular scale values used. Assuming all scales involved are regular, one can use the following definition (due to Suppes 1959 and Suppes and Zinnes 1963). Definition: A statement involving numerical scales is meaningful if its truth or falsity is unchanged after any (or all) of the scales is transformed (independently?) by an admissible transformation. 17 If some scales are irregular, we have to modify the definition: Alternate Definition: A statement involving numerical scales is meaningful if its truth or falsity is unchanged after any (or all) of the scales is (independently?) replaced by another acceptable scale. We shall usually disregard the complication. (But it does arise in practical examples, for example those involving preference judgments under the semiorder model.) Even this alternative definition is not always applicable in some situations. However, it is a useful definition for a lot of situations and we shall adopt it. 18 “This talk will be three times as long as the previous talk.” Is this meaningful? We have a ratio scale (time intervals). (1) f(a) = 3f(b). This is meaningful' if f is a ratio scale. For, an admissible (x) = ®x, ® > 0 transformation is . We want (1) to hold iff (' ± f )(a) = 3(' ± f )(b): (2) But (2) becomes ®f (a) = 3®f (b): (3) (1) à ! (3) since®> 0: 19 “The high temperature today was three times the high temperature yesterday.” Meaningless. It could be true with Fahrenheit and false with Centigrade, or vice versa. In general: For ratio scales, it is meaningful to compare ratios: f(a)/f(b) > f(c)/f(d) For interval scales, it is meaningful to compare intervals: f(a) - f(b) > f(c) - f(d) For ordinal scales, it is meaningful to compare size: f(a) > f(b) 20 “I weigh 1000 times what the Statue of Liberty weighs.” Meaningful. It involves ratio scales. It is false no matter what the unit. Meaningfulness is different from truth. It has to do with what kinds of assertions it makes sense to make, which assertions are not accidents of the particular choice of scale (units, zero points) in use. 21 Average Performance Study two groups of individuals, machines, algorithms, etc. under different conditions. f(a) is the time required by a to finish a task. Data suggests that the average performance of individuals in the first group is better than the average performance of individuals in the second group. a1 , a2 , ..., an b1 , b2 , ..., bm (1) individuals in first group individuals in second group. 1 Xn 1 Xm f (ai ) > f (bi ) n m i= 1 i= 1 We are comparing arithmetic means. 22 Statement (1) is meaningful iff for all admissible transformations of scale , (1) holds iff 1 Xn 1 Xm ' ± f (ai ) > ' ± f (bi ) n m (2) i= 1 i= 1 Performance = length of time, a ratio scale. Thus, so (2) becomes Xn Xm 1 n (3) 1 ®f (ai ) > m i= 1 ®> 0 Then implies (1) $ (3) ' (x) = ®x, ® > 0 ®f (bi ) i= 1 . Hence, (1) is meaningful. 23 , Note: (1) is still meaningful if f is an interval scale. For example, the task could be to heat something a certain number of degrees and f(a) = temperature increase produced by a. Here, (4) ' (x) = ®x + ¯, ® > 0 . Then (2) becomes 1 Xn 1 Xm [®f (ai ) + ¯] > [®f (bi ) + ¯] n m i= 1 i= 1 This readily reduces to (1). However, (1) is meaningless if f is just an ordinal scale. 24 To show that comparison of arithmetic means can be meaningless for ordinal scales: Suppose f(a) is measured on an ordinal scale, e.g., 5-point scale: 5=excellent, 4=very good, 3=good, 2=fair, 1=poor. In such a scale, the numbers don't mean anything, only their order matters. Group 1: 5, 3, 1 average 3 Group 2: 4, 4, 2 average 3.33 (greater) 5! 100, 4! 75, 3! 65, 2! 40, 1! 30 Admissible transformation: . New scale conveys the same information. New scores: Group 1: 100, 65, 30 average 65 (greater) Group 2: 75, 75, 40 average 63.33 25 Thus, comparison of arithmetic means can be meaningless for ordinal data. Of course, you may argue that in the 5-point scale, at least equal spacing between scale values is an inherent property of the scale. In that case, the scale is not ordinal and this example does not apply. On the other hand, comparing medians is meaningful with ordinal scales: To say that one group has a higher median than another group is preserved under admissible transformations. 26 Importance Ratings/Performance Ratings Suppose each of n individuals is asked to rate each of a collection of alternative variables, people, machines, policies as to their relative importance. Or we rate the falternatives on different criteria or against i (a) different benchmarks. Let be the rating of alternative a by individual i (under criterion i). Is it meaningful to assert that the average rating of alternative a is higher than the average rating of alternative b? A similar question arises in performance ratings, loudness ratings, brightness ratings, confidence ratings, etc. (1) 1 Xn 1 Xn f i (a) > f i (b) n n i= 1 i= 1 27 If each (2) fi is a ratio scale, then we consider for > 0, 1 Xn 1 Xn ®f i (a) > ®f i (b) n n i= 1 i= 1 (1) à ! (2) Clearly, , so (1) is meaningful. f 1 , f 2 , ..., f n Problem: might have independent units. In thisf case, we i want to allow independent admissible transformations of the . Thus, we must consider (3) 1 Xn 1 Xn ®i f i (a) > ®i f i (b) n n i= 1 It is easy to see that there are (1) is meaningless. i= 1 ®i so that (1) holds and (3) fails. Thus, 28 Motivation for considering different f 1 (a) = wei ght ®i : f 2 (a) = hei ght n = 2, of a, of a. Then (1) says that the average of a's weight and height is greater than the average of b's weight and height. This could be true with one combination of weight and height scales and false with another. Conclusion: Be careful when comparing arithmetic mean importance or performance ratings. 29 In p this context, p it is safer to compare geometric means (Dalkey). n ¦ p n ¦ all n f (a) i= 1 i n i= 1 > n ¦ n f i (b) $ i= 1 p ®i f i (a) > n ¦ n ®i f i (b) i= 1 , ®i > 0 fi Thus, if each is a ratio scale, if individuals can change importance rating scales (performance rating scales) independently, then comparison of geometric means is meaningful while comparison of arithmetic means is not. 30 Application Roberts (1972, 1973). A panel of experts each estimated the relative importance of variables relevant to energy demand using the magnitude estimation procedure. If magnitude estimation leads to a ratio scale -- Stevens presumes this -- then comparison of geometric mean importance ratings is meaningful. However, comparison of arithmetic means may not be. Geometric means were used. 31 Magnitude Estimation by One Expert of Relative Importance for Energy Demand of Variables Related to Commuter Bus Transportation in a Given Region Variable Rel. Import. Rating 1. No. bus passenger mi. annually 80 2. No. trips annually 3. No. miles of bus routes 4. No. miles special bus lanes 5. Average time home to office 100 50 50 70 6. Average distance home to office 7. Average speed 8. Average no. passengers per bus 65 10 20 9. Distance to bus stop from home 10. No. buses in the region 11. No. stops, home to office 50 20 20 32 Performance Evaluation Fleming and Wallace (1986) make a similar point about performance evaluation of computer systems. A number of systems are tested on different benchmarks. Their scores on each benchmark are normalized relative to the score of one of the systems. The normalized scores of a system are combined by some averaging procedure. If the averaging is the arithmetic mean, then the statement that one system has a higher arithmetic mean normalized score than another system is meaningless: The system to which scores are normalized can determine which has the higher arithmetic mean. 33 However, if geometric mean is used, this problem does not arise (under reasonable assumptions about scores). f i (u) = performance of system u on benchmark i. F is a “combined score” such as arithmetic mean. F ( f 1 ( u ) , ..., f 1 (x) f n (u) ) f n (x) > F ( f 1 ( v) , ..., f 1 (x ) f n (v) ) f n (x) (normalized score relative to system x) F ( f 1 ( u ) , ..., f 1 (y) f n (u) ) f n (y) < F ( f 1 ( v ) , ..., f 1 (y) f n (v) ) f n (y) (normalized score relative to system y) 34 System BenchMark R M Z 1 417 244 134 2 83 70 70 3 66 153 135 4 39449 33527 66000 5 72 368 369 35 Scores Normalized to System R R M Z 1 1.00 0.59 0.32 2 1.00 0.84 0.85 3 1.00 2.32 2.05 4 1.00 0.85 1.67 5 1.00 0.48 0.45 Arith. Mean 1.00 1.01 1.07 36 Scores Normalized to System M R M Z 1 1.71 1.00 0.55 2 1.19 1.00 1.00 3 0.43 1.00 0.88 4 1.18 1.00 1.97 5 2.10 1.00 1.00 Arith. Mean 1.32 1.00 1.08 37 Geometric Means Geom. Mean of Scores Normalized Relative to R R 1.00 M 0.86 Z 0.84 Geom. Mean of Scores Normalized Relative to M R 1.17 M 1.00 Z 0.99 Similar situation if we test students on different benchmarks 38 How Should we “Average” or “Merge” Scores? Sometimes arithmetic means are not a good idea. Sometimes geometric means are. Are there situations where the opposite is the case? Or some other method is better? Can we lay down some guidelines about when to use what merging procedure? u = F (a1 ; a2 ; :::; an ) a1 , a2 , ..., an are scores. F is an unknown merging function. Approaches to finding acceptable merging functions: (1) axiomatic (2) scale types (3) meaningfulness 39 An Axiomatic Approach Theorem (Fleming and Wallace). Suppose following properties: F : Rn ! R+ + has the (1). Reflexivity: F(a,a,...,a) = a F (a1 ; a2 ; :::; an ) = F (a¼( 1) ; a¼( 2) ; :::; a¼( n ) ) (2). Symmetry: for all ¼ of f 1; 2; :::; ng permutations (3). Multiplicativity: F (a1 b1 ; a2 b2 ; :::; an bn ) = F (a1 ; a2 ; :::; an )F (b1 ; b2 ; :::; bn ): Then F is the geometric mean. And conversely. 40 A Functional Equations Approach Using Scale Type or Meaningfulness Assumptions Unknown function u = F (a1 ; a2 ; :::; an ): ai Luce's idea: If you know the scale types of the and the scale type of ai u and you assume that an admissible transformation of each of the leads to an admissible transformation of u, you can derive the form of F. Example: scale. a1 , a2 , ..., an are independent ratio scales, u is a ratio F : Rn ! R+ + 41 F (a1 ; a2 ; :::; an ) = u ! F (®1 a1 ; ®2 a2 ; :::; ®n an ) = ®u: ®1 > 0, ®2 > 0, ..., ®n > 0, ® > 0 and ® depends on ®1 , ®2 , ..., ®n . Thus, we get the functional equation: (*) F (®1 a1 ; ®2 a2 ; :::; ®n an ) = A(®1 ; ®2 ; :::; ®n )F (a1 ; a2 ; :::; an ); A(®1 ; ®2 ; :::; ®n ) > 0 F : Rn ¡ ! R+ + Theorem (Luce 1964): If is continuous and satisfies ¸ > 0, c1 , c2 , ..., cn (*), then there are so that F (a1 ; a2 ; :::; an ) = ¸ ac1 ac2 :::acn : 1 2 n 42 (Aczél, Roberts, Rosenbaum 1986): It is easy to see that the assumption of continuity can be weakened to continuity at a point, monotonicity, or boundedness on an (arbitrarily small, open) ndimensional interval or on a set of positive measure. Call any of these assumptions regularity. Theorem (Aczél and Roberts 1989): If in caddition F =satisfies = c = ::. cn = 1=n 1 2 reflexivity and symmetry, then = 1 and , so F is the geometric mean. Aside: Solution to (*) without Regularity (Aczél, Roberts, and Rosenbaum): F (a1 ; a2 ; :::; an ) = ¸ ¦ n M i (ai ) i= 1 M i > 0, M i (xy) = M i (x)M i (y), ¸ > 0 43 a1 , a2 , ..., an Second Example: unit; u is a ratio scale. are ratio scales, but have the same New Functional Equation: (¤¤)F (®a1 ; ®a2 ; :::; ®an ) = A(®)F (a1 ; a2 ; :::; an ); A(®) > 0 Regular Solutions to (**) (Aczél, Roberts, and Rosenbaum): F (a1 ; a2 ; :::; an ) = ac f ( 1 f > 0, arbitrary regular function on c arbitrary constant. a2 a , ..., n ) a1 a1 R n¡ 1 (n = 1: f is a positive constant) 44 Solutions to (**) without Regularity (Aczél, Roberts, and Rosenbaum): a2 an F (a1 ; a2 ; :::; an ) = M (a1 )f ( , ..., ) a1 a1 f > 0, arbit rary funct ion M > 0, M (xy) = M (x)M (y): Third Example: interval scale a1 , a2 , ..., an are independent ratio scales, u is an New Functional Equation: (***) F (®1 a1 ; ®2 a2 ; :::; ®n an ) = A(®1 ; ®2 ; :::; ®n )F (a1 ; a2 ; :::; an )+ B (®1 ; ®2 ; :::; ®n ); A(®1 ; ®2 ; :::; ®n ) > 0 45 Regular solutions to (***) (Aczél, Roberts, and Rosenbaum): F (a1 ; a2 ; :::; an ) = ¸ ¦ n ac i i= 1 i + b ¸ , b, c1 , c2 , ..., cn arbit rary const ant s OR Xn F (a1 ; a2 ; :::; an ) = ci loga + b i i= 1 b, c1 , c2 , ..., cn arbit rary const ant s. 46 Solutions to (***) without Regularity (Aczél, Roberts, and Rosenbaum): F (a1 ; a2 ; :::; an ) = ¸ ¦ n M i (ai ) + b i= 1 M i > 0, M i (xy) = M i (x)M i (y) ¸ , b arbit rary const ant s OR Xn F (a1 ; a2 ; :::; an ) = L i (ai ) + b i= 1 L i (xy) = L i (x) + L i (y) b arbit rary const ant . 47 Sometimes You Get the Arithmetic Mean a1 , a2 , ..., an Fourth Example: interval scales with the same unit and independent zero points; u an interval scale. Functional Equation: (****) F (®a1 + ¯1 ; ®a2 + ¯2 ; :::; ®an + ¯n ) = A(®; ¯1 ; ¯2 :::; ¯n )F (a1 ; a2 ; :::; an )+ B (®; ¯1 ; ¯2 ; :::; ¯n ); A(®; ¯1 ; ¯2 :::; ¯n ) > 0 Solutions to (****) without Regularity Assumed (Aczél, Roberts, and Xn Rosenbaum): F (a1 ; a2 ; :::; an ) = ¸ 1 , ¸ 2 , ..., ¸ n , b ¸ i ai + b i= 1 arbitrary constants 48 Note that all solutions are regular. Theorem (Aczél and Roberts): (1). P If in addition F satisfies reflexivity, then n i= 1 ¸ i = 1, b = 0: (2). in addition F satisfies reflexivity and symmetry, then ¸ i = If 1=n for all i, b = 0, i.e., F is the arithmetic mean. 49 Meaningfulness Approach While it is often reasonable to assume you know the scale type of the a1 , a2 , ..., an independent variables , it is not so often reasonable to assume that you know the scale type of the dependent variable u. However, it turns out that you can replace the assumption that the scale type of u is xxxxxxx by the assumption that a certain statement involving u is meaningful. 50 a1 , a2 , ..., an Back to First Example: are independent ratio scales. Instead of assuming u is a ratio scale, assume that the statement F (a1 ; a2 ; :::; an ) = kF (b1 ; b2 ; :::; bn ) a1 , a2 , ..., an , b1 , b2 , ..., bn is meaningful for all get the same results as before: and k > 0. Then we Theorem (Roberts and Rosenbaum 1986): Under these hypotheses and regularity, F (a1 ; a2 ; :::; an ) = ¸ ac1 ac2 :::acn : 1 2 n If in addition F satisfies reflexivity and symmetry, then F is the geometric mean. 51 MEASUREMENT OF AIR POLLUTION Various pollutants are present in the air: Carbon monoxide (CO), hydrocarbons (HC), nitrogen oxides (NOX), sulfur oxides (SOX), particulate matter (PM). Also damaging: Products of chemical reactions among pollutants. E.g.: Oxidants such as ozone produced by HC and NOX reacting in presence of sunlight. Some pollutants are more serious in presence of others, e.g., SOX are more harmful in presence of PM. Can we measure pollution with one overall measure? 52 To compare pollution control policies, need to compare effects of different pollutants. We might allow increase of some pollutants to achieve decrease of others. One single measure could give indication of how bad pollution level is and might help us determine if we have made progress. Combining Weight of Pollutants: Measure total weight of emissions of pollutant i over fixed period of time and sum over i. e(i,t,k) = total weight of emissions of pollutant i (per cubic meter) over tth time period and due to kth source or measured in kth location. X A(t; k) = e(i ; t; k) i 53 Early uses of this simple index A in the early 1970s led to the conclusion that transportation is the largest source of air pollution, accounting for over 50% of all pollution, with stationary fuel combustion (especially by electric power plants) second largest. Also: CO accounts for over half of all emitted air pollution. Are these meaningful conclusions? A(t; k) > A(t; k 0) X A(t; kr ) > X A(t; k) k6 = kr X X e(i ; t; k) > t ;k e(j ; t; k) t ;k j 6 =i All are meaningful if we measure all e(i,t,k) in same units of mass (e.g., milligrams per cubic meter) and so admissible transformation means multiply e(i,t,k) by same constant. 54 These comparisons are meaningful in the technical sense. But: Are they meaningful comparisons of pollution level in a practical sense? A unit of mass of CO is far less harmful than a unit of mass of NOX. EPA standards based on health effects for 24 hour period allow 7800 units of CO to 330 units of NOX. These are Minimum acute toxicity effluent tolerance factors (MATE criteria). Tolerance factor is level at which adverse effects are known. Let t(i) be tolerance factor for ith pollutant. Severity factor: t(CO)/t(i) or 1/t(i) 55 One idea (Babcock and Nagda, Walther, Caretto and Sawyer): Weight the emission levels (in mass) by severity factor and get a weighted sum. This amounts to using the indices Degree of hazard: 1 e(i ; t; k) t (i ) and the combined index Pindex: B (t; k) = P 1 e(i ; t; k) i t (i ) Under pindex, transportation is still the largest source of pollutants, but now accounting for less than 50%. Stationary sources fall to fourth place. CO drops to bottom of list of pollutants, accounting for just over 2% of the total. 56 These conclusions are again meaningful if all emission weights are measured in same units. For an admissible transformation multiplies t and e by the same constant and thus leaves the degree of hazard unchanged and pindex unchanged. Pindex was introduced in the San Francisco Bay Area in the 1960s. But, are comparisons using pindex meaningful in the practical sense? Pindex amounts to: for a given pollutant, take the percentage of a given harmful level of emissions that is reached in a given period of time, and add up these percentages over all pollutants. (Sum can be greater than 100% as a result.) 57 If 100% of the CO tolerance level is reached, this is known to have some damaging effects. Pindex implies that the effects are equally severe if levels of five major pollutants are relatively low, say 20% of their known harmful levels. Severity tonnage of pollutant i due to a given source is actual tonnage times the severity factor 1/t(i). Data from Walther 1972 suggests the following. Interesting exercise to decide which of these conclusions are meaningful. 58 1. HC emissions are more severe (have greater severity tonnage) than NOX emissions. 2. Effects of HC emissions from transportation are more severe than those of HC emissions from industry. (Same for NOX.). 3. Effects of HC emissions from transportation are more severe than those of CO emissions from industry. 4. Effects of HC emissions from transportation are more than 20 times as severe as effects of CO emissions from transportation. 5. The total effect of HC emissions due to all sources is more than 8 times as severe as total effect of NOX emissions due to all sources. 59 Social Networks and Blockmodeling In studying M = social (m i j ) networks, m i ja common approach is to start withi t ha matrix measures the friendship of the th jwhere individual for the . M is used to derive such conclusions as: * the degree to which friendship is propagated through the group (if k is a friend of j and j is a friend of i, is k a friend of i?) * the degree to which friendship is reciprocated (if j is a friend of i, is i a friend of j?) * grouping of individuals into friendship cliques or clusters or blocks. 60 Batchelder: m i j Unless attention is paid to the types of scales used to measure , conclusions about propagation, reciprocation, and grouping can be meaningless. Simple Example: -Transitivity The social network is called -transitive ( > 0) if [m i j ¸ ±]& [m j k ¸ ±] ! m i k ¸ ±, i 6 = k: 1 M = 1 2 3 2 3 10 8 4 12 1 2 is 5-transitive. 61 Suppose each individual makes judgments of friendship on a ratio scale. Even if all the ratio scales are the same, the conclusion of transitivity is meaningless: 1 ½M = 1 2 3 2 .5 2 3 5 1 4 6 is not 5-transitive. 62 How we present data can m i jdetermine the meaningfulness of conclusions. Suppose is not the numerical estimate of i's friendship for j, but the relative ranking by i of j among the other individuals. In the above example M, suppose we use only the ranks: n for first place, n-1 for second place, etc. We get 1 M' = 1 2 3 2 2 2 3 3 3 2 3 This is 2-transitive. Even if the scales are only ordinal, this M' is unchanged even when M changes. Thus, -transitivity of M' is a meaningful conclusion. (Batchelder) 63 Blockmodeling The process of grouping individuals into blocks is called blockmodeling. A widely used algorithm in blockmodeling is the CONCOR algorithm of Breiger, Boorman, and Arabie and Arabie, Boorman, and Levitt. In CONCOR, the matrix M is used to obtain a similarity measure between individuals i and j, using either column by column or row by row product moment correlations. The partition into blocks is based on the similarities. Batchelder: If the friendship rating scales are independent interval scales, then the column by column product moment correlation obtained from M can change. Thus, the blocks can also change. 64 Hence, the blocking conclusions from a procedure such as CONCOR can be meaningless. ci j E.g.: Suppose is the Pearson column by column c0j product moment correlation obtained from matrix M and suppose i is the same correlation obtained from M', where m 0j = ®i m i j + ¯i , ®i > 0: i Let 0 2 2 1 2 0 6 3 M= 2 4 0 5 1 1 3 0 ®1 = ®2 = ®3 = 1 ®4 = 3 ¯1 = ¯2 = ¯3 = ¯4 = 0 c12 = 1 If , , , then c0 2 = ¡ 1 and 1 . This is as dramatic a change as possible. 65 rij Note: Batchelder shows that if is the product moment correlation t h i j th between the and rows of M, then under the transformation m 0j = ®i m i j + ¯i , ®i > 0; i rij is invariant. Thus, conclusions from M based on row by row correlations are meaningful. 66 Meaningfulness of Statistical Tests (Marcus-Roberts and Roberts) For more than 40 years, there has been considerable disagreement on the limitations that scales of measurement impose on statistical procedures we may apply. The controversy stems from the foundational work of Stevens (1946, 1951, 1959, ...), who developed the classification of scales of measurement we have given and then provided rules for the use of statistical procedures which provided that certain statistics were inappropriate at certain levels of measurement. The application of Stevens' ideas to descriptive statistics has been widely accepted, but the application to inferential statistics has been labeled by some a misconception. 67 Descriptive Statistics P = population whose distribution we would like to describe We try to capture some of the properties of P by finding some descriptive statistic for P or by taking a sample S from P and finding a descriptive statistic for S. Our previous examples suggest that certain descriptive statistics are appropriate only for certain measurement situations. This idea was originally due to Stevens and was popularized by Siegel in his wellknown book Nonparametric Statistics (1956). 68 Our examples suggest the principle that arithmetic means are “appropriate” statistics for interval scales, medians for ordinal scales. The other side of the coin: It is argued that it is always appropriate to calculate means, medians, and other descriptive statistics, no matter what the scale of measurement. Frederic Lord: Famous football player example. “The numbers don't remember where they came from.” I agree: It is always appropriate to calculate means, medians, ... But: Is it appropriate to make certain statements using these descriptive statistics? 69 My position: It is usually appropriate to make a statement using descriptive statistics if and only if the statement is meaningful. Why? A statement which is true but meaningless gives information which is an accident of the scale of measurement used, not information which describes the population in some fundamental way. So, it is appropriate to calculate the mean of ordinal data, it is just not appropriate to say that the mean of one group is higher than the mean of another group. 70 Inferential Statistics Over the years, Stevens' ideas have also come to be applied to inferential statistics -- inferences about an unknown population P. They have led to such principles as the following: (1). Classical parametric tests (e.g., t-test, Pearson correlation, analysis of variance) are inappropriate for ordinal data. They should be applied only to data which define an interval or ratio scale. (2). For ordinal scales, non-parametric tests (e.g., Mann-Whitney U, Kruskal-Wallis, Kendall's tau) can be used. Not everyone agrees. Thus: Controversy 71 My View: The validity of a statistical test depends on a statistical model, which includes information about the distribution of the population and about the sampling procedure. The validity of the test does not depend on a measurement model, which is concerned with the admissible transformations and scale type. The scale type enters in deciding whether the hypothesis is worth testing at all -- is it a meaningful hypothesis? The issue is: If we perform admissible transformations of scale, is the truth or falsity of the hypothesis unchanged? 72 A Brief Analysis of These Principles Inferences about an unknown population P. Measurement model for P defines the scale type, i.e., tells admissible transformations. Null hypothesis H0 about P. Suppose an admissible transformation transforms scale values. ' (P ) = f ' (x) : x 2 P g: We get a new population We also get a transformed Hnull hypothesis : 0 all scales and scale values in ' (H 0 ) by applying to 73 Example: H0 : mean of P is 0. Measurement ' (x) = exmodel: P ordinal Then is admissible. ' (H 0 ) : ' (0) = e0 = 1: mean of (P) is Note that if Also, P = f ¡ 2, -1, 0, 1, 2g , then ' (P ) = f e¡ 2 , e¡ 1 , e0 , e1 , e2 g H0 and so is true. ' (H 0 ) is false. Thus, a null hypothesis can be meaningless. Summary of my position: The major restriction measurement theory places on inferential statistics is on what hypotheses are meaningful. We should be primarily concerned with testing meaningful hypotheses. 74 Can we test meaningless hypotheses? Sure. But I question what information we get outside of information about the population as measured. More details: Testing H0 about P : 1). Draw a random sample S from P. 2). Calculate a test statistic based on S. 3). Calculate probability that the test statistic is what was observed H0 given is true. H0 4). Accept or reject on the basis of the test. Note: Calculation of probability depends on a statistical model, which includes information about the distribution of P and about the sampling procedure. But, the validity of the test depends only on the statistical model, not on the measurement model. 75 Thus, you can apply parametric tests to ordinal data, provided the statistical model is satisfied, in particular if the data is normally distributed. (Thomas showed that ordinal data can be normally distributed; it is a misconception that it cannot be.) Where does the scale type enter? In determining if the hypothesis is worth testing at all. i.e., if it is meaningful. For instance, consider ordinal data and H 0 : T he mean is 0. The hypothesis is meaningless. But, if the data meets certain distributional requirements such as normality, we can apply a parametric test, such as the t-test, to check if the mean is 0. 76 The result gives us information about the population as we have measured it, but no “intrinsic” information about the population since its truth or falsity can depend on the particular choice of scale. So, what is wrong with using the t-test on ordinal data? Nothing. The test is o.k. However, the hypothesis is meaningless. Some Observations on Tests of Meaningful Hypotheses H0 meaningful. We can test it in various ways. We cannot expect H0 that two different tests of will lead to the same conclusion in all cases, even forgetting about change of scale. For example, the t-test can lead to rejection and the Wilcoxon test to acceptance. 77 But what happens if we apply the same test to data both before and after admissible transformation? The best situation is if the results of the test are invariantHunder admissible changes of scale. More precisely: Suppose H0 is a meaningful hypothesis about population P. Test 0by drawing a random sample S from P. Let (S) be the set of all (x) for x in S. We hope for a test T for H0 with the following properties: (a). The statistical model for T is satisfied by P and also by (P) for all admissible transformations as defined by the measurement model for P. H0 (b) T accepts with S at level of significance if and ' (H 0sample ) only if T accepts with sample (S) at level of significance . 78 There are many examples in the statistical literature where conditions (a) and (b) hold. We give two examples. Example 1: t-test Measurement model: interval scale H0 mean is ¹0 Statistical model: normal distribution Admissible transformations: (x) = kx+c, k>0 (a). Such take normal distributions into normal distributions 79 (b) The test statistic is p t = [X ¡ ¹ 0 ]=(s= n) where n = sample size, X = sample mean, s2 = sample variance. Transform by (x) = kx+c. New sample mean is ' (H 0 ) : mean is kX + c ; new sample variance is k 2 s2 : ' (¹ 0 ) = k¹ 0 + c: New test statistic for ' (H 0 ) : p [kX + c ¡ (k¹ 0 + c)]=(ks= n) = t: The test statistic is the same, so (b) follows. 80 Example 2: Sign Test Measurement model: ordinal scale H0 : median is M0 (meaningful) Statistical model: Literature of nonparametric statistics is unclear. Most books say continuity of probability density is needed. P r (x = M 0 )function = 0 In fact, it is only necessary to assume . (a) P r (x = M 0 ) = 0 $ P r (' (x) = ' (M 0 )) = 0 ( is strictly increasing) (Note that strictly increasing also take continuous data into continuous data, since there are only countably many gaps) 81 M0 (b) Test statistic: Number of sample points above or number M0 below , whichever is smaller. This test statisticMdoes not change 0 when is replaced by ' (M 0 ) a strictly increasing is applied to S and . Thus, (b) follows. What if Conditions (a) or (b) Fail? What if (a) holds, (b) fails? We might still find test T useful, just as we sometimes find it useful to apply different tests with possibly different conclusions to the same data. However, it is a little disturbing that seemingly innocuous changes in how we measure things could lead to different conclusions under the “same” test. 82 What if H0 is meaningful but (a) fails? This could happen Hif P violates the statistical model for T. We might 0 still be able to test using T: Try to find admissible such that ' (P ) satisfies the apply T to test H 0 statistical model H 0for T and 'then ' (H 0 ) (H 0 ) . Since is meaningful, holds iff , so T applied to H0 ' (P ) gives a test for . H0 For instance, if is meaningful, then even for ordinal data, we can seek a transformation which normalizes the data and allows us to apply a parametric test. 83 What if conditions (a) or (b) fail and we still apply test T to test H0 ? There is much empirical work on this, especially emphasizing the situation where (a) and (b) fail because we have an ordinal scale but we treat it like an interval scale. I don't think measurement theory has much to say about this situation. But good theoretical work does, and more is needed. 84 Summary of My Views on the Limitations on Statistical Procedures Imposed by Scales of Measurement (1). It is always “appropriate” to calculate means, medians, and other descriptive statistics. However, the key point is whether or not it is “appropriate” to make certain statements using these statistics. (2). If we want to make statements which capture something fundamental about the population being described, it is appropriate to make statements using descriptive statistics if and only if the statements are meaningful. 85 (3). The appropriateness of a statistical test of a hypothesis is just a matter of whether the population and sampling procedure satisfy the appropriate statistical model, and is not influenced by the properties of the measurement scale used. (4). However, if we want to draw conclusions about a population which say something basic about the population, rather than something which is an accident of the particular scale of measurement used, then we should only test meaningful hypotheses, and meaningfulness is determined by the properties of the measurement scale. 86 (5). A meaningful hypothesis can be tested by several different tests, which may or may not give the same conclusion even on the same data. It can also be tested by applying the same test to both given data and data transformed by an admissible transformation of scale, and in the best situation the two applications of the test will give the same conclusion. So long as the hypothesis is meaningful, it can also be tested by transforming the data by an admissible transformation and applying a test which might not be appropriate for the data in its initial form. 87 The First Representation Problem: Ordinal Utility Functions (R ; > ): Homomorphism from (A,R) into f :A ! R aRb $ f (a) > f (b) R = warmer than, louder than, preferred to, ... If preference, then f is an ordinal utility function. Definition: We say that (A,R) is a strict weak order if it satisfies the following two conditions: (1). Asymmetry: aRb ! ~bra ! (2). Negative Transitivity: ~aRb & ~bRc ~aRc 88 Example: $ aRb a is to the right of b Representation Theorem: Suppose A is a finite set. Then (A,R) is (R ; > ) homomorphic to iff (A,R) is a strict weak order. This says that our example with the 's is the most general strict weak order on a finite set. 89 Application: Preferences Among Composers i ; j ent ry = 1 i® i is preferred t o j Mo H Br Be W Ba Ma S Mo 0 1 1 0 0 0 1 1 H 0 0 1 0 0 0 1 0 Br 0 0 0 0 0 0 0 0 Be 1 1 1 0 1 0 1 1 W 0 1 1 0 0 0 1 1 Ba 1 1 1 0 1 0 1 1 Ma 0 1 1 0 0 0 0 0 S 0 1 1 0 0 0 1 0 Mo = Mozart, H = Haydn, Br = Brahms, Be = Beethoven, W = Wagner, Ba = Bach, Ma = Mahler, S = Strauss Is there an ordinal utility function? How does one tell if (A,R) is strict weak? 90 Proof of the Representation Theorem for A Finite: Necessity: straightforward Sufficiency: Let f (x) = jf y : xRygj: If (A,R) is strict weak, then f is a homomorphism. Suppose aRb. Then ~bRa. Hence, ~aRy implies ~bRy, i.e., bRy ¸ implies aRy. It follows that f(a) f(b). By asymmetry, ~bRb. Since aRb, it follows that f(a) > f(b). Suppose ~aRb. Then ~bRy implies ~aRy, so aRy implies bRy, so ¸ f(b) f(a), i.e., ~f(a) > f(b). 91 Comment: Given any (A,R), we can define f this way. Then, if there is a homomorphism at all, f is one. Application to our Example: Calculate row sums. Then f(x) = row sum of row x. If the matrix is rearranged so that the alternatives are listed in descending order of row sums (with arbitrary ordering in case of ties), then we can test if f is a homomorphism by checking that there are 1's in row x for all those y with f(y) < f(x). This should give a block of 1's in each row from some point to the end. 92 Mo H Br Be W Ba Ma S Row Sum Mo 0 1 1 0 0 0 1 1 4 H 0 0 1 0 0 0 1 0 2 Br 0 0 0 0 0 0 0 0 0 Be 1 1 1 0 1 0 1 1 6 W 0 1 1 0 0 0 1 1 4 Ba 1 1 1 0 1 0 1 1 6 Ma 0 1 1 0 0 0 0 0 1 S 0 1 1 0 0 0 1 0 3 Mo = Mozart, H = Haydn, Br = Brahms, Be = Beethoven, W = Wagner, Ba = Bach, Ma = Mahler, S = Strauss Be Rearranged Matrix Ba W Mo S H Ma Br Be 0 0 1 1 1 1 1 1 Ba 0 0 1 1 1 1 1 1 W 0 0 0 0 1 1 1 1 Mo 0 0 0 0 1 1 1 1 S 0 0 0 0 0 1 1 1 H 0 0 0 0 0 0 1 1 Ma 0 0 0 0 0 0 0 1 Br 0 0 0 0 0 0 0 0 93 Are the axioms reasonable? Prescriptive vs. descriptive. As descriptive axioms for warmer than or preferred to: Asymmetry: Seems ok for both. Negative transitivity: Seems ok for warmer than. However, for preference, consider the following case. We choose on the basis of price, unless prices are close, in which case we choose by quality. Suppose a is higher quality than b, which is higher quality than c, with a higher in price than b and b higher in price than c. If a and b are close in price and so are b and c, but a and c are not, then we would choose a over b and b over c, but c over a. Thus, ~cRb and ~bRa, but cRa. 94 Strict Simple Orders Let (A,R) be a strict weak order. Define E on A by $ aEb ~aRb & ~bRa (A,R) strict weak implies (A,E) is an equivalence relation. If R is interpreted as preference, then E is interpreted as indifference. We are indifferent between a and b iff we prefer neither to the other. Let A* be the set of equivalence classes. Define R* on A* by $ aR*b* aRb 95 Then, R* is well-defined and is strict weak. (A*,R*) is called the reduction of (A,R) by the equivalence relation E. (A*,R*) has no ties. A strict weak order with no ties allowed is called strict simple. (A*,R*) tells us how to order the equivalence classes. It can be thought of as the relation “strictly to the right of” on a set of points on a line: 96 Example: A = f 0; 1; 2; :::, 10g; $ aRb a mod 3 > b mod 3. Then (A*,R*) is given by: f 2; 5; 8gR ¤ f 1; 4; 7; 10gR ¤ f 0; 3; 6; 9g Formally, we say that a binary relation (A,R) is a strict simple order if it satisfies the following conditions: (1). Asymmetry ! (2). Transitivity: aRb aRc (8a&= 6 bRc b) (3). Completeness: (aRb or bRa) Theorem: If (A,R) is a strict weak order, then (A*,R*) is a strict simple order. 97 Theorem: If A is a finite set, then there is a 1-1 homomorphism (R ; > ) from (A,R) into iff (A,R) is a strict simple order. Weak Orders (R ; ¸ ) is not a strict simple order. It violates asymmetry: We can have aRb and bRa. Similarly, “weakly to the right of” on a set of points is not a strict weak order for the same reason. 98 This suggests that there should be concepts of weak order and simple order analogous to the concepts of strict weak order and strict simple order and bearing the same relation to these concepts as does to >. Here are the formal definitions: (A,R) is called a weak order if it satisfies the following conditions: (1). Transitivity (8a; b) (2). Strong Completeness: (aRb or bRa) Think of “weakly preferred to” or “at least as warm as”. 99 (A,R) is called a strict simple order if it satisfies: (1). Transitivity (2). Strong Completeness ! (3). Antisymmetry: aRb & bRa a = b. Theorem: If A is finite, then there is a homomorphism from (A,R) (R ; ¸ ) into iff (A,R) is a weak order. Theorem: If (R ;A¸ )is finite, then there is a 1-1 homomorphism from (A,R) into iff (A,R) is a simple order. 100 If (A,R) is a weak order, define a binary relation E on A by: aEb $ aRb & bRa Then (A,E) is an equivalence relation. Again, we can define a binary relation R* on the set A* of equivalence classes, with a*R*b* $ aRb hen (A*,R*) is well-defined and is a simple order. 101 Generalizations of the Representation Theorem to Infinite Sets of Alternatives. Cantor's Theorem(R (1895): If A is a countable set, then (A,R) is ;>) homomorphic to iff (A,R) is a strict weak order. This theorem fails for arbitrary infinite sets. Use the lexicographic ordering of the plane: A= R£ R (a; b)R(s; t) $ [a > s or (a = s& b > t)]: This is strict weak, but there is no homomorphism: f(a,1) > f(a,0). rational number g(a) with f(a,1) > g(a) > f(a,0). g is 1-1 function from! reals into rationals: a>b g(a) > f(a,0) > f(b,1) > g(b) 102 Birkhoff-Milgram Theorem (A,R) binary relation, B A. Say B is order-dense in A if a,b A-B, aRb and ~bRa imply that c B such that aRcRb. (R ; > ) Theorem (Birkhoff-Milgram): (A,R) iff ¤ ) homomorphic to (A ¤ ; Ris (A,R) is a strict weak order and has a countable set which is order-dense. 103 A = [0; 1] [ [2; 3], R = > (A ¤ ; R ¤ ) = (A; R) Q A is countable, order-dense A = [0; ¼] [ [2¼; 3¼], R = > (A ¤ ; R ¤ ) = (A; R) Q QA\ isAnot order-dense [ f ¼; 2¼g But is. A ¤=; R {0,1}, R= ¤ ) = (A; (A R)> A is countable, order-dense A = R £ R (a; b)R(s; t) $ a > s , (a; b)E (s; t) $ a = s: Note: The set of equivalence classes with rational first component is countable and order-dense. 104 Uniqueness Results (R ; > ) Uniqueness Theorem: Every homomorphism from (A,R) into is regular and every homomorphism defines an ordinal scale. Meaningfulness: f(a) > f(b) is meaningful. f(a) = 3f(b) is not meaningful. f(a) = f(b) > f(c) - f(d) is not meaningful. We do not get an interval scale of temperature as we might have wanted. We'll return to this point. 105 What Representations Lead to Ordinal Scales? (x 1 ; x 2 ; :::; x M ) Let M be a positive integer. An M-tuple M¹ = f 1; 2; :::; M g from : R corresponds to a strict weakxorder of rank first > x i , i , ..., i j i 1 2 p all i such that we never have ; having ranked , xj > xi : j 6 = i , i , ..., i 1 2 p rank next all i such that for no , is Thus, (3,9,3,2,8) has 2 ranked first, 5 ranked second, 1,3 tied for third, 4 last. M¹ TM R Suppose (x ; is a:::; strict weak order on . Let on consist of all x ; x ) M M-tuples 1 2 whose corresponding ¼= f (1; 2)g, T 2 = f (x; y) :ranking x > yg (strict weak © ¼ order) is y) .: xE.g.: . If = , T 2 = f (x; = yg: If ¼ ¼ 106 R Theorem (Roberts 1984): Suppose S is an M-ary relation on and (R ; S) f is a homomorphism from which defines a regular jf (A)j(A,R) ¸ M into ordinal scale. Suppose . Then S is of the form S = (T M ) 0 [ (T M ) 0 [ :::; ¼1 ¼2 M¹ (T M ) 0 ¼1 , ¼2 ¼i where , ..., are all the strict weak orders of and is ;: TM ¼i or For example, if M = 2, then ¼1 = f (1; 2)g ¼2 = f (2; 1)g , , ¼3 = ; . So S = T M [ T M i s ¸ on R ¼1 ¼3 S = TM [ TM is 6 = on R : ¼1 ¼2 107 TM [ TM Note that the condition is necessary but not sufficient: not lead to an ordinal scale. ¼1 ¼2 does It is an open question to characterize those S so that (Rcompletely ; S) homomorphisms into lead to ordinal scales. It is also an open question to obtain similar results for homomorphisms into (R ; S1 ; S2 ; :::; Sp ; ±1 ; ±2 ; :::; ±q ): A ;related open question: If [f TisM a homomorphism from (A,R) into (R S) ¼ i and S is of the form , when is the statement f(a) > f(b) meaningful? (Partial results: Harvey and Roberts 1989; Roberts and Rosenbaum 1994.) 108 Second Representation Problem: Extensive Measurement (A; R; ±) (R ; > ; + ) Consider the homomorphisms from into representation is known as extensive measurement. Motivation: R is “heavier than”, “preferred to” and combination. ± . This is combination. Also: Historically, scales were called extensive if they were at least additive and a scale was not considered acceptable unless it was at least extensive. Representation Theorem (Hölder Every Archimedean ordered (R ; > ; +1901): ): group is homomorphic to 109 Axioms for an Archimedean Ordered Group (A; R; ±) (A; ±) (1). is a group. (2). (A,R) is a strict simple order. (3). Monotonicity: aRb $ (a ± c)R(b± c) $ (c ± a)R(c ± b) (4). Archimedean: If e is the group identity and aRe, then there is a positive integer n a ± a ± a ± :: such that naRb. (Here, na is . n times) Which of the axioms are necessary? (1). Associativity, identity, inverse are not. (2). Strict simple order is not; there could be ties. (3). Monotonicity is. (4). The Archimedean axiom as stated is not; it doesn't even make sense without e. 110 (A; R; ±) Definition: is an extensive structure if it satisfies the following conditions: (1). Weak Associativity: [a ± (b± c)R(a ± b) ± c] ~ [(a ± b) ± cRa ± (b± c)] & ~ (2). (A,R) is a strict weak order. (3). Monotonicity. (4). Modified Archimedean Axiom: (na ± c)R(nb± d): If aRb, then there is a positive integer n such that (For the latter axiom, think of n(a-b) > (d-c).) 111 (A; R; ±) Theorem (Roberts and Luce 1968): is homomorphic to (R ; > ; + ) (A; R; ±) iff is an extensive structure. Are the axioms reasonable? Descriptive? For mass: OK For preference: Strict weak could be questioned. (Recall price-quality example.) Weak associativity could be questioned if combination allows physical interaction: (a = flame, b = cloth, c = fire retardant) 112 Monotonicity could be questioned: a = black coffee, b = candy bar, c = sugar a ± cRb± c bRa and Modified Archimedean: ana=±$1, lifed?as a cripple, d is a long and healthy life. Is c b = $0, c isnb± ever R to Uniqueness Theorem (Roberts and Luce 1968): Every (A; R; ±) (R ; > ; + ) homomorphism from into is regular and defines a ratio scale. Note: This gives the desired scale type for mass. Note: f(a) = 3f(b) is meaningful. 113 Question: What representations A into give rise to ratio scales? (R ; S) Roberts (1984) showed that (A,R) into never gives rise to a ratio scale. Thus, requires at least two relations or at least one operation. Can we get a ratio scale without any operations in ? 114 Third Representation Problem: Difference Measurement Define on R by ¢ (x; y; u; v) $ x ¡ y > u ¡ v: LetDD relation on A. Consider the representation (A; ) ! be(Ra; quaternary ¢ ): A homomorphism f satisfies D (a; b; c; d) $ f (a) ¡ f (b) > f (c) ¡ f (d): This representation is called algebraic difference measurement. Motivation: Temperature measurement. D is comparison of differences in warmth. Related motivation: Utility with comparison of differences in value. 115 Representation Theorems: Sufficient conditions were given by Debreu (1958), Scott and Suppes (1958), Suppes and Zinnes (1963), Kristof (1967), Krantz, Luce, Suppes, and Tversky (1971), ... Necessary and sufficient conditions when A is finite were given by Scott (1964). A certain solvability condition plays an important role in the representation theorem of Krantz, Luce, Suppes, and Tversky and is basic for the corresponding uniqueness theorem. Definition: (A,D) satisfies solvability if whenever ~D(s,t,a,b) and ~D(x,x,s,t), there are u,v A such that ~D(a,u,s,t), ~D(s,t,a,u), ~D(v,b,s,t) and ~D(s,t,v,b). 116 What this says: f (a) ¡ f (b) ¸ f (s) ¡ f (t) ¸ 0 9u; v 2 A implies such that f (a) ¡ f (u) = f (s) ¡ f (t) and f (v) ¡ f (b) = f (s) ¡ f (t): Solvability is not a necessary condition for the representation. Uniqueness Theorem (Suppes and Zinnes 1963, Krantz, Luce, (R ; ¢ Suppes, ) Tversky 1971). Every homomorphism from (A,D) into is regular. Moreover, if (A,D) satisfies solvability, then every homomorphism defines an interval scale. Note: This gives temperature as an interval scale. 117 Example to Show Hypothesis of Solvability is Needed: A = f 0; 1; 3g D (x; y; u; v) $ x ¡ y > u ¡ v f(x) = x, all x; g(0) = 1, g(1) = 2, g(3) = 8 (R ; ¢ ) Then f and g are homomorphisms from (A,D) into . g = ' ±f If , then (0) = 1, (1) = 2, (3) = 8 and we cannot find >0, so that (x) = x+. Thus, f is not an interval scale. Comment: Solvability fails. Open Question: What representations scales? A into lead to interval 118 Fourth Representation Problem: Conjoint Measurement Very important in the history of measurement theory was the introduction of conjoint measurement as an example of a fundamental measurement structure different from extensive measurement. Conjoint measurement is concerned with multidimensional A = A 1 £ A 2 £ ::. £ A n : alternatives. The set A has a product structure (a1 ; a2 ; :::; an ) In economics, we can think of an alternative as a market A1 basket. In perception, could be intensities of sounds presented to A2 the left ear and could be intensities of sounds presented to the right ear. 119 A1 A2 In testing, set could be a set of subjects and set a set of test items. In studying response strength, we could take a combination of drive and incentive. And so on. We have a binary relation on A interpreted as “preferred f i : A ito”, ! R“sounds louder”, “performs better”, etc. We seek functions so that (a1 ; a2 ; :::; an )R(b1 ; b2 ; :::; bn ) $ f 1 (a1 ) + f 2 (a2 ) + ::: + f n (an ) > f 1 (b1 ) + f 2 (b2 ) + ::: + f n (bn ): Sufficient conditions for the representation were given by Debreu [1960] and Luce & Tukey [1964]; necessary and sufficient conditions Ai by Scott [1964] in case each is finite. 120 Semiorders Suppose R is preference. Recall that indifference corresponds to the relation: $ aIb ~aRb & ~bRa Suppose f is an ordinal utility function: $ aRb f(a) > f(b) Then aIb $ f(a) = f(b) This implies that indifference is transitive. 121 Arguments against transitivity of indifference (or more generally of “tying” in a measurement context) go back to Armstrong in the 1930's, Wiener in the 1920's, even Poincaré (in the 19th century). One argument against transitivity of indifference: The Coffee-Sugar example (Luce 1956). Let 0* = cup of coffee with no sugar, 5* = cup of coffee with 5 spoons of sugar. 0* 5* We have a preference between 0* and 5* -- we are not indifferent. 122 Arguments against transitivity of indifference (or more generally of “tying” in a measurement context) go back to Armstrong in the 1930's, Wiener in the 1920's, even Poincaré (in the 19th century). One argument against transitivity of indifference: The Coffee-Sugar example (Luce 1956). Let 0* = cup of coffee with no sugar, 5* = cup of coffee with 5 spoons of sugar. 0* 5* We have a preference between 0* and 5* -- we are not indifferent. Add sugar one grain at a time. We are indifferent between the first and the last. 123 Second argument against transitivity of indifference: Pony-bicycle Example (Armstrong 1939): pony I bicycle pony I bicycle+bell not bicycle I bicycle+bell The semiorder model we shall discuss is intended to account for the first kind of problem, but not the second. 124 The Idea: Find a function f so that aRb $ f(a) is “sufficiently larger” than f(b) Let > 0 be a threshold or just noticeable difference. Find a function f on A so that aRb Define > on R $ f(a) > f(b) + by x > ± y $ x > y + ±: We are seeking a homomorphism from (A,R) into (R ; > ± ): 125 Note that if such a homomorphism f exists for some > 0, then another f ' exists for all ' > 0: Take f ' = ( '/ )f. Definition (Luce 1956): (A,R) is a semiorder if it satisfies the following conditions: (1). Irreflexivity: ! (2). aRb & cRd ! (3). aRb & bRc ~aRa aRd or cRb aRd or dRc. Representation Theorem (Scott and Suppes 1958). Suppose (A,R) is a binary relation(R on; >a finite set A. Then there is a homomorphism ) ± from (A,R) to iff (A,R) is a semiorder. 126 Necessity of the conditions: (1). trivial ( ) (3). ( ) f(c) f(b) f(a) f(d) f(c) f(a) f(b) f(a) f(c) (2). If f(a) f(c): ( ) If f(a) f(c): ( ) 127 Idea of the Proof A = {1,2,3,4,5} R = {(1,3), (1,4), (1,5), (2,4), (2,5), (3,5)} Is this a semiorder? Not easy to check by the axioms. However: It is easy to construct f. First, find the order of embedding. Finding the Order: 1R3, not 2R3 f(1) > f(2). 1R3, not 1R2 f(2) > f(3). 3R5, not 4R5 f(3) > f(4). 3R5, not 3R4 f(4) > f(5). f(5) 0 f(4) .5 f(3) 1.2 f(2) 1.7 f(1) 2.3 with = 1 128 Uniqueness Problem for Semiorders There can be irregular R = > 1 homomorphisms from (A,R) to A = {0,1/2,10}, . (R ; > 1 ) : Two homomorphisms from (A,R) into f(0) = 0, f(1/2) = 0, f(10) = 10 g(x) = x, all x. There g =transformation ' ±f : ' : f (A)is! noRadmissible so that (R ; > ± ) . So, the “usual” theory of scale type does not apply. There are various ways around this. 129 The Method of Perfect Substitutes Define an equivalence relation E on A by: aE b $ (8c)(aRc $ bRc & cRa $ cRb): In this case, a and b are perfectRsubstitutes for each other with ¤ respect Ato¤ the relation R. Define on the collection of equivalence classes by: a¤ R ¤ b¤ $ aRb: R¤ If (A,R) is a semiorder, is well-defined and is again a semiorder. Moreover, every homomorphism is now 1-1 and regular. 130 A similar trick of canceling out the perfect substitutes relation E always leads to regular representations for equivalence classes. We is the scale type of a homomorphism from (A ¤ ;can R ¤ ) now ask: (R ; >What )? ± into None of the common scale types applies and no succinct characterization of admissible transformations is known. A Uniqueness Result Suppose Define W on A by: aRb $ f (a) > f (b) + ±: aW b $ f (a) > f (b). Then (A,W) is a strict weak order. 131 Theorem (Roberts 1971): The order (A,W) is essentially unique. The only changes allowed are to permute elements a and b which are equivalent under the perfect substitutes relation. Meaningfulness One still ask, given a homomorphism f from (A,R) into (R ; >can ) ± , if the conclusion f(a) > f(b) is meaningful in the broader sense of invariance under all possible homomorphisms. It isn't always. In the example above, we have g(1/2) > g(0), but not f(1/2) > f(0). However, it is easy to see that if f is a homomorphism, f(a) > f(b) is a meaningful conclusion for all a, b if for all x y, ~xEy. 132 Indifference Graphs Consider the indifference representation corresponding to the semiorder representation: aI b $ jf (a) ¡ f (b)j · ±: Define J± on R by: xJ ± y $ jx ¡ yj · ±: We seek a homomorphism from (A,I) into (R ; J ±) : Graph-theoretic formulation: G = (V,E), V = A, edge a to b iff aIb. Assign numbers to vertices of G so that two vertices are adjacent iff their corresponding numbers are close (within ). 133 Here is an example of such a representation on a graph: 1.1 =1 0 .7 1.6 2.5 Definition: We say that is an indifference graph if there is a (R ; J(A,I) ): ± homomorphism into 134 Examples of Graphs that are Not Indifference Graphs C4 Similarly, C5, C6, ... 135 Representation Theorem (Roberts 1969). G = (A,I) is an indifference graph iff it has none of the above graphs as an induced subgraph. Uniqueness: Irregularity can again be a problem. Now, the perfect substitutes relation is equivalent to: aE b $ (8c)(aI c $ bI c) (Assuming that aIa for all a, we have aEb iff a and b have the same closed neighborhoods.) Define (A ¤ ; I ¤ ) I¤ a¤ I ¤ b¤ $ aI b A¤ on(R ; Jby . Then all homomorphisms from ) ± into are regular, but the class of admissible transformations is not known. 136 aW b $ f (a) > f (b) Again, semiorders. is essentially unique, as in the case of Meaningfulness f(a) > f(b) is never meaningful for indifference graphs since -f is again a homomorphism. However, some ordinal conclusions are meaningful. Say y is between x and z if x<y<z or z<y<x. Betweenness is a significant observation in psychophysical and perceptual experiments and in judgments of worth or value. When is the conclusion that f(b) is between f(a) and f(c) meaningful? 137 Theorem (Roberts 1984). Suppose (R ;AJ ±is ) finite and f is a homomorphism from (A,I) into . Then the conclusion that f(b) is between f(a) and f(c) is meaningful for all a,b,c in A iff (A,I) is connected as a graph and for all x y in A, ~xEy. It is an open question to generalize this result beyond indifference graphs and determine for what representations the conclusion that f(b) is between f(a) and f(c) is meaningful for all a,b,c. 138 Interval Orders and Measurement without Numbers Let f satisfy the semiorder representation: aRb $ f (a) > f (b) + ±: Let J (a) = [f (a) ¡ ±=2, f (a) + ±=2]: Think of J(a) as a range of possible values (utilities) for a. Let  J J' mean that x > y for all x J, y J'. We have aRb $ J (a)  J (b): (*) The intervals J(a) all have the same length. What if we allow different lengths? When does there exist an assignment of an interval J(a) to each a A satisfying (*)? 139 This is a representation problem in the spirit we have been discussing. It is measurement without numbers. Definition: (A,R) is an interval order if it satifies the first two conditions in the definition of a semiorder, i.e., (1). Irreflexivity (2). aRb & cRd aRd or cRb. Representation Theorem (Fishburn 1970): Suppose (A,R) is a binary relation and A is a finite set. Then there is an assignment J of real intervals satisfying (*) iff (A,R) is an interval order. 140 Uniqueness Results No one has formulated a theory of admissible transformations of interval assignments. However, there are some uniqueness results. There are four types of relations among intersecting intervals J(a) and J(b) that we would like to distinguish: (1) J(a) ____________ J(b) _______ (2) J(b) ______________ J(a) _____ (3) J(a) ___________ (4) J(b) ___________ J(b) ____________ J(a) _________ (1) (2) (3) (4) J(a) J(b) J(a) J(b) “overreaches” J(b) “overreaches” J(a) “weakly precedes” J(b) “weakly precedes” J(a). 141 One interesting question is to try to determine when the conclusion that J(a) overreaches J(b) is meaningful. These questions arise in economics when the intervals are ranges of possible values of a commodity. They also arise in archaeological seriation when these are intervals of time during which a particular type of artifact was in use. In the latter context, Skrien (1980,1984) developed an O(n3) algorithm for determining if these conclusions are meaningful. 142 Interval Graphs Consider the indifference version of the representation (*) for preference: aI b $ J (a) \ J (b) 6 = ;: The graph-theoretic formulation of this is to assign a real interval to each vertex of a graph so that two vertices are adjacent iff their intervals overlap. If we can find such an assignment, we say that the graph is an interval graph. Early representation theorems for interval graphs were due to Lekkerkerker and Boland (1962), Gilmore and Hoffman (1964), and Fulkerson and Gross (1965). 143 Interval graphs have numerous applications, in archaeology, developmental psychology, molecular biology, scheduling, traffic phasing, ecology, computer science, etc. Uniqueness Questions Uniqueness questions similar to those about interval orders arise. For example, when are conclusions about overreaching and weak  precedence meaningful? WeÂhave an additional question here: How unique are conclusions J(a) J(b)? These are never meaningful,  since we can always “reverse” all intervals. However, induces a transitive orientation on the complement of the graph in question, and the uniqueness of transitive orientations has been widely studied. 144 Higher Dimensional Intersection Graphs Consider again aI b $ J (a) \ J (b) 6 = ;: What if the J(a)'s are other objects, such as boxes, cubes, spheres, arcs on a circle, etc.? One can ask representation and uniqueness questions about these situations. For the most part, these questions are unsolved. For instance, we still don't know a representation theorem if the J(a) are rectangles in the plane (with sides parallel to the coordinate axes), or squares. 145 Partial Orders Every semiorder and every interval order is asymmetric and transitive. In general, a binary relation (A,R) that satisfies these two conditions is called a strict partial order. Proper containment on a set of sets is an example of a strict partial order. Strict partial orders often arise when we state preferences among alternatives that have several dimensions or aspects. i tWe might let f i (a) h measure the value of alternative a on the dimension, i= 1,2,...,n. We might prefer a to b iff a rates higher on every dimension, i.e., ($$) aRb $ [f 1 (a) > f 1 (b)]& [f 2 (a) > f 2 (b)]& ... & [f n (a) > f n (b)]: 146 If R is defined this way for real-valued functions fi, then (A,R) defines a strict partial order. The converse problem is important in utility theory and elsewhere. Suppose we are given a strict partial order (A,R). For Rgiven n, can we find functions f1, f2, ..., fn, each mapping A into , so that for all a,b, ($$) holds? This is a representation problem in the spirit of measurement theory. It is not hard to show that for all strict partial orders on finite sets A, we can find such functions for n sufficiently large. The smallest such n is called the dimension of the strict partial order. 147 There is a second notion of dimension that is used in the literature (more commonly than this one). Szpilrajn [1930] showed that for every strict partial order (A,R) and for every a,b in A so that ~aRb and ~bRa, there is a strict simple order (A,S) so that R S and so that aSb. It follows that R can be written as the intersection of strict simple orders on A. The smallest number of such strict simple orders is called the dimension of the strict partial order. For every d > 2, the two notions of dimension agree. However, strict weak orders have dimension one by the first definition and two by the second. 148 Example: A = f a; b; c; dg, R = f (a; b), (b; c), (a; c), (a; d)g This is a strict partial order. It is the intersection of two strict simple orders: the order ranking a over b over c over d and the order ranking a over d over b over c. We can also find two functions f1 and f2 satisfying ($$): f1(a) = 4, f1(b) = 3, f1(c) = 2, f1(d) = 1, f2(a) = 4, f2(d) = 3, f2(b) = 2, f2(c) = 1. 149 Partial Orders vs. Strict Partial Orders We shall use the term partial order for the binary relations related to strict partial orders in the same way that weak orders are related to strict weak orders. Thus, proper containment on a set of sets defines a strict partial order, while containment defines a partial order. Formally, we say that (A,R) is a partial order if it is reflexive, transitive, and antisymmetric. 150 Subjective Probability When we say that one event seems more probable than another, we are making subjective comparisons of probabilities. One can ask when such comparisons are consistent with a probability measure. (X ; E) Formalization of this question: We say that is an E E algebra of events if elements of are subsets of X and is closed under union and complementation. We say that a function p on an algebra of events is a finitely E additive probability measure if p is a real-valued function from so that (1). p(A) 0 (2). p(X) p(A [ =B1) = p(A) + p(B ) if A \ B = ; : (3). 151  (X ; E) Suppose  is an algebra of events and is a binary relation on E , with A B interpreted to mean that A is (subjectively) judged more probable than B. Does there exist a finitely additive probability (X ; E)  measure which “preserves” , i.e., so that for all B 2 E p on A, , (%) A  B $ p(A) > p(B ): Such a measure p is called a subjective probability measure. We can ask for representation and uniqueness theorems in the spirit of representational measurement theory. 152  The study of the comparative probability relation goes back at least to Bernstein (1917). The following necessary conditions for the representation (%) go back to Bruno de Finetti (1937). de Finetti Axioms (E;  ) (1).  X Â; (2). is a strict weak order. and Aº ; (3). Monotonicity: If , where A B means that ~B A\ B = A\ C= ;  A. , then B  C $ A [ B  A [ C: 153 The axioms are clearly necessary. That they are not sufficient was E shown by Savage in 1954 for infinite and by Kraft, Pratt, and E Seidenberg in 1959 for finite. Representation Theorems Scott (1964) gave necessary and sufficient conditions in the case E where is finite and Suppes and Zanotti (1976) gave necessary and E sufficient conditions for arbitrary . Perhaps historically one of the most interesting non-necessary axioms is the following: 154 (X ; E) We say that has an n-fold almost uniformEpartition if there is a collection of disjoint subsets X1, X2, ..., Xn in whose union is X and such that whenever 1 k < n, the union of no k of the Xi is more probable than the union of any k+1 of them. We say that A» B $ ~ [A  B ] &~ [B  A] (X ; E) An n-fold uniform partition of is a collection of disjoint E Xi » Xj subsets X1, X2, ..., Xn in whose union is X and such that for all i, j. Every n-fold uniform partition is almost uniform. 155 (X ; E) Representation Theorem (Roberts 1979). Suppose is an  E algebra of sets and is a binary relation on . The following axioms are sufficient to prove the existence of a measure of subjective probability: (1). (E The ¤ ; Âde ¤ ) Finetti axioms. (2). has a countable, order-dense subset. E (3). For all n, there is an n-fold almost uniform partition of . E These sufficient conditions imply that is infinite. Luce (1967) E gave important sufficient conditions that do not require that be infinite. 156 Uniqueness Results (X ; E)  Theorem (Roberts 1979). Suppose is an algebra of sets and E is binary relation on . Suppose that for all positive integers n, (X a; E) has an n-fold almost uniform partition. Then if p and p' are (X ; E;  ) two measures of subjective probability on , p = p'. In other words, p is an absolute scale. Intuition: Finite Additivity plays to additivity in ((A; R; ±) a! role (R ;analogous > ; + )) extensive measurement . The uniqueness theorem for extensive measurement can be modified to show that p' = p, > 0. But p(X) = p'(X) = 1, so = 1. 157 Unfortunately, the theorem fails without an added hypothesis like existence of n-fold almost uniform partitions for all n. E = 2X Example: X = {a,b}, , f a; bg  f ag  f bg  ; : ; Then p({a,b}) = 1, p( ) = 0, p({a}) = , p({b}) = is a measure of subjective probability so long as > , = 1- . Other conditions sufficient for getting an absolute scale were given for X finite by Fishburn and Odlyzko [1989] and Van Lier [1989]; necessary and sufficient conditions by Fishburn and Roberts [1989]. Other conditions for absolute scales, based on the additional concept of probabilistic independence, were given by Suppes and Alechina [1994]. 158 Probabilistic Consistency If judgments are made repeatedly, individuals are often inconsistent. For example, a subject might prefer a to b once, b to a later; or say that a is warmer than b once, b is warmer than a later. Let pab represent the proportion of times that a is preferred to b, or a is judged warmer than b, etc. Then p is a function from A into [0,1]. Call (A,p) a forced choice pair comparison system (fcpcs) if for all a,b in A, pab + pba = 1. We try to capture the idea that the individual, while being inconsistent, is being consistent in a probabilistic sense. 159 Definition: A fcpcs satisfies the weak utility model if there is f :A ! R so that for all a,b in A, pab > pba $ f (a) > f (b): Representation and Uniqueness Theorems: Define W on A by aW b $ pab > pba : Then (A,p) satisfies (R ; >the ) weak utility model iff (A,W) is homomorphic to . Moreover, if f is a function satisfying the weak utility model, then f defines a (regular) ordinal scale. 160 Definition: A fcpcs satisfies the strong utility model if there is f :A ! R so that for all a,b,c,d in A, pab > pcd $ f (a) ¡ f (b) > f (c) ¡ f (d): Representation and Uniqueness Theorems: Define D on A by D (a; b; c; d) $ pab > pcd : Then (A,p) satisfies (R ; ¢the ) strong utility model iff (A,D) is homomorphic to . Moreover, if (A,D) satisfies the solvability axiom, then f is a (regular) interval scale. The open uniqueness questions for difference measurement, the (A; D ) ! (R ; ¢ ) representation , are of equal interest for the strong utility model. 161 Stochastic Transitivity Conditions We now state some conditions on the pab values that are related to the weak and strong utility models. They are probabilistic versions of transitivity. Weak Stochastic[pTransitivity (WST): ¸ 1=2]& [p ¸ 1=2] ! pac ¸ 1=2: ab bc Moderate Stochastic Transitivity (MST): [p ¸ 1=2]& [p ¸ 1=2] ! pac ¸ minf pab; pbc g ab bc Strong Stochastic Transitivity (SST): [pab ¸ 1=2]& [pbc ¸ 1=2] ! pac ¸ maxf pab; pbc g 162 Theorem: WST is equivalent to the weak utility model. Theorem: The Strong Utility Model implies SST, but they are not equivalent. The pony-bicycle example mentioned in connection with semiorders gives an argument against SST and hence against the strong utility model: Suppose We have But pbi cy cl e;pon y ¼ 1=2 . pbi cy cl e+ bel l ;bi cy cl e = 1 pbi cy cl e+ bel l ;pon y ¼ 1=2 . . 163 Homogeneous Families of Semiorders In experiments with loudness judgments, we often take a to be louder than b (or beyond threshold) if it is judged to be louder a sufficiently high percentage of the time, like 3/4. Given a number [1/2,1), we can define a binary relation R on A by: aR ¸ b $ pab > ¸ : Luce [1958,1959] showed that under certain conditions, each (A,R) is a semiorder. For instance, SST implies this conclusion. Thus, F = f (A; R ¸ ) : ¸ 2 [1=2; 1)g under SST, is a family of semiorders. 164 We have noted that for a semiorder (A,R), the order in which the(R ; > ) ± elements are mapped into the reals under a homomorphism into is essentially unique. This order is sometimes called the compatible weak order. We shall call a family of semiorders on the same underlying set homogeneous if there is one weak order compatible with all of the F semiorders in the family. It turns out that under SST, is a homogeneous family of semiorders. This is another condition of probabilistic consistency. The converse is also true under one additional assumption. 165 Theorem (Roberts 1971): Suppose (A,p) Fis a fcpcs and pab 1/2 for all a,b. Then (A,p) satisfies SST iff is a homogeneous family of semiorders. Actually, the result holds under the weaker hypothesis: pab = 1=2 ! (8c)(pac = pbc ): 166 Meaningfulness of Conclusions from Combinatorial Optimization (Roberts 1990) What conclusions about optimal solutions or algorithms are meaningful? Shortest Path Problem 1 a x 2 y 20 What is the shortest path from x to y? 167 Meaningfulness of Conclusions from Combinatorial Optimization (Roberts 1990) What conclusions about optimal solutions or algorithms are meaningful? Shortest Path Problem 1 a x 2 y 20 What is the shortest path from x to y? Solution: x, a, y 168 But what if weights are measured on an interval scale? An admissible transformation takes the form x x + , > 0. Take = 2, = 100. 102 a x 104 y 140 169 But what if weights are measured on an interval scale? An admissible transformation takes the form x x + , > 0. Take = 2, = 100. 102 a x 104 y 140 Now the shortest path is x, y. The original conclusion was meaningless. It is meaningful if weights are measured on a ratio scale. 170 Linear Programming The shortest path problem can be formulated as a linear programming problem. The above discussion illustrates the following point: The conclusion that z is an optimal solution in a LPP can be meaningless if the cost parameters are measured on an interval scale. 171 Minimum Spanning Tree Given a connected weighted graph, what is the spanning tree with a total sum of weights as small as possible? 8 a b 70 54 63 22 c 53 e 12 10 d 172 a b 8 70 54 63 22 c 53 e 12 10 d optimal solution 173 What if we change weights, say by an admissible transformation for interval scales? Say we take x x + with = 2, = 100. We get the following weighted graph. 174 What if we change weights, say by an admissible transformation for interval scales? Say we take x x + with = 2, = 100. We get the following weighted graph. 116 a b 240 226 208 144 c 206 e 124 120 d 175 a 240 226 208 206 e 144 c b 116 124 120 d optimal solution 176 The optimal solution is the same. Is this an accident? 177 The optimal solution is the same. Is this an accident? No. It is a corollary of the result that Kruskal's algorithm (the greedy algorithm) gives a minimum spanning tree. Kruskal's Algorithm 1. Order edges in increasing order of weight. 2. Examine edges in this order. Include an edge if it does not form a cycle with edges previously included. 3. Stop when all vertices are included. Corollary: The conclusion that a given spanning tree is minimum is meaningful even for ordinal scales. 178 Note that analysis of meaningfulness is something like sensitivity analysis. However, the perturbations studied do not have to be small. We are investigating the sensitivity (or invariance) of conclusions under certain kinds of changes. 179 Frequency Assignment and T-Coloring The following problem arises from questions of channel assignment. Let G = (V,E) be a graph and T be a set of nonnegative integers. Find an assignment f of a positive integer f(x) to each vertex x in G so that f x; yg 2 E ! jf (x) ¡ f (y)j 2= T: Such an f is called a T-coloring of G. In the application: V = set of transmitters; E means conflict; f(x) is channel assigned to transmitter x; T is a set of disallowed separations between channels assigned to conflicting transmitters. E.g.., T = {0}, T = {0,1}. 180 T = {0,1,4,5} Find a “greedy” solution by always choosing the lowest possible channel. 181 T = {0,1,4,5} 1 Find a “greedy” solution by always choosing the lowest possible channel. 182 T = {0,1,4,5} 1 3 Find a “greedy” solution by always choosing the lowest possible channel. 183 1 T = {0,1,4,5} 9 3 Find a “greedy” solution by always choosing the lowest possible channel. 184 Here is a “better” solution: 1 7 4 185 In what sense is this better? In the sense that S(f ) = max jf (x) ¡ f (y)j x ;y 2 V is smaller. S(f) is called the span of f. The minimum S(f) over all T-colorings f of G is called the span of G. (Note that it is NP-complete to compute the optimal span.) Widely studied question: For what graphs does the greedy algorithm obtain the optimal span? 186 What if numbers in T change by a scale transformation? Is the answer to this widely studied question meaningful? Consider the case of ratio scales. Then we can have transformations t t, > 0. In this case, even for complete graphs Kn, the conclusion that the greedy algorithm obtains the optimal span is meaningless. Take the above example with = 2. Then T = {0,2,8,10} The greedy algorithm does obtain the optimal span: 187 T = {0,2,8,10} 1 188 T = {0,2,8,10} 1 2 189 T = {0,2,8,10} 1 5 2 190 Aside It is not as easy to give an example of a complete graph Kn and set T and number such that the greedy algorithm obtains the optimal span for Kn with T but not with T. One example is K6 with T = f 0; 1; 2; 3; 4g [ f 8g [ f 10; 11; 12g [ f 31; 32g and = 2. The greedy T-coloring uses the channels 1, 6, 15, 20, 29, 34. 191 Proof of optimality: Consider any T-coloring g with g(1) < g(2) < g(3) < g(4) < g(5) < g(6). Without loss of generality, take g(1) = 1. f 0; 1; 2; 3; 4g ½ T ! g(j + 1) ¸ g(j ) + 5 Then g(j + 2) ¸ g(j ) + 10 , . f 10; 11; 12g ½ T ! g(j + 2) ¸ g(j ) + 13 But . g(3) ¸ 14, g(5) ¸ 27, g(6) ¸ 32 Thus, . But 32-1 = 31, 33-1 = 32, so g(6) 34. QED 192 With = 2, ®T = f 0; 2; 4; 6; 8g [ f 16g [ f 20; 22; 24g [ f 62; 64g: The greedy T-coloring uses the channels 1, 2, 11, 12, 29, 30. A better T-coloring uses the channels 1, 2, 13, 14, 27, 28. 193 What is meaningful? Theorem (Cozzens and Roberts 1991): The conclusion that the greedy algorithm obtains the optimal span for all complete graphs is meaningful if elements of T are measured on a ratio scale. For example, if T = {0,1,2, ... , r}, this conclusion is true for all complete graphs. Even true if T = {0,1,2, ..., r} S where S contains no multiple of r+1. 194 Comment: The theory of meaningfulness needs to be applied with care. In the T-coloring application, limiting the set T to integers could be problematical in the interpretation. Suppose for example T = {0,1} and frequencies are measured in kilohertz (kHz). Then frequencies at distance 1 kHz are not allowed on conflicting transmitters. Neither, of course, are frequencies at distance 1/2 kHz. However, if we take = 1000, then the new units are Hertz (Hz) and now frequencies at distance 500 Hz = 1/2 kHz are allowed on conflicting channels. This changes the problem. 195 Scheduling with Priorities and Lateness/Earliness Penalties (Roberts 1995, Mahadev, Pekec, and Roberts 1997, 1998) The following problem came from the Military Airlift Command. n items (equipment, people) are to be transported from an origin to a destination. Item i has a desired arrival time di. (For simplicity, di is a positive integer.) Transportation time is constant. For a given arrival time, there is available a certain capacity c for transportation (# of seats, cargo space, etc.) Note constant c. 196 au i A schedule u assigns a positive integer arrival time to each au item, subject to the constraint that the number of i that can equal a given integer is at most c. Penalties are applied -- for late arrivals and perhaps for early arrivals. Penalties depend upon desired and scheduled arrival times and upon priorities. Different items have different status or priority. Let ti be a measure of the priority of item i. 197 Some Penalty Functions Pe(u;t,d,c) is the penalty with schedule u if t = (t1,t2,...,tn), d = (d1,d2,...,dn) and c is the capacity. P1 (u; t; d; c) = P n t jau i= 1 i i ¡ di j . This penalty is summable in Xthe sense that n Pe = ge (t i ; au ; di ): i i= 1 It is also separable in the sense that ³ ge (t; a; d) = h e ( t ) f e ( a;d) 1 h e ( t ) f e ( a;d) if if 2 Here, f 1 (a; d) = ja ¡ dj , h1 (t) = h1 (t) = t 1 2 a¸ d a< d . 198 Other examples of summable, separable penalty functions have he (t) = he (t) = t 1 2 and fe(a,d) defined as follows: P2 (u; t; d; c) : f 2 (a; d) = ja ¡ dj P3 (u; t; d; c) : f 3 (a; d) = 1 if a > d and 0 otherwise. if a d and 0 otherwise. P4 (u; t; d; c) : f 4 (a; d) = ° ja ¡ dj if a d and |a - d| otherwise P5 (u; t; d; c) : f 5 (a; d) = ja ¡ dj 2 A summable example is P6(u;t,d,c): g6(t,a,d) = 1 if a d + F(t) and 0 otherwise. 199 Problem: Find an optimal schedule, i.e., one which minimizes the penalty. Analogous Problem in Manufacturing n tasks to perform; task i has desired completion time di; c processors; constant processing time; tasks have different priorities This problem arises in "just-in-time" production: Toyota, Boeing, etc. 200 I will not discuss algorithms for solving the optimization problem: minimize P . (Many things are known. For example if he (t) = he (t) e= 1 1 2 and fe(a,d) = |a - d| and c = 1, Garey, Tarjan, and Wilfong [1988] show that the problem is NP-complete if transportation times can vary but polynomial if they are constant as in our problem.) I will ask: Is the conclusion that schedule u is optimal a meaningful conclusion? That is, does optimality still follow if we change how we measure priorities? 201 Example 1 Suppose priorities are measured on a ratio scale. Consider the statement: (1) Pe(u;t,d,c) Pe(v;t,d,c) for all schedules v for the scheduling problem t,d,c Let > 0. Does (1) imply (2) Pe(u;t,d,c) Pe(v;t,d,c) for all schedules v for the scheduling problem t,d,c Here, t = (t1,t2,...,tn) means (t1, t2,..., tn). 202 Suppose that Pe is summable and separable. Suppose that he (®t) = K (®)he (t) for i (For example, i he (t) = t i K (®) > 0: satisfies this condition.) Then Pe(w;t,d,c) = K()Pe(w;t,d,c). Since K() > 0, (1) (2). We conclude that under our assumptions, the assertion that schedule u is optimal is meaningful under ratio scale priority measurement. 203 Example 2: he (t) = he (t) = 2t ¡ 1 Pe summable and separable, fe(a,d) = |a-d|, Xn Thus, P (u; t; d; c) = 2t i ¡ 1 jau ¡ d j e i 2 1 . i i= 1 Then meaningfulness can fail under ratio scale priority measurement. t = (1,2,2,2), d = (1,2,2,2), c = 1. Consider u defined by au = (1,2,3,4). Then Pe(u;t,d,c) = h1(2) + 2h1(2) = 6. It is easy to check that any other schedule v has penalty 6, so u is optimal. 204 However, consider = 2. Then t = (2,4,4,4). Pe(u;2t,d,c) = h1(4) + 2h1(4) = 24. Let av be given by (4,1,2,3). Then Pe(v;2t,d,c) = 3h1(2) + h2(4) + h1(4) = 22. Thus, u is no longer optimal after a change of unit. 205 Constant Arrival Times d = (d,d,...,d) Observation: Suppose P is summable and separable he (t)and symmetric he (t) = hee (t) 2 in the sense that 1 . Suppose also that i is increasing in t and fe(a,d) = |a - d|. Then a greedy algorithm gives the only optimal solution and this remains the case after any ordinal scale (strictly increasing) transformation -- i.e., the conclusion of optimality is meaningful under ratio, interval, and ordinal scales. 206 Example 3 The conclusion we have stated is false without symmetry. fe(a,d) = |a - d| p he (t) = t 2 , he (t) = 1 2 t t = (1/4,1/9,1), d = (2,2,2), c = 1. Then the optimal schedule u is given by au = (3,4,2). Pe (u; t; d; c) = he (1=4) + 2he (1=9) = 1=16 + 2=81 1 1 . 207 In comparison, if av = (3,1,2), then Pe (v; t; d; c) = he (1=4) + he (1=9) = 1=16 + 1=3 1 2 . However, let = 36. Then t = (9,4,36) and Pe (u; ®t; d; c) = he (9) + 2he (4) = 81 + 32 1 1 Pe (v; ®t; d; c) = he (9) + he (4) = 81 + 2 1 2 . . Thus, u is no longer optimal. 208 Nonconstant Arrival Times Observation: Suppose he (t) = he (t) = t 1 2 , fe(a,d) = |a - d|. As noted above in Example 1, optimality is meaningful for ratio scale priority measurement with this penalty function. However, we note that it fails for interval scale priority measurement. Are interval scales relevant to priority measurement in scheduling? Yes. Utility or worth or loss of goodwill are not necessarily ratio scales and could very well be interval or ordinal scales. t = (9,9,9,1), d = (2,2,2,1), c = 1. au = (2,1,3,4); Pe(u;t,d,c) = 21. It is easy to check that u is optimal. 209 However, let = 1/8, = 7/8. Then t+ = (2,2,2,1) and Pe(u; t+,d,c) = 7. If av = (2,3,4,1), then Pe(v; t+,d,c) = 6. 210 A Special Case of Interval Scale Priority Measurement and Nonconstant d Assumptions: Let d = (d,d,...,d,k), with 0 < d < k n. Let c = 1. he (t) Suppose Pe is summable and separable with 1,2, and with fe(a,d) = D|a - d|, D > 0. Assume that i increasing in t, i = he (®t + ¯) = K (®; ¯)he (t) + L (®; ¯); i i K (®; ¯) > 0: Theorem: Then the conclusion that u is an optimal schedule is meaningful if priorities are measured on an interval scale. 211