1111001010001001001000100100100100 1111100101100100100100010100010010 1001010010100100101001010101000101 0000100101001010010001010100101010 0101111010110110100101010100101010 0100111101011001101001010001010100 0000010111110100100001011010001010 0101000100000101101111111100101000 1001001000100100100100111110010110 0100100100010100010010100101001010 0100101001010101000101000010010100 1010010001010100101010010111101011 0110100101010100101010010011110101 1001101001010001010100000001011111 0100100001011010001010010100010000 0101101111111100101000100100100010 0100100100111110010110010010010001 0100010010100101001010010010100101 0101000101000010010100101001000101 0100101010010111101011011010010101 0100101010010011110101100110100101 0001010100000001011111010010000101 1010001010010100010000010110111111 1100101000100100100010010010010011 Chapter 3 Rescaling 20 % What is Rescaling? • Conversion of one measurement scale another • Common technique used in quantitative biology 111100101000100100100010 010010010011111001011001 001001000101000100101001 010010100100101001010101 000101000010010100101001 000101010010101001011110 101101101001010101001010 100100111101011001101001 20 % Nominal Ordinal Interval [0,0,1,0,0,0] [1st, 2nd, 3rd] [90,180,45]o Ratio [0,1.4,3.2]m More detail Less detail 12 rescaling options Why bother? • Logical rescaling has many applications – Simplify analyses – Even out datasets – Help reveal patterns – Consolidate explanatory variables – Run non-parametric statistics Simplify Ratio Nominal Other Gravel Other Even out datasets Net Harper cuts Fish census # by species 10 brown trout 5 smelt 1 Arctic char 15 2005 2006 2007 2008 2009 Salvage Can no longer long compare term analysis: total values Nominal Ratio 6 2010 2011 2012 Help reveal patterns • Rescaling to a less detailed quantity sometimes makes it easier to see patterns • Is the presence/absence of storms associated with number of vagrant birds observed? Consolidate explanatory variables Nominal forest/barrens berries present/absent wet/dry Ordinal Habitat rank … Run non-parametric statistics • Interval & Ratio Rank [10.5, 15.6, 19.1, 9.8] ml [3, 2, 1, 4] • Non-parametrics can be useful when parametric test assumptions are violated – Wilcoxon rank-sum test (≈ t-test) • But…these test are not a staple these days – Generalized linear models can deal with various error distributions Normalization • Another common technique used in quantitative biology • Conversion of quantity to a ratio with no units 𝑄 𝑄𝑟𝑒𝑓 𝛽 Q and Qref have the same units • Common Qref values: Qmax Qmin Qsum Qmean Qrange Qsd Scope 𝑄𝑚𝑎𝑥 Scope(𝑄) = 𝑄𝑚𝑖𝑛 • We can use scope to – compare the capacity of measurement instruments, – compare the information content of graphs, – compare variability of physical systems, or biological systems Physical quantities – larger scope Quantity = mass(H2) mass(Earth) Scope = mass(H2) Biological quantities – smaller scope Quantity = body mass mass(Blue Whale) Scope = mass(bacteria) 5.98 · g Scope = 3.3 · 10−24 g 1.8 · 108 g Scope = 9.5 ∙ 10−13 g Scope = 1.83 ∙ 1043 Scope = 1.89 ∙ 1020 1019 Scope of measurement instruments • Defined as the max over min reading 1m Scope meter stick = = 100 (𝑖𝑓 𝑚𝑎𝑟𝑘𝑒𝑑 𝑖𝑛 𝑐𝑚) 1 cm 1m Scope meter stick = = 1000 (𝑖𝑓 𝑚𝑎𝑟𝑘𝑒𝑑 𝑖𝑛 𝑚𝑚) 1 mm 1 kg Scope ? = = 109 1 μg Survey scope 1. Defining the sample unit 2. Listing all possible units (the frame), 3. Then survey all possible units (complete census) or sample units at random Salmon survey Unit: 100 km transects 1 Scope 2 3 4 Frame: 700 km 5 6 7 Survey scope Unit: 100 km transects Frame: sum(rivers) Scope: # possible transects Experiment scope • Unit depends on quantity measured or sampling interval – Sampling livers volume(liver) – Census bacteria each day volume(sample) 10 8 1 2 8 7 = scope 6 4 3 3 4 5 6 7 8 9 Millions of bacteria recorded each day 10 days 10 Scope = 6 5 = 10 ? 1 day 4 Normalization • Another common technique used in quantitative biology • Conversion of quantity to a ratio with no units 𝑄 𝑄𝑟𝑒𝑓 𝛽 Q and Qref have the same units • Relative to a statistic: Qsum Qmean Qrange Qsd Normalization to a sum • Taking a percentage % = – e.g. Mendel’s experiments 705 224 224 = 0.24 705 + 224 𝑄𝑖 𝑖=1 𝑛 𝑄𝑖 1 Normalization to the mean • Useful for assessing deviations from the mean – e.g. Number of plant species on the Canary Islands Nplant = [ 366 348 763 1079 539 575 391 ] · sp/island mean(Nplant) = n-1 Nplant mean(Nplant) = 7-1 · 4061 · species/island = 580 dev(Nplant) = Nplant - mean(Nplant) dev(Nplant) = [ -214 -232 +182 +498 -41 -5 -189 ] · sp/island Date SST anomaly (oC) SST (oC) Normalization to the mean Coefficient of Variation stdev(𝑄) CV = mean(𝑄) • Unitless ratio that allows comparisons of two quantities, free of various confounds – e.g. We can use the CV to compare morphological variability in mice and elephants Normalization to a range • The range is defined as the largest minus smallest value • Ranging uses both the minimum and maximum value to reduce the quantity to the range 0 to 1 𝑄 − 𝑄𝑚𝑖𝑛 𝑄 ≡ 𝑄𝑚𝑎𝑥 − 𝑄𝑚𝑖𝑛 ′ Normalization to the stdev • This is a common form of normalization in statistical treatments of data • Returning to example of number of plant species on 7 Canary Islands: Rigid Rescaling • Rigid rescaling replaces one unit with another 0.9114 m 1 km 700 yards ∙ ∙ ⇒ 0.64 km yards 1000 m • Units disappear because any unit scaled to itself = 1 (no units) – m/m is notation for metre/metre = 1 – kcal/Joules is a number with no units – km1.2/m1.2 has no units: it is the number of crooked m per crooked km Convert units • Generic procedure – Three steps ∙ k1 quantity ∙ ⇒ Q𝑛𝑒𝑤 1.Q𝑜𝑙𝑑 Write the tokbe rescaled 2 k1 new units k2 newer units 2.Q𝑜𝑙𝑑 Apply rigid conversion factors⇒soQ𝑛𝑒𝑤𝑒𝑟 units ∙ ∙ old unit new units cancel 0.9114 m 3. Calculate 700 yards ∙ yards ∙ 1 km 1000 m ⇒ 0.64 km • Figure out how much Phelps eats in a day (in lbs) hamburger 290 g Kg 2.2 lbs 12,000 Kcal ∙ ∙ ∙ ∙ ⇒ 11.43 lbs 670 Kcal hamburger 1000 g Kg 11.43 lbs = 5.8 % 195 lbs How much do you eat in a day, as a % of body weight? • 2000 Kcal/day for women not in training • 2200 kcal/day for men not in training