Name that tune. Song title? Performer(s)? | | R.G. Bias

Name that tune. Song title? Performer(s)? R.G. Bias | rbias@ischool.utexas.edu | 1 Descriptive Statistics “Finding New Information” 4/5/2010 R.G. Bias | rbias@ischool.utexas.edu | 2 Standard Deviation σ = SQRT(Σ(X - 2 µ) /N) (Does that give you a headache?) 3 R.G. Bias | rbias@ischool.utexas.edu |  USA Today has come out with a new survey - apparently, three out of every four people make up 75% of the population. – David Letterman 4 R.G. Bias | rbias@ischool.utexas.edu |  Statistics: The only science that enables different experts using the same figures to draw different conclusions. – Evan Esar (1899 - 1995), US humorist 5 R.G. Bias | rbias@ischool.utexas.edu | Scales  The data we collect can be represented on one of FOUR types of scales: – Nominal – Ordinal – Interval – Ratio  “Scale” in the sense that an individual score is placed at some point along a continuum. 6 R.G. Bias | rbias@ischool.utexas.edu | Nominal Scale  Describe something by giving it a name. (Name – Nominal. Get it?)  Mutually exclusive categories.  For example: – Gender: 1 = Female, 2 = Male – Marital status: 1 = single, 2 = married, 3 = divorced, 4 = widowed – Make of car: 1 = Ford, 2 = Chevy . . .  The numbers are just names. 7 R.G. Bias | rbias@ischool.utexas.edu | Ordinal Scale  An ordered set of objects.  But no implication about the relative SIZE of the steps.  Example: – The 50 states in order of population: • • • • 8 1 = California 2 = Texas 3 = New York . . . 50 = Wyoming R.G. Bias | rbias@ischool.utexas.edu | Interval Scale  Ordered, like an ordinal scale.  Plus there are equal intervals between each pair of scores.  With Interval data, we can calculate means (averages).  However, the zero point is arbitrary.  Examples: – Temperature in Fahrenheit or Centigrade. – IQ scores 9 R.G. Bias | rbias@ischool.utexas.edu | Ratio Scale  Interval scale, plus an absolute zero.  Sample: – Distance, weight, height, time (but not years – e.g., the year 2002 isn’t “twice” 1001). 10 R.G. Bias | rbias@ischool.utexas.edu | Scales (cont’d.) It’s possible to measure the same attribute on different scales. Say, for instance, your midterm test. I could:  Give you a “1” if you don’t finish, and a “2” if you finish.  “1” for highest grade in class, “2” for second highest grade, . . . .  “1” for first quarter of the class, “2” for second quarter of the class,” . . .  Raw test score (100, 99, . . . .). – (NOTE: A score of 100 doesn’t mean the person “knows” twice as much as a person who scores 50, he/she just gets twice the score.) 11 R.G. Bias | rbias@ischool.utexas.edu | Scales (cont’d.) Nominal Ordinal Interval Ratio Name = = = Mutuallyexclusive = = = Ordered = = Equal interval = + abs. 0 Days of wk., Temp. Inches, Dollars Gender, Yes/No Class rank, Survey ans. R.G. Bias | rbias@ischool.utexas.edu | 12 Earlier . . .  We learned about frequency distributions.  I asserted that a frequency distribution, and/or a histogram (a graphical representation of a frequency distribution), was a good way to summarize a collection of data.  There’s another, even shorter-hand way. 13 R.G. Bias | rbias@ischool.utexas.edu | Measures of Central Tendency  Mode – Most frequent score (or scores – a distribution can have multiple modes)  Median – “Middle score” – 50th percentile  Mean - µ (“mu”) – “Arithmetic average” – ΣX/N 14 R.G. Bias | rbias@ischool.utexas.edu | More quiz questions about measures of central tendency 4 – True or false: In a normal distribution (bell curve), the mode, median, and mean are all the same? __True __False 5 – (This one is tricky.) If the mode=mean=median, then the distribution is necessarily a bell curve? __True __False 6 – I have a distribution of 10 scores. There was an error, and really the highest score is 5 points HIGHER than previously thought. a) What does this do to the mode? __ Increases it __Decreases it __Nothing __Can’t tell b) What does this do to the median? __ Increases it __Decreases it __Nothing __Can’t tell c) What does this do to the mean? __ Increases it __Decreases it __Nothing __Can’t tell 7 – Which of the following must be an actual score from the distribution? a) Mean b) Median c) Mode d) None of the above 15 R.G. Bias | rbias@ischool.utexas.edu | OK, so which do we use?  Means allow further arithmetic/statistical manipulation. But . . .  It depends on: – The type of scale of your data • Can’t use means with nominal or ordinal scale data • With nominal data, must use mode – The distribution of your data • Tend to use medians with distributions bounded at one end but not the other (e.g., salary). – The question you want to answer • “Most popular score” vs. “middle score” vs. “middle of the see-saw” • “Statistics can tell us which measures are technically correct. It cannot tell us which are ‘meaningful’” (Tal, 2001, p. 52). 16 R.G. Bias | rbias@ischool.utexas.edu | Mean – “see saw” (from Tal, 2001) 17 R.G. Bias | rbias@ischool.utexas.edu | Have sidled up to SHAPES of distributions  Symmetrical  Skewed – positive and negative  Flat 18 R.G. Bias | rbias@ischool.utexas.edu | “Pulling up the mean” 19 R.G. Bias | rbias@ischool.utexas.edu | Why . . .  . . . isn’t a “measure of central tendency” all we need to characterize a distribution of scores/numbers/data/stuff?  “The price for using measures of central tendency is loss of information” (Tal, 2001, p. 49). 20 R.G. Bias | rbias@ischool.utexas.edu | Didja hear the one about . . .  the Aggies who were on a march and came to a river? The Aggie captain asked the farmer how deep the river was.”  “Oh, it averages two feet deep.”  All the Aggies drowned. 21 R.G. Bias | rbias@ischool.utexas.edu | Note . . .      We started with a bunch of specific scores. We put them in order. We drew their distribution. Now we can report their central tendency. So, we’ve moved AWAY from specifics, to a summary. But with Central Tendency, alone, we’ve ignored the specifics altogether. – Note MANY distributions could have a particular central tendency!  If we went back to ALL the specifics, we’d be back at square one. 22 R.G. Bias | rbias@ischool.utexas.edu | Measures of Dispersion  Range  Semi-interquartile range  Standard deviation – σ (sigma) 23 R.G. Bias | rbias@ischool.utexas.edu | Range  Highest score minus the lowest score.  Like the mode . . . – Easy to calculate – Potentially misleading – Doesn’t take EVERY score into account.  What we need to do is calculate one number that will capture HOW spread out our numbers are from that measure of Central Tendency. – ‘Cause MANY different distributions of scores can have the same central tendency! – “Standard Deviation” -- σ = SQRT(Σ(X - µ)2/N) 24 R.G. Bias | rbias@ischool.utexas.edu | Let’s do a short example  What if I asked four undergraduates how many cars they’ve owned in their lives and I got the following answers: 1 1 1 1  There would be NO variance. σ = 0.  But what if the answers were 0 0 1 3 What’s the mode? Median? Mean?  Go with mean.  So, how much do the actual scores deviate from the mean? 25 R.G. Bias | rbias@ischool.utexas.edu | So . . .  Add up all the deviations and we should have a feel for how disperse, how spread, how deviant, our distribution is.  Let’s calculate the Standard Deviation.  As always, start inside the parentheses.  Σ(X - µ) 26 R.G. Bias | rbias@ischool.utexas.edu | Standard Deviation Score (X) Mean (µ) X-µ 0 1 -1 0 1 -1 1 1 0 3 1 2 Total 27 0 (damn) R.G. Bias | rbias@ischool.utexas.edu | Damn!  OK, let’s try it on another set of numbers. X 2 3 5 6 R.G. Bias | rbias@ischool.utexas.edu | 28 Damn! (cont’d.)  OK, let’s try it on a smaller set of numbers. X X-µ 2 -2 3 -1 5 1 6 2 Σ = 16 Σ = 0 µ = 4 Hmm. R.G. Bias | rbias@ischool.utexas.edu | 29 OK . . .  . . . so mathematicians at this point do one of two things.  Take the absolute value or square ‘em.  We square ‘em. Σ(X - µ)2 30 R.G. Bias | rbias@ischool.utexas.edu | X - µ (X - µ)2 X 2 3 5 6 Σ = 16 µ=4 -2 -1 1 2 Σ=0 4 1 1 4 10 R.G. Bias | rbias@ischool.utexas.edu | 31 Standard Deviation (cont’d.)  Then take the average of the squared deviations. Σ(X - µ)2/N – Remember, dividing by N was the way we took the average of the original scores. – 10/4 = 2.5.  But this number is so BIG! 32 R.G. Bias | rbias@ischool.utexas.edu | OK . . .  . . . take the square root (to make up for squaring the deviations earlier).  σ = SQRT(Σ(X - µ)2/N)  SQRT(2.5) = 1.58  Now this doesn’t give you a headache, right?  I said “right”? 33 R.G. Bias | rbias@ischool.utexas.edu | Hmmm . . . Mode Range Median ????? Mean Standard Deviation R.G. Bias | rbias@ischool.utexas.edu | 34 We need . . .  A measure of spread that is NOT sensitive to every little score, just as median is not.  SIQR: Semi-interquartile range.  (Q3 – Q1)/2 35 R.G. Bias | rbias@ischool.utexas.edu | To summarize Mode Range -Easy to calculate. -May be misleading. Median SIQR Mean (µ) SD (σ) -Capture the center. -Not influenced by extreme scores. -Take every score into account. -Allow later manipulations. R.G. Bias | rbias@ischool.utexas.edu | 36 Practice Problems  I’ll send you some, tonight. 37 R.G. Bias | rbias@ischool.utexas.edu |  http://highered.mcgrawhill.com/sites/0072494468/student_view0/ statistics_primer.html  Click on Statistics Primer. 38 R.G. Bias | rbias@ischool.utexas.edu | References  Hinton, P. R. Statistics explained.  Shaughnessy, Zechmeister, and Zechmeister. Experimental methods in psychology. R.G. Bias | rbias@ischool.utexas.edu | 39

Name that tune. Song title? Performer(s)? | | R.G. Bias

Related documents

Products

Support

Name that tune. Song title? Performer(s)? | | R.G. Bias

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib