Name that tune. Song title? Performer(s)? | | R.G. Bias

Name that tune. Song title? Performer(s)? R.G. Bias | rbias@ischool.utexas.edu | 1 Descriptive Statistics “Finding New Information” 3/23/2011 R.G. Bias | rbias@ischool.utexas.edu | 2 Standard Deviation σ = SQRT(Σ(X - 2 µ) /N) (Does that give you a headache?) 3 R.G. Bias | rbias@ischool.utexas.edu |  Statistics: The only science that enables different experts using the same figures to draw different conclusions. – Evan Esar (1899 - 1995), US humorist 4 R.G. Bias | rbias@ischool.utexas.edu |  USA Today has come out with a new survey - apparently, three out of every four people make up 75% of the population. – David Letterman 5 R.G. Bias | rbias@ischool.utexas.edu | The last 2 lectures . . .  . . . we’ve been talking about the scientific method.  When you conduct an experiment, at some point you’ll have some data.  “Statistics” is the field of study that addresses how we deal with, manipulate, interpret those data. R.G. Bias | rbias@ischool.utexas.edu | 6 How to talk about a set of numbers  We can list ‘em. – Can get WAY unwieldy. – Plus hard to make any sense out of them.  First step – put ‘em in order.  Second step – – Graph ‘em, and/or – Calculate percentiles/deciles 7 R.G. Bias | rbias@ischool.utexas.edu | Frequency Distributions Histograms  # of pets ever owned – – – – – – – – – – 8 13 2 1 4 0 1 3 0 5 1  Put ‘em in order – – – – – – – – – – 0 0 1 1 1 2 3 4 5 13 R.G. Bias | rbias@ischool.utexas.edu | Freq Dist  Raw Scores (in order) – – – – – – – – – – 9 0 0 1 1 1 2 3 4 5 13 Raw Score 0 1 2 3 4 5 13 Freq 2 3 1 1 1 1 1 Cumu Freq 2 5 6 7 8 9 10 R.G. Bias | rbias@ischool.utexas.edu | Histogram 3 2.5 2 1.5 # of pets 1 0.5 0 0 1 2 3 4 5 13 R.G. Bias | rbias@ischool.utexas.edu | 10 Percentiles  LOCATION of 25th percentile: – X.25 = (N+1) .25  LOCATION of 50th percentile: – X.50 = (N+1) .50  LOCATION of 75th percentile: – X.75 = (N+1) .75  Example: If we had 10 scores, – the 25th percentile would be the (11).25=2.75th score or part way (half way!) between the 2nd and 3rd scores. – The 50th percentile would be the (11).50=5.5th score, or half way between the 5th and 6th scores. 11 R.G. Bias | rbias@ischool.utexas.edu | Note . . .  With an odd number of scores, the 50th percentile will be an actual score:  Raw Scores (in order) – – – – – – – – – – – 0 0 1 1 1 2 3 4 5 13 100  50th percentile = (N+1).50 = (12).5 = 6th score = 2. 12 R.G. Bias | rbias@ischool.utexas.edu | Earlier . . .  We learned about frequency distributions.  I asserted that a frequency distribution, and/or a histogram (a graphical representation of a frequency distribution), was a good way to summarize a collection of data.  There’s another, even shorter-hand way. 13 R.G. Bias | rbias@ischool.utexas.edu | Measures of Central Tendency  Mode – Most frequent score (or scores – a distribution can have multiple modes)  Median – “Middle score” – 50th percentile  Mean - µ (“mu”) – “Arithmetic average” – ΣX/N 14 R.G. Bias | rbias@ischool.utexas.edu | Let’s calculate some “averages” Here’s a distribution of scores  2  2  5 Measures of Central Tendency  Mode?  Median?  Mean? R.G. Bias | rbias@ischool.utexas.edu | 15 Let’s calculate some “averages” Here’s a distribution of scores  0  0  0  1  1  10 Measures of Central Tendency  Mode?  Median?  Mean? R.G. Bias | rbias@ischool.utexas.edu | 16 A quiz about averages 1 – If one score in a distribution changes, will the mode change? __Yes __No __Maybe 2 – How about the median? __Yes __No __Maybe 3 – How about the mean? __Yes __No __Maybe 4 – True or false: In a normal distribution (bell curve), the mode, median, and mean are all the same? __True __False 17 R.G. Bias | rbias@ischool.utexas.edu | More quiz questions about measures of central tendency 5 – (This one is tricky.) If the mode=mean=median, then the distribution is necessarily a bell curve? __True __False 6 – I have a distribution of 10 scores. There was an error, and really the highest score is 5 points HIGHER than previously thought. a) What does this do to the mode? __ Increases it __Decreases it __Nothing __Can’t tell b) What does this do to the median? __ Increases it __Decreases it __Nothing __Can’t tell c) What does this do to the mean? __ Increases it __Decreases it __Nothing __Can’t tell 7 – Which of the following must be an actual score from the distribution? a) Mean b) Median c) Mode d) None of the above 18 R.G. Bias | rbias@ischool.utexas.edu | OK, so which do we use?  Means allow further arithmetic/statistical manipulation. But . . .  It depends on: – The type of data • Can’t use means with nominal or ordinal scale data (more on the Monday) • With nominal data, must use mode – The distribution of your data • Tend to use medians with distributions bounded at one end but not the other (e.g., salary). – The question you want to answer • “Most popular score” vs. “middle score” vs. “middle of the see-saw” • “Statistics can tell us which measures are technically correct. It cannot tell us which are ‘meaningful’” (Tal, 2001, p. 52). 19 R.G. Bias | rbias@ischool.utexas.edu | Mean – “see saw” (from Tal, 2001) 20 R.G. Bias | rbias@ischool.utexas.edu | Have sidled up to SHAPES of distributions  Symmetrical  Skewed – positive and negative  Flat 21 R.G. Bias | rbias@ischool.utexas.edu | “Pulling up the mean” 22 R.G. Bias | rbias@ischool.utexas.edu | Why . . .  . . . isn’t a “measure of central tendency” all we need to characterize a distribution of scores/numbers/data/stuff?  “The price for using measures of central tendency is loss of information” (Tal, 2001, p. 49). 23 R.G. Bias | rbias@ischool.utexas.edu | Didja hear the one about . . .  the Aggies who were on a march and came to a river? The Aggie captain asked the farmer how deep the river was.”  “Oh, it averages two feet deep.”  All the Aggies drowned. 24 R.G. Bias | rbias@ischool.utexas.edu | Note . . .      We started with a bunch of specific scores. We put them in order. We drew their distribution. Now we can report their central tendency. So, we’ve moved AWAY from specifics, to a summary. But with Central Tendency, alone, we’ve ignored the specifics altogether. – Why isn’t a Measure of Central Tendency, alone, satisfactory? – Note MANY distributions could have a particular central tendency!  If we went back to ALL the specifics, we’d be back at square one. 25 R.G. Bias | rbias@ischool.utexas.edu | Measures of Dispersion (or Spread)  Range  Semi-interquartile range  Standard deviation – σ (sigma) 26 R.G. Bias | rbias@ischool.utexas.edu | Range  Highest score minus the lowest score.  Like the mode . . . – Easy to calculate – Potentially misleading – Doesn’t take EVERY score into account.  What we need to do is calculate one number that will capture HOW spread out our numbers are from that measure of Central Tendency. – ‘Cause MANY different distributions of scores can have the same central tendency! – “Standard Deviation” -- σ = SQRT(Σ(X - µ)2/N) 27 R.G. Bias | rbias@ischool.utexas.edu | Let’s do a short example  What if I asked four undergraduates how many cars they’ve owned in their lives and I got the following answers: 1 1 1 1  There would be NO variance. σ = 0.  But what if the answers were 0 0 1 3 What’s the mode? Median? Mean?  Go with mean.  So, how much do the actual scores deviate from the mean? 28 R.G. Bias | rbias@ischool.utexas.edu | So . . .  Add up all the deviations and we should have a feel for how dispersed, how spread, how deviant, our distribution is.  Let’s calculate the Standard Deviation.  As always, start inside the parentheses.  Σ(X - µ) 29 R.G. Bias | rbias@ischool.utexas.edu | Standard Deviation Score (X) Mean (µ) X-µ 0 1 -1 0 1 -1 1 1 0 3 1 2 Total 30 0 (damn) R.G. Bias | rbias@ischool.utexas.edu | Damn!  OK, let’s try it on another set of numbers. X 2 3 5 6 R.G. Bias | rbias@ischool.utexas.edu | 31 Damn! (cont’d.)  OK, let’s try it on a smaller set of numbers. X X-µ 2 -2 3 -1 5 1 6 2 Σ = 16 Σ = 0 µ = 4 Hmm. R.G. Bias | rbias@ischool.utexas.edu | 32 OK . . .  . . . so mathematicians at this point do one of two things.  Take the absolute value or square ‘em.  We square ‘em. Σ(X - µ)2 33 R.G. Bias | rbias@ischool.utexas.edu | X - µ (X - µ)2 X 2 3 5 6 Σ = 16 µ=4 -2 -1 1 2 Σ=0 4 1 1 4 10 R.G. Bias | rbias@ischool.utexas.edu | 34 Standard Deviation (cont’d.)  Then take the average of the squared deviations. Σ(X - µ)2/N – Remember, dividing by N was the way we took the average of the original scores. – 10/4 = 2.5.  But this number is so BIG! 35 R.G. Bias | rbias@ischool.utexas.edu | OK . . .  . . . take the square root (to make up for squaring the deviations earlier).  σ = SQRT(Σ(X - µ)2/N)  SQRT(2.5) = 1.58  Now this doesn’t give you a headache, right?  I said “right”? 36 R.G. Bias | rbias@ischool.utexas.edu | Hmmm . . . Mode Range Median ????? Mean Standard Deviation R.G. Bias | rbias@ischool.utexas.edu | 37 We need . . .  A measure of spread that is NOT sensitive to every little score, just as median is not.  SIQR: Semi-interquartile range.  (Q3 – Q1)/2 38 R.G. Bias | rbias@ischool.utexas.edu | To summarize Mode Range -Easy to calculate. -May be misleading. Median SIQR Mean (µ) SD (σ) -Capture the center. -Not influenced by extreme scores. -Take every score into account. -Allow later manipulations. R.G. Bias | rbias@ischool.utexas.edu | 39 Practice Problems  I’ll send you some, tonight. 40 R.G. Bias | rbias@ischool.utexas.edu |  http://highered.mcgrawhill.com/sites/0072494468/student_view0/ statistics_primer.html  Click on Statistics Primer. 41 R.G. Bias | rbias@ischool.utexas.edu | References  Hinton, P. R. Statistics explained.  Shaughnessy, Zechmeister, and Zechmeister. Experimental methods in psychology. R.G. Bias | rbias@ischool.utexas.edu | 42

Name that tune. Song title? Performer(s)? | | R.G. Bias

Related documents

Products

Support

Name that tune. Song title? Performer(s)? | | R.G. Bias

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib