DESCRIPTIVE STATISTICS BORAM KANG STATISTICS The only science that enables different experts using the same figures to draw different conclusions. Evan Esar (1899 - 1995), US humorist STATISTICS(SINGULAR) AND STATISTICS(PLURAL) A way of reasoning, a long with a collection of tools and methods, designed to help us understand the world. Calculations made from data DATA Values along with their context The data we collect can be represented on one of FOUR types of scales: Nominal Ordinal Interval Ratio NOMINAL SCALE Numbers are Names Describe something by giving it a name. For example: Question ! Female or Male ? Gender: 1 = Female, 2 = Male ORDINAL SCALE An ordered set of objects. But no implication about the relative SIZE of the steps. • Questions ! How do you feel about this? Good / Medium/ Bad INTERVAL SCALE Ordered, like an ordinal scale. Plus there are equal intervals between each pair of scores. Questions ! Temperature (Fahrenheit) • • 100° is 10° warmer than 90°(+ -) But We can’t say (* /) -100° is not twice as warm as 50° RATIO SCALE Interval scale, plus an absolute zero. Sample: • Distance, weight, height, time (but not years – e.g., the year 2002 isn’t “twice” 1001). Questions! Age 30 years old/ 15 years old : twice MEASURES OF CENTRAL TENDENCY Mode Most frequent score (or scores – a distribution can have multiple modes) Median “Middle score” 50th percentile Mean - µ (“mu”) “Arithmetic average” ΣX/N Range Max-Min So, how much do the actual scores deviate from the mean? We need <the Standard Deviation > Add up all the deviations and we should have a feel for how disperse, how spread, how deviant, our distribution is. That’s the Standard Deviation MEAN – “SEE SAW” (FROM TAL, 2001) SHAPES OF DISTRIBUTIONS EXAMPLE I asked 12 people how many cars they had owned in their lives. Here are the answers I got: 10 1 4 8 8 2 3 6 3 5 9 7 <1,2,3,3,4,5,6,7,8,8,9,10> Histogram frequency 3 2 1 1 2 3 4 5 6 7 8 # of cars ever owned What is the value of “N”? 12 What’s the mode of this distribution? 3, 8 Median? 5.5 Mean? 5.5 Range? 10-1 9 IQR? <1,2,3,3,4,5,6,7,8,8,9,10> 8-3=5 Standard Deviation?2.81 9 10 SO WHICH DO WE USE? It depends on: The type of scale of your data Can’t use means with nominal or ordinal scale data(female/male) (good/middle/bad) With nominal data, must use mode The distribution of your data Tend to use medians with distributions bounded at one end but not the other (e.g., salary). The question you want to answer “Most popular score” vs. “middle score” vs. “middle of the see-saw” “Statistics can tell us which measures are technically correct. It cannot tell us which are ‘meaningful’” (Tal, 2001, p. 52). CITATION http://highered.mcgrawhill.com/sites/0072494468/student_view0/statistic s_primer.html http://easycalculation.com/statistics/standarddeviation.php http://www.uoguelph.ca/htm/MJResearch/Resear chProcess/IntervalScale.htm