Index Cards 1. 2. 3. 4. 5. Name Major Favorite Class Ever, & Why Areas of Interest in Psychology Unique/Bizarre/Little Know fact about you. 6. Most exciting event over vacation 7. Favorite TV show ever 8. Stupidest thing you’ve ever done 1 Small Group Questions Name, where you’re from Best class ever & why Stupidest thing you’ve ever done Bizarre facts/tricks you can do 2 Pop Quiz #1 1. Your instructor is from… a. b. c. d. e. Nevada New York Nebraska Minnesota East-central Tibet 3 Pop Quiz #1 2. Your instructor has taught statistics a. b. c. d. e. f. Never About 10 times About 20 times About 30 times About 40 times Way, way too many times 4 Pop Quiz #1 3. Your instructor was once bitten by a … a. b. c. d. e. f. Rattle Snake Polar Bear South American Malting Meek Mouse Snapping Turtle An oversized freshman His wife, after refusing to mow the lawn 5 Pop Quiz #1 4. Your instructor can’t get enough… a. b. c. d. e. f. g. Chocolate Schlitz Malt Liquor Diet Pepsi Diet Coke Diet Schlitz Malt Liquor Prune Juice Red Bull 6 Pop Quiz #1 5. Your instructor’s 2nd favorite TV show is … a. b. c. d. e. f. g. Married with Children The Simpsons Survivor #5: Downtown Rockhill The Daily Show Space Ghost Seinfeld NOVA – Deadly Snapping Turtles 7 Stats Basics: 1st Week Overview Course Tips Types of Data Graphing Distributions The Normal Curve Graphing Sample Means Practicing with SPSS 8 Secret Course Tips Bulldog Tactics Syllabus Office hours Engagement & Attendance Quizzes Request for leniency Notebook Course Packs Organization Homework, Labs, & Reading Class time Set-up first Please avoid surfing Note-taking • Write & Process Ask questions! Slow me down! Homework ** Studying ** • Often • Active • Self-Explanation Practicing SPSS Laugh at my jokes!! 9 Make Friends Quickly!! Option A: Solo Every Penguin For Herself! Keep the competition down. Option B: Teamwork!!! Ask questions of peers Answer questions Form study groups Practice explaining 10 Terminology: Samples vs. Populations Samples & Populations Statistics: refer to characteristics of samples • e.g., xbar or M • always regular alphabet symbols Parameters: refer to characteristics of population • e.g. μ • always greek symbols Self-check: height of several students in class to represent class height of class to represent height of typical undergraduates 11 Qualitative vs. Quantitative Data Quantitative: can be ranked • shoe size, height, self-esteem score on scale, airplane lift Qualitative: can’t be ranked • gender, political affiliation, major, car maker Check Gender region weight depression steps Social Security Number Letter Grade: A, B, C, D 12 Scales of measurement Nominal – classify data into categories (religion) Ordinal – classify and rank (Olympic Medals) Interval – classify and rank with equal intervals (Celsius) Ratio – classify, rank with equal intervals, true zero (Kelvin) your residence hall batting average your rank on mom’s love list height IQ weight Self-esteem (7 point Likert Scale) SAT score Grade: A, B, C, D distance gender gpa number of close friends social security number region of country level of depression 13 Experimental terms Empirical Method: Experimental Method Question: Why do airplanes fly? Theory: Wings create lift Operational Definitions IV: Wing position: (straight, bent up) ‘levels’ DV: Lift Gathering data Careful observation; quantification Level of measurement – use highest possible Controlling Extraneous Variables Drawing Conclusions 14 Experimental terms (2) Experimental Terminology Independent Variable: (e.g., Wing Position) • Variable you manipulate; • variable you think will impact DV Dependent Variable: (e.g., Change in Vertical Position) • Variable that might be affected by IV; • variable you measure Extraneous Variable: (e.g., drafts, throwing style) • Any fact that affects the DV other than the IV • Sources of “error” – we want to STANDARDIZE conditions to minimize the amount of error Quasi Experimental Design No manipulation of IV 15 Experimental terms (3) Practice Can fat people eat more bacon than skinny people? Does B.O. significantly decrease attractiveness? Do kids who get “hooked on phonics” have more problems with addiction later in life Do people who study more do better on tests? 16 Frequency Distributions Definitions Grade Freq The values taken on by a given variable All the actual data points you obtained for a given variable Most basic ways to look at study outcomes Quantitative Examples: • • • • The SAT scores for all Winthrop students The reaction times for all study participants Grades on the first test: #’s of As, Bs, Cs, & Ds The starting salaries of graduates Qualitative Examples: • Favorite TV shows of students in this class • Residence halls occupied by students in this class A 4 B 7 C 3 D 2 Test 1 7 6 5 4 3 2 1 0 A B C 17 D Representing Frequency Distributions Table: List possible values, and indicate the number of times each value occurred. Graphs X-axis: possible values Y-axis: # of times that value occured R. Time Freq. 0-10 4 11-20 8 21-30 12 31-40 6 41-50 4 Reac Time 12 10 51-60 3 8 61-70 1 6 4 2 0 0-10 11- 21- 31- 41- 51- 6120 30 40 50 60 70 18 Graphing Distributions Quantitative Data Line graphs or Histograms (columns touching) Qualitative Data Pie charts & Bar graphs (columns not touching) See SPSS Guide for examples Also, you can practice with these datasets on the website… city sprawl bogus winthrop data employee data 19 A Graph of the Normal Curve Hypothetical Frequency Distribution (Line Graph) Shows distribution of infinitely large sample (theoretical) Symmetrical Shows common and uncommon (extreme) scores Basis for testing hypotheses Percentiles SAT Scores μ = 500 20 Normal Curve (with raw and standard scores) μ Few Extreme Scores Few Extreme Scores SAT 200 300 400 500 600 700 800 Female Height 4’4” 4’8” 5’0” 5’4” 5’8” 6’0” 6’4” Anxiety 20 30 40 50 60 70 80 Stand.Normal Curve -3 -2 -1 0 +1 +2 +3 21 Deviations from Normality Ways in which distribution can be non-normal Skew Positive Skew Negative Skew Kurtosis Platykurtic Mesokurtic Leptokurtic Modality Unimodal Bimodal (etc.) 22 Graphing Sample Means One IV: Typically use bar-graph Two IV: Typically use line-graph $400 $350 $300 $250 Damage $200 $150 $100 $50 $0 Rock Anvil Tomato 23 Math Review Preparation for Calculating Standard Deviation Learn the differences between… Σx Σx2 (Σx)2 24 Problem #1 x x2 “Sum of x squared” ?? 2 4 “Sum of x-quantity squared” ?? 3 9 2 4 Σx = ?? Σx2 =?? (Σx)2 =?? 25 Problem #1 Answer x x2 “Sum of x squared” Σx2 2 4 “Sum of x-quantity squared” (Σx)2 3 9 2 4 Σx = 7 Σx2 =17 (Σx)2 =49 26 Problem #2 x x2 1 ?? 2 ?? 2 ?? x x n 2 2 sˆx n 1 Σx = ?? Σx2 = ?? (Σx)2 =?? 27 Problem #2: Answer-a x x2 1 1 2 4 2 4 Σx = 5 Σx2 = 9 (Σx)2 =25 x x n 2 2 sˆx n 1 25 9 3 sˆx 3 1 28 Problem #2: Answer-b x x2 1 1 2 4 2 4 Σx = 5 Σx2 = 9 (Σx)2 =25 x x n 2 2 sˆx n 1 25 9 .666 6 3 sˆx 3 1 2 sˆx .333 3 .5774 29 Problem #3 x 2 3 x x n 2 2 sˆx n 1 5 30 Problem #3: Answer x x2 2 4 3 9 5 25 Σx = 10 Σx2 = 38 x x n 2 2 sˆx n 1 100 38 4.666 6 3 sˆx 3 1 3 1 (Σx)2 =100 sˆx 2.3333 1.5275 31 Descriptive Statistics Measures of Central Tendency Where does the center of the distribution fall? Where are most of the scores Measures of Variability How spread out is the distribution? How dispersed are the scores? Importance: To determine whether IV affects DV, we consider: • The difference between the means • The amount of variability 32 Imaginary Study with 2 Outcomes Purpose: See why variability is important Research Question: Imagine a business where customers are routinely offended: • comments about their mothers • misc. name calling Does social skills training for clerks improve customer satisfaction scores. IV: Social Skills training (training, no training) DV: Customer Satisfaction Imagine two worlds where we get two different outcomes 33 Training Study Outcomes Version 1 Version 2 Control Experimental Control Experimental 2 5 4 3 2 2 5 4 4 2 4 4 1 6 4 4 1 2 1 6 7 4 3 3 M= 3 M= 4 M= 3 M= 4 SD = 1.26 SD = 1.10 SD = 2.10 SD = 2.19 34 Measures of Central Tendency Data: # of close friends Note: Use “frequencies” in SPSS Mean 1 2 3 3 3 4 4 5 5 6 12 arithmetic mean: all scores divided by “n” Sample: xbar or M Population: μ (“mu”) most arithmetically sophisticated best predictor if no other info available used in deviation score calculation M = 4.36 Median Score at 50th percentile – middle score less influenced by skew Md = 4 Mode most frequent score used with qualitative data Mo = 3 35 Choosing Measures of Central Tendency Data: # of close friends A 2 3 4 2 3 4 5 3 2 3 1 B 2 3 4 2 1 4 6 2 27 3 5 C 3 4 3 1 2 12 15 12 14 16 10 What’s best for A? What’s best for B? What’s best for C? 36 SPSS – Setting up Frequencies Analysis 37 SPSS Frequencies Output (partial) Note: Need to select mean, median, & mode Statistics A N Valid Missing Mean Median Mode 11 0 2.91 3.00 3 B 11 0 5.36 3.00 2 C 11 0 8.36 10.00 3a a. Multiple modes exist. The s mallest value is shown 38 Measures of Variability What is Variability? dispersion; spread; distance between scores “Some people did really well, some did really poorly” “My tips are always about the same, between $30 and $35” “Some students study only a few minutes a day, some put in 30 hours per week.” Range simplest measure High Score – Low Score Problems: only uses two scores – not good for summarize entire distribution unduly affected by extreme scores 39 The Big Daddy: Standard Deviation Standard Deviation The typical deviation of a score from the mean of the distribution Most scores (68%) fall between +1 and –1 SD. Four Steps to Standard Deviation 1. Deviation Score 2. Sum of Squares 3. Variance 4. Standard Deviation x x 2 2 n 1 n 40 1. Deviation Scores idea: consider deviation of every score and add up distance from mean of a given score: x – xbar positive/negative deviation scores fall to the ____ of the mean problem why can’t we just add up the deviation scores consider distribution of : 1, 2, 3 x 1 2 xbar -2 -1 -2 0 3 -2 +1 0 41 2. Sum of Squares (SS) Means “Sum of the Squared Deviation Scores” Square each score, then add up Conceptual Formula (how we think about it) SS x x 2 Computational Formula (how we calculate by hand) SS x •Problem x 2 2 Sum of x Quantity Squared n Sum of x Squared –Biased by sample size – bigger samples have bigger SS 42 3. Variance Sum of Squares – no control for size of sample Think of relation between sum and average – divide sum by n … with sum of sq. and variance – divide sum of sq. by n Variance: x Average of the Squared Deviation Scores 2 x SS n x 2 2 n n 43 4. Standard Deviation Want measure in metric of raw scores Remember?? We used Sum of the SQUARED Deviations So…we take the square root of the variance Note, subscript “x” is optional x x n 2 2 x SS n n note that σ is no longer squared 44 SD: Bridge Building Example How high should the bridge be? Truck Height: 7,6,8,5,6,5,6,7 average: 6.25 Can we build it 6.25? •Calculation Tip: –Think anal retentive!! 45 SD: Bridge Building II x x n 2 2 x n 2500 320 8 x 8 7.5000 x .9375 8 x .9682 So we’d expect the truck height to range between about 6.25 .9682 Roughly 5.25 to 7.25. But… What if we missed some extremely tall trucks??? Should actually calculate ŝ – Standard Deviation as a population estimate 46 SD: Typical Formula Standard Deviation as a Population Parameter SD as a Population Parameter Estimate corrects for bias of smaller samples – missing of extreme scores x x n 2 2 sˆx n 1 47 SD: Different Forms 48 SD: Bridge Building Revisited x x n 2 2 sˆx n 1 2500 320 8 sˆx 8 1 So… σ = 0.9682 ŝ = 1.0351 SD calculated as estimate will always be larger. 7.5000 sˆx 1.0714 8 1 sˆx 1.0351 49 What type of Standard Deviation? A manager wants to know the variability in shift productivity for planning future projects. A teacher calculates the variability of reading scores for just her class of 25 students, and only applies it to her sample. The Educational Testing Service calculates the variability among SAT scores for all the students that took the SAT. A researcher determines the variability in reaction time in a perception study. Your statistics professor calculates test score variability with 25 students to know how much variability to expect on that sort of test. A researcher on anxiety collects data from 1000 participants in order to develop norms for a new anxiety instrument. 50 Practice x x n 2 Problem: Calculate σ for 4,2,3 2 x n 81 29 3 x 3 2 x .6667 3 x .8165 51 $$ Practice Calculations I 52 $$ Practice Calculations II 53 Confidence Intervals Combines mean with standard deviation 68% CI = M ± 1SD We can be 68% certain that a given score will fall between one SD below the mean to one SD above. Example: Bob took the history test after the rest of the class. The class scored 70 on average (μ =70) with a standard deviation of 10 (σ). What score do you expect Bob to get? 68% CI = M ± 1SD 68% CI = 70 ± 10 68% CI = 60, 80 That is, we’d expect Bob to get between 60 and 80. We’ll be right about 68% of the time. 54 Error Bars Graph in SPSS Shows mean ± 1 SD. 55