Statistics Refresher Research Methods PSYC362 Three Stages of Data Analysis 1) Getting to know the data 2) Summarizing the data 3) Confirming what the data reveal Three Stages of Data Analysis 1) Getting to know the data Exploratory or investigative stage What is going on with the data? Are there errors in the data? Does the data require transformation? Frequency distribution of data Histogram, frequency polygon, etc. Stem-and-Leaf displays Frequency Distribution Frequency distribution An organized tabulation of the number of individuals located in each category on the scale of measurement Presented as either tables or a graph Two elements of Frequency Distributions The set of categories that make up the original measurement scale. A record of the number of individuals in each category. Frequency Distribution Tables 8, 9, 8, 7, 10, 9, 6, 4, 9, 8, 7, 8, 10, 9, 8, 6, 9, 7, 8, 8 Frequency Distribution Tables X 10 9 8 7 6 5 4 f 2 5 7 3 2 0 1 Proportions and Percentages Proportion/relative frequency P= f / N Percentage P(100)= f / N(100) Proportions & Percentages X 10 9 8 7 6 5 4 f 2 5 7 3 2 0 1 p % Proportions & Percentages X 10 9 8 7 6 5 4 f 2 5 7 3 2 0 1 p .1 .25 .35 .15 .1 0 .5 % Proportions & Percentages X 10 9 8 7 6 5 4 f 2 5 7 3 2 0 1 p .1 .25 .35 .15 .1 0 .05 % 10% 25% 35% 15% 10% 0% 5% Frequency Distributions Graphs Histograms Vertical bars above each score Height of bar corresponds to Frequency Width extends to real limits of the score Bar graphs Vertical bars above each score with space between each bar Designates separate distinct categories Frequency Distribution Polygon (line graph) A dot is centered above the score w/ height corresponding to frequency Connected with a continuous line Frequency Distribution Tables X 10 9 8 7 6 5 4 f 2 5 7 3 2 0 1 Histogram F r e q u e n c y Bar graph F r e q u e n c y Line graph (frequency distribution polygon) F r e q u e n c y Shape of a distribution Symmetrical distribution Line at midpoint will give identical halves Skewed distribution Scores pile up at one end and taper off at the other Positively skewed Tail points toward the positive end of the scale Negatively skewed Tail points toward the negative end of the scale Symmetrical Distribution F r e q u e n c y Negative Skew F r e q u e n c y Positive Skew F r e q u e n c y Stem and Leaf Display: An alternative to traditional frequency distribution • Lowest scores appear at the top. •Presenting exact value for each 6|12 score 6|7889 •Showing the shape of the 7|000223 distribution (viewed from the side) 7|5677888899 •One of the 8|00112222344 techniques in 8|5566666667788999 Exploratory Data 9|01 Analysis 9|6 1-1 Three Stages of Data Analysis 2) Summarizing the data Measures of Central Tendency: Mean- the sum of the scores divided by the number of scores contributing to the sum Median- the middle point of a frequency distribution determined by ranking all of the scores from lowest to highest Mode- the most often recurring score Three Stages of Data Analysis 2) Summarizing the data Dispersion of Variability: Range- the lowest score in the distribution subtracted from the highest Standard Deviation- the square root of the average squared deviations of scores about the mean. Tells how far on average a score is from the mean Standard Error of the Mean- population standard deviation divided by the square root of the sample size Estimated Standard Error of the Mean- sample standard deviation divided by the square root of the sample size STANDARD DEVIATION Raw Scores Method Method we will use for Class SS X First calculate the Sum of Squares Then calculate the standard deviation X 2 2 s SS N 1 SS N N Sampling Distribution of the Mean Central Limit Theorem Standard deviation of the sampling distribution of mean is equal to the standard deviation of the raw score population divided by n X Where n = the size of the sample n Thus, the standard distribution of means is narrower than the population distribution