Shavelson – Descriptive Statistics Variability Range Variance SD Shavelson Chapter 5 S5-1. Define, be able to create and recognize graphic representations of a normal distribution (115-121). Normal distribution: Provides a good model of relative frequency distribution found in behavioral research. Shavelson Chapter 5 S5-2. Know the four properties of the normal distribution (120-121). Unimodal, thus the greater the distance a score lies from the mean, the less the frequency of at score. Symmetrical Mean, mode, and median all the same Aymptotic line never touches the abscissa Note that the mean and variance can differ, thus “a family of normal distributions” Shavelson Chapter 5 S5-3. You should know what is meant by the phrase “a family of normal distributions” (121,3). I will also cover in class the general issues of “distributions” which are frequently used in statistical analyses. From:http://www.gifted.uconn.edu/siegle/research/Normal/instructornotes.html 25% 25% 25% 25% . Th h. .. ns es e di s tri bu tio ns ha . . tri bu tio es e Th es e Th di s tri bu tio ns ha . . ha . ns 0 of 5 di s tri bu tio 4. di s 3. es e 2. These distributions have the same mode, different median and SD These distributions have different mode, same median, different SD These distributions have different means, modes and variances These distributions have the same mode, mean and median, but different SDs Th 1. 45 Enter question text... 25% 25% 25% 25% 0 of 5 g is th e sa m e w h. .. ns ot hi n N Th es e di s tri bu tio ns tri bu tio di s es e Th ... . ha . . ha . ns tri bu tio 4. di s 3. es e 2. These distributions have the same mean, but different SDs These distributions have a different means and medians, but the same modes and SDs These distributions have different means, modes, and Nothing is the same with these two! Th 1. 45 Shavelson Chapter 5 S5-4. Know the areas under the curve of a normal distribution (roughly, e.g. 34.13%, 13.59%, 2.14 % and .13% on either side of the mean) From:http://www.gifted.uconn.edu/siegle/research/Normal/instructornotes.html Shavelson Chapter 5 Shavelson Chapter 5 S5-5a. What is a standard score (z-score) (123,3)? Be able to calculate the z-score, given a raw score, mean, and standard deviation. Z score = X-mean S X = raw score Mean = mean of distribution S = standard deviation Notice that to calculate the Z score you need the mean and S of a distribution of scores. Shavelson Chapter 5 S5-5b. What two bits of information does the z-score provide us (125, 1-2)? Z scores provides the following information: 1. Size of Z scores indicates the number of standard deviations raw score is from the mean 2. Sign (+ or -) indicates if the raw score is above the mean (+) or below the mean (-) A z score of -1.8 means… 0 of 5 Th e n di st rib of th e di st r i. .. Th ut i e on ra is w s. sc .. Th or e e lie ra s w 1. sc ... or Th e e lie ra s w 1. sc 8. or .. e 1. 8 lie s. .. 20% 20% 20% 20% 20% m ea 5. e 4. The mean of the distribution is 1.8 The distribution is skewed The raw score lies 1.8 means above the mean The raw score lies 1.8 standard deviations below the mean The raw score 1.8 lies standard deviations above the mean Th 1. 2. 3. 45 Mean = 10, X= 18, S = 4, what is the Z score? N ot l is te d 2 -4 20% 20% 20% 20% 20% 4 -2 4 -4 2 Not listed -2 1. 2. 3. 4. 5. 0 of 5 45 Shavelson Chapter 5 S5-6. Know what a Standard(ized) distribution is. Convert all raw scores of a distribution into Z scores, and put into a frequency distribution. – – Mean = 0 Std. Dev. And Variance = 1 -2 -1 0 +1 +2 Shavelson Chapter 5 S5-8. Know how to calculate the proportion of scores that lie above or below a given raw score Convert raw score to a Z score Do rough estimate on a standard normal distribution Look up in table B (swap labels on 3&4 if it is a neg Z value). Mean = 80 S=5 X = 69 Let's do a few more! Mean = 18, X = 5, S = 7.1. What percentile was the person who scored the x in? 0 of 5 .6 4 25% 46 25% 3. 44 25% .6 4 25% 96 1.83 96.64 3.44 46.64 1. 83 1. 2. 3. 4. 45 Shavelson Chapter 8 S8-1. Know the definition of a statistic, parameter, and estimator Statistic: describes characteristic of sample e.g. sample mean x-bar as opposed to population mean mu (μ) Parameter: describes characteristic of population Estimator: statistic that estimates a population parameter The mean is an example of….. Parameter Statistic Estimator All of the above A ll o ft Es he tim ab ov at o e r tic St at is ra m et er 25% 25% 25% 25% Pa 1. 2. 3. 4. 0 of 5 45 Shavelson Chapter 8 S8-2. Know the role of statistics, as well as the difference between inferential and descriptive statistics. Role of Stats: • Guidelines for summarizing/describing data • Method for drawing inferences from sample to population • Help set effective methodology Descriptive Stats • Organize/summarize/depict/describe collections of data Inferential Stats • Draw inferences about population from sample Shavelson Chapter 8 S8-3. Know and be able to recognize and provide examples of the two types of questions asked about a population (Case 1 and Case II research). (217) Case I Research: Was a particular sample of observations drawn from a particular (known) population? Example: all students in US took GRE on same day, means of all scores…look at one state in particular…mean is higher…they are from a different population with a higher mean. Take one sample from the population (get mean) and compare to the overall population mean. Actually answer: what is the probability that a sample was drawn from a particular (known) population. Shavelson Chapter 8 S8-5. Know the general approach for conducting case I and case II hypothesis testing. That is, you should be able to list and briefly describe the steps your author lists at the end of each section (case I, 220-221, 4 steps; case II 223-224, 5 steps). Be able to describe the various alternative hypotheses (step two of each) Case II research: Are the observations from two different samples drawn from the same population? • (do observations on two groups of subjects differ from one another) • Actually answer: given that a difference exists between two samples (e.g. the means) what is the probability that this difference is caused by chance alone? If not from chance alone, they must be from different populations e.g. our treatment changed them! Shavelson Chapter 8 S8-5 Case 1 research steps: 1. Set your hypotheses • Ho: µ = specific value • H1: µ some specific value – usually pop mean (two tailed) • H1: µ > some specific value (one tailed) • H1: µ < some specific value (one tailed) 2. Randomly select participants for your study 3. decide to reject the null or not based on the comparison of the sample mean to the population mean – Reject null means that the difference between the population mean and the sample mean is not likely to have occurred by chance (it was probably due to whatever you were studying!) – Failure to reject the null means there is a fairly good chance that the difference between the sample mean and the population mean could have occurred simply by chance (not due to whatever you were studying) Shavelson Chapter 8 Case II 1. Set your hypotheses • Ho: µe = µc • µe = Experimental group • µc = Control Group • H1: µe µc (two tailed) • H1: µe > µc (one tailed) • H1: µe < µc value (one tailed) 2. Randomly Select then Randomly Assign participants to experimental and control groups. 3. Perform the experiment – apply the IV and measure the DV 4. Decide to reject the Null Hypothesis or not – – Reject the null means that the difference between the experimental and control group is not likely to have occurred by chance (thus was probably your IV!) Failure to reject the null means that it is likely that the difference between the control group and the experimental group was due to chance and not your IV. Shavelson Chapter 8 S8-6. Know the two types of statistical errors: Type 1 and type 2. Be able to prove and recognize examples of each. Types of errors in statistical inference Type I: Reject the null when it is true (say there is a treatment effect when there is not) Type II: Not reject null when it should have been (say there is no treatment effect when there was) Shavelson Chapter 8 S8-6. Know the two types of statistical errors: Type 1 and type 2. Be able to prove and recognize examples of each. The way it really is Your Decision Reject Null – Vit A had an effect Accept Null Vit A had no effect Vit A has no effect Vit a Had an effect Type 1 error Correct Decision Correct decision Type 2 error Shavelson Chapter 9 Probability Event: any specified outcome Outcome space: all possible outcomes • P(e)= the probability of some event • P(e)= # events/# outcomes in outcome space • Ex. Dice Outcome space = {1,2,3,4,5,6} (= six items) E = {2} (=1 item) Probability of getting a 2 = 1/6=.17 Probability: what is the probability of getting a two by chance alone? First Question Second Question Third Question Correct Score on test Correct Incorrect 3 2 Incorrect Correct 2 Correct Incorrect Correct Incorrect 1 2 1 Correct Incorrect 1 0 Correct Incorrect Incorrect Shavelson Chapter 10 10-1. Two fundamental ideas of conducting case I research: The null hypothesis is assumed to be true. • (that is, the difference between the sample and population mean is assumed to be due to chance alone) A sampling distribution is used to determine the probability of obtaining a particular sample mean. • In this case the sampling distribution is composed of group means Shavelson Chapter 10 10-2. What is the central limit theorem? The Central Limit Theorem is a statement about the characteristics of the sampling distribution of means of random samples from a given population. That is, it describes the characteristics of the distribution of values we would obtain if we were able to draw an infinite number of random samples of a given size from a given population and we calculated the mean of each sample. The Central Limit Theorem consists of three statements: [1] The mean of the sampling distribution of means is equal to the mean of the population from which the samples were drawn. [2] The variance of the sampling distribution of means is equal to the variance of the population from which the samples were drawn divided by sqrt of the size of the samples. [3] If the original population is distributed normally (i.e. it is bell shaped), the sampling distribution of means will also be normal. If the original population is not normally distributed, the sampling distribution of means will increasingly approximate a normal distribution as sample size increases. (i.e. when increasingly large samples are drawn) Shavelson Chapter 10 10-3. Know the characteristics of a sampling distribution of means. Characteristics of Sampling distribution of means 1. normally distributed (even if pop. is skewed - if N = 30 or more) 2. sampling mean = population mean 3. standard dev (standard error of the mean) = Pop S.D. N Shavelson Chapter 10 10-4. Know what happens to the SEM as sample size increases. SEM decreases as N increases SEM = Pop S.D. N σx=σ N Shavelson Chapter 10 10-5. Know how one could create a sampling distribution of means Sampling Distribution of means A distribution composed of sample means How to conduct 1. Pull a sample from population of N size 2. Find the mean of the sample 3. Repeat this many times (all samples of size N) 4. Create a frequency distribution of the means (actual convert if to relative frequencies = proportions!) Shavelson Chapter 10 10-5. What is the functions of a sampling distribution of means? Used as a probability distribution to determine the likelihood of obtaining a particular sample mean, given that the null hypothesis is true. null hypothesis is true = same thing as “by chance alone” Shavelson Chapter 10 S10-6. As your author does, be able to calculate the probability of obtaining a particular sample mean, given the appropriate data (e.g. the mean of the sampling distribution and the standard error). If I ask for this on the test I will either supply table B or will have the Zx fall on a whole value (e.g. 1 or 2, or 3). You should thus review the probabilities under the normal curve as you will be expected to be able to apply this information) (260-262) μ = 100 (mean of the population and the sampling distribution) σ x = 25 X = (mean of the sample we used in our study) What is the probability of obtaining a sample mean of 175 by chance alone (i.e. when the null is true: Ho: μ = x) Zx = mean of the sample – pop mean = X - μ = 175-100 SEM σx 25 Use table b if needed! Shavelson Chapter 10 S10-7. What meant by the terms "unlikely" and "likely"? You should be able to answer this in terms of accepting or rejecting the null hypothesis, or in terms of what is meant by "significance level" (263-264) Level of significance = what we consider to be “unlikely” Generally set at 5% or 1 % chance of obtaining a sample mean by chance alone Alpha = .05 or alpha = .01 Thus: decisions to reject the null are based on your alpha level Reject null if your sample mean is equal too, or less than your alpha level. You get all the scores of the folks in CA who took the GRE and find that their average score is 675 (for verbal). The overall (entire population) mean is 500 and the SEM is 100. Is the California mean statistically significant (the diff from the pop mean). Alpha = .05 33% H uh ? o Ye 33% N 33% s 1. Yes 2. No 3. Huh? 0 of 5 45 Shavelson Chapter 10 S10-7 Decisions to reject the null are based on your alpha level “Reject the null hypothesis if the probability of obtaining a sample mean is less than or equal to .05 (.01); otherwise, don’t reject the null hypothesis” Shavelson Chapter 10 10-8 Calculating Zx (critical) (The Zx score at which we say it is “unlikely” to obtain this value by chance alone) at the alpha = .05 level of significance Zx (critical) = 1.65 (from table B) at the α = .01 level of significance (critical) = 2.33 (from table B ) Example: μ = 42 σx = 8 X = 30 Reject the Ho or not at the .05 level of significance? translate alpha level into z-score Shavelson Chapter 10 10-8 Calculating Zx (critical) Two ways to reject the null: Find the probability of obtaining the Z score (obtained), or find the Z scored that lies at the alpha level (critical). Then Either compare the probability of getting the Zobtained (e.g. .03) to the alpha level (e.g. .05). In this case you would say reject the null - we show statistical significance Or, compare the Zobtained to the Zcritical in this case, 1.88 (obtained) and 1.65(critical). In this case since the Zobtained is greater than Zcritical we reject the null - we show statistical significance Shavelson Chapter 10 10-9. Know the difference between directional and non directional tests, and when to use each! 1. A one tail may be supported by previous research or theory 2. When in doubt, choose two tailed! Tails are specified by alternative hypotheses. Ho: xbar=mu H1: xbar ≠≠ μ (2 tailed: both) Or H1: xbar < μ (1 tailed: left) Or H1: xbar > μ (1 tailed: right) Easier to show statistical significance with 1-tailed test. Directional vs. non-directional tests Directional uses only one tail of the sampling distribution Non-directional uses both tails Thus: If alpha = .05 and one tail all .05 (1.65) is in one tail (or -1.65) If alpha = .05 and two tail .025 (1.96) is in one tail, and .025 (-1.96) is Shavelson, Chapter 10 • If conducting case II research, how could you determine the probability of getting a particular difference between 2 means • (which is what we are looking at for case II). Shavelson Chapter 10 Sampling distribution of differences between means gives the probability of obtaining a particular difference between means. (Case II) Theoretically you could…. Make sampling distribution of differences between means, then find a z-score, compare to alpha level, accept or reject the null hypothesis. Case II xbar1-xbar2 = 2 xbar1-xbar2 = 3 xbarz-xbar2 = -1 graph frequency of each difference make freq distribution of differences between sample means Can also calculate SD, determine likelihood of obtaining difference between means by chance alone Shavelson Chapter 10 • Characteristics of the sampling distribution of differences between means 1. normally distributed 2. Mean=0 3. Standard Deviation (called the standard error of the difference between means) Is equal to: σx1-x2 = • σx12 + σx22 Note: variance = sigma squared Shavelson Chapter 10 Calculate a Z score for diff between means Z x1-x2 = Xe – Xc σx1-x2 Example: Xe = 24 Xc = 30 σx1-x2 = 2.8 H1: = Xe ≠ Xc Z crit? Z obs? Shavelson Chapter 11 S11-1. Know the definition and recognize/generate examples of the two types of errors (Type I and Type II)(also see table 11-1)This is similar to what we did last unit. How does one adjust the probability of making a type I error? (313). The way it really is Your Decision Reject Null – Vit A had an effect Accept Null Vit A had no effect Vit A has no effect Vit a Had an effect Type 1 error Correct Decision Correct decision Type 2 error Shavelson Chapter 11 S11-2. Know the definition of "power" and how it is calculated. (314) Power = 1-Beta The probability of correctly rejecting a false null hypothesis. OR: Power is the probability of you detecting a true treatment effect. (What researchers are really interested in! Detecting a true difference if it exists.) Power = .27 (27%)…very low. Want higher power, want higher number. Shavelson Chapter 12 S12-1. What is the purpose of a t test in general (334,3). Also how is a t test used for case I research? (that is, what question does it answer?(334,3). As in previous chapters the function of the t test is to determine the probability of observing a particular sample mean, given that the null hypothesis is true. You should know this point. You should also know how the standard deviation is estimated for the population when using the t distribution (334) T-test is used to… A. Determine the probability that a sample was drawn from a hypothesized population (given a true Ho) B. Used when the population standard deviation is not known C. Calculated standard deviation (SEM) is: How would one go about doing this? Standard Dev. Of Sample = Sx = Sq. Root of sample size s N Shavelson Chapter 12 S12-2. You should be able to describe the t distribution and what it is used for (determining the probability of obtaining a particular sample mean)(335336). Know the important differences between the t distribution and the normal distribution. (335,5,-335,7) (there are three points made). A. T(observed): X–μ sx = the number of standard deviations that a particular t lies from the mean) The t distribution is created from numerous same sized samples from the population – just like a sampling distribution! The t(observed) can be compared to the t distribution to determine the probability of obtaining that particular sample mean (given the Ho is true) Shavelson Chapter 12 T-distribution vs. Normal Distribution: 1. T has a different distribution for every sample size (N) 2. More values lie in the tails of t; thus critical values for t are higher than Z 3. As sample size increases t becomes closer + closer to normal distribution. Shavelson Chapter 12