Week 2 Exercises Deb Davis - Pg 1 Statistics for Those Who (Think They) Hate Statistics Chapter 4 – Questions 1-5 C4-Q1-P72 Data set on web -- complete the following: 1a: Frequency Distribution & Histogram 2 6 7 8 11 12 14 15 16 21 22 25 26 27 29 31 33 34 36 38 41 42 43 44 45 47 49 51 53 54 55 56 57 59 1 1 1 1 1 2 1 2 2 2 2 1 1 1 2 1 1 1 1 1 1 2 2 2 2 2 1 1 1 3 1 3 2 1 0-5 6 - 10 1 3 11 - 15 6 16 - 20 21 - 25 2 5 26 - 30 4 31 - 35 3 36 - 40 2 41 - 45 9 46 - 50 3 51 - 55 6 56 - 60 6 Week 2 Exercises Deb Davis - Pg 2 1b: Why the class interval you selected? I selected intervals of 5 because it made for a reasonable size group. 1c: Is this distribution skewed? How would you know? This distribution is skewed. This is visually obvious in the Cross Validation from the lower limit of the first bin. It is also apparent from the above bar graph that the distribution is not symmetrical, ergo, it is skewed. Week 2 Exercises C4-Q2-P72 Frequency distribution given: Class Frequency 90 - 100 12 80 - 89 14 70 - 79 20 60 - 69 24 50 - 59 28 40 - 49 29 30 - 39 21 20 - 29 15 10 - 19 17 0-9 12 Create a histogram: Deb Davis - Pg 3 Week 2 Exercises Deb Davis - Pg 4 C4-Q3-P73 Identify these distributions as negatively skewed, positively skewed, or not skewed at all, and why. 3a. This talented group of athletes scored very high on the vertical jump takes. This distribution is positively skewed as they are a talented group and apparently all scored well. 3b. On this incredibly crummy test, everyone received the same score. This distribution is not skewed at all as despite the “crumminess” of the test, all scores were equal. 3c. On the most difficult spelling test of the year, the third graders wept as the scores were delivered. It is impossible to tell if this is a skewed distribution because the third grades may have wept for joy, for pity, or for either. There is no quantifier to the distribution to indicate what the scores were. C4-Q4-P73 For each of the following, indicate whether you would use a pie, line, or bar chart, and why. 4a. Proportion of freshmen, sophomores, juniors, and seniors in a particular university would easily lend itself to a pie chart. To visualize these groups by pieces of a pie is very straightforward. 4b. Change in GPA over four semester would likely render best in a bar chart as the visual change in grades could be color monitored to assign terms, and would make tracking extremely visual. 4c. Number of applicant for four summer jobs would again render in a bar chart for the same reasons. 4d. Reaction time to different stimuli would probably best render in a line chart as the details could cloud and otherwise clear pie or bar chart. 4e. Number of scores in each of 10 categories could be rendered in any method, but I would probably use a bar chart because of the clarity of image. Week 2 Exercises Deb Davis - Pg 5 C4-Q5-P73 Provide an example for each of the below and then draw the chart accordingly. 5a. A line graph is well geared for a large groups of numbers from which trends may be gathered. For my area, I would use a line graph to chart the scores on midterms taken over term of teaching. For example, with a possible score of 200, the following scores were received over the last two terms. Tplies 5b. A bar graph gives excellent comparatives, such as midterm paper grades to final paper grades. 5c. A pie graph would give great information for group totals. Week 2 Exercises Deb Davis - Pg 6 Chapter 5 – Questions 1-8 C5-Q1-P93 Using the following data: Number Correct Attitude 17 94 13 73 12 59 15 80 16 93 14 85 16 66 16 79 18 77 19 91 1a: Compute the Pearson product-moment correlation coefficient by hand and show work. Sum of all correct (X) is 156 Sum of all Attitude (Y) is 797 Sum of each X-squared is 2476 Sum of each Y-squared is 64727 Sum of Products of X and Y is 12568 THEREFORE: (10 x 12568) - (156 x 797) = -----------------------------------Sq rt of [(10 x 2476) - 1562][(10 x 64727) - 7972] ======================================================= 125680 - 124332 = --------------------------------------------------------Sq rt of [24760-24336][647270-635209] ======================================================== 1348 1348 1348 = --------------------------= --------------- = ------- = 0.59609 Sq rt of [424][12061] sort 5113864 2261.39 1b. Construct a scatter plot for these 10 values by hand. Would you expect the correlation to be direct or indirect? Indirect correlation. Relationship is weak Week 2 Exercises Deb Davis - Pg 7 C5-Q2-P94 Use the below data for 2a and 2b. Speed 21.6 23.4 26.5 25.5 20.8 19.5 20.9 18.7 29.8 28.7 Strength 135 213 243 167 120 134 209 176 156 177 Sum of all speed (X) is 235.4 Sum of all strength (Y) is 1730 Sum of each X-squared is 5677.74 Sum of each Y-squared is 313210 Sum of Products of X and Y is 41095.2 THEREFORE: (10 x 41095.2) - (235.4 x 1730) = -----------------------------------Sq rt of [(10 x 5677.4) - 235.42][(10 x 313210) - 17302] ======================================================= 410952 - 407242 = --------------------------------------------------------Sq rt of [56777.4-55413.16][3132100-2992900] ======================================================== 1348 1348 1348 = --------------------------= --------------- = ------- = 0.26922 Sq rt of [1364.24][139200] sqrt 189902208 1378.5 2b. A low correlation (.27) indicates that the contributing factors may not be a huge influence. Week 2 Exercises Deb Davis - Pg 8 C5-Q3-P94 C5-Q3-P94 Budget + (X) Acht (y) 7 3 5 7 2 1 5 4 4 38 x sq 11 14 13 26 8 3 6 12 11 104 y sq 49 9 25 49 4 1 25 16 16 194 121 196 169 676 64 9 36 144 121 1536 xy 77 42 65 182 16 3 30 48 44 507 SUMS 1a: Compute the Pearson product-moment correlation coefficient. Sum of all Increase (X) is 38 Sum of all Acht (Y) is 104 Sum of each X-squared is 194 Sum of each Y-squared is 1536 Sum of Products of X and Y is 507 THEREFORE: (10 x 507) - (38 x 104) = -----------------------------------Sq rt of [(10 x 194) - 382][(10 x 1536) - 1042] ======================================================= 5070 - 3952 = --------------------------------------------------------Sq rt of [1940-1444][15360-10816] ======================================================== 1118 1118 1118 = --------------------------= --------------- = ------- = 0.7447 Sq rt of [496][4544] sqrt 2253824 1501.27 The correlation is slightly skewed indicating a relationship between increased budget and increased scores. Week 2 Exercises C5-Q4-P95 Hours(x) 23 12 15 14 16 21 14 11 18 9 153 GPA (y) x sq 3.95 529 3.9 144 4 225 3.76 196 3.97 256 3.89 441 3.66 196 3.91 121 3.8 324 3.89 81 38.73 2513 Deb Davis - Pg 9 y sq xy 15.6025 90.85 15.21 46.8 16 60 14.1376 52.64 15.7609 63.52 15.1321 81.69 13.3956 51.24 15.2881 43.01 14.44 68.4 15.1321 35.01 150.099 593.16 SUMS Sum of all hours (X) is 153 Sum of all GPA (Y) is 38.73 Sum of each X-squared is 2513 Sum of each Y-squared is 150.99 Sum of Products of X and Y is 593.16 THEREFORE: (10 x 593.16) - (153 x 38.73) = -----------------------------------Sq rt of [(10 x 2513) - 1532][(10 x 150.099) - 38.732] ======================================================= 5931.6 - 5925.69 = --------------------------------------------------------Sq rt of [25130-23409][1500.99-1500.0129] ======================================================== 5.91 5.91 5.91 = --------------------------= --------------- = ------- = 0.1441 Sq rt of [1721][0.9771] sqrt 1681.5891 41.007 A low correlation such as this would indicate a lack of relationship. Accordingly, the plot is random. Week 2 Exercises Deb Davis - Pg 10 C5-Q5-P05 - A coefficient between two variables is 0.64. The Pearson correlation is 8 [??????]; the relationship is quite strong, and the variance unaccounted is .36 (1-.64). Chapter 6 - Questions 2-5 C6-Q2-P118 Provide an example of when you would want to establish test-retest and parallel forms reliability. C6-Q3-P118 You are developing an instrument that measures vocational preferences and you need to administer the test several times during the year while students are attending a vocational program. You need to assess the test-retest reliability of the test and the data from two administrations (Ch6 data set 1) -- one fall and one spring. Would you call this a reliable test? Why or why not? C6-Q4-P118 How can a test be reliable and not valid, and not valid unless it is reliable? C6-Q5-P118 When testing any experimental hypothesis, why is it important that the test you use to measure the outcome be both reliable and valid? Week 2 Exercises Deb Davis - Pg 11 Chapter 7 - Questions 1-7 (Note: Teacher will provide the articles for #1) C7-Q1-P113 Select five empirical research articles and detail the following information: a-What is the null hypothesis? b-What is the research hypothesis? c-Create a null and research hypothesis for own area. d-identify articles with no clear stated or implied hypothesis. Can a research hypothesis be crafted? C7-Q2-P113 Why does the scientific method work? Steps: Observe Question Hypothesize Experiment Accept or Reject Change Hypothesis? Experiment Accept or Reject Etc. -- The scientific method generally works because of its circular perspective. C7-Q3-P113 Why do good samples make for good tests of research hypotheses? Good samples make for good tests of research hypotheses because good samples are directed to incorporate specifics of a directed hypothesis (an educated guess). C7-Q4-P113 For the following, create one null hypothesis, one directional research hypothesis, and one nondirectional research hypothesis. a-What are the effects of attention on out-of-seat classroom behavior? -Diagnostically Severe ADHD students would have the same out-of-seat frequency as those determined to be not ADHD-Severe. -Diagnostically Severe ADHD students would have more out-of-seat frequency than those determined to be not ADHD-Severe. -Diagnostically Severe ADHD will differ in out-of-seat frequency than those determined to be not ADHD-Severe. Week 2 Exercises Deb Davis - Pg 12 b-What is the relationship between the quality of a marriage and the quality of the spouses relationships with their siblings? -Those with a strong quality of marriage will always have a weak quality of sibling relationships. -Those with a strong quality of marriage will always have a strong quality of sibling relationships. -Those with a strong quality of marriage will have varying quality of sibling relationships. c-What’s the best way to treat an eating disorder? - The best way to treat an eating disorder is always calories-in-calories-out. - The best way to treat an eating disorder is never calories-in-calories-out. - The best way to treat an eating disorder is completely dependent upon the cause of the disorder, and even then, treatment may or may not be effective. C7-Q5-P113 What do we mean when we say that the null hypothesis acts as a starting point? To start at the null hypothesis allows for all possibilities. When there are a number of unknowns, to start by eliminating as many variables as possible allows for individual test methods. C7-Q6-P113 Evaluate the hypotheses from C7-Q1 in terms of the five criteria discussed at the end of the chapter. Hypotheses should: Be stated in a declarative form Posit a relationship between variables Reflect a theory or a body of literature on which they are based Be brief and to the point, and Be testable! C7-Q7-P113 Why does the null hypothesis presume no relationship between variables? That defines “null” – having no relationship! Week 2 Exercises Deb Davis - Pg 13 C8-Q1-9 C8-Q1-P151 What are the characteristics of the normal curve? The three characteristics of a bell curve are: 1) it is not skewed; 2) it is perfectly symmetrical about the mean; 3) the tails are asymptotic (close to the axis but never quite reaches). What human behavior is distributed normally? Generally, height and weight are distributed normally in a population. In my classroom, grades turn from a reverse bell to a bell through the course of the term. C8-Q2-P151 Standard scores, such as z scores, allow us to make comparisons across different samples. Why? A z score is the result of dividing the amount that a raw score differs from the mean of the distribution by the standard deviation. So, scores below the mean will have negative z scores, and scores above the mean will have positive z scores. Positive z scores always fall to the right of the mean, and negative always fall to the left. Remember that z scores across different distributions are comparable. C8-Q3-P151 Why is a z score a standard score, and why can standard scores be used to compare scores from different distributions with one another? A z score is a standard score because it is based on the degree of variability within its distribution. C8-Q4-P151 Compute the z scores for the following raw scores where the X-bar is 50 and the standard deviation is 5. z = (rawscore – mean)/standarddeviation a. 55 (55-50)/5 = 5/5 = 1 b. 50 (50-50)/5 = 0/5 = 0 c. 60 (60-50)/5 = 10/5 = 2 d. 57.5 (57.5 – 50)/5 = 7.5/5=1.5 e. 46 (46-50)/5 = -4/5 = -.8 Week 2 Exercises Deb Davis - Pg 14 5. For the following set of scores, fill in the cells. The mean is 70 and the standard deviation is 8. z = (rawscore – mean)/standarddeviation Raw Score 68.0 57.2 82.0 84.4 69.0 66.0 85.0 83.6 72.0 z score (68-70)/8 = -2/8 = -.25 (x-70)/8 = -1.6 (82-70)/8 = 1.5 (x-70)/8 = 1.8 (69-70)/8 = -0.125 (x-70)/8=-0.5 (85.0-70)/8=1.875 (x-70)/8=1.7 (72.0-70)/8=0.25 6. Questions 6a through 6d are based on a distribution of scores with a mean of 75 and a standard deviation is 6.38. z = (rawscore – mean)/standarddeviation a. Wha is the probability of a score falling between a raw score of 70 and 80? b. What is the probability of a score falling above a raw score of 80? c. What is a probability of a score falling between a raw score of 81 and 81? d. What is the probability of a score falling below a raw score of 63? 7. Jake needs to score in the top 10% in order to earn a physical fitness certificate. The class mean is 78 and the standard deviation is 5.5. What raw score does he need to get that valuable piece of paper? (x-78)/5.5=.9 82.95 minimum required 8. So, why doesn’t it make sense to simply combine, for example, course grades across different topics – just take and average and call it a day? Each raw score is rated to different distributions which will make all the difference. Week 2 Exercises Deb Davis - Pg 15 9. Who is the better student, relative to his or her classmates? Here’s all the information you ever needed to know . . . . MATH Class Mean Class Standard Deviation 81 2 READING Class Mean Class Standard Deviation 87 10 z = (rawscore – mean)/standarddeviation RAW Mean SD z math-n math-t 85 87 81 81 2 2 2 3 rdg-n rdg-t 88 81 87 87 10 10 0.1 -0.6 avg-n avg-t 2 3 0.1 -0.6 2.1 2.4 1.05 1.2 Talya is the better student.