Statistics, Part 9

advertisement

What’s with all those numbers?

What are Statistics?

Collecting Data

1.

State the goal of your study precisely.

2.

Choose a representative sample from the population.

3.

Collect raw data from the sample and summarize.

4.

Use the sample statistics to make inferences about the population.

5.

Draw conclusions.

Samples

 A representative sample is a sample in which the relevant characteristics of the sample members match those of the desired population.

Sample Types

1.

Simple Random Sampling

2.

Systematic Sampling

3.

Convenience Sampling

Mean, Median, Mode

 To find a mean, you sum all the data and divide by the total number of values.

 To find a median, you list all the data in order of smallest to largest (or largest to smallest) and choose the middle value. If the data set has an even number of values, find the average of the two middle values.

 The mode is the value that occurs most often in a data set.

Example

 On a certain exam, students earned the following scores: 98, 96, 96, 93, 87, 84, 76, 67, 67, 67, 62, 51, and

17. Find the mean, median and mode.

Mean – Add all the values and divide by 13, the total number of values.

Answer = ?

Example

 On a certain exam, students earned the following scores: 98, 96, 96, 93, 87, 84, 76, 67, 67, 67, 62, 51, and

17. Find the mean, median and mode.

Median – Since this data set has an odd number of values, take the number that is in the middle.

Example

 On a certain exam, students earned the following scores: 98, 96, 96, 93, 87, 84, 76, 67, 67, 67, 62, 51, and

17. Find the mean, median and mode.

Mode – Which number occurs most often?

Range

 The range of a data set is the difference between the largest and smallest values in the data set.

 Example: On a certain exam, students earned the following scores: 98, 96, 96, 93, 87, 84, 76, 67, 67, 67,

62, 51, and 17.

Range: 98 – 17= 81

Standard Deviation

 Standard deviation is a measure of dispersion; that is, how spread out the data is.

 Standard deviation gets all sorts of mathy with this formula: 𝜎 =

Σ(𝑥 𝑖

−𝜇)

2

.

𝑛

Example

 On a certain exam, students earned the following scores: 98, 96, 96, 93, 87, 84, 76, 67, 67, 67, 62, 51, and

17. Find the standard deviation.

1.

Find the mean μ .

2.

Construct a table as follows.

67

67

62

51

17

93

87

84

76

67

Data Point

98

96

96

Difference from μ = 73.9 Difference squared

98 – 73.9 = 24.1

(24.1)² = 580.81

96 – 73.9 = 22.1

93 – 73.9 = 19.1

(22.1)² = 488.41

(22.1)² = 488.41

87 – 73.9 = 13.1

84 – 73.9 = 10.1

(19.1)² = 364.81

(13.1)² = 171.61

(10.1)² = 102.01

76 – 73.9 = 2.1

67 – 73.9 = -6.9

67 – 73.9 = -6.9

67 – 73.9 = -6.9

62 – 73.9 = -11.9

51 – 73.9 = -22.9

17 – 73.9 = -56.9

(2.1)² = 4.41

(-6.9)² = 47.61

(-6.9)² = 47.61

(-6.9)² = 47.61

(-11.9)² = 141.61

(-22.9)² = 524.41

(-56.9)² = 3237.61

Sum: 6246.93

 We take the sum: 6246.93 and divide by the number of people that took the exam to get

6246.93

13

= 480.533

 The final step to find the standard deviation is to take the square root of this value. That is, 𝜎 =

480.533 = 21.92

 A large standard deviation like this tells us that the exam scores were really spread out.

Quartiles and the 5-

 Quartiles divide the data into quarters. You will have

 The low value.

 The lower quartile, the median of the lower half of data.

 The median.

 The upper quartile, the median of the upper half of data.

 The high value.

Example

 On a certain exam, students earned the following scores: 98, 96, 96, 93, 87, 84, 76, 67, 67, 67, 62, 51, and

17. Find the five number summary.

Low

Lower Quartile

Median

Upper Quartile

High

17

67

76

93

98

Box Plots

 Box plots are a visual representation of the five number summary.

 Make a vertical mark for each of the five values.

 Connect the middle three with the use of a box.

 The high and low are connect to the box by a line.

Normal Distributions

 Normal distributions turn up in many of the things we measure about humans.

Z-Scores

 Normal distributions allow us to use tables to determine a z-score. The z-score is a standard score that tells us how many standard deviations from the mean any particular piece of data lies.

 In order to find a z-score, you must know the score, the mean, and the standard deviation. 𝑧 = 𝑥−μ 𝜎

Example

 The mean score for a student taking the SAT math exam in 1990 was 500 with a standard deviation of

113. If I scored a 570, find my z-score and use it to find the percentile.

 Answer: 𝑧 =

570−500

=

70

= 0.619

113 113

 Answer: To find the percentile now go to 0.6 on the left, and 0.02 on the top and find the value where they intersect. The percentile will be 0.7324 or 73 rd percentile.

Download