SIA Unit 3

SCIENTIFIC INQUIRY AND

ANALYSIS

UNIT 2

STATISTICAL DATA ANALYSIS

SCIENTIFIC DATA ANALYSIS 1


OBJECTIVES:

The student will be able to:

•

Create a frequency table from a set of data.

(CCCS.HSS.ID.A.1)

•

Compute, interpret, and analyze the measures of central tendency (mean, median, and mode) of a set of data.

(CCCS.HSS.ID.A.2)

•

Compute measures of spread (variance, standard deviation, quartiles, and interquartile range)

(CCCS.HSS.ID.2)

•

Graph one variable by hand. (histogram, boxplot)

(CCCS.HSS.ID.A.1)



OBJECTIVES:


•

Identify outliers informally and recognize their effect on a set of data. (CCCS.HSS.ID.A.3)

•

Define the characteristics of the Normal distribution by examining a histogram. (CCCS.HSS.ID.4)

•

Explain how a histogram, which is a discrete probability distribution, is related to the Normal distribution curve, a continuous probability distribution. (CCCS.HSS.ID.A.4)

•

Determine if a given set of data is approximately Normal using the empirical rule (68 - 95 - 99.7 rule). (CCCS.HSS.ID.A.4)

•

Estimate areas under the Normal curve using the empirical rule.



OBJECTIVES:


•

Graph two variables by hand (scatterplot).

(CCCS.HSS.ID.B.6)

•

Describe a scatterplot in terms of form, direction, strength, and the presence of outliers. (CCCS.HSS.ID.B.6)

•

Find equations of lines of best fit by fitting a line by hand and using technology (TI-84 regression function and/or

Excel). (CCCS.HSS.ID.B.6.A)

•

Interpret the slope (rate of change) and the intercept

(constant term) of a linear model in the context of the data.

(CCCS.HSS.ID.C.7)



OBJECTIVES:


•

Compute the correlation coefficient using technology

(TI-84 or Excel) and interpret it in the context of the data. (CCCS.HSS.ID.C.8)

•

Informally assess the fit of a function by plotting and analyzing residuals. (CCCS.HSS.ID.B.6.B)

•

Make predictions based upon analysis of data.

(5.2.12.A.3)

•

Distinguish between correlation and causation.

(CCCS.HSS.ID.C.9)



•

Statistics

– collection of methods for planning experiments, obtaining data, and then organizing, summarizing, presenting, analyzing, interpreting, and drawing conclusions from the data.



•

Measures of Central Tendency

– a method to describe the entire sample or population in a single number known as an average (mode, median and mean)

•

Mode

– the value that occurs most frequently in data.

–

Example 1: What is the mode of the following data: (2, 5, 3, 2, 1, 6, 4, 10, 44, 2, 4, 1, 10, 3, 2, 5)?



•

Mode

–

Example 2: What is the mode of the following data: (2, 5, 3, 10, 1, 4, 4, 10, 1, 2, 3, 4, 1, 10, 3, 2,

5, 5)?

–

Mode is not a stable average, but it gives you the most common value in a distribution if that is the information desired.

–

There can sometimes be more than one mode in a given piece of data.



•

Median

– the central value that occurs in an ordered distribution of data.

–

If there is an odd number of data, it is the center value.

–

If there is an even number of data, there are two center values therefore:

Median = sum of two middle values / 2



•

Median

–

Example 1: What is the median of the following data: (62, 3, 5, 28, 67, 33, 22, 2, 10)?

–

Example 2: What is the median of the following data: (62, 3, 5, 28, 67, 33, 22, 2, 10, 120)?

–

Median is a more stable average than the mode, but it does not indicate the range of values above or below it.



•

Mean

– adds all values of a distribution of data and divides by the amount of data.

𝑥 𝑛 𝑠𝑢𝑚 𝑜𝑓 𝑎𝑙𝑙 # ′ 𝑠

= 𝑡ℎ𝑒 𝑎𝑚𝑡. 𝑜𝑓 # ′ 𝑠

𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑛 = 𝜇 =

𝑥 𝑛



•

Mean

–

Trimmed Mean: will remove the highest and lowest values of a group of data before taking a mean. The typical trim amounts are either 5% or

10%.

–

5% Trim Mean: take 5% of the number of data points, round out the answer, take that amount off the top and bottom, and then take the average.



•

Mean

–

Example: Given the following data take the 5% trimmed mean: 34, 56, 72, 74, 78, 82, 85, 85, 88,

90, 90, 92, 95, 95, 99, 100.

•

5% of 16 values is .8, therefore round up to 1 and remove the top and bottom scores.

•

Remove 34 & 100; add up the remains = 1181 / 14 =

84.4%

•

If no trimming is done, then the mean would be 82.2%.



•

Measures of Variation

– a cross reference of the spread of the data.

•

Range

– the difference between the largest and smallest values of a distribution.

–

Example 1: What is the range of the following data: (2, 5, 3, 2, 1, 6, 4, 10, 44, 2, 4, 1, 10, 3, 2, 5)?

–

Range fails to tell how much values vary from one another.



•

Sample Standard Deviation

– a measurement that gives you a better idea of how the data entries differ from the mean.

𝑆𝑎𝑚𝑝𝑙𝑒 𝑠𝑡𝑑. 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 = 𝑠 =

𝑥 − 𝑥 𝑛 − 1

2

𝑆𝑎𝑚𝑝𝑙𝑒 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 𝑠 2 =

𝑥 − 𝑥

2 𝑛 − 1

– x = a value in the distribution

– 𝑥 = the sample mean value of the distribution.

– n = the total number of values in a sample distribution



•

Population Standard Deviation

– this is the same as the sample standard deviation with the exception that this includes the complete population that you are studying not just a sample set. NOTE: the symbol is different and you divide by the whole population ( N ).

𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑠𝑡𝑑. 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 = 𝜎 =

𝑥 − 𝜇

𝑁

2

– x = a value in the distribution

– 𝜇 = the population mean value of the distribution.

–

N = the total number of values in the population



•

Standard Deviation

–

Example: Find the standard deviation of the following values:

(1, 2, 7, 9, 10, 10). 𝑥 (𝒙 − 𝑥) 2

1 – 6.5 = -5.5

30.3

s 2 =

1

2

7

9

10

10

𝑀𝑒𝑎𝑛 = 𝑥 = s =

Σ (𝒙 − 𝑥) 2

=



•


– the following is an alternate means to calculate sample std. deviation.

𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑡𝑑. 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 = 𝑠 =

𝑆𝑆 𝑥 𝑛 − 1 𝑤ℎ𝑒𝑟𝑒 𝑆𝑆 𝑥

= Σ(𝑥 2 ) −

(𝑥) 2 𝑛



•


–

Previous example: Find the standard deviation of the following values: (1, 2, 7, 9, 10, 10) using alternate method

9

10

10

Σx =

2

7 x

1

SS x

= x 2

1

4

Σx 2 = s =



•

Coefficient of Variation

– while standard deviation computes a value which indicates the range of data around the mean value, coefficient of variation (CV) will indicate it as a % .

𝑠

𝐶𝑉𝑓𝑜𝑟 𝑎 𝑠𝑎𝑚𝑝𝑙𝑒 = × 100

𝑥 𝜎

𝐶𝑉𝑓𝑜𝑟 𝑎 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 = × 100 𝜇

– s = sample standard deviation

– 𝑥 = the sample mean value of the distribution

– 𝜎 = population standard deviation.

– 𝜇 = the population mean value of the distribution.



•

Histograms

–

Sometimes it is difficult to see how data is distributed by just looking at the numbers. To see how data is distributed, a histogram is used.

–

A histogram is a type of bar graph with the exception that all of the bars touch, and the width of the bars represents something.



•

Histograms

Probability Test

10

8

6

4

2

0

59.5 -

65.5

65.5 -

71.5

71.5 -

77.5

77.5 -

83.5

Test Scores

83.5 -

89.5

89.5 -

95.5

95.5 -

101.5



•

Histograms Procedure

1.

Decide how many classes (bars) you want. It will be given by the problem.

2.

To figure out the width of the bars, divide the range by the # of bars and then round up to the next whole number. (NOTE: Always round up even if the number is less than 5, i.e. 5.41 rounds to 6.0)

𝐵𝑎𝑟 𝑊𝑖𝑑𝑡ℎ =

(ℎ𝑖𝑔ℎ𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒 −𝑙𝑜𝑤𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒)

# 𝑜𝑓 𝑏𝑎𝑟𝑠

3.

Take the bar width and add it to the lowest value to get the range of the first bar, then add the bar width to the last value to get the range of the next bar. Keep going until you get all of your bar ranges. (i.e. if the lowest value was 60, your bar width was 6 then the first bar would be 60 – 66, the second bar would be 66 – 72, etc.)



•

Histograms Procedure (continued)

The problem occurs if your data point is 66 as in the example.

In order to alleviate this problem, a boundary is calculated for the bars.

4.

Calculate the boundaries of each bar: a.

Find the interval of the data. Is the data given down to whole numbers, tenths, hundredths, etc? (Note: the data will always have the same interval) b.

Take the interval and divide by 2. This is the boundary adjustment. (i.e. whole numbers means intervals of 1, so ½ = 0.5) c.

For each bar range calculated previously in step 3, subtract the upper and lower limit by the boundary adjustment value. These will be your new bar ranges or boundaries. (i.e. 60 – 0.5 = 59.5 and 66 – 0.5 = 65.5; first bar 59.5 – 65.5)



•


5. Calculate the midpoint of each bar: a.

Take the upper and lower limit of a bar add them together and divide by 2. This will be the midpoint. (i.e. (59.5 +

65.5) / 2 = 62.5) 𝑏𝑎𝑟 𝑚𝑖𝑑𝑝𝑜𝑖𝑛𝑡 = 𝑏𝑎𝑟 𝑢𝑝𝑝𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 + 𝑏𝑎𝑟 𝑙𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡

2 b. Do this for all of the rest of the bars.

c.

The midpoint is sometimes used instead of the boundaries to graph the bars.



•


6. Construct a frequency table by using tally marks.

59.5 –

65.5

||

65.5 –

71.5

|

71.5 –

77.5

||

77.5 –

83.5

|

83.5 –

89.5

||

89.5 –

95.5

||||

||

95.5 –

101.5

||||

||||

7. Graph the frequency table using a bar graph arrangement.



•

Draw the histogram for the following data.

Put it into 5 classes. The data is the number of passing touchdowns for the top 20 rated quarterbacks in the 2011 season.

45

41

17

13

9

46

15

21

18

20

39

29

27

21

13

31

29

16

18

20



•

Histograms

–

If the midpoint of each class is plotted, they can be interconnected with a straight line.

–

This straight line graph of the midpoints is known as a Frequency Polygon



•

Histograms

–

What did the histogram indicate?

–

Histograms can be used as a means of predicting outcome or probability. These are known as probability distributions.

–

One of the famous probability distributions is the normal distribution, also known as the normal curve or bell curve.



•

Normal

Distribution

–

The graph to the right is an example of a normal distribution. Not only does it indicate the results of the scores, but it can also be used for probability or predictions.



•

Normal Distribution Properties

–

The curve is bell shaped with the highest point at the mean value.

–

It is symmetrical about a vertical line through the mean value.

–

The curve approaches the horizontal axis but never touches it.

–

The transition points (between cup down and cup up) occur at (mean + standard deviation) and

(mean – standard deviation).



•

Empirical Rule

–

For a normal distribution the following can be said about the data:

•

68.2% of the data will lie within 1 standard deviation on either side of the mean

•

95.4% of the data will lie within 2 standard deviations on either side of the mean.

•

99.7% of the data will lie within 3 standard deviation on either side of the mean



•

Normal

Distribution

Properties

• 𝜎 = 34.1%

•

2 𝜎 = 13.6%

•

3 𝜎 = 2.15%

•

>3 𝜎 = 0.15%

• These %’s are used to indicate probabilities.



•

Example: Assume the heights of college women are normally distributed, with a mean of 65 inches and a SD of 2.5 inches.

– What % of women are taller than 65 inches –OR- what is the probability if one woman is selected she is taller than 65 inches?

–

Shorter than 65 inches?

–

Between 62.5 and 67.5 inches?

–

Between 60 and 70 inches?



•

Percentiles

–

Sometimes it is more important to see the relative position of piece of data rather than its exact value.

–

Percentile refers to where data lies relative to the other data in the distribution. A data point at the n th percentile means n% of the data falls at or below that point and 100 – n% falls at or above that point.

–

Example: You scored in the 85 th percentile therefore

85% of the people who took the test scored at or below you while 15% scored at or above you. Note: this does

NOT mean you scored 85% on the test.



•

Percentiles

–

The median is a type of percentile. It is the middle data point in the distribution therefore it is at the

50 th percentile.

–

A special type of percentile known as the quartile is also used to evaluate the position of data.

–

Quartiles split data into fourths.

–

The 1 st quartile (Q

1 quartile (Q

2

(Q

3

) is the 25 th percentile, the 2 nd

) is the median, and the 3

) is the 75 th percentile.

rd quartile



•

Quartiles

Q

1

Q

2

Q

3

–

Interquartile Range (IQR) = Q

3

– Q

1



•

Quartiles

–

Procedure to compute quartiles:

1. Order the data from smallest to largest.

2. Find the median; this is the 2 nd quartile, Q

2

.

3. The first quartile Q

1 is then the median of the lower half of the data. It is the median of the data falling below the Q

2 and not including Q

2

.

4. The third quartile Q

3 is then the median of the upper half of the data. It is the median of the data falling above the Q

2 and not including Q

2

.



•

Quartiles

–

Example (even # of data):

–

Find Q1, Q2 & Q3 & IQR for the following data:

(3, 4, 9, 13, 20, 24)

1. Find Q2. Find the median of all of the data. No center data point so take mean of the two center data points. 13 +

9 / 2 = 11.

2. Find Q1. Find the median of the first half of the data not including Q2. Q1 = 4

3. Find Q3. Find the median of the second half of the data not including Q2. Q3 = 20

4. IQR = Q3 – Q1 = 20 – 4 = 16.



•

Quartiles

–

Example: A study of ice cream bars was done.

Twenty seven bars tested were rated as tasting

“fair.” The cost per bar is listed below. Find the quartiles and the IQR.

0.99

1.07

1.00

0.50

0.37

1.03

1.07

1.07

0.97

0.63

0.33

0.50

0.97

1.08

0.47

0.84

1.23

0.25

0.50

0.40

0.33

0.35

0.17

0.38

0.20

0.18

0.16



•

Quartiles

–

Knowing Q

1

, Q

2

, Q

3

, highest value and lowest value in a table of data is known as a Five-

Number Summary .

–

In order to graphically represent the five-number summary, a Box-and Whisker Plot will be used.



•

Quartiles

–

Box-and Whisker Plot (Shown vertically but can be done horizontally as well)

Highest Value

Q

3

Q

2

Q

1

Lowest Value



•

Quartiles

–

Proceure to make a Box-and Whisker Plot :

•

Draw a vertical scale to include the lowest and highest data values.

•

To the right of the scale draw a box from Q

1 to Q

3

.

•

Include a solid line through the box at the median level.

•

Draw solid lines called whiskers from Q

1 value and from Q

3 to the highest value.

to the lowest

–

EXAMPLE: Go back to the ice cream problem and create a box-and-whisker plot.



•

Outliers

–

Sometimes data can skew the average of a range of data.

–

When data is 1.5X the difference of the 1 st and 3 rd quartiles, than it may be considered an outlier.

–

Outliers are sometimes removed from the data so that is does not skew the results.



•

Scatter Plots

–

Remember from last unit that data can be plotted as a series of x and y points known as a scatter plot.

–

We estimated a line of best fit. In doing this, we were finding a linear correlation that exists between the x and y points.

–

We shall analyze the data of a scatter plot more closely in the next couple of slides.



Time

(seconds)

0.7

1.8

2.6

3.4

3.8

4.1

4.9

6.0

6.5

Position

(meters)

3.8

3.2

2.8

2.2

1.8

1.4

0.8

0.2

0

SCIENTIFIC INQUIRY AND ANALYSIS 46


•

Scatter Plots

–

The y-distance that a data point is away from the line of best fit is known as a Residual.

–

The optimal line of best fit occurs when the sum of all of the square of all of the residual values is the smallest. This is know as finding the line of best fit through Least Squares method.



•

Least Squares Method

–

Recall that the slope of a linear line is in the format: 𝑦 = 𝑚𝑥 + 𝑏

–

This method will allow us to find the optimal slope

( m ) and the y-intercept ( b ) based on the data.

–

We will use a similar method here as we did for calculating standard deviation.



•

Least Squares Method 𝑦 = 𝑚𝑥 + 𝑏

–

To find the slope m, the following equation is used: 𝑚 =

𝑆𝑆 𝑥𝑦

𝑆𝑆 𝑥 where 𝑆𝑆 𝑥𝑦

= Σ𝑥𝑦 −

Σ𝑥 Σ𝑦 𝑛 and

𝑆𝑆 𝑥

= Σ𝑥 2 −

Σ𝑥

2 𝑛

–

To find the y-intercept b, the following equation is used: where 𝑦 is he mean of y and 𝑥 is he mean of x



X -data

Time

(seconds)

0.7

Y-data

Position

(meters)

3.8

x 2 xy

1.8

2.6

3.4

3.8

4.1

4.9

6.0

6.5

Σ x =

𝑥 =

3.2

2.8

2.2

1.8

1.4

0.8

0.2

0

Σ y = 𝑦 =

Σ x 2 = Σ xy =



•

Example:

1. From the example on the previous page find the slope:

Σ𝑥 Σ𝑦

𝑆𝑆 𝑥𝑦

= Σ𝑥𝑦 −

Σ𝑥

=

𝑆𝑆 𝑥

= Σ𝑥 2 −

2 𝑛

= 𝑛

𝑆𝑆 𝑥𝑦 𝑚 = =

𝑆𝑆 𝑥

2. From the example on the previous page find the yintercept:

3.

Write the equations for line of least squares.

𝑦 = 𝑚𝑥 + 𝑏



Graph 1: Movement of a Car

4,5

4

3,5

3

2,5

2

1,5

1

0,5

0

-0,5

0 1 2 3 4

Time (seconds) y = -0,6974x + 4,419

R² = 0,9886

5 6 7



•

Measuring the Spread of Data

–

There are three methods for measuring the spread of the data around the line of least squares:

•

Standard Error of Estimate

•

Coefficient of Correlation

•

Coefficient of Determination



•

Standard Error of Estimate

–

In order to do this we look at how far away the y data point is away from the least squares line for each of the data points.

–

This method will calculate a value that is representative of spread of all of the data.

–

We will use values that were already calculated in figuring out the least squares line.



•

Standard Error of Estimate

𝑆 𝑒

=

𝑆𝑆 𝑦

− 𝑚 𝑆𝑆 𝑥𝑦 𝑛 − 2

–

Why would it be n – 2? (In other words, why does n have to be >2)

–

Use the same method as before to find m, SS xy

SS x

: and

𝑆𝑆 𝑦

= Σ𝑦 2 −

Σ𝑦

2 𝑛



X -data

Time

(seconds)

0.7

Y-data

Position

(meters)

3.8

y 2

Previously

Calculated

Data

SS xy = m = 1.8

2.6

3.4

3.8

4.1

4.9

6.0

6.5

Σ x =

3.2

2.8

2.2

1.8

1.4

0.8

0.2

0

Σ y = Σ y 2 =



•

Example:

1. From the example on the previous page find the following:

Σ𝑦 2

𝑆𝑆 𝑦

= Σ𝑦 2 − 𝑛

=

2. From the above calculation and previous calculated data find the standard error of estimate:

𝑆 𝑒

=

𝑆𝑆 𝑦

− 𝑚 𝑆𝑆 𝑥𝑦 𝑛 − 2

=



•

Linear Correlation Coefficient, r

–

So far, we have been able to figure the line of best fit by using the line of least squares (which is also known as the

“least squares regression line of y on x”)

–

We then wanted to determine the quality of our line by using the standard error of estimate.

–

The problem with the standard error of estimate is that it has units of y; therefore, when looking at two different sets of data, you cannot say that one graph is better than other because the units may skew the result.

–

The linear correlation coefficient helps to alleviate this problem by calculating a number that is unitless and therefore independent of the units.



•

Linear Correlation Coefficient, r

𝑆𝑆 𝑥𝑦 𝑟 =

𝑆𝑆 𝑥

𝑆𝑆 𝑦

–

The value of r

0

1 or -1 r Indication

There is no linear relationship of the data points

There is a perfect linear relationship between the x and y data points; all points lie on the least-squares line.

Between 0 and 1 The x and y data points have a positive correlation (+ slope)

Between 0 and -1 The x and y data points have a negative correlation (- slope)



X -data

Time

(seconds)

0.7

Y-data

Position

(meters)

3.8

Previously

Calculated

Data

SS xy =

1.8

2.6

3.4

3.8

4.1

4.9

6.0

6.5

3.2

2.8

2.2

1.8

1.4

0.8

0.2

0

SS x =

SS y =



•

Example:

1. From the example on the previous page find the following:

𝑆𝑆 𝑥𝑦 𝑟 = =

𝑆𝑆 𝑥

𝑆𝑆 𝑦

2. What does the value of r indicate about the correlation of the data points?



•

Coefficient of Determination, r 2

–

Another way of looking at the quality of your data is to look at how far away some y-data point ( y ) is from the mean of the y-data ( 𝑦 ). This is simply the deviation. 𝑦 − .

–

The deviation is made up of two parts:

•

The first part indicates how far away the least squares line ( from the mean of the y-data ( 𝑦 ). This is simply 𝑦 𝑝 y p

) is

− , and this is known as the explained portion of the standard deviation.

• The second part indicates how far away a particular y-data point ( y ) is from the least squares line ( y p

). This is simply 𝑦 − 𝑦 𝑝

, and this is known as the unexplained portion of the standard deviation.



•


–

Recall that when the deviation is squared we get the variance or variation. Based on the explanation before the variance has two parts: the explained variation and the unexplained variation.

–

The Coefficient of Determination is a ratio of the explained variation to the total variation and is simply calculated by taking the Correlation

Coefficient ( r ) and squaring it.



•


–

So what does r 2 indicate?

–

Change r 2 into a %

–

The % indicates what % of the variation of the y data is explained by the variation of the x data if we use the least squares line.

– 100% − 𝑟 2 indicates what % of the variation of the y data is due to random chance or some other variable beside the x that may influence y.



•

Example:

1. From the previous example find the

Coefficient of Determination, r 2 :

2 𝑟 2 =

𝑆𝑆 𝑥𝑦

=

𝑆𝑆 𝑥

𝑆𝑆 𝑦

2. What does the value of r 2 indicate about the explained and unexplained portions of the variation?



•

Correlation vs Causation

–

Correlation refers to one variable changing as another variable changes.

–

Causation refers to one variable changing because of another variable changing. (Cause & Effect)

–

Just because there is a correlation between two variables does not mean there is a causation.


DO NOW / HW Unit 2-1 Check

•

Have out your homework and do the following: Find the mode, median, mean and standard deviation.

60%

63%

66%

74%

74%

77%

86%

89%

89%

91%

91%

94%

94%

94%

94%

94%

97%

97%

100%

100%

100%

100%

100%

100%

100%


HW Assignment 2-1 Check

•

10, 12, 14, 18, 36, 37, pg. 449 – 50

10. 8.33

9

9

12. 85.625

85.5

91

14. 2.77

2.9

2.9

18. 14

4

36. $233,071.43

$142,000 none

37. $645,000

$213,242.66



•

Have out your homework and do the following: Make a histogram of the following data in 7 classes.

These were the top 32 quarterback ratings in the NFL in 2012.

108.0

99.1

90.7

87.4

83.3

81.2

77.4

72.6

105.8

98.7

90.5

87.2

82.6

79.8

76.5

72.2

102.4

97.0

88.6

86.2

81.6

79.1

76.1

66.9

100.0

96.3

87.7

85.3

81.3

78.1

74.0

66.7



RANGE : 41.3

CLASSES:

BAR WIDTH:

BAR STARTING POINT:

UPPER BAR RANGES:

INTERVAL:

BOUNDARY ADJUSTMENT:

BOUNDARIES STARTING POINT:

BOUNDARY RANGES:

7.0

6

66.7

72.7

0.1

0.05

66.65

72.65

78.7

78.65

84.7

84.65

90.7

90.65

96.7

96.65

102.7

102.65

108.7

108.65

66.65 - 72.65

72.65 - 78.65

78.65 - 84.65

84.65 - 90.65

90.65 - 96.65 96.65 - 102.65

4 5 7 7 2 5

102.65 - 108.65

2 # OF QB'S


HW Assignment 2-2


EXPERIMENTAL DESIGN

•


–

Example: Find the standard deviation of the following values:

(1, 2, 7, 9, 10, 10). 𝑥 (𝒙 − 𝑥) 2

1

2

7

9

10

10

Mean = 39/6 = 6.5

– s 2 = 81.8 / 5 = 16.4

1 – 6.5 = -5.5

2 – 6.5 = -4.5

7 – 6.5 = 0.5

9 – 6.5 = 2.5

10 – 6.5 = 3.5

10 – 6.5 = 3.5

s = 4.05

30.3

20.3

0.3

6.3

12.3

12.3

Σ = 81.8


EXPERIMENTAL DESIGN

•


–

Previous example: Find the standard deviation of the following values: (1, 2, 7, 9, 10, 10) using alternate method

2

7 x

1

9

10

10

Σx = 39

–

SS x

= 335 – 39 2 /6 = 81.5 s = 4.04

x 2

1

4

49

81

100

100

Σx 2 = 335


SIA Unit 3

Related documents

Products

Support

SIA Unit 3

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib