section 2 2018

2ndEnglish Descriptive Statistics REVIEW OF CH. 3;4 Measure Raw Data x x Mean Grouped Data x1  x2  .....  xn n  xi n X  X i 1 n n f i 1 fi i i X i : is the midpoint of the 𝑖 class. f i :is the frequency in that class. 1- sort the data n  f1 2 MD  A  L f 2  f1 2- calculate: -odd sample: [ (𝑛+1) 2 ]th. A is the Iower limit of the class of the median. (boundaries) -even sample: Median ([ (𝑛) 2 (𝑛) ]th + [ 2 f1 is the cumulative number of frequencies in all the classes before the class of the median. +1]th)/2 f 2 is the cumulative number of frequencies in all the n: the sample size. classes after the class of the median. L is the width of the class 1 2ndEnglish Descriptive Statistics Mode  A  f  f1 L 2 f  f1  f 2 A is the Iower limit of the class containing the mode. (boundaries ) Mode The value that occurs most often. f is the large number of frequencies. f1 is the number of frequency preceding the class containing the mode. f 2 is the number of frequency which following the class containing the mode. L is the width of the class. 1- sort the data 2- calculate: (𝑛+1) (𝑛+1) - if [ 4 ]th integer, Q1=[ 4 ]th. - otherwise, Q1= L+F×(U-L). L: the integer part of [ 1st quartile (Q1) (𝑛+1) th ] 4 U: the round above of [ (𝑛+1) th ] 4 F: the fraction part of [ (𝑛+1) . ] 4 Ex: 2;5;7;8;9;11 (𝑛+1) th ] 4 [ =[1.75]th n  f1 MD  A  4 L f 2  f1 A is the Iower limit of the class of the Q1 ( (boundaries ) f1 is the cumulative number of frequencies in all the n classes before the class of the Q1 ( ). 4 f 2 is the cumulative number of frequencies in all the n classes after the class of the Q1 ( ). 4 L is the width of the class Q1= 2+.75×(5-2)= 4.25. 2 n ). 4 2ndEnglish Descriptive Statistics 1- sort the data 2- calculate: 3(𝑛+1) 3(𝑛+1) - if [ 4 ]th integer, Q1=[ 4 ]th. - otherwise, Q1= L+F×(U-L). Ex: 2;5;7;8;9;11 3st quartile (Q3) 3(𝑛+1) [ 4 ]th 3n  f1 MD  A  4 L f 2  f1 A is the Iower limit of the class of the Q3 ( 3n ). 4 f1 is the cumulative number of frequencies in all the 3n classes before the class of the Q3 ( ). 4 =[5.25]th Q1= 9+.25×(11-9)= 9.5. f 2 is the cumulative number of frequencies in all the 3n classes after the class of the Q3 ( ). 4 L is the width of the class. Midrange (Lowest + Highest)/2 (Lowest boundary + Highest boundary)/2 n n S2   ( X i  X )2 S2  i 1 n 1 (X i 1  x )2 fi n f i 1 Sampling variance i i 1 X i : is the midpoint of the 𝑖 class. f i :is the frequency in that class. Range (Highest - Lowest) 3 (Highest boundary - Lowest boundary) 2ndEnglish Descriptive Statistics Q  Q3  Q1 Interquartile range Semi-interquartile range Q S  100% ; S=√𝑺𝟐 . X coefficient of variation (relative measure) Range Rule of Thumb Q3  Q1 2 S Range ; Chebyshev’s Theorem when k=2. 4 4 2ndEnglish Descriptive Statistics Important Notes 1- 2Properties of the Mean  Uses all data values.  Varies less than the median or mode  Used in computing other statistics, such as the variance  Unique, usually exists in data values 5 2ndEnglish Descriptive Statistics  Affected by extremely high or low values, called outliers  Cannot be used for nominal or ordinal data Properties of the Median  Not uses all data values.  Affected less than the mean by extremely high or extremely low values.  Can not be used for nominal data Properties of the Mode  Easiest measure to compute  Can be used with nominal data  Not always unique or may not exist Properties of the Midrange  Easy to compute.  Affected by extremely high or low values in a data set 6 2ndEnglish Descriptive Statistics 3Chebyshev’s Theorem (Empirical Rule) 𝑝(𝜇 − 𝑘𝜎 < 𝑥 < 𝜇 + 𝑘𝜎) ≥ 1 − 1 ; 𝑘 > 1. 𝑘2 #of standard Minimum Proportion within k Minimum deviations ,k standard deviations within k standard deviations 2 1 1 3  4 4 75% 3 1 1 8  9 9 88.89% 4 1 1 15  16 16 93.75% 7 Percentage 2ndEnglish Descriptive Statistics EX: The mean price of houses in a certain neighborhood is $50,000, and the standard deviation is $10,000. 1-Find the price range for which at least 55% of the houses will sell. 2- Find the price range for which at least 75% of the houses will sell. 1- Chebyshev’s Theorem states that at least 55% of a data set will fall within 1.5 standard deviations of the mean. Lowestvaule  50000  1.5  10000  35000 highestvaule  50000  1.5  10000  65000 2- Chebyshev’s Theorem states that at least 75% of a data set will fall within 2 standard deviations of the mean. Lowestvaule  50000  2  10000  30000 highestvaule  50000  2  10000  70000 Note: there is –ve relation between the accuracy and the estimated range. 8 2ndEnglish Descriptive Statistics In the case that the shape of the distribution for the data is roughly bell-shaped, the Empirical Rule states that: The interval: (μ - σ , μ+σ) will contain approximately 68% of all the measurements. The interval: (μ - 2σ, μ+2σ) will contain approximately 95% of all the measurements. The interval: (μ - 3σ, μ+ 3σ) will contain approximately 99.7% of all the measurements. EX: A survey of local companies found that the expenditures on traveling for individuals were $0.25 per month. The standard deviation was 0.025$. Using Chebyshev’s theorem, 1- Find the minimum percentage of the individuals expenditures that will fall between $0.20 and $0.30. 2- Assuming the population individuals is bell-shaped, find the minimum percentage of the individuals expenditures that will fall between $0.20 and $0.30. 9 2ndEnglish Descriptive Statistics 1-Compute the value of k .30  .25  2 or .025 .25  .20 K 2 .025 K At least 75% of the individuals expenditures will fall between $0.20 and $0.30. 2- At least 95% of the individuals expenditures will fall between $0.20 and $0.30. 10

section 2 2018

Products

Support

section 2 2018

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib