Matakuliah Tahun : D0722 - Statistika dan Aplikasinya : 2010 Pendahuluan Pertemuan 1 Learning Outcomes • Pada akhir pertemuan ini, diharapkan mahasiswa akan mampu : 1. memberikan definisi skala pengukuran, sampel, populasi , data dan pengumpulan data 2. menerangkan statistik deskriptif 3 COMPLETE 5th edi tion 1-4 BUSINESS STATISTICS Using Statistics (Two Categories) Descriptive Statistics Inferential Statistics Predict and forecast values of population parameters Test hypotheses about values of population parameters Make decisions Collect Organize Summarize Display Analyze McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE 5th edi tion 1-5 BUSINESS STATISTICS Types of Data - Two Types Qualitative Categorical or Nominal: Examples are Color Gender Nationality McGraw-Hill/Irwin Quantitative Measurable or Countable: Examples are Temperatures Salaries Number of points scored on a 100 point exam Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 5th edi tion 1-6 Scales of Measurement • Nominal Scale - groups or classes Gender • Ordinal Scale - order matters Ranks • Interval Scale - difference or distance matters – has arbitrary zero value. Temperatures • Ratio Scale - Ratio matters – has a natural zero value. Salaries McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 5th edi tion 1-7 Samples and Populations A population consists of the set of all measurements for which the investigator is interested. A sample is a subset of the measurements selected from the population. A census is a complete enumeration of every item in a population. McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 1-8 5th edi tion Why Sample? Census of a population may be: Impossible Impractical Too costly McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 5th edi tion 1-9 12-6 Index Numbers An index number is a number that measures the relative change in a set of measurements over time. For example: the Dow Jones Industrial Average (DJIA), the Consumer Price Index (CPI), the New York Stock Exchange (NYSE) Index. Value in period i Index number in period i: = 100 Value in base period Changing the base period of an index: Old index value New index value: = 100 Index value of new base McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE 5th edi tion 1-10 BUSINESS STATISTICS Index Numbers 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 121 121 133 146 162 164 172 187 197 224 255 247 238 222 100.0 100.0 109.9 120.7 133.9 135.5 142.1 154.5 162.8 185.1 210.7 204.1 196.7 183.5 McGraw-Hill/Irwin 64.7 64.7 71.1 78.1 86.6 87.7 92.0 100.0 105.3 119.8 136.4 132.1 127.3 118.7 Price and Index (1982=100) of Natural Gas Price 250 Original Index (1984) P ric e Index Index Year Price 1984-Base 1991-Base 150 Index (1991) 50 Aczel/Sounderpandian 1985 1990 1995 Year © The McGraw-Hill Companies, Inc., 2002 COMPLETE 5th edi tion 1-11 BUSINESS STATISTICS Summary Measures: Population Parameters Sample Statistics Measures of Central Tendency Median Mode Mean McGraw-Hill/Irwin Measures of Variability Range Interquartile range Variance Standard Deviation Other summary measures: Skewness Kurtosis Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 5th edi tion 1-12 Measures of Central Tendency or Location Median Middle value when sorted in order of magnitude 50th percentile Mode Most frequentlyoccurring value Mean Average McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE 5th edi tion 1-13 BUSINESS STATISTICS Arithmetic Mean or Average The mean of a set of observations is their average the sum of the observed values divided by the number of observations. Population Mean Sample Mean N m= McGraw-Hill/Irwin n x x= i =1 N Aczel/Sounderpandian x i =1 n © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 1-14 5th edi tion Percentiles and Quartiles Given any set of numerical observations, order them according to magnitude. The Pth percentile in the ordered set is that value below which lie P% (P percent) of the observations in the set. The position of the Pth percentile is given by (n + 1)P/100, where n is the number of observations in the set. McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 1-15 5th edi tion Quartiles – Special Percentiles Quartiles are the percentage points that break down the ordered data set into quarters. The first quartile is the 25th percentile. It is the point below which lie 1/4 of the data. The second quartile is the 50th percentile. It is the point below which lie 1/2 of the data. This is also called the median. The third quartile is the 75th percentile. It is the point below which lie 3/4 of the data. McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE 1-16 BUSINESS STATISTICS 5th edi tion Measures of Variability or Dispersion Range Difference between maximum and minimum values Interquartile Range Difference between third and first quartile (Q3 - Q1) Variance Average*of the squared deviations from the mean Standard Deviation Square root of the variance Definitions of population variance and sample variance differ slightly. McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 1-17 5th edi tion Example - Range and Interquartile Range (Data is used from Example ) Sales 9 6 12 10 13 15 16 14 14 16 17 16 24 21 22 18 19 18 20 17 Sorted Sales 6 9 10 12 13 14 14 15 16 16 16 17 17 18 18 19 20 21 22 24 McGraw-Hill/Irwin Maximum - Minimum = Range Rank 24 - 6 = 18 1 Minimum 2 3 4 5 Q1 = 13 + (.25)(1) = 13.25 6 First Quartile 7 8 9 10 See slide # 19 for the template output 11 12 13 14 Q3 = 18+ (.75)(1) = 18.75 15 16 Third Quartile 17 Q3 - Q1 = Interquartile 18 18.75 - 13.25 = 5.5 19 Range Maximum 20 Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE 5th edi tion 1-18 BUSINESS STATISTICS Variance and Standard Deviation Population Variance Sample Variance 2 m (x ) s 2 = i=1 x 2 s= McGraw-Hill/Irwin ( x) - i=1 s s = 2 i =1 N N = (x - x) n N N i =1 2 N 2 (n - 1) ( ) x n = N 2 2 n x i =1 2 n i =1 (n - 1) s= s 2 Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE 5th edi tion 1-19 BUSINESS STATISTICS Group Data and the Histogram Dividing data into groups or classes or intervals Groups should be: Mutually exclusive • Not overlapping - every observation is assigned to only one group Exhaustive • Every observation is assigned to a group Equal-width (if possible) • First or last group may be open-ended McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE 5th edi tion 1-20 BUSINESS STATISTICS Frequency Distribution Table with two columns listing: Each and every group or class or interval of values Associated frequency of each group • Number of observations assigned to each group • Sum of frequencies is number of observations – N for population – n for sample Class midpoint is the middle value of a group or class or interval Relative frequency is the percentage of total observations in each class Sum of relative frequencies = 1 McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE 1-21 BUSINESS STATISTICS 5th edi tion Cumulative Frequency Distribution x Spending Class ($) 0 to less than 100 100 to less than 200 200 to less than 300 300 to less than 400 400 to less than 500 500 to less than 600 F(x) Cumulative Frequency 30 68 118 149 171 184 F(x)/n Cumulative Relative Frequency 0.163 0.370 0.641 0.810 0.929 1.000 The cumulative frequency of each group is the sum of the frequencies of that and all preceding groups. McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 1-22 5th edi tion Histogram A histogram is a chart made of bars of different heights. Widths and locations of bars correspond to widths and locations of data groupings Heights of bars correspond to frequencies or relative frequencies of data groupings McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 5th edi tion 1-23 Histogram Example Frequency Histogram McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 5th edi tion 1-24 Histogram Frequency A histogram is a chart made of bars of different heights. Widths and locations of bars correspond to widths and locations of data groupings Heights of bars correspond to frequencies or relative frequencies of data groupings McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 5th edi tion 1-25 Skewness and Kurtosis Skewness – Measure of asymmetry of a frequency distribution • Skewed to left • Symmetric or unskewed • Skewed to right Kurtosis – Measure of flatness or peakedness of a frequency distribution • Platykurtic (relatively flat) • Mesokurtic (normal) • Leptokurtic (relatively peaked) McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 1-26 5th edi tion Skewness Skewed to left McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 1-27 5th edi tion Skewness Symmetric McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 1-28 5th edi tion Skewness Skewed to right McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 5th edi tion 1-29 Kurtosis Platykurtic - flat distribution McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 1-30 5th edi tion Kurtosis Mesokurtic - not too flat and not too peaked McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 5th edi tion 1-31 Kurtosis Leptokurtic - peaked distribution McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE 1-32 BUSINESS STATISTICS 5th edi tion Methods of Displaying Data Pie Charts Categories represented as percentages of total Bar Graphs Heights of rectangles represent group frequencies Frequency Polygons Height of line represents frequency Ogives Height of line represents cumulative frequency Time Plots Represents values over time McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 1-33 5th edi tion Pie Chart McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE 5th edi tion 1-34 BUSINESS STATISTICS Bar Chart Fig. 1-11 Airline Operating Expenses and Revenues 12 Average Revenues Average Expenses 10 8 6 4 2 0 American Continental Delta Northwest Southwest United USAir A i r li n e McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE 5th edi tion 1-35 BUSINESS STATISTICS Frequency Polygon and Ogive Relative Frequency Polygon 0.3 Ogive 1.0 0.2 0.5 0.1 0.0 0.0 0 10 20 30 40 50 Sales McGraw-Hill/Irwin 0 10 20 30 40 50 Sales Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 5th edi tion 1-36 Time Plot M o n thly S te e l P ro d uc tio n (P ro b le m 1 -4 6 ) Millions of Tons 8.5 7.5 6.5 5.5 Month McGraw-Hill/Irwin J F M A M J J A S ON D J F M A M J J A S ON D J F M A M J J A S O Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE 1-37 BUSINESS STATISTICS 5th edi tion Exploratory Data Analysis - EDA Techniques to determine relationships and trends, identify outliers and influential observations, and quickly describe or summarize data sets. Stem-and-Leaf Displays Quick-and-dirty listing of all observations Conveys some of the same information as a histogram Box Plots Median Lower and upper quartiles Maximum and minimum McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE 5th edi tion 1-38 BUSINESS STATISTICS Example Stem-and-Leaf Display 1 2 3 4 5 6 McGraw-Hill/Irwin 122355567 0111222346777899 012457 11257 0236 02 Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE 5th edi tion 1-39 BUSINESS STATISTICS Box Plot Elements of a Box Plot Outlier Smallest data point not below inner fence o Largest data point Suspected not exceeding outlier inner fence X Outer Fence Inner Fence Q1-1.5(IQR) Q1-3(IQR) McGraw-Hill/Irwin X Q1 Median Interquartile Range Q3 Inner Fence Q3+1.5(IQR) * Outer Fence Q3+3(IQR) Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 COMPLETE BUSINESS STATISTICS 5th edi tion 1-40 Example: Box Plot McGraw-Hill/Irwin Aczel/Sounderpandian © The McGraw-Hill Companies, Inc., 2002 Ringkasan Skala pengukuran: nominal, ordinal, interval, rasio Penyajian data : histogram frekuensi Angka indeks Statistik deskriptif : ukuran pemusatan dan penyebaran 41