Chapter 1.1_0

Chapter 1 EQT 271 SEM II 2014/2015 DESCRIPTIVE AND INFERENTIAL STATISTICS 1. Statistics is the sciences of conducting studies to collect, organize, summarize, analyze, and draw conclusions from data. 2. A variable is a characteristic of attributes that can assume different values. 3. A population consists of all subjects (human or otherwise) that are being studied. 4. A sample is a group of subjects selected from a population. 5. In a Census, the researchers collect data from all subjects in the population 6. Descriptive statistics consists of the collection, organization, summarization, and presentation of data. 7. Inferential statistics consist of generalizing from samples to populations, performing estimations and hypothesis tests, determining relationships among variables and making predictions. VARIABLES AND TYPES OF DATA 1. Qualitative variables are variables that have distinct categories according to some characteristic or attribute. (e.g : Genders (male or female), Race (Malay, Chinese, Indian)) 2. Quantitative variables are variables that can be counted or measured. 3. Discrete variables assume values that can be counted. (e.g: Number of children in a family, number of students in a classroom) 4. Continuous variables can assume an infinite number of values between any two specific values. They are obtained by measuring. They often include fractions and decimals. Khatijahhusna Abd Rani Page 1 Chapter 1 EQT 271 SEM II 2014/2015 5. Level of measurements: a. Nominal: classifies data into mutually exclusive (nonoverlapping) categories in which no order or ranking can be imposed on the data. b. Ordinal: classifies data into categories that can be ranked; however, precise differences between the ranks do not exist. c. Interval: ranks data and precise differences between units of measure do exist; however, there is no meaningful zero d. Ratio: possesses all the characteristics of interval measurement, and there exists a true zero. In addition, true ratios exist when the same variable is measured on two different members of the population. FREQUENCY DISTRIBUTIONS AND GRAPHS 1. A frequency distribution is the organization of raw data in table form, using classes and frequencies. 2. A Grouped frequency distribution is used if variables take a large number of values or the variable is continuous. 3. Class Width: subtracting lower class limit of one class from the lower class limit of the next class. The class width also can be found by subtracting lower boundary from the upper boundary for any given class. Class limits Class boundaries 100-104 99.5-104.5 105-109 104.5-109.5 Class Width = 105-100 =5 Class Width OR = 109.5-104.5 =5 4. The Histogram is a graph that displays the data by using contiguous vertical bars (unless the frequency of a class is 0) of various heights to represent the frequencies of the classes. Khatijahhusna Abd Rani Page 2 Chapter 1 EQT 271 SEM II 2014/2015 Measures of central tendency 1. The Mean ( x ) is the sum of the values, divided by total number of values 2. The Median ( ~ x ) is the midpoint of the data array. Steps: I. Arrange the data values in ascending order II. Determine the number of values in the data set III. a. If n is odd, select the middle data as the median b. If n is even, find the mean of the two middle values. 3. The Mode ( x̂ ) is the value that occurs most often (higher frequency) in a data set 4. Best Measure of central tendency: Type of Variable Best measure of central tendency Nominal Mode Ordinal Median Interval/Ratio (not skewed) Mean Interval/ Ratio (Skewed) Median Measures of dispersion 1. The Range is the highest value minus the lowest value. The Symbol R is used for the range. R = highest value-lowest value 2. The Variance is the average of the squares of the distance each value is from the mean. 3. The Standard deviation is the square root of variance. 4. A statistical rule stating that for normal distribution (bell shaped), is almost all data will fall within three standard deviations of the mean. The Empirical Rule shows that 68% will fall within the 1 standard deviation, 95% within 2 two standard deviations, and 99.7% will fall within the 3 standard deviations of the mean. Khatijahhusna Abd Rani Page 3 Chapter 1 EQT 271 SEM II 2014/2015 Example: Suppose the age distribution of a sample of 1000 persons is bell shaped (normally distributed) with a mean of 40 years and a standard deviation of 12 years. s s Approximately 68% of the sample is between ages 28 (40-12) years old and 52(40+12) years old. 28 x  40 52 Measures of position 1. A z score or Standard score for a value is obtained by subtracting the mean from the value dividing the result by the standard deviation. Z value - mean standard deviation The major purpose of standard scores is to place scores for any individual on any variable having any mean and standard deviation on the same standard scale so that comparisons can be made. When all data for a variable are transformed into z scores, the resulting distribution will have a mean of 0 and standard deviation of 1. z score for samples, z xx s z score for populations, z Khatijahhusna Abd Rani x  Page 4 Chapter 1 EQT 271 SEM II 2014/2015 2. Quartiles divide the distribution into four groups, denoted by Q1 Q2 Q3 3. An Outlier is an extremely high or an extremely low data value when compared with the rest of the data values. The five number summary and Boxplots 1. A boxplot can be used to graphically represent the data set. These plots involve five specific values: i. The lowest value of the data sets (i.e., minimum) ii. Q1 iii. Q2 iv. Q3 v. The highest value of the data set (i.e., maximum) STAY FOCUSED AND NEVER GIVE UP!! ^_^ Khatijahhusna Abd Rani Page 5

Chapter 1.1_0

Related documents

Products

Support

Chapter 1.1_0

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib