Chapter 1 Introduction of Statistics Statistics The practice or science of collecting and analysing numerical data in large quantities, especially for the purpose of inferring proportions in a whole from those in a representative sample. Characteristics of Statistics*read notes ● Statistics are numerically expressed. ● It has an aggregate of facts. ● Data are collected in systematic order. ● It should be comparable to each other. ● Data are collected for a planned purpose. ● Descriptive Statistics Limitations of Statistics* read notes ● Statistics does not deal with individual measurements. ... ● Statistics deals only with quantitative characteristics. ... ● Statistical results are true only on an average. ... ● Statistics is only one of the methods of studying a problem. ... ● Statistics can be misused. Descriptive statistics are used to summarize a set of data. Suppose you were teaching a class of 25 students, and you wanted to know what the average score was for the test that you just gave. You would use descriptive statistics; you are interested in the performance of that particular set of students. One limitation of descriptive statistics is that they do not allow us to make any inferences about the population at large. This type of statistics simply describes the data set that has been collected. Inferential Statistics Now, suppose you need to collect data on a very large population. For example, suppose you want to know the average height of all the men in a city with a population of so many million residents. It isn't very practical to try and get the height of each man. This is where inferential statistics comes into play. Inferential statistics makes inferences about populations using data drawn from the population. Instead of using the entire population to gather the data, the statistician will collect a sample or samples from the millions of residents and make inferences about the entire population using the sample. The sample is a set of data taken from the population to represent the population. Probability distributions, hypothesis testing, correlation testing and regression analysis all fall under the category of inferential statistics. Population In statistics,Population is the entire set of items from which you draw data for a statistical study. It can be a group of individuals, a set of items, etc. It makes up the data pool for a study. What is a Sample? A sample is defined as a smaller and more manageable representation of a larger group. A subset of a larger population that contains characteristics of that population. A sample is used in statistical testing when the population size is too large for all members or observations to be included in the test. Data Data is raw unorganized facts that need to be processed. Data can be something simple and seemingly random and useless until it is organized. Primary Data Primary data refers to the first hand data gathered by the researcher himself. Secondary Data Secondary data means data collected by someone else earlier. Examples Source Surveys, observations, experiments, questionnaire, personal interview, etc. Government publications, websites, books, journal articles, internal records etc. Information When data is processed, organized, structured or presented in a given context so as to make it useful, it is called information. Good quality information is: ● Relevant. Information obtained and used should be needed for decision-making - it •doesn't matter how interesting it is. ... ● Up-to-date. Information needs to be timely if it is to be actioned. ... ● Accurate. ... ● Meeting user needs. ... ● Easy to use and understand. ... ● Worth the cost. ... ● Reliable. ● Complete Constant * Variable* (or data) Not consistent or liable to change. Categorical (or Quantitative) Variables Variables whose values result from counting or measuring something. Examples: height, weight, time in the 100 yard dash, number of items sold to a shopper Numerical (or Qualitative) Variables Variables that are not measurement variables. Their values do not result from measuring or counting. Examples: hair color, religion, political party, profession Discrete Variables A discrete variable is a variable that takes on distinct, countable values. In theory, you should always be able to count the values of a discrete variable. Examples ● Years of schooling ● Number of goals made in a soccer match ● Number of red M&M’s in a candy jar ● Votes for a particular politician ● Number of times a coin lands on heads after ten coin tosses Continuous Variables A continuous variable is a variable that can take on any value within a range. A continuous variable takes on an infinite number of possible values within a given range. Because the possible values for a continuous variable are infinite, we measure continuous variables (rather than count), often using a measuring device like a ruler or stopwatch. Continuous variables include all the fractional or decimal values within a range. Examples ● The time it takes sprinters to run 100 meters ● The size of real estate lots in a city ● The weight of baby elephants ● The body temperature of patients with the flu ● The deployment altitude of skydivers QUIZ Question