3 Chapter 1 Sec. 1.1 Uncertainty, Randomness and Data What is Statistics? Statistics is the science of gathering, describing and analyzing data in order to draw meaningful conclusions about the unknown universe from which the data were obtained Gathering data Experiment: an action to produce data Population: the complete (include all subjects to be studied) collection of elements Sample: a sub collection of elements draw from a population. Describing data Numerically or graphically or literally describe the center, and the variation or the spread of the data. Analyzing data Random variable Probability Distribution Testing Drawing meaningful conclusion What do the data imply? Predictions 4 Sec.1.2 The Need to Model Uncertainty Deterministic: predictable outcomes. (e.g. car…) Uncertain or random but not chaotic. (e.g. flip a fair coin….) Sec. 1.3 Random Variables and Their Distributions Random Variable: a function with an uncertain outcome Examples: i) Number of people in a waiting line at certain time. ii) Number of heads shown from a flipping three coins. The distribution of a random variable: The collection of the values of the random variable together with how often they occur. Sec. 1.4 Types of Data 1. Quantitative (numerical) (1). Discrete (e.g. counting numbers) (2). Continuous (e.g. real numbers in an interval) 2. Qualitative (attribute / categorical, e.g. SSN) 5 Sec. 1.5 Data Production and Random Sampling Data: values of random variable Common Sampling Methods: 1. Random Sampling Each member of the population has an equal chance of being selected. (e.g.: Computers are often used to generate random telephone numbers.) 2. Stratified Sampling Classify the population into at least two strata, and then draw a sample from each. 3. Systematic Sampling Select every kth member. 4. Cluster Sampling Divide the population area into section, randomly select a few of those sections and then choose all members in them. 5. Convenience Sampling Use results that are readily available, such as ask people on the street. 6 Example 1. Identify the type of sampling used in each case. a) A quality control analyst inspects every 50th compact disk from the assembly line. Answer: Systematic b) A sample of 12 juniors is selected by first placing all names on cards (one name on one card) that are mixed in a drum. Answer: Random c) A television news team samples reactions to an election by selecting adults who passing by the news studio entrance. Answer: Convenience d) A political analyst first identifies three different economic groups, then select 200 subjects from each of them Answer: Stratified e) News coverage includes exit polls of everyone from each of the 80 randomly selected election precincts. Answer: Cluster. 7 Example 2. Classify each situation as deterministic or random. If it is random, then a) Use a complete sentence to describe the random variable. b) Give three descriptive examples of the possible outcomes. c) Describe the range of the random variable. d) State if the random variable is discrete, continuous or qualitative. (1). An urn contains 3 red marble and 5 yellow marbles. Five marbles are selected one at a time without replacement. Of the interest is the number of red marbles selected. Answer: Random. a) X = the number of red marbles selected b) X = 0 (YYYYY); X = 1 (YRYYY); or X = 2 (RYYRY) c) {0, 1, 2, 3} d) Discrete (2). Repeat 1, but with replacement. Answer: Random. a) X = the number of red marbles selected b) X = 0; X = 1; (the same as above) or X = 5 (RRRRR) c) {0, 1, 2, 3, 4, 5} d) Discrete 8 (3). The area of a rectangle whose dimensions are random positive real Numbers. Answer: Random. a) X= area of a rectangle b) X = 6 (width W = 2, length L=3); X = 2.8 (W = 0.7, L = 4); or X= 14 (W = 2 , L = 7 ) c) X 0 d) Continuous