MAT 102: Intro to Statistics The Greatest Last-Place Finish Ever! Mexico, 1968 John Stephen Akhwari of Tanzania The Greatest Last-Place Finish Ever! Mexico, 1968 Out of the cold darkness he came. John Stephen Akhwari of Tanzania entered at the far end of the stadium, pain hobbling his every step, his leg bloody and bandaged. The winner of the marathon had been declared over an hour earlier. Only a few spectators remained. But the lone runner pressed on. As he crossed the finish line, the small crowd roared out its appreciation. Afterward, a reporter asked the runner why he had not retired from the race, since he had no chance of winning. He seemed confused by the question. Finally, he answered: "My country did not send me to Mexico City to start the race. They sent me to finish." A Statistical Short Story! Have you heard the story about Malcom Forbes, who once got lost floating for miles in one of his famous balloons, and finally landed in the middle of a cornfield? He spotted a man coming toward him and asked, “Sir, can you tell me where I am?” The man said, “Certainly, you are in a basket in a field of corn.” Forbes said, “You must be a statistician.” The man replied, “That’s amazing, how did you know that?” “Easy,” said Forbes, “your information is concise, precise, and absolutely useless!” The purpose of this course is to convince you that information resulting from a good statistical analysis is always concise, often precise, but NEVER USELESS! Why Study Statistics? So why is statistics required in so many majors? One reason is that numerical information is everywhere. Look in the newspapers (USA Today), news magazines (Time, Newsweek, U.S. News and World Report), business magazines (BusinessWeek, Forbes), or general interest magazines (People), women’s magazines (Ladies Home Journal or Elle), or sports magazines (Sports Illustrated, ESPN The Magazine), and you will be bombarded with numerical information! Yes, Data are everywhere! Let’s take a look at some interesting statistics I found on the Internet! Before we answer the question: What is Statistics? Internet Statistical Examples 40% of women have hurled footwear at a man. 27% admit to cheating on an exam or quiz. When nobody else is around, 47% drink straight from the carton. 10% of us claim to have seen a ghost. Last semester, my students had the same reaction the first day of stat class! 2008 Opening Day salary for Major League Baseball Players is $3.15 million! Average Major League Baseball Career Is 5.6 Years! The average career of a Major League Baseball player is 5.6 years, according to a new study by a University of Colorado at Boulder research team. The study examined the career statistics of baseball players who started their careers between 1902 and 1993. Pitchers were excluded because of their unique positions, career volatility and propensity for injuries. Between 1902 and 1993, 5,989 position players started their careers and played 33,272 person years of Major League Baseball. Using voluminous baseball statistics, the authors then developed an average career length for the players. Chapter One Definition : Statistics Statistics involves the procedures associated with the data collection process, the summarizing and interpretation of data, and the drawing of inferences or conclusions based upon the analysis of the data. Branches of Statistics Essentially the study of statistics involves: describing the main characteristics of raw data as well as drawing conclusions from the data based upon the analysis of the data. From this point of view, statistics can thus be subdivided into two branches: 1) Descriptive Statistics and 2) Inferential Statistics Descriptive Statistics Descriptive Statistics: uses numerical and/or visual techniques to summarize or describe the data in a clear and effective manner. Let’s look at an example of Descriptive Statistics. Descriptive Statistics Example Descriptive Statistics describes the basic features of a data set. In the NHL: a growth industry, the large amounts of data pertaining to height and weight of every NHL player was described in a sensible way. The descriptive statistic reduced lots of data into a simpler summary: one number for the height & one number representing the weight of the NHL players. Some Example of Descriptive Statistics Batting Average: a simple number used to summarize how well a batter is performing in baseball. This single number is simply the number of hits divided by the number of times at bat (reported to three significant digits). A batter who is hitting .333 is getting a hit one time in every three at bats. One batting .250 is hitting one time in four. The single number describes a large number of discrete events. Grade Point Average (GPA). This single number describes the general performance of a student across a potentially wide range of course experiences. Caution! When describing a large data set of observations with a single statistic runs the risk of distorting the original data or losing important detail. For example: The batting average doesn't tell you whether the batter is hitting home runs or singles. It doesn't tell whether she's been in a slump or on a streak. The GPA doesn't tell you whether the student was in difficult courses or easy ones, or whether they were courses in their major field or in other disciplines. Given these limitations, descriptive statistics still provides a powerful summary that may enable comparisons across people or other units. Inferential Statistics Inferential Statistics: is the branch of statistics that involves drawing conclusions about a large group, called a population, based on the analysis of a smaller group of data, called a sample, collected from the population. Let’s look at an example of Inferential Statistics. Inferential Statistics Example According to a CNN/Gallup Poll nationwide phone survey of 1,000 Adults regarding leisure time: Inferential statistics is used to try to infer from a portion of a data set what the entire data set might think. In the CNN/Gallup Poll, the pollsters are inferring that 52% of the ALL (population) adults living in the U.S. believe they do not have enough leisure time. This inference is based on the opinions of the 1,000 (sample) adults in the phone poll. This is the essence of inferential statistics. That is, the pollsters are drawing a conclusion or inference about the ALL (or population) of adults using the opinion of a portion (or sample) of the population. Population and Sample Definition: Population The Population is the entire collection of all individuals or objects of interest. Definition: Sample The Sample is the portion of the population that is selected for study. Researchers, pollsters and/or decision makers use Inferential Statistics to draw conclusions or inferences about a population. Examples: A medical researcher wants to determine if large doses of vitamin C are effective in combating colds. A market researcher wants to know in what quantities the American consumer is willing to purchase a new product. Pollsters want to estimate what percentage of the American public approve of the death penalty to curb crime. Researchers use a sample because it is usually impractical or impossible to obtain all the population observations or measurements. Example: an electrical company wants to determine the average life of their new 40 watt light bulb Population: All the 40 watt bulbs the company manufactures Objective: To obtain the average life of the population of light bulbs Problem: To compute the population average life of each bulb is impractical since they would not have any bulbs left to sell! Solution: It is necessary to select a representative sample. Representative Sample Similar to the "toothpick" technique used to determine if a cake is completely baked. A toothpick is inserted in several areas of the cake, and if the toothpick comes out free of cake batter each time, it is concluded that the entire cake is done. Thus, the bulb manufacturer needs to select a sample with similar characteristics as all the new 40 watt light bulbs that can be used to make inferences about the entire population. Objective of Inferential Statistics To use sample information to estimate a population characteristic. Thus it is imperative that the researcher try to design a procedure to select a sample which is representative of the population. DEFINITION: Representative sample is a sample that has the pertinent characteristics of the population in the same proportion, as they are included in that population. Let’s Summarize The Differences With descriptive statistics you are simply describing the data set. With inferential statistics, you are trying to reach conclusions that extend beyond the immediate data set. Thus, we use inferential statistics to make inferences from our data to more general conditions; we use descriptive statistics simply to describe what's going on in a data set. Is this an example of Descriptive or Inferential Statistic? Time for our own survey! Are you enjoying this Stat class? Before you answer, keep this in mind! Parameter and Statistic In the light bulb example, a sample was selected and the sample average was used to estimate the average life for all the new 40 watt light bulbs. When a number is used to describe a characteristic of a sample, such as sample average, this number is called a statistic. The population average is an example of a parameter since it describes a population characteristic. Definition: Statistic A statistic is a number that describes a characteristic of a sample. Definition : Parameter A parameter is a number that describes a characteristic of a population. Thus, we can say that the concept of inferential statistics is to use a sample statistic (sample average of the bulbs) to make inferences about a population parameter (population average of the bulbs). Example An opinion poll of 1500 potential voters was taken to estimate how all the 20,000 voters in an election district will vote in the upcoming election. The opinion poll results indicate that 52% will vote for the politician A in the upcoming election. Population: The 20,000 voters Sample: 1500 potential voters polled Statistic: 52% is a statistic since it describes a characteristic of the sample. Bus Tour Traveler NFL Weigh In Money for big kids! Who teens turn to with pressures and problems! Infertility Incidence Women vote more! Study in the Journal of the AMA states teenage girls who supplement their diet with extra serving of Calcium daily built stronger bones! Soyjoy Ad Eating Jelly Beans causes an Unhealthy Spike & Drop in Blood Sugar!