Statistics An intro to our world of Data by S. Middleton, M.A. There are lots of reasons one takes a course in school but… WHY STUDY STATS? Statistics is Historical . . . Statistical Census taking goes back to Babylonia and ancient Egypt, however the Roman Empire gives us its name. Statistics is derived from the Latin word status, which means “state.” Examples of a Census: “In those days Cesar Augustus issued a decree that a census should be taken of the entire Roman world. This was the first census that took place while Quirinius was Governor of Syria. And everyone went to his own town to register”. Recorded in a book of the Bible by the Physician Luke in Ch. 2 verses 1-3 Statistics is Current . . . Modern day executives use statistics to shape decision making. Take for example- Moneyball Moneyball is the story of Billy Beane, a Mt. Carmel Graduate, who changed the way Major League Baseball evaluates their prospects based upon statistical information. Why Study Stats? As a student, we study Statistics to be able to read and understand various statistical studies in our careers. Statistical procedures are basic to research in all fields. To become better citizens and consumers. In this presentation and in the Stats course… WHAT WILL I LEARN? Goals for our study… We will answer these questions: What are the branches of Statistics? What are data? Where do we get data? Two Branches of Statistics Descriptive Statistics, which utilizes numerical & graphical methods to look for patterns in a data set, summarize the information in a convenient form. Inferential Statistics, which utilizes sample data to make predictions, estimates, decisions or other generalizations about a larger set of data. DESCRIPTIVE STATISTICS • The U.S. market for credit cards. The Nilson company collected data on credit or debit purchases recorded in the U.S. during the first six months of 1998. US Market Share for Credit Cards DISCOVER 6% DINERS CLUB 1% AMERICAN EXPRESS 18% VISA 50% MASTER CARD 25% DESCRIPTIVE STATISTICS With descriptive statistics, the statistician tries to DESCRIBE a situation. Often the data is presented in some meaningful form, such as charts, graphs, or tables. A second branch is… INFERENTIAL STATISTICS Family Home Journal Study A group of 1017 men aged 48 years old was studied for 18 years. It was found that for unmarried men 60% to 70% were still alive at age 65. For married men 90% were alive till age 65. They Concluded that marriage contributed to the length of one’s life. What are data? The KINDS of data. The SOURCE of data TECHNIQUES in obtaining data. The MEASUREMENT Classifications. Data can be … (classified by Kind) QUANTITATIVE - involvesNumbers OR QUALITATIVE - is a sometimes called “categorical” DATA . . . Classified for you Data Quantitative (Measured) Qualitative (Categorical) Discrete (can be counted) 1, or 2 but not 1.2 Continuous ( values) 1, 1.01, 1.1,1.07, etc. Data has a source … A POPULATION“All of the Observations or Measurements” A SAMPLE “ A Portion of the Population” An Example from Hawaii The department of Agriculture wants to know if this crop of pineapples are under sized. They They take the individual weights of a sample of 100 pineapples from an experimental field of pineapples for study. What’s the Population ? The Sample ? The Weight of all the pineapples in the field. 100 pineapples TECHNIQUES to Produce data Observation Experiment Simulation Survey Data can be Produced by… Observation – the researcher merely observes what is happening or what has happened in the past. Motorcycle Industry Council– collected data on the ages and incomes of motorcycle owners in 1980 and then again in 1998. The researchers merely stated that motorcycle owners were getting older (USA today). There was no research intervention. Data can be Produced by… Experimentation – the researcher manipulates one of the variables and tries to determine how the manipulation influences the other variables. Virginia Polytechnic University (Psychology Today)– they divided the female undergraduate students into two groups and had them do as many sit-ups as possible in 90 seconds. The first group was only told “to do your best”, while the second group was told to try an increase their best by 10% each day. They were measured again after 4 days to see what happened. Data can be Produced by… Simulation – a researcher may use probability experiments to mimic real life situations that might be too costly, dangerous, or time-consuming. NASA space shuttle pilots are trained using the simulator, rather than learning on the real shuttle. Data can be Produced by… Survey – one of the most common methods for obtaining information is a survey. There are many types, but 4 common methods are: Telephone, Mailed Questionnaire, Personal Interview, and surveying records. Literary Digest In 1932, Literary digest conducted a survey by mailing questionnaires to subscribers asking questions about the upcoming election. Which Technique is Best ? In the following slides, use each of the techniques just presented -Surveying, Experimentation, or Census taking, to answer the question which seems appropriate for each senario. A Study of the effect of stopping the cooling process of anuclear reactor. Probably SIMULATION. I don’t think you want a melt down ! Study the effect of calcium supplements given to young girls on their bone mass EXPERIMENTATION This is very similar to studies the AMA performs regularly. In fact this one was done by Tom Lloyd, who used 94 girls half of which were given the calcium and half given a placebo. He found that 1.3% more bone mass was gained by girls using the calcium treatment. He published his findings in the “Journal of the American Medical Association”. Study the credits earned of each student enrolled at MCHS. Surveying Records The registrar can keeps these records for every student, you could gain a report of this data. Data can be CLASSIFIED by how it is measured? There are 4 levels of measurement • Nominal Level • Ordinal Level • Interval Level • Ratio Level Nominal Level means “in name only” it refers to data that has no way of organizing or ranking data as greater than other data. Nominal Examples include: color, names of cities, ski areas, etc. Ordinal Level Ordinal Level means includes categorical data that can be ranked or placed in an order but actual differences between data values can not be determined or are meaningless. Examples include: NBA teams, AAA rated motels, Digital Cameras, etc. Interval Level Interval means that data can be compared and includes meaningful differences between the values. There is NO real ZERO involved, however. Examples include: temperature ratings, years when Democrats won elections, etc. Ratio Level Ratio Level allows for ranking, taking differences, and finding a “ratio” between the data values. It makes sense to say that one data value is “twice” as long as the other. Examples include: length, time lapse between check at a bank, temperature measured in 0K. Levels of Measurement From different data values classify them as Nominal, Ordinal, Interval, or Ratio. Senator’s name is Sam Wilson. Nominal The Senator is 58 yrs old. Ratio He was elected to the Senate in 1980, 1986, 1992 Interval His Income is $878,314. Ratio A leading magazine claims he is ranked 7th based on his voting record on bills regarding schools. Ordinal In Summary, Here’s What YOU should know. The KINDS of data. Quantitative & Qualitative The SOURCE of data. Population or Sample Techniques for PRODUCING data. Observation, Survey, Experiment, Simulation, CLASSIFICATION measurement levels. Nominal, Ordinal, Interval, or Ratio It’s YOUR turn!!!!!! Take each of the questions we answered in the survey at the beginning of this unit and classify each level of measurement for the data values collected.