Statistics for Psychology Patrick Murphy Department of Statistics Room L548 5th Floor Library Building Patrick.Murphy@UCD.IE 12 Lectures 2.00 pm Tuesdays Theatre L Textbook Seeing Through Statistics by Jessica Utts Duxbury Press CLASS WEBPAGE 1. Go to the Statistics Department Website WWW.UCD.IE/~Statdept/ 2. Then click on ClassPages in the left frame 3. Finally click on Statistics for Psychology What do you know about statistics? It’s boring… Frogs and Princesses There are three kinds of lies: Lies Damned Lies and Statistics - Benjamin Disraeli A single death is a tragedy, a million deaths is a statistic. Joseph Stalin (1879-1953) The weaker the data available upon which to base one's conclusion, the greater the precision which should be quoted in order to give the data authenticity. Norman R. Augustine Simpsons episode: Homer is questioned about his newly formed vigilante group Newscaster: Since your group started up, petty crime is down 20%, but other crimes are up. Such as heavy sack beating which is up 800%. So you’re actually increasing crime. Homer: You can make up statistics to prove anything. 43% of people know that. Misuse of Statistics The Great Meryl Streep Apple Juice Cancer Scare Asbestos is really bad for you so we need to eradicate it from our buildings Aeroplanes 1/1,000,000 chance of a bomb on a plane Aeroplane Engines What about Probability? The foundation of Probability theory lies in problems associated with gambling and games of chance The Romans used played a game with ASTRAGALI - Heel bones of animals DICE DICE as we know them were invented around 300 BC “I lied, cheated and stole to become a millionaire. Now anybody at all can win the lottery and become a millionaire” LOTTO 6/42 What are the chance of winning with one selection of 6 numbers? Matches Odds 6 1 in 5,245,786 5 1 in 24,286 4 1 in 555 LOTTO 6/42 The average time to win each of the prizes is given by: Match 3 with Bonus 2 Years, 6 Weeks Match 4 2 Years, 8 Months Match 5 116 Years, 9 Months Match 5 with Bonus 4323 Years, 5 Months Share in Jackpot 25,220 Years Why do people still play the lottery? If you’re not in you can’t win! You never know your luck until you try! My chances of winning a million are better than my chances of earning a million. The lottery is a tax on the statistically challenged. Lincoln & Kennedy Abraham Lincoln was elected to Congress in 1846. John F Kennedy was elected to Congress in 1946. Abraham Lincoln was elected President in 1860. John F. Kennedy was elected President in 1960. The names Lincoln and Kennedy each contain seven letters. Both were particularly concerned with civil rights. Lincoln & Kennedy Both wives lost a child while living in the White House. Both Presidents were shot on a Friday. Both Presidents were shot in the head. Lincoln's secretary was named Kennedy. Kennedy's secretary was named Lincoln. Both were assassinated by Southerners. Lincoln & Kennedy Both were succeeded by Southerners named Johnson. Andrew Johnson, who succeeded Lincoln, was born in 1808. Lyndon Johnson, who succeeded Kennedy, was born in 1908. John Wilkes Booth, who assassinated Lincoln, was born in 1839. Lee Harvey Oswald, who assassinated Kennedy, was born in 1939. Lincoln & Kennedy Both assassins were known by their three names. Both names are composed of fifteen letters. Lincoln was shot at the theatre named 'Ford.' Kennedy was shot in a car called 'Lincoln.' Booth ran from the theatre and was caught in a warehouse. Oswald ran from a warehouse and was caught in a theatre. Booth and Oswald were assassinated before their trials. Lincoln & Kennedy And here's the clincher. A week before Lincoln was shot, he was in Monroe, Maryland. A week before Kennedy was shot, he was in Marilyn Monroe. Oh…and on the day he died Lincoln pardoned a man named… Patrick Murphy Election: Which parties have most power? Party A - 45% Party B - 44% Party C - 7% Party D - 4% We’re ready to play some games… An Example Experiment: Roll Two Dice Possible Outcomes: Any number from 1 to 6 can appear on each die. There are 36 possible outcomes Each Outcome in the Sample Space is equally probable. So the probability of each outcome is 1/36 What is the probability of the Event - “get combined total of 7 on the dice” (1,1) (1,2) (1,3) (1,4) (1,5) (1,6) (2,1) (2,2) (2,3) (2,4) (2,5) (2,6) (3,1) (3,2) (3,3) (3,4) (3,5) (3,6) (4,1) (4,2) (4,3) (4,4) (4,5) (4,6) (5,1) (5,2) (5,3) (5,4) (5,5) (5,6) (6,1) (6,2) (6,3) (6,4) (6,5) (6,6) (1,6) (2,5) (3,4) (4,3) (5,2) (6,1) A more interesting example Game Show “Who wants to win a Ferrari?” 3 doors 1 Car & 2 Goats You pick a door - e.g. #1 Host knows what’s behind all the doors and he opens another door, say #3, and shows you a goat He then asks if you want to stick with your original choice #1, or change to door #2? Ask Marilyn. Marilyn vos Savant Guinness Book of Records Highest IQ “Yes you should switch. The first door has a 1/3 chance of winning while the second has a 2/3 chance of winning.” Ph.D.s - Now two doors, 1 goat & 1 car so chances of winning are 1/2 for door #1 and 1/2 for door #2. “You are the goat” - Western State University. Who’s right? At the start, the sample space is: {CGG, GCG, GGC} Pick a door e.g. #1 1 in 3 chance of winning Host shows you a goat so now {CGG, GCG, GGC} So Marilyn was right, you should switch. Chapter 1 The Beginning Statistics is the science of data. This involves collecting, analysing and interpreting information. Descriptive Statistics uses graphical and numerical techniques to summarise and display the information contained in a dataset. Inferential Statistics uses sample data to make decisions or predictions about a larger population of data More Definitions Population: The entire collection of individuals or objects about which information is desired. Sample: A part (subset) of the population selected in some prescribed manner. Variable: A characteristic or property of an individual unit in the population. Representative Sample: A selection of data chosen from the target population which exhibits characteristics typical of the population. Representative samples should give unbiased estimates More Definitions The most common way to select a Representative Sample is to choose a Random Sample. A Random Sample is a sample selected so that each different possible sample of the desired size has an equal chance of being the one chosen. This implies that each member of the original population has an equal chance of being selected in any random sample. Descriptive vs Inferential Statistics Descriptive statistics is only interested in describing a dataset, whereas Inferential Statistics seeks to make a decision based on the data. An Example of Descriptive Statistics - UCD Faculties Faculties Faculty Arts Commerce Law Science Engineering Medicine Architecture Agriculture # Students 4,438 2,129 463 1,868 1,142 1,185 289 950 # Degrees 1,153 424 120 327 229 218 79 130 # PG Degrees 342 395 43 106 88 63 22 0 Faculty ... La Sc w ie nc En e gi ne ... M ed ic in A rc e hi t.. . A gr ic u. .. C om A rt s 5000 4000 3000 2000 1000 0 # Degrees 1500 1000 500 0 Arts Co Law Scie Engi Med Arc Agri mm nce nee icin hite cult erc ring e ctur ure X e e Degrees/Student 0.3 0.25 0.2 0.15 0.1 0.05 0 Arts Co Law Scie Engi Med Arc Agri mm nce nee icin hite cult erc ring e ctur ure e e By using Descriptive Statistics to display the data in this manner we can now analyse the data more easily to find trends or patterns which were not immediately obvious in the original dataset. The Basics of Inferential Statistics - An Example A Newspaper wants to know whether people are happy with the performance of the Government. They hire a company to conduct an opinion poll. The pollsters select 1000 people and ask them the question: “Are you happy with the performance of the Government?” The Newspaper prints a headline like the following: “70% want the Government to go” or “Government achieves record popularity among voters” How can the newspaper publish things like this? They have only got the opinions of less than 1000 people ( remember the “don’t knows”). 1000/2.3 Million = 0.00043 or 0.043% Before the end of this course we will find out in great detail whether we should believe these polls. For the moment lets examine the procedure carried out in this example. The newspaper is interested in a certain population. What is this Population? The newspaper wants to measure some variable for each unit of the population. What variable do they want to measure? The opinion pollsters decide to select a sample from the population. What is the sample? And what is so special about the sample chosen? Is the result reliable? How to collect data. Before we can begin making inferences about the data we need to collect the data itself. Usually one gets data in one of 4 different ways. Data from a published source The data has already been collected and the results published, all we need do is draw conclusions from the data. This is where politicians and economists get most of their data. A boring way to get data!!! Data from a designed experiment Here you design and conduct an experiment to measure some characteristic of a population. You have strict control over how the experiment is carried out. This is the way scientists collect their data and it is the method which should provide the most accurate results. How to collect data continued... Data from a survey Here you select a representative sample of people from the population you are interested in. You ask each person some questions and record their answers. This method is used by polling companies, government statisticians etc. It has certain obvious drawbacks relating to the truthfulness of responses. Data collected observationally Here one observes the sample in its normal environment and records the variables of interest. Used by biologists and psychologists.