Statistical Analysis Construction Engineering 221 Decision Making Under Uncertainty Statistics • “Statistics” has several meanings in the general public – The data on performance (the game stats) – A particular calculation made from the data (mean, median, deviation, etc.) – A field of study whose objective is to improve decision making under conditions of imperfect information Statistics • Statistics is poorly understood by the general public, misused by government agencies and academic researchers, and abused by special interest groups: – A researcher counts every observation twice in order to increase sample size in an effort to achieve statistically significant results – The often quoted “10% of the population is gay” is based on a study where 10% of a prison population was found to have engaged in homosexual behaviors Statistics – A radical feminist researcher at a university published a report stating that the majority of married women were unhappy and married men were happy, thereby “proving” that marriage was an institution of oppression – A report commissioned under the Reagan Administration found that men who view pornography and adult magazines are more likely to commit rape, so Playboy was banned from military bases and federal prisons Statistics • What’s wrong with these studies? – Statistical significance is not the same as practical significance. If you sample large enough, the test will be significant, but the result meaningless. Also, counting every observation twice biases the sample. – Results from a prison population cannot be generalized to the general public because prisoners are not a representative sample Statistics – The feminist researcher measured unhappiness by how often a married person cried. Crying under stress is much more common among women than men. When anger or withdrawal is included (male reactions to stress), the unhappiness with marriage is equal. Measurement must be free from bias – In the Meese study, men were randomly sampled, but age was not controlled for. Most rapes are committed by young people, and most adult magazines are read by young people. It is being young that creates the risk. The relationships were poorly understood Statistics • Good statistical studies: – Are done right the first time (don’t redo the data until you get the result you want) – Have representative samples of the population under study – Use unbiased measurement of the characteristics under investigation – Understand the relationships (use theory to set up the statistical analysis) Statistics • Statistics was first developed for agricultural applications (animal breeding, plant crossfertilization, etc.) • Even today, most top statistics departments are in state ag schools (ISU, Texas A & M, Colorado State) • Current important uses are in medical research, risk management and decision theory, quality control, market research, advertising, insurance Statistics • For our class, statistics will be used as a method for improved decision making in situations of incomplete information, or in other words, a tool for risk management Statistics • Two types or uses of statistics – Descriptive- describe the data or observations found in the sample • Pictorial descriptions: histogram, bar or pie chart, line graph, frequency distribution, classification • Numerical description: mean, median, mode, range, variance, standard deviation Statistics • What can descriptive statistics tell us? (how something is different or the same) Statistics • In a normal distribution, each ½ standard deviation captures 19%, 15%, 9%,, 5% of the variation in observations. (rule of 9’s and 5’s) 2% 7% 16% 31% 50% 69% 84% 93% 98% Statistics – Inferential statistics • Decision made about a population based on sample (incomplete) data: – – – – Random samples representative of the population Hypotheses tests and the null hypothesis Confidence intervals Shape of distribution Statistics • What could we infer from the Exam 3 scores in this class? – What population does the Con E 221 class “represent” (sampling) – What relationships could we build (time spent studying for the test, class attendance, pages read in the text, homework completed on time) – How would we measure our variables