Statistics 301 Project #2 Description Fall 2015 The projects allow you to use your skills in a practical problem. Statistics provides tools to make justifiable conclusions from data with variation. The project includes description of data, inference about answers to questions, and evaluating assumptions. You will work in a team of 2 to 4 people, although please talk with me if you prefer to work as an individual. Your team will choose one of the two topics described on the next page. I will provide you your data set. Your team will write a technical report of not more than 8 typed (11 point font or larger) double-spaced pages. Reports will not be returned, so make a copy if you want a copy for yourself. This project allows you to apply all the modelling skills you have learnt over the semester. The next page provides short summaries of the two topics. Additional information about your chosen topic and the data set will be sent to you next week. You will have time in lab to work on the project. Your report should include: An executive summary (one or two paragraphs) that briefly but completely summarizes your approach and the results of your analysis. Although this will appear at the beginning of your report, it is likely to be the last thing you write. A summary of relevant aspects of the study design. A description of the relationship between your explanatory and response variables. Again, include any graphs, summary statistics, and analysis that support your description. You may need to transform one or more variable if the initial analysis has problems with assumptions. An evaluation of assumptions for the ``final’’ analysis. Answers to topic-specific questions Predictions and any related information for the specified X values. Write your report with a minimum of statistical jargon. Jargon is especially inappropriate in the executive summary. If you use statistical phrases, e.g., saying something is statistically significant, you need to explain what you mean by that. Turning in a wad of JMP output is not an appropriate statistical analysis. The body of your report should include only those results that support your description or inference. You should cut/paste important figures into your report. Most numeric results will be better presented in a form other than the JMP output. Remember that JMP usually gives you more than you need so do not include graphs or output that is not needed. Including unnecessary JMP output will result in deductions from the final project score. The project will be graded with a total of 30 points. If everyone on the team contributes approximately equally, each team member will receive the same score. When the report is submitted, any team member can include a separate sealed envelope giving her/his assessment of the relative effort contributed by each team member. This should be expressed as a percentage of the total team effort for each team member. If effort was wildly unequal, I will discuss the issue with team members and if appropriate, adjust individual scores. If no envelopes are turned in, I will assume equal contributions. The two topics available are: Topic 1: Understanding reproduction in wild horses (part 2). The wild horse is a Western icon but large numbers of wild horses create ecological problems. The response variable is fecundity, the number of foals per adult in the group. This is measured on 50 groups of wild horses. Data were collected over multiple years in two locations. As it turns out, the treatment was expected to have no effect in the first year of the study (details provided if you choose this project). There are various nuisance variables that might need to be adjusted for: location, number of adults in the group, observation month, and year post-treatment. It is also not clear how to best denote the treatment: as 1/0 (for group has a sterilized male / does not) or as the proportion of sterilized males in the group (0 or some number > 0). The folks managing these herds want to know: 1. what is the effect of the treatment on foal production (fecundity)? 2. is the response to the treatment different in the two locations? 3. the predicted fecundity for three specified conditions Your group will need to decide (among other things): how to deal with the nuisance variables, how to use data from the first year of the study, and how best to use the treatment information. Topic 2: Investigation of possible sex discrimination in a bank. These data are derived from those provided in support of a 1980’s era lawsuit against a Texas bank. The plaintiff’s claim that female employees were hired at a lower starting salary and given smaller pay raises. To support these claims, they obtained data on 100 employees hired between 1965 and 1975. The data set contains information on one job category (skilled, entry-level clerical). It includes information on the starting salary, the salary in 1977, the average annual pay raise, the number of months they had been hired when the data were collected, the year of hire, and, and some descriptive information about each individual: their age, sex, number of years of education, and number of months of work experience prior to being hired at the bank. The plaintiff’s lawyers have asked you to: 1. estimate the average difference between (or ratio of) female and male starting salaries. 2. estimate the average difference between (or ratio of) female and male starting salaries, after adjusting for appropriate individual characteristics. 3. estimate the average difference between (or ratio of) female and male pay raises, after adjusting if appropriate. 4. the predicted pay raise for two specified individuals. Your group will need to decide (among other things): how to appropriately adjust for individual characteristics Due dates: End of lecture, 20 Nov: Each team submits a list of team members and its choice of topic. If no team member will be in class that day, please e-mail me that information. End of lecture, Friday, 11 Dec: Turn in your report.