PRE-ASSIGNMENT: GENDER DISCRIMINATION IN FACULTY SALARIES AT A JUNIOR COLLEGE Fall14 A number of years ago, women junior college faculty members in Massachusetts sued state junior colleges for gender discrimination in salaries. After a court case lasting many years, state junior colleges were found guilty of gender discrimination. The attached data set “juniorcol short” is a subset of the type of data used to analyze the validity of claims of gender discrimination. This data set has 5 variables for 81 junior college faculty at one institution: rank (instructor, assistant professor, associated professor, full professor), gender, age, number of years at the college and salary. The assignment is to use Pivot Tables (and some Excel functions) to understand the extent to which the data suggest gender discrimination in salaries. There are more sophisticated analyses, including hypothesis tests and regression models, that could be used to analyze these data. Later in the QM course we will use these approaches to re-examine this data set. However, it is useful to develop the type of understanding of the data possible through analyses of PivotTables even when more sophisticated analyses will be done later. Also, at the end of the Pivot Table assignment, we walk you through a particular useful type of graphical analysis for this situation. In addition to pivot tables, reviewing absolute and relative cell addressing, If and If(and … ) statements and the sumproduct function would be useful. Use Pivot Tables to do the following: (Each of the analyses that follow should be on a separate worksheet. You can do this by going back to the database and inserting a new Pivot Table for each numbered analysis. Make sure to indicate that you want the new pivot table on a new worksheet (you do not want the pivot table on the same worksheet as the data). In the material that follows, sometimes there are specific instructions to guide your analysis; other times, there is more of a burden on you to figure out how to do things. This was done on purpose. 1. Examine average salary by gender. Include both average salary and a count of the number of men (M) and women (W) in the data set in the cells of a single pivot table. You can get “count” by dragging any variable to the Values box in the Pivot Table Field List, clicking on the down arrow, going to Value Field Settings and clicking on “count”. To format the $ values in this table, click on a cell, then right click, then click number format and eliminate values to the right of the decimal. (NOTE: one problem with Excel is it often gives you a number with too many decimals, which doesn’t allow you to see clearly what is going on. It is very useful to get in the practice of reformatting numbers in a way that facilitates “telling the story” you want to tell.) How much lower are W faculty salaries than M faculty salaries? There are a number of factors that might explain why W and M salaries differ other than gender discrimination. For example, faculty with higher ranks may be paid more and there may be more M with higher ranks. In the analyses that follow, you will examine these factors. 2. Examine average salary by rank and by gender. First, create a Pivot Table with rank in the rows, gender in the columns and average salary in the cells (remember to reformat $s). Copy this Pivot Table, click on cell I3 and click on paste. Click on a cell in the original Pivot Table and the Pivot Table Field List will come up. Change average of salary to count of salary and you will get a new Pivot Table showing the number of M and W at each rank. When you compare M and W salaries for those at the same rank, you are “controlling for” rank. In cell M5, calculate M average salary minus W average salary and drag this formula down thru cell M8. Notice that for associate and full professors, the difference in salaries looks very large. This is because there are no women in these ranks. When you do the subtraction and one of the cells is blank, Excel treats the cell as having a zero value. We are interested in examining, for the ranks that have both M and W, how their salaries compare. From the table, you can see how they compare for each rank, but we want to summarize these differences across ranks. If there were the same number of instructors and assistant professors, we could just average $4105 and $4993. However, since there are different numbers, we want to take a weighted average, using the proportion of faculty at each rank as the weights (for those ranks that have both M and W faculty). The following describes a way to calculate the weighted average. In cell E5, use the If(And ….) function to put in the cell the total number of faculty if there are both M and W at that rank and a 0 otherwise. In the formula, highlight cell B5 and, using the F4 key, make the column an absolute cell address; do the same thing for column C5. Drag the formula down to row 8. In row 9 calculate the sum of the 4 numbers. In cell A12 type “salary difference controlling for rank”. In cell D12, calculate a weighted average of the differences in column M by using the sumproduct function and then dividing the result by the total number of faculty used in this analysis (cell E9) (remember to fix the $ formats, this time using the format cell option). You would like to know whether the salary difference controlling for rank is smaller than the original salary difference. This would suggest that differences in rank explain part of the difference in average salaries between M and W. The problem is that you need to calculate the average salary difference for the 69 faculty used in this analysis (because there are both M and W at the same rank) not the 81 total faculty. Drag cell E5 across to G5. (The reason you used absolute cell addressing for the columns in this formula is so the dragging would work). In the formula in cell F5, change E5 to C5 and in the formula in cell G5, change F5 to B5. Then, drag the formulas in F5 and G5 down thru row 8 and drag the sum formula in cell E9 across columns F and G. The numbers in column F should be the number of M faculty for those ranks with both M and W faculty; the numbers in column G should be the same thing for W faculty. In cell A13, type “average salary of M”. In column D13, calculate the weighted average of M salaries for those ranks that have both M and W (similar to how you calculated cell D12). In row 14 repeat this for W salaries. In cell A15, write “average M salary minus average F salary”. In cell D15, calculate this difference. In cell A16, write “% of salary difference explained by rank” and in cell D16 calculate this as 100*(D15-D12)/D15. Format this number as a percentage. Why does rank explain so little of the difference between M and W faculty salaries? If assistant professor salaries were much larger than instructor salaries, do you think rank would explain a lot more of the difference between M and W salaries? 3. Examine average salary by years at the institution and by gender. This analysis is similar to that in step 2, with one exception. When the Pivot Table is created, there are a lot of individual values for “years” with very few cases. You would like to group “years” into categories. To group years into 5-year categories, right click on one of the numbers in the Year column, click on Group and group years into 5year categories. Then, repeat the analysis in step 2. What % of the salary difference between M and W is explained by differences in years at the institution? Why is this % so much higher than that for rank? 4. Examine average salary by age and by gender. This analysis is the same as that in step 3 but group ages into 10-year age categories starting at age 26. You can just copy all of the formulas from the step 3 worksheet and paste them into this worksheet. What % of the salary difference between M and W is explained by differences in their age? 5. Examine average salary by years and age and gender, i.e., controlling for both age and years. Drag both years and age to the Row box and gender to the column box. Go thru the same steps as in the analyses above. However, to facilitate use of the “dragging formulas”, you should get rid of the subtotals in the table. Click on a cell inside the table, then click on the Design Box at the top of the screen, then Subtotals (at the far left of the ribbon), then “Do not show subtotals”? What % of the salary difference between M and W is explained by differences in their age and years? Why is this percent much lower than the sum of the percents you found in step 3 and step 4? SUMMARY OF RESULTS: Name each worksheet indicating the analysis on that worksheet (e.g., rank, age, etc). Create a worksheet called Summary and put on that worksheet answers to the questions asked above. Having done this analysis, how strong is the gender discrimination case? A Useful Graphical Analysis: Scatter Plots Scatter plots are a very useful way to examine the gender discrimination data. However, it can require some care to get what you want when doing them in Excel. The question we will examine is the relationship between years at the institution and salary for men and women. Go back to the gender discrimination data sheet, copy columns B-E and paste them in column H. Then delete the age column. (This gives a little simpler data set to work with). Sort the data by gender so that females are the first listed set of data and then males. To create a scatter plot, do the following: 1) Highlight a cell in the dataset. Then do Insert > scatter > select the chart in the upper left. Excel’s default scatter plot is a mess, usually not at all what you want. It is a plot of each variable in the order in which the data are listed. What you want is a plot that has age on the X (horizontal) axis and salary on the Y (vertical) axis. NOTE: It was not a random decision to put age on the X axis. We have an underlying hypothesis that salary is caused in part by how long a person has been at the institution, that is, years is a CAUSE and salary is an EFFECT. When you have this type of implied relationship, it is standard practice to put the CAUSE on the X axis and the EFFECT on the Y axis. 2) Click on the design tab>select data. You will get the following screen (first screen next page). You can use this dialogue box to redesign the scatter plot. First, remove years. Then edit Salary. You will get a new dialogue box (second box, next page). Delete what is in Series name and type in females. Delete what is in Series X values and highlight I2:I38. Delete what is in Series Y values and highlight J2:J38. Now you have a scatter plot of just female salaries. 3) Now click on Add in the Select Data Source Screen and you will get a new Edit Series dialogue box. Type in males for the series name and highlight the relevant data ranges for male X value and female X values. Now, you have a scatter plot that allows you the separately see the relationship between years and salary for men and women. In a few sentences, how would you summarize the relationships that you see. Type your answer to the right of the graph.