UC Berkeley Admissions Queries to extract knowledge from the data set: • Is there a gender bias in the graduate admissions process? Output: First looking at total percentages of admitted males vs. females, it looks unfair (44.5% for males vs. 30.4% for female). Second consider these numbers by departments. Departments A and F have much lower acceptance rates for males, B and D slightly lower for males, and C and E slightly higher for males. Third, look at the overall acceptance rates per department. A and B ave higher acceptance rates, F has a very low acceptance rate and C, D, and E have moderate acceptance rate. Finally, look at applicants by department. Notice that the departments A and B with high acceptance rates have predominantly male applicants. C and E (departments with middle value acceptance rates) have an applicant pool with roughly two female applicants for every male applicant. The genders are roughly balanced in the applicant pools of Departments D and F. In summary, “the apparent association between admission and sex stems from differences in the tendency of males and females to apply to the individual departments (females used to apply more to departments with higher rejection rates).” [data set help manual in R] This plot shows that a smaller % of females were admitted.Of the male applicants 44.5% were admitted, whereas for the female applicants, only 30.4% were admitted. As Page 74 of the R book notes, “barplots can be used effectively to show the data in a two-way UC Berkeley graduate admissions 0 500 1000 2000 Rejected Admitted Male Female table. One variable is chosen to be the categories of the barplot. Then, the bars for each level of the category are segmented to indicate the proportions of the other variable.” Also side-by-side plots are possible. In the above example, gender category is the chosen variable and proportions of the admit category (admitted/rejected) are shown on the y-axis. Thus females are admitted at a higher rate in Dept A, B, D, F and at a lower rate in Dept C and E. The preference for males is quite Rejected Admitted Rejected Admitted Male 200 Male 100 200 300 Rejected Admitted Dept F Rejected Admitted 0 100 200 300 Dept E 0 100 0 0 Male Dept D Rejected Admitted 400 400 Dept C 0 0 Male 300 Dept B 200 Rejected Admitted 200 600 Dept A Male Male slight in C and E whereas the preference for females is quite large in A and F. In B and D, females are admitted at a slightly higher rate than males. For each possible value of components 2 (Gender) and 3 (Dept), form proportions over the remaining variables (Admit). The proportions are plotted in this next figure. 0.8 Dept F Rejected Admitted 0.0 0.4 0.8 Rejected Admitted 0.4 Male Male Dept E 0.0 0.0 0.4 Rejected Admitted 0.8 Male Dept D Rejected Admitted 0.0 0.0 0.0 Male 0.8 Rejected Admitted Dept C 0.4 0.8 Dept B 0.4 Rejected Admitted 0.4 0.8 Dept A Male Male Below is a mosaic plot of the admit and gender variables summed across all departments. Student admissions at UC Berkeley Rejected Male Female Gender Admitted Admit Per department mosaic plots: Student admissions at UC Berkeley Dept A Dept B Rejected Admitted Dept C Rejected Admitted Rejected Gender Female Female Admit Admit Admit Dept D Dept E Dept F Rejected Admitted Gender Gender Female Male Female Admit Admit Rejected Male Admitted Female Rejected Male Admitted Gender Male Gender Male Female Gender Male Admitted Admit Transposed per-department mosaic plots: Student admissions at UC Berkeley Dept B Admitted Admit Rejected Female Gender Dept D Dept E Dept F Gender Female Admit Male Gender Male Female Rejected Admitted Female Admit Male Admitted Gender Rejected Admitted Male Gender Rejected Admit Dept C Female Admitted Male Admit Admitted Female Rejected Admit Male Rejected Dept A Gender