UC Berkeley Admissions

advertisement
UC Berkeley Admissions
Queries to extract knowledge from the data set:
•
Is there a gender bias in the graduate admissions process?
Output:
First looking at total percentages of admitted males vs. females, it looks unfair (44.5% for males vs. 30.4% for female). Second consider these numbers by departments. Departments A and F have much lower acceptance rates for males, B and D slightly lower for
males, and C and E slightly higher for males. Third, look at the overall acceptance rates per department. A and B ave higher acceptance
rates, F has a very low acceptance rate and C, D, and E have moderate acceptance rate. Finally, look at applicants by department.
Notice that the departments A and B with high acceptance rates have predominantly male applicants. C and E (departments with middle
value acceptance rates) have an applicant pool with roughly two female applicants for every male applicant. The genders are roughly
balanced in the applicant pools of Departments D and F. In summary, “the apparent association between admission and sex stems from
differences in the tendency of males and females to apply to the individual departments (females used to apply more to departments
with higher rejection rates).” [data set help manual in R]
This plot shows that a smaller % of females were admitted.Of the male applicants 44.5% were admitted, whereas for the female applicants, only 30.4% were admitted. As Page 74 of the R book notes, “barplots can be used effectively to show the data in a two-way
UC Berkeley graduate admissions
0
500
1000
2000
Rejected
Admitted
Male
Female
table. One variable is chosen to be the categories of the barplot. Then, the bars for each level of the category are segmented to indicate
the proportions of the other variable.” Also side-by-side plots are possible. In the above example, gender category is the chosen variable
and proportions of the admit category (admitted/rejected) are shown on the y-axis.
Thus females are admitted at a higher rate in Dept A, B, D, F and at a lower rate in Dept C and E. The preference for males is quite
Rejected
Admitted
Rejected
Admitted
Male
200
Male
100 200 300
Rejected
Admitted
Dept F
Rejected
Admitted
0
100 200 300
Dept E
0
100
0
0
Male
Dept D
Rejected
Admitted
400
400
Dept C
0
0
Male
300
Dept B
200
Rejected
Admitted
200
600
Dept A
Male
Male
slight in C and E whereas the preference for females is quite large in A and F. In B and D, females are admitted at a slightly higher rate
than males. For each possible value of components 2 (Gender) and 3 (Dept), form proportions over the remaining variables (Admit).
The proportions are plotted in this next figure.
0.8
Dept F
Rejected
Admitted
0.0
0.4
0.8
Rejected
Admitted
0.4
Male
Male
Dept E
0.0
0.0
0.4
Rejected
Admitted
0.8
Male
Dept D
Rejected
Admitted
0.0
0.0
0.0
Male
0.8
Rejected
Admitted
Dept C
0.4
0.8
Dept B
0.4
Rejected
Admitted
0.4
0.8
Dept A
Male
Male
Below is a mosaic plot of the admit and gender variables summed across all departments.
Student admissions at UC Berkeley
Rejected
Male
Female
Gender
Admitted
Admit
Per department mosaic plots:
Student admissions at UC Berkeley
Dept A
Dept B
Rejected
Admitted
Dept C
Rejected
Admitted
Rejected
Gender
Female
Female
Admit
Admit
Admit
Dept D
Dept E
Dept F
Rejected
Admitted
Gender
Gender
Female
Male
Female
Admit
Admit
Rejected
Male
Admitted
Female
Rejected
Male
Admitted
Gender
Male
Gender
Male
Female
Gender
Male
Admitted
Admit
Transposed per-department mosaic plots:
Student admissions at UC Berkeley
Dept B
Admitted
Admit
Rejected
Female
Gender
Dept D
Dept E
Dept F
Gender
Female
Admit
Male
Gender
Male
Female
Rejected
Admitted
Female
Admit
Male
Admitted
Gender
Rejected
Admitted
Male
Gender
Rejected
Admit
Dept C
Female
Admitted
Male
Admit
Admitted
Female
Rejected
Admit
Male
Rejected
Dept A
Gender
Download