Jerry Alan Fails March 01, 2005 Application CS 838S – Ben Shneiderman Page 1 of 11 US News College Information Every year, US News publishes a composite ranking of all undergraduate colleges and universities. These rankings combine and even camouflage numerous features using a nebulous composite formula. By accessing the individual data features/attributes, however, students and faculty may be able to use the data to better guide decisions in where to study and/or teach. Two data sets were joined to allow exploring these two perspectives. The joined data sets were the US News College Data from 1995 and the Faculty/Staff Salary Data from the same year. (The data sets were found at: http://www.amstat.org/publications/jse/jse_data_archive.html.) In order to get a geographical perspective of the data, the final data set also included regional data for each college as specified by the US Census Bureau (see http://www.census.gov/geo/www/us_regdiv.pdf). The tools used to explore the data were Treemap and SpotFire. In looking at the data there were two main perspectives that I used, that of a future college student and that of a future college professor. Some common questions between prospective students and teachers are: where do people go to college, where are the largest colleges, what do they cost, etc. Questions a prospective student might ask are: which schools can I afford, which school is ranked highest, where will I get the most for my money, etc. A prospective professor might ask are: where are the highest salaries, where is the most funding, where are the “smartest” students. Common to both perspectives is an understanding of where college institutions are, how many full-time students are enrolled and how much tuition costs. Treemap is a perfect tool to view this hierarchical data (see Figure 1). Figure 1 — General layout of enrollment in colleges and universities. Size indicates number of students enrolled. Color represents the price of out-of-state tuition (Green, high; black, low). Jerry Alan Fails March 01, 2005 CS 838S – Ben Shneiderman Page 2 of 11 Application Note that with the treemap representation one can identify the number of full-time students in both the regions and the states. For example, this representation illustrates that the regions with the most students in descending order are: East North Central (OH, IL, MI, IN, WI), South Atlantic (NC, VA, FL, GA, MD, SC, WV, DC, DE), Middle Atlantic (NY, PA, NJ), West South Central (TX, LA, OK, AK), Pacific (CA, WA, OR, HI, AK), West North Central (MO, MN, IA, KS, NE, SD, ND), East South Central (TN, AL, KY, MS), Mountain (CO, UT, AZ, ID, NM, MT, NV, WY), and New England (MA, CT, NH, RI, ME, VT). The top four states are obvious as well: CA, NY, TX, and PA. Analyzing the same data with Spotfire one could see the distribution with the graphs shown below. These graphs add the element of being able to visualize the ratio between public and private institutions. Number of Full-time Students by region Number of Institutions by region 840815 195 197 185 800000 180 726929 700000 160 619945 600000 134 140 507150 500000 120 451636 423280 101 100 100 400000 93 81 305330 300000 301366 80 276955 60 200000 48 40 100000 20 0 0 Region Name Region Name Number of Institutions by Region 299288 250000 200000 187263 150000 132030 100000 96403 80411 67590 54612 50000 43201 25500 18162 7535 0 State Full Name Figure 2 — Spotfire representations of the distributions of number of full-time student enrollment (upper left, in decreasing order), number of institutions (upper right, same order as upper right), and by state (lower middle, in decreasing order by state). Color represents public (blue) and private (red) institutions. Jerry Alan Fails March 01, 2005 CS 838S – Ben Shneiderman Page 3 of 11 Application Analysis of these scatter plots that were generated by Spotfire reveals that NY not only enrolls many undergraduates, but it also has many private universities — many more than CA, which while enrolling many students has mostly public schools. Similar to NY, PA also has a high ratio of private to public, although not quite as high. TX is more like CA having fewer private schools while enrolling many students. At first one may attribute this as a difference between western and eastern states due to the population, population-density, and geographical reasons (both CA and TX are large) variations. However, by looking at Figure 2, upper left, it is apparent that most regions have a large percent of public schools, the Middle Atlantic and New England schools have a large percentage of private vs. public schools. Viewing Figure 2, upper right reveals that the West North Central and New England regions have smaller institutions since there are more institutions in these regions, and the ordering of the regions is by number of full-time enrolled undergraduates. Another view of the overview of the data results in some insights. Figure 3 simply plots instate tuition and number of full-time undergraduate students. As shown, these features highly segregate the public and private schools. Also highlighted are a cluster of institutions with enrollment of 2500 to 12500 and tuitions of $13,000 to $20,000. Looking at the names of the colleges reveals many famous, prestigious universities. The graph also shows a private school amidst the public schools, namely Brigham Young University. The school is a large private school, with very low tuition. Number of Full-time Students vs. In-state T uition 25000 20000 15000 Yale University Dartmouth College Stanford University Cornell Univ.-Statutory Clls Harvard University Brown University Washington University Univ. of Southern California Emory University University of Puget Sound New York University Northwestern University Villanova University Syracuse University Northeastern University 10000 5000 Brigham Young University 0 0 5000 10000 15000 20000 25000 30000 Number of full-time undergraduates Figure 3 — Number of full-time students vs. in-state tuition. Color represents public (blue) and private (red) institutions. Size of square indicates average faculty salary. As can be seen there is a cluster of famous private universities that are about the same in-state tuition and same number of full-time undergraduates. Also, there is an outlier private school that has a large enrollment and small tuition (Brigham Young University). Jerry Alan Fails March 01, 2005 Application CS 838S – Ben Shneiderman Page 4 of 11 Figure 5, below shows how two similarly structured treemaps reveal this same relationship between private and public schools and their tuition costs. Both treemaps use a region/state hierarchy. The treemap on the left is coded by public (blue) and private (red) schools. The treemap to the right is color-coded by in-state tuition, with black meaning low tuition and green signifying a high tuition. As can be seen there is a nice visual comparison between the blue on the left and black on the right, and red on the left and green on the right. Figure 4 — Left, shows the region/state hierarchy, coloring public schools blue and private schools red. Right, the region/state hierarchy, with a color distribution over in-state tuition (black is low in-state tuition, green is high tuition). Notice the visual correlation between high-tuition and private schools. (The trend for out-of-state tuition is similar, although slightly blurred.) Jerry Alan Fails March 01, 2005 Application CS 838S – Ben Shneiderman Page 5 of 11 Addressing the perspective of a prospective professor, the Spotfire scatter plots in Figure 4, provide new interesting insights. Average Faculty Salaries by Region 500 481.8709677419 466.9603960396 465.2702702703 411.4461538462 408.4166666667 401.2842639594 395.46 377.5373134328 369.2345679012 400 300 200 100 0 Region Name Average Public Institution Faculty Salaries by Region 507.6 519.8913043478 500 500 460.1785714286 400 Average Private Institution Faculty Salaries by Region 459.3015873016 424.1463414634 416.8979591837 415.1666666667 405.5405405405 396.5593220339 400 469.5616438356 462.4528301887 447.1942446043 393.8780487805 388.6060606061 388.1666666667 384.9826086957 354.8470588235 338.7045454545 300 300 200 200 100 100 0 0 Region Name Region Name Figure 5 — Average faculty salaries by region and type of institution. Color represents public (blue) and private (red) institutions. Surprisingly, average public faculty salaries are more than average private faculty salaries. Jerry Alan Fails March 01, 2005 Application CS 838S – Ben Shneiderman Page 6 of 11 As displayed the graphs in Figure 4, a surprising insight emerges, namely that by averaging by region, private school faculty salaries on average are less than public school salaries. This is surprising as private and public schools are generally distinguishable by the amount of tuition (recall Figures 3 and 4). The somewhat unexpected difference in public and private school faculty salary can somewhat be reconciled by the fact that public schools receive public funding. Figure 6, explores the supply/demand and cost benefits for professors. The treemap below uses the region/state hierarchy. It uses the number of faculty for sizing, and average faculty salary for coloring (low salary being black [$23200], high salary being green [$86600]). This graph shows a high demand with a correlating high average salary in the Middle Atlantic region as well as in CA and MA. Somewhat surprisingly, the regions with the highest demand (East North Central and South Atlantic), do not seem to pay as much as these the third and forth most needy regions (Middle Atlantic and Pacific). Figure 6 — The general hierarchical distribution of number of faculty (size) and average salary (high salary green, low salary black). Middle Atlantic seems to be a high paying region as do CA and MA. Continuing an exploration of the data, looking for correlations for high faculty salaries, Figure 7 displays out-of-state tuition vs. instructional expenditure per student. Square size of the university indicates the average faculty salary for that institution. As can be seen, there is a strong correlation between tuition cost and instructional expenditure, except for the highlighted institutions. The highlighted institutions have higher instructional expenditure per student than the amount of tuition. This is probably due to outside funding sources. For example, it is not too surprising that Gallaudet University has a lot of funding (as it is focused on sign language, and teaching of the deaf and hard-of-hearing). Cal Tech receives a lot of technical funding. Johns Hopkins, of course, is known for their medical research. Jerry Alan Fails March 01, 2005 CS 838S – Ben Shneiderman Page 7 of 11 Application Out-of-state T uition vs. Instruction Expenditure per Student (Size = Faculty Salary) California Inst. of Tech. 60000 Johns Hopkins University 50000 Washington University Antioch University Wake Forest University Yale University 40000 University of Chicago Harvard University 30000 Emory University University of Pennsylvania Stanford University Massachusetts Inst. of Tech. Columbia University Dartmouth College Princeton University Northwestern University University of Rochester Vanderbilt University Swarthmore College Wheaton College Creighton University Wellesley College Pomona College Univ.of Calif.-Los Angeles Cornell Univ.-Statutory Clls Brown University New York University Harvey Mudd College Carnegie-Mellon University Rice University Gallaudet University 20000 Case Western Reserve Univ. 10000 5000 10000 15000 20000 25000 Out-of-state tuition Figure 7 — Out-of-state tuition vs. instruction expenditure per student. Color represents public (blue) and private (red) institutions. Size of square indicates average faculty salary. As can be seen a lot of “top” universities spend a lot more on instructional expenditure per student than they receive in tuition. This is likely due to outside funding sources. As can be noted by the sizes of the squares, there is an obvious correlation between the instructional expenditure per student and average faculty salaries. It is also interesting to note that many of the highlighted universities (those whose instructional expenditure are greater than their tuition), are also mostly “prestigious” institutions. As indicated by the size of square for the institution, these top universities also pay their faculty higher salaries. Of course this is not too surprising since one of the axes is instructional expenditure per student, but this per-student value is also dependent upon the student/faculty ratio. Jerry Alan Fails March 01, 2005 Application CS 838S – Ben Shneiderman Page 8 of 11 Application Evaluation Spotfire and Treemap were effective tools for visualizing this multi-dimensional data set. To be able to view information however, it is necessary for the user in both applications to prune and define specific data attributes to view. Spotfire attempted to give suggestions to things that were highly correlated; however, since the views do not include any of the color mappings available on the main visualization panel, it is really difficult to see what these graphs mean. Also, because you cannot even expand this ‘View Tip’ box, it is impossible to read long comparison names to help in the pruning process. Spotfire does allow visualization of 4D data in a 2D space via the x-, y-axis, node size and color. This is a nice feature for multi-dimensional data. I also like the aggregate functions Spotfire allows when creating histograms. Treemap on the other hand allows visualization of more levels if they include hierarchies. Although in most cases it only allows a 2D representation based on square size and color, grouped by an imposed hierarchy. HCE might have also yielded interesting results, but would fatally crash upon viewing this data. This project took a lot of data massaging using Excel and Access prior to visualization. A visualization tool that integrated better data manipulation and control would be very desirable. Jerry Alan Fails March 01, 2005 CS 838S – Ben Shneiderman Page 9 of 11 Application Other Interesting Graphs Instruct Expenditure per Student vs. Average Salary _ . California Inst. of Tech . _ 60000 _ Johns Hopkins University _ 50000 _ Washington University _ Antioch University _ Wake Forest University _ Yale University _ U . S. Naval Academy _ 40000 _ University of Chicago _ Harvard University _ Massachusetts Inst. of Tech . _ Stanford University _ _ Dartmouth College Emory University _ _ Columbia University 30000 _ Northwestern University University of Rochester _ _ Duke University _ University of Pennsylvania _ 20000 _ 10000 _ 300 _ 400 _ 500 _ 600 _ 700 _ 800 _ Average salary - all ranks _ Figure 8 — This illustrates average salary vs. instructional expenditure. There is a general correlation, with many outliers. The highlighted outliers have a higher instructional expenditure vs. salary indicating that these universities probably have more funding that does not necessarily go to the professors. The US Naval Academy is a “public” outlier among the private schools. This is not surprising since the Naval Academy likely receives a lot of federal funding. Jerry Alan Fails March 01, 2005 CS 838S – Ben Shneiderman Page 10 of 11 Application Out-of-state T uition vs. Instruction Expenditure per Out-of-State T uition Dollar Gallaudet University 5 4 Grambling State University Univ. of Hawaii at Hilo California Inst. of Tech. Univ. Alabama at Birmingham Brigham Young University Wake Forest University 3 Antioch University Boise State University Univ.of Calif.-Los Angeles University of South Alabama Eastern Oregon State College Adams State College Univ.Alaska-Fairbanks Lincoln University Univ.of Calif.-San Diego 2 Johns Hopkins University Univ. of Hawaii at Manoa Bellevue College William Paterson College Washington University Rice University Creighton University University of Washington Harvard University University of Chicago Stanford University Yale University Northwestern University Dartmouth College Duke University Massachusetts Inst. of Tech. Columbia University Univ.of Calif.-Davis University of Pennsylvania University of Rochester Carnegie-Mellon University 1 0 5000 10000 15000 20000 25000 Out-of-state tuition Figure 9 — This graph highlights the universities where students get the “biggest bang for their buck”. In other words how much instructional expenditures they get per tuition dollar. This puts expensive and inexpensive universities on an equal playing level as their instructional spending is normalized by their out-of-state tuition. Jerry Alan Fails March 01, 2005 CS 838S – Ben Shneiderman Page 11 of 11 Application Regions vs. Instructional expenditure per Out -of-State T uition Dollar - Region Name vs. Instructional expenditure per out -of-state tuition dollar Gallaudet University 5 Grambling State University Univ. of Hawaii at Hilo 4 California Inst. of Tech. Univ. Alabama at Birmingham Brigham Young University Wake Forest University 3 Antioch University Boise State University Texas Woman's University University of South Alabama Washington University Adams State College 2 Jersey City State College Lincoln University Univ. of Illinois-Chicago Howard University Bellevue College SUNY at Stony Brook William Paterson College Kean College of New Jersey University of Chicago Harvard University Creighton University Yale University Massachusetts Inst. of Tech. Dartmouth College 1 0 East North Central East South C... Middle Atlantic Mountain New England Pacific South Atlantic West North ... Region Name Figure 10 — This shows which universities, by region, have the “biggest bang for their buck”. West South Central