Producing Better Institutional Performance Metrics More Efficiently Results of a 2014 On-line Survey of Institutional Research Departments, Offices and Related Units January 23, 2015 Institutional Research Sheridan College Ontario, Canada Financial support for this research was provided by the Ontario Ministry of Training, Colleges and Universities from the Productivity and Innovation Fund. Summary The incredible rate at which post-secondary institutions have been accumulating data over the past cade should mean that they are now overflowing with useful information to monitor institutional performance and make strategic decisions. However, since raw data has to go through a complex and expensive intermediate process to become useful information, simply accumulating more and more data does not guarantee that more or better information is being produced. In fact, it is the strength and efficiency of this intermediate process that will have much more bearing on the amount of useful information available to senior decision makers than the quantity of raw data available. A main output of this intermediate information-generating process is institutional performance metrics. These metrics are used to convey strategic- or performance-relevant information in a form that senior decision-makers can use. The purpose of this study is to help those charged with building and improving their institutions’ performance metrics system with empirically grounded research based on an extensive survey of the leaders of Institutional Research (IR) Departments and Offices throughout the United States and Canada.1 This study was financially supported by the Ontario Ministry of Training, Colleges and Universities as part of a large Ontario-wide fund known as the Productivity and Innovation Fund (“PIF”), which was made available to post-secondary institutions in Ontario to foster innovation, improve the quality of learning and promote greater efficiency. The research, led by Sheridan’s Department of Institutional Research with research support from Academica Group (a research firm specializing in higher education), included in-depth telephone interviews of 38 key administrators closely associated with the production and use of performance metrics in addition to a survey sent to 1,307 IR Departments (192 responses, 152 complete responses) or related units in Canada, the United States, Australia and New Zealand.2 All respondents, whether they fully completed the survey or not, were promised a copy of the results of the survey. This report contains those results. Simplified Model of Metric Production and Net Benefits The analysis of the survey data directly informs two important choices: 1) which metrics to focus on because they provide a higher net benefit; and, 2) what mix of resources, environment and data infrastructure result in higher IR productivity. A simple framework is used to model institutional metric production.3 The framework places the IR Department at its center, pulling data from databases, 1 For expositional convenience, we refer to the primary unit responsible for performance metric production as the Institutional Research (IR) Department, though that unit can have many different names and institutional metric production in some institutions is quite decentralized and thus not the primary responsibility of any one department. Also for expositional convenience, we refer to the key administrator of the IR Department or related unit, that is, the person who typically completed the survey, as the “director”, even though there are many other possible titles and positions for this role. 2 Of the 1,307 institutions contacted, 152 completed the full survey and an additional 40 completed some of the survey (response rate of 11.6% for completed surveys, 14.7% for completed and partially completed surveys). See Table 2.1 in the main text for additional summary statistics about response rates. 3 We focus on a centralized model of metric production because the survey data best matches this model. That is, the survey targeted one respondent per institution, the person who was the most involved with directing the Page 2 of 71 transforming it into metrics and delivering it to senior decision makers and governments. Raw data, mostly from institutional administrative database systems and surveys, is accessed, cleaned and prepared via computer algorithms and manual processing resulting in an intermediate data input that is further processed into performance metrics. Validation checks are done at various points and the resulting metrics are distributed by some reporting mechanism to senior decision makers. The collective value of the individual metrics produced determines the overall value of the system of performance metrics. Direct costs of producing these metrics are the labour costs of the IR staff to understand business rules, write computer code, design and administer surveys, etc. Creating and maintaining a data warehouse, data marts and data dictionaries (data infrastructure) that can be used for metric production is another possible cost, but also lowers or eliminates the cost to produce the intermediate data input. Survey Results While an oversimplification, the survey results map quite well to this model. In particular, the survey results provide information about the value of individual metric outputs, their collective value and how changes in the inputs (IR staff, data infrastructure and the operating environment) correlate with the outputs. The Value of Metrics Respondents were presented with up to 40 typical institutional performance metrics; for each metric, respondents were asked to identify if the potential metrics is produced, the quality of the metric at their institution and their view of the metric’s theoretical or potential importance, independent of its current importance, which may be affected by lower quality at their institution (i.e. what they would tell a new director of an IR department at another similar type of institution). The most important broad metric categories are Student Success, Enrolment Management and Student Satisfaction.4 Within each category, there is considerable variation in the individual metric scores. Thus, ignoring cost considerations, potential synergies and highly institution-specific missions and goals, carefully choosing metrics within each broad metric category can potentially yield a much higher overall metric score and a more comprehensive view of the institution’s performance. See Table 1 below for a detailed list or ranked metrics. production of institutional metrics. Institutions with highly decentralized models for producing metrics should interpret the results with this in mind. 4 An alternative way to rank metric categories is by their primary use and, assuming that strategic or quality assurance are the most important uses, produces the same top rankings. Page 3 of 71 Table 1: Metric Ranking by Metric Score, Grouped by Metric Category Page 4 of 71 Metrics can also be considered along the dimensions of quality and importance. Chart 1 below plots each metric’s quality (horizontal axis) and importance (vertical axis), and is located in one of four quadrants: Key Metrics: high importance and high quality Potentially Distracting Metrics: low importance and high quality Challenging Key Metrics: high importance and low quality; and, Low Value Metrics: low importance and low quality. Retention and graduation rate metrics are closely tied to arguably the main broad purpose of almost any post-secondary institution and are ranked to have both the highest importance and quality (top right).5 One interesting quadrant is the Challenging Key Metrics (top left), those of lower quality but high importance. Student learning or skill gain and graduate employment outcomes stand out. Their importance is also self-evident, but the fact that their quality is relatively low suggest that it is difficult or expensive to create metrics that measure these vital functions and outputs of an institution accurately. 5 Certainly, there are some institutions, programs and situations where students who transfer to another institution for more advanced studies before graduating represent a successful outcome, but that does not diminish the importance of retention and graduation metrics, only that more information, such as separating out transfers, is needed. Page 5 of 71 Chart 1: Four Quadrant Diagram of Metrics IR Resources, Characteristics and Operating Environment Institutions in the sample range from as small as 432 students to as large as 60,000. The largest IR Department has 15 full-time equivalent (FTE) staff. University IR Departments are about twice as large as those in colleges. Larger institutions have, on average, larger IR Departments but there are considerable economies of scale. Chart 3 below plots the IR FTE staff levels (vertical axis) against the reported institution headcount (horizontal axis, in base 10 logarithm). For every doubling of the institution size, the IR Department does not also double but adds about 1.1 new staff (about 25% of the average IR size). Significant variation exists between institutions, and based on eight relevant openended comments that are relevant to institution size, and paraphrased in the chart, larger than expected departments have responsibilities beyond a traditional IR Department and smaller than expected departments seem to indicate more limited capabilities and possibly some frustration with senior leadership. Page 6 of 71 Chart 3: Relationship between Institution Size and Number of FTE IR Staff The levels of education of IR staff are high, with a majority (62%) having a graduate degree (19% hold a Ph.D). IR key administrators (“directors”) have diverse educational backgrounds, with the largest share having a background in Education (28.2%), followed by Social Sciences (24.8%) and Business/Economics (17.4%). The IR Department’s position in the organization reporting structure varies substantially; a large share of departments (44%) falls under the Vice-president Academic/Provost, a nearly equal share (45%) is positioned under either the President or a non-academic executive. Respondents indicated that a lack of IR staff was the largest barrier to improving performance metrics, though the skills of existing IR staff were not seen as lacking (ranked as the lowest barrier). The availability of raw data is also not considered a barrier; rather, the lack of necessary infrastructure (e.g. data warehouses, data dictionaries) and support to create the infrastructure systems (e.g. data Page 7 of 71 governance, senior decision maker support) are seen as more important problems. IR Departments typically use tools and data sources that have been around for decades and only a few are routinely using the latest data-driven approaches, though a large minority use more sophisticated statistical techniques routinely. Respondents flagged a lack of perceived buy-in among senior decision makers as a barrier, though much of the impetus for creating metrics comes from senior executives with much less demand from mid-level management, Faculties or even the Board of Governors. 6 Relationship between Inputs, Operating Environment and Performance Metric Outputs To measure the correlation between metric outputs, IR inputs and the operating environment, we construct an index of metric output and correlate this to reported input levels, operating conditions and interactions between inputs and operating conditions. The following results are found: For every additional IR employee, if there is a good data infrastructure, 0.9 (3.2%) additional metrics are produced and the share of Challenging Key Metrics is higher. With poor data infrastructure, adding more IR staff results in no additional metrics being produced. A perceived lack of support from senior administrators had no impact on the number of metrics produced. Counterintuitively, it is weakly associated with a higher share of Challenging Key Metrics. (It may be that more ambitious IR administrators may perceive themselves as being more constrained by limited resources.) Age of the IR Department is not correlated with the number of metrics produced but is strongly positively correlated with the average metric importance of the metrics produced, independent of other factors, perhaps reflecting a learning-by-doing process. The more metrics that are produced the lower the average metric importance (diminishing returns to metric importance) 6 The results from the interviews are inconsistent from the survey results in that they differ markedly in how insufficient raw data inputs and lack of resources (lack of staff) are ranked as barriers: the interviewees ranked “lack of input data” much higher than “lack of resources”, whereas survey respondents ranked “lack of input data” much lower than “lack of resources”. One possible explanation for the incongruity is that the survey was mostly answered by those directly responsible for IR (over 75% were at the director level or lower) while the interviewees were mostly at a more senior level (only 45% were at the director level or lower), and these two levels of administration may, in an uncoordinated fashion, be using different definitions of resources and data inputs. For instance, senior administrators, further away from the information-producing process, may consider a researchready dataset, that generated in Step 1 of the simple model described above, that is easily extracted from a data warehouse as part of the raw data inputs or resources for IR, while directors and managers consider the underlying operational raw data as the data inputs, and compiling and cleaning this data into data warehouses and useable datasets to be part of the research process itself and thus requiring staff time to process. As explained, the statistical analysis shows that, other things being the same, departments with more IR staff do not produce more metrics when the data infrastructure is inadequate. Thus, in some ways, both definitions could be correct depending on whether there is a good research data infrastructure and where the cost to create and maintain such an infrastructure lies. Page 8 of 71 If government is seen as a main initiator of the production of metrics, the total number of metrics produced is lower, the share of Key Metrics is higher and the share of Challenging Key Metrics and Potentially Distracting Metrics is lower. One interpretation is that governments typically request metrics that are categorized as Key Metrics and this comes at an opportunity cost of fewer other types of metrics produced Directors with a social science or education background lead departments or offices that produce about 14% more metrics for reasons unrelated to institution type, size, department size or country. There are no differences between countries or institution types in the number of metrics produced once other differences have been taken into account. Average metric importance for colleges is slightly higher than for universities. IR Departments that report to the President’s Office produce a mix of metrics that are slightly less important, which may be due to a choice of a greater breadth of measurement at the expense of not producing some metrics generally considered to be more important. The results can be summarized within the structure of the simple model described above. The production of performance metrics of IR Departments does depend on the number of staff, but staff work with a research data infrastructure and if the data infrastructure is lacking the ability of staff to produce new metrics will be greatly reduced, possibly to zero.7 If the department is significantly understaffed relative to a typical size for the same institution type, metric output is also likely to be low or of poorer quality. IR Departments focus on core metrics (Key Metrics) when resources are limited (or when mandated by government), but as additional resources are available tend to develop metrics that are importance but are more difficult to create at a high level of quality (Challenging Key Metrics). As more metrics are produced, the value of new metrics becomes smaller since directors and their institutions rationally have focused on producing the most valuable metrics first. Some of the productivity of an IR Department comes with maturity of the department; the size of the department, quality of data infrastructure or level of support from senior management are unlikely to fully replace this maturation process. Furthermore, changing where the IR Department is positioned in the institution is unlikely to have a significant impact on metric production. 7 It could be that new staff have so much trouble producing additional metrics that they work instead on nonmetric related activities or they start to work on improving the data infrastructure. In either case, no new metrics are created in the short-run. Page 9 of 71 1. Introduction In order to build up and enhance its own suite of performance metrics, in the Winter of 2013 Sheridan College, working with Academica Group (a research firm specializing in higher education), collected survey data from 192 Institutional Research (IR) Departments, offices and related units8 from publicly funded post-secondary institutions throughout the United States of America, Canada, Australia and New Zealand. This study was financially supported by the Ontario Ministry of Training, Colleges and Universities (“the Ministry”) as part of a large Ontario-wide fund known as the Productivity and Innovation Fund (“PIF”), which was made available to post-secondary institutions in Ontario to improve efficiency and foster innovation. The research began in December 2013, led by Sheridan’s Department of Institutional Research with research support from Academica Group. The background research used to develop the survey instrument involved a review of the literature and in-depth telephone interviews of 38 key administrators closely associated with the production and use of performance metrics. The survey was sent to 1,307 IR Departments or related units in Canada, the United States, Australia and New Zealand. Of these, 192 responded and 152 fully completed the survey and were included in the analysis presented in this report.9 All respondents who participated in either an interview or the survey were promised a copy of the results of the survey. This report contains those results. 1.1.Definitions Throughout the report, we refer to institutional performance metrics, sometimes calling them performance metrics or simply metrics. This study does not deal with detailed operational metrics but instead focuses on the types of metrics that would be of interest to senior decision makers, the Board of Governors and governments. We also refer to IR Departments and offices even though at some institutions included in our sample there is no department or office referred to as Institutional Research. Moreover, even where an institution has an IR Department or office some of the important metrics may be produced by other units. Thus when we refer to IR Departments or offices it should be taken within context to mean the unit or units that are primarily responsible for the production of performance metrics. 1.2.Organization of Report The rest of the report is organized as follows. Section 2 briefly describes the survey methodology. Sections 3 through 5 present summaries of key parts of the survey. Section 3 deals with direct inputs in IR Departments and offices. Section 4 examines the operating environment and barriers, and Section 5 documents the recent changes in terms of the development and production of institutional metrics. Section 6 examines the institutional performance metrics themselves. Readers mostly or only interested in metric ranking and categorization may want to skip to this section. Section 7 pulls the other sections together by using multiple equation regression analysis to estimate associations of importance-adjusted 8 For expositional convenience, we refer to all these units simply as IR departments in this report. Of the 1,307 institutions contacted, 152 completed the full survey and an additional 40 completed some of the survey (response rate of 11.6% for completed surveys, 14.7% for completed and partially completed surveys). See Table 2.1 in the main text for additional summary statistics about response rates. 9 Page 10 of 71 metric production and the choice of metric production with potential drivers, including IR staff resources, the operating environment and institutional characteristics. Section 8 concludes. 2. Methodology The main data source for this study is a survey of the key administrators of Institutional Research Departments and related units. In order to construct the survey instrument, a set of telephone interviews of key administrators10 of post-secondary IR departments was done. A total of 150 colleges and universities were invited to participate in a telephone survey. Thirty-eight of these completed the interview, which lasted from 18 to 56 minutes or an average of 38 minutes (see Appendix 2 for the interview questions). All interviews were done by telephone and conducted by the lead researcher at Academica guided by a fixed list of interview questions, but using a conversational style. Each interview was recorded and transcribed by a third party and then anonymized by the researcher at Academica. Interviewees were asked for permission at the beginning of the interview to have the interview recorded and transcribed, and were assured that all identifying information would be removed from transcripts to maintain anonymity and confidentiality. Sheridan’s lead IR researcher was provided with these anonymized interview transcripts as well as a summary of the main findings prepared by the Academica researcher. The information from the interviews was used to help develop most of the main survey sections. The survey instrument itself asked respondents for information about the department or office (e.g. resources, staff composition, position in the reporting structure, history), the institution (e.g. size, degree of admission selectivity), and performance metrics (e.g. opinions about the importance and quality, recent developments, barriers to production, initiators). There were also two open ended questions allowing respondents to provide any other information to help with understanding the context of some of the answers and to provide feedback about the survey itself. See Appendix 1 for the full survey instrument. Numerous questions in the survey instrument ask the respondent for his or her opinion or evaluation. Likert scales were used for many of these questions. For a few of the questions involving at least some subjective judgment, the survey asked the respondent for related judgements in multiple nonconsecutive questions using different prompts. This allows for some testing of internal validity of the instrument, but for the most part, the length of the survey precluded a design that would allow extensive internal validity testing of individual questions. Notwithstanding this, the strength and direction of statistical association based on variation in responses across respondents provides a strong test of the validity of the key results. In order to conduct a robust statistical analysis, it was important to have a large enough sample. One objective of the study was to examine potential differences between various subgroups of institutions. 10 Includes Provosts, Associate Provosts, Vice-presidents, Associated Vice-presidents, Directors, and Managers. For the survey, the person who appeared to be the most directly responsible for leading the IR department, IR office or closely related unit was contacted. Page 11 of 71 This further increased the need to have a large sample in order to ensure there were enough observations in each relevant subgroup. The main grouping variables used were country, institution type (college or university) 11, institution size and open versus selective admission policy. As the subsequent results show, the final useable sample of 152 observations was indeed large enough to be able to draw meaningful conclusions from the statistical analysis. To build a contact list of potential respondents, a list of all publicly funded post-secondary institutions in Canada, the United States, Australia and New Zealand was created. From this list, we categorized institutions by country, institution type, and size (small, medium and large).12 For Canada (excluding Quebec), Australia and New Zealand, all post-secondary public institutions in the population for which contact information could be obtained were included and for the United States, where the cost of collecting contact information for all public post-secondary institutions was higher, 1,500 institutions with a minimum of 1,000 students enrolled were selected at random and from these, contact information for 1,181 (79%) was located. The contact person was initially notified by email in early April that there would be a request to participate in the survey and was also provided information about the survey’s purpose. The contact was then emailed a link to the survey about one to two weeks later. Up to five reminder emails were sent during the second half of April. A second brief follow up survey was sent to all respondents in the United States in order to collect additional information on the importance and quality of specific metrics. The initial survey took longer to complete than expected, with respondents taking an average of 40 minutes. Of the 1,307 respondents contacted, 152 completed the full survey and an additional 40 completed some of the survey (response rate of 11.6% for completed surveys, 14.7% for completed and partially completed surveys). Table 2.1 below shows the breakdown of responses by country, size and institution type. Three quarters of the complete surveys (115) were from American institutions while only three were from Australia and New Zealand. Given the very small number of respondents for Australia and New Zealand, those responses are excluded from the analysis.13 Canadian institutions were somewhat more likely to complete the full survey than American institutions (91.9% versus 77.2%, the difference being statistically significant at the 5% level). 11 There are country differences in what is defined as a college and university. For the purposes of this report, for the United States we refer to community colleges and two-year colleges as colleges and four-year and longer institutions as universities; for Canada, we take the college and university designation based on contact list sources that are separate for colleges and universities. 12 For American institutions, the standard Integrated Postsecondary Education Data System (IPEDS) classification was used: 0 - 4,999 students enrolled (small); 5,000 - 9,999 (medium); 10,000 and above (large). For all other respondents we used the Ontario Ministry of Training, Colleges and Universities standard categories: 0 – 3,000 (small); 3,000 – 8.000 (medium); 8,000 and above (large). 13 However, all the Australian and New Zealand institutions that participated were also sent a copy of this research and we believe that most of the results will be of direct relevance to these institutions as well. Page 12 of 71 Table 2.1 Number of Respondents by Country and Completion Status 2.1 Methodology for the Survey Analysis A number of different methodologies were used to analyse the survey data. The analyses for Sections 3, 4 and 5 employ only basic summary statistics and some graphical representation of relationships between variables. These approaches alone are sufficient to provide meaningful insights. However, more complex inferences are only possible using more advanced statistical techniques that go beyond basic multivariate regression analysis. These are taken up in Sections 6 and 7. However, this paper’s focus is on presenting the results and interpretations, and thus the statistical techniques used are not discussed in detail. The ranking and categorization of performance metrics is the subject of Section 6. In order to rank metrics relative to each other, the numeric value of importance and quality, as defined using five-point Likert scales, are summed together across all respondents to create an overall quality-importance value for that metric, thus treating the ordinal importance and quality scales as if they were equal cardinal scales. The shortcomings of this approach are discussed in that section as well as checks of the robustness of the results to alternative specifications. The more complex analysis involving multivariate and multi-equation statistical techniques is taken up in Section 7. In this section, the focus is not on the metrics but on the factors that cause the set of metrics produced by IR Departments to differ across institutions. We initially constructed an index of IR metric productivity by again summing together quality and importance of all metrics the respondent rated. Some limitations were found with this approach. In particular, we found some evidence of a possible rater bias that induced a potentially spurious correlation between metric quality and importance. In order to mitigate the potential bias a different approach was used that involved two steps. The first step was to construct an average metric importance for each metric by averaging across all respondents that rated the importance of that metric. The second step was to, for each respondent, add together the average metric importance value computed in the first step for only those metrics that the respondent indicated were produced at his or her institution at a “fair” quality level or better. Thus this approach converts the metric quality to a dichotomous scale (fair-and-above or poor-or-not-produced) and uses average importance values to attenuate any potential unwanted, and unconscious, respondent bias in reported metric quality and importance. Page 13 of 71 The resulting IR productivity index is decomposed into its two constituent components: 1) the total number of minimum quality metrics produced, and 2) the group-based importance-weighted value of those metrics. Both components are correlated to various IR inputs (e.g. staff levels, director educational background) and characteristics of the operating environment using a two-equation system. Specifically, the first equation in this system relates the number of fair-and-above quality metrics with inputs and operating environment, and the second equation relates the average importance value of these fair-and-above quality metrics to a different but overlapping set of explanatory variables. The number of fair-and-above metrics is included in the second equation to capture diminishing returns in the importance-value of metrics; if institutions build the set of metrics rationally, starting with more important metrics and then less and less important metrics, the data should show that the more metrics that are produced, the lower should be the importance of the last metric produced. (We do not impose a diminishing returns assumption on the specification but allow it to be estimated freely.)14 The final analysis returns to the four-quadrant diagram introduced in Section 6 and places it in a multiequation setting. For each respondent, we first calculate the share of metrics that fall into each of the four quadrants described in Section 6. We then examine how various factors influence these shares, including staffing levels, the role of government, barriers and institutional characteristics. (Since shares lie between zero and one and sum to 100% this creates certain restrictions on the error term that are not valid using ordinary least squares regression techniques. We use a quasi-maximum likelihood fractional multinomial logit model proposed by Papke and Wooldridge (1996) 15 and developed for Stata by Buis (fmlogit).16) 3. IR Departments, Offices and Their Resources This section discusses the characteristics of the institutions, the IR Departments and resource levels. 3.1 Institution and IR Department Size The variation in the institutions included in the analysis is large. Institution size ranges from as small as 432 students to as large as 60,000 students, as shown in Table 3.1 below (based on reported full-time students enrolled in the fall term (September 2013). The median enrolment is smaller than the mean, indicating that the distribution is skewed toward smaller institutions, which is not surprising. This skew is also apparent in the institution size histograms shown in Chart 3.1. There are both small and large institutions represented for each country/institution type combination. The large variation in institution size, as we will see shortly, is very useful for understanding the degree to which institution size drives IR Department size and, in turn, the number of metrics produced. 14 Since this approach includes the dependent variable in the first-equation as an endogenous right-hand side variable in the second, we use a three-stage simultaneous equation technique (reg3 in Stata) as the assumptions for ordinary least squares in this specification are not valid. 15 Econometric Methods for Fractional Response Variables with an Application to 401(k) Plan Participation Rates. Journal of Applied Econometrics, 11(6):619-632. 16 http://maartenbuis.nl/software/fmlogit.html , Maarten L. Buis, accessed September 3, 2014. Page 14 of 71 Table 3.1: Institution Size by Country and Institution Type Chart 3.1: Institution Size Distribution by Country and Institution Type Page 15 of 71 There is also considerable range in the size of IR Departments (see Table 3.2 below). The size of the IR Departments varies from zero full-time equivalent (FTE) staff to 15.17 University IR Departments are about twice as large as those of colleges, with American two-year/community colleges having the smallest IR Departments, on average. The difference between average Canada and the United States sizes is explained by differences in average institution size and type in this sample.18 Table 3.2: Number of Staff in Institutional Research Department (and Related Units) by Institution Type and Country We expect that larger institutions will on average have larger IR Departments. However, because much of IR data preparation and report generation is based on scalable (and extensible) computer routines, significant economies of scale in metric production and other IR activities should exist; thus, doubling the size of an institution should result in the IR department expanding by less than double. We can see this clearly in Chart 3.2 below, which plots the IR FTE staff levels against the reported institution headcount (in base 10 logarithm). As the slope of the regression line through the data shows, for every doubling of the institution size, about 1.1 new IR staff are added. Chart 3.2 also shows the underlying data points, and the significant variation between institutions is easy to see. Eight of the respondents provided open-ended comments that are relevant to institution size, and are paraphrased in the chart (comments are located near the corresponding institution data point but are not indicated precisely to protect confidentiality). Some respondents’ comments indicate that they feel that their departments had responsibilities beyond a traditional IR Department and this explained why they were larger than would be expected. These comments were all located above the regression line, confirming that these departments or offices are indeed larger than what would be expected. One of the larger IR offices of a large institution indicated that they were using in-house 17 The number of full-time equivalent staff was calculated as the number of reported full time staff plus 0.5 multiplied by the reported number of part time staff. 18 Based on an unreported ordinary least squares regression of the number of FTE staff on the natural logarithm of institution headcount, a college dummy variable (equal to 1 for a college and 0 for a university) and a country dummy variable (equal to 1 for Canada and 0 for the United States of America), the coefficient multiplying the country dummy variable was found to be statistically insignificant. Page 16 of 71 reporting tools, though positioning to move to an outside vendor, which we speculate may explain additional staff needed to maintain these reporting tools. On the other hand, smaller than average departments for the institution size made comments that seem to indicate more limited abilities and possibly some frustration with senior leadership. Chart 3.2: Relationship between Institution Size and Number of FTE IR Staff 3.2 IR Department Characteristics The length of time an IR Department has existed varies considerably, with many departments being quite young while others have been established for decades. Chart 3.3 below shows the distribution of the age of the department (responses were missing for 40 institutions). The tenure of the department turns out to be closely correlated with a measure of productivity that is examined in Section 8. Page 17 of 71 Chart 3.3: Distribution in the Number of Years the IR Department Has Existed In addition to institution size, the size of the IR Department is related to institution type (e.g. college versus university) and the age of department. These relationships are shown in Chart 3.4 below. The scatter plots are similar to Chart 3.2 above, with each graph showing headcount (base 10 logarithmic scale) on the horizontal axis and IR FTE staff counts on the vertical axis. The left-hand graph shows the data for colleges and the right-hand for universities, while within each graph the orange markers (and orange trend lines) correspond to IR Departments that have existed for 10 or more years and the blue markers (and blue trend lines) correspond to IR departments that have existed for less than 10 years. Readily apparent is the difference in the slopes of the lines between colleges and universities with the university trend line being much steeper, suggesting that university IR Departments and offices vary more with institution size than those in colleges. It is also evident that there is a significant difference between older and younger IR Departments (differences in the slopes of the orange and blue lines) for universities. Older IR Departments in universities are on average larger than their younger counterparts at similarly sized universities. This may be evidence of a ratcheting effect whereby as IR Departments develop more metrics and take on additional responsibilities, existing responsibilities remain, necessitating department growth. However, it is not obvious why this would not also apply to colleges, yet age of the IR Department or office does not seem to matter for colleges.19 19 The difference in slopes between older and younger IR Departments among universities is marginally statistically significant so it may be that there is no actual difference even among universities and the results are simply due to this particular sample. On the other hand, the difference in slope of the trend lines between colleges and universities is highly statistically significant so it is very unlikely that he college-university difference in slopes is due Page 18 of 71 Chart 3.4: Relationship between Institution Size and Number of FTE IR Staff, By Institution Type and Age of the IR Department The levels of education of IR staff vary quite a bit, reflecting the wide range of skills needed in an IR Department or office. A majority (62%) of IR staff have a graduate degree (19% hold a Ph.D.), and as Table 3.3 below shows, this does not vary by country or institution type (the differences in the distributions are not statistically significant). to chance. Also, the results of the regression analysis presented in Section 7 show that, independent of its size, the department’s age is associated with the choice of which metrics to produce, with older departments and offices tending to produce metrics considered collectively as more important. Page 19 of 71 Table 3.3: Distribution of IR Staff Credentials by Country and Institution Type IR key administrators (“directors”) have diverse educational backgrounds, with the largest share having a background in Education (28.2%), followed by Social Sciences (24.8%) and Business/Economics (17.4%). Other educational backgrounds, such as Engineering, Computer Science, Public Administration, are also represented. Chart 3.5 shows that there are some country differences, with IR administrators of Canadian institutions more often having educational backgrounds in business or economics compared to their American-based colleagues who are more likely to have backgrounds in education. In terms of institution type, there are no statistically significant differences between colleges and universities in terms of the director’s background. Does the director’s background matter to the production of metrics? In the later statistical analysis section we present evidence that it does a bit, with those having an Education or Social Science background producing more metrics with a given number of IR staff, which could reflect a difference in how tradeoffs on allocating staff to metric and non-metric related activities is influenced by a directors background, or possibly how the hiring criteria for the IR director is influenced by varying institution needs. Chart 3.5: Distribution of Educational Background of the IR Key Administrator by Area of Study and Country We asked respondents how long they were at the institution and how long they have been in the role as the key administrator of Institutional Research. From these responses we infer that those who were at the institution longer than in their current role were internal hires (or transfers) and the others were Page 20 of 71 external hires. By this definition, roughly equal shares were hired externally (53.5%) versus internally (46.5%). Internal hires occur after about eight to ten years working at the institution, even for younger IR departments (see Chart 3.6) and the number of years of experience in the IR leadership role is quite similar between internal and external hires. Despite these similarities, directors’ total years of experience in the institution varies considerably by institution (standard deviation of 5.67 years), even when normalized by the number of years the IR Department or office has existed.20 Chart 3.6: Relationship between Key Administrator’s Previous Role, Tenure at Institution and IR Department’s Age 4. Operating Environment 4.1 Reporting Position of the IR Department IR Departments and offices, which typically have a wide institutional mandate, have a position in the organization reporting structure that varies substantially. While the largest share (44%) falls under the 20 Years the IR department or office has existed explains less than 7% of the variation in tenure time as the key IR administrator in a regression of the latter on the former (regression results not shown). Page 21 of 71 Vice-President Academic/Provost, a nearly equal share (45%) is positioned under either the President or a non-academic executive (see Chart 4.1). One interesting question is whether the organizational position of IR matters to the production of metrics. While many arguments can be advanced for why it should or should not matter, it seems that this question is best addressed empirically. We will return to this question later. Chart 4.1: IR Department’s Position in the Reporting Structure 4.2 Barriers to Metric Production Of course, there are many other factors that can affect an IR Department’s production of performance metrics. While the survey is not able to capture all of these, it does provide a number of measures of potentially relevant operating conditions, ranging from the availability of technical tools, to data resources, to the support of senior leadership. Chart 4.2 shows a ranking of potential barriers based on the number of respondents that thought the item was a barrier to a large or very large extent. A lack of IR staff was considered the largest barrier to improving performance metrics, though the skills of existing IR staff were not seen as lacking (ranked as the lowest barrier). It is possible that IR Departments have generally been staffed with highly qualified staff (as evidenced by the large share with graduate level credentials) but that this is also expensive and thus limits the number of total staff that could be employed. Whether an average IR Department or office would be better off moving to a greater balance between the level of training and the number of staff is a question that is beyond the scope of this research. Chart 4.2 also reveals that the availability of data is not a key issue; rather, the lack of necessary infrastructure (e.g. data warehouses, data dictionaries) and support to create the infrastructure systems (e.g. data governance, senior decision maker support) are seen to be the main barriers. To summarize, IR Departments and offices have a few highly qualified staff with lots of potentially useful data to analyse, but lack a sufficient number of staff or the required data infrastructure to efficiently turn that data into useful information. Page 22 of 71 Chart 4.2: Extent a Barrier to Effective Development and Management of Metrics Chart 4.3 below summarizes data on the top three challenges to performance metric production. Consistent with the above results, insufficient resources is seen as the top challenge. The rest of the responses are spread roughly equally among the remaining barriers, with the exception of a lack of benchmarking data, which was chosen significantly less often. The lack of perceived buy-in among senior decision makers was confirmed in the interviews, and is particularly problematic.21 21 The interview and survey results are inconsistent, however, in an interesting way. In particular, they differ markedly in how insufficient raw data inputs and lack of resources (lack of staff) are ranked: the interviewees ranked “lack of input data” much higher than “lack of resources”, whereas survey respondents ranked “lack of input data” much lower than “lack of resources”. One possible explanation for the incongruity is that the survey was mostly answered by those directly responsible for IR (over 75% were at the director level or lower) while the interviewees were mostly at a more senior level (only 45% were at the director level or lower), and these two levels of administration may, in an uncoordinated fashion, be using different definitions of resources and data inputs. For instance, senior administrators, further away from the information-producing process, may consider a research-ready dataset that is easily extracted from a data warehouse to be part of the raw data inputs or resources, while directors and managers consider the underlying operational raw form as the raw data inputs, and compiling and cleaning this data into data warehouses and useable datasets to be part of the research process itself and thus requiring staff time to process. The statistical analysis supports this latter view. Anticipating the results, additional staff does seem to result in more metric production (the IR director view) but only when the data infrastructure (e.g. data marts, data dictionaries) are not significant barriers (the senior administrator view, when considering data infrastructure as part of the inputs). Page 23 of 71 Chart 4.3: Top Three Challenges to Producing New Metrics It is interesting to note that despite issues with perceived buy-in from senior decision makers is that much of the demand for initiating metric development originates with senior executives (Chart 4.4 below), with much less pressure coming from mid-level management, faculties or even the board of governors. In Section 7 we examine if buy-in or perceived buy-in from senior leaders matters for the production of performance metrics. Chart 4.4: Initiate or Influence Development of Performance Metrics 4.3 Data Collection, Analysis and Reporting Technologies and Methodologies When asked about which data and information sharing tools exist, the large majority of respondents have online survey and statistical software, but only about a third have a data dictionary and less than a quarter have a data governance framework, consistent with the above results. It is possible that data dictionaries and data governance frameworks, key components of data infrastructure, are not particularly important for producing a high-level representation of an institution of the type envisioned by a Balanced Scorecard. In fact, we find that only the use of statistical software is positively correlated Page 24 of 71 with the production of a Balanced Scorecard.22 On the other hand, it could be that institutions misallocate resources, working to early on developing Balanced Scorecards before having good data infrastructure in place that would support efficient metric production and ultimately be the main inputs into such a scorecard. Chart 4.5: Information Tools Existing at Institution Probing technologies, data and some of the main statistical tools further, in Chart 4.6 we see that IR Departments and offices are typically using tools and data sources that have been around for decades (e.g. flat files, spreadsheets, and student information systems) and only a few are routinely using the latest data-driven approaches, such as those related to big data (e.g. HADOOP, Learning Management System data, text mining). Sophisticated statistical analysis (e.g. measuring value-added of services, complex patterns in time-series data, or multivariate correlations) are used routinely by about a third of institutions. While we expect more institutions to move towards using more recent and sophisticated analytical techniques and tools, and the fact that around a third are using these more sophisticated approaches can be taken as a positive indication, such approaches are nevertheless typically very resource intensive and may be a challenge for all but the largest of institutions. Indeed, where large institutions (or large IR Departments and offices with more than 6 employees) stand out in how they benefit from their size is in their use of enterprise systems. They are two-times more likely to have and routinely use these systems (chart not shown). 22 A simple linear multivariate regression of a dummy variable indicating whether an institution produces a balanced score card on dummy variables for the presence or absence of a data dictionary, data governance framework, online survey tools, statistical software, data visualization software, only the coefficient on the statistical software dummy variable was statistically significant and positive. Page 25 of 71 Chart 4.6: Use of Data Collection and Information Tools The use of technologies by IR Departments come with their own set of barriers. Chart 4.7 shows the main difficulties that hinder the adoption and use of a technology. All the responses cluster around the support and cost of the technology rather than the technology per se. Financial cost, training and IT support are mentioned about twice as often as barriers relative to the technologies. This is fully consistent with the results above that show financial barriers, which translate into insufficient staff, being a key limiting factor. As we see in the next section, it is not as if senior decision makers have ignored requests for more resources, but it may be that what has been given is still far from enough to fully deliver information decision support at the level of what is expected. Chart 4.7: Main Difficulties Using Data Management and Analysis Technologies Finally, we see some validation of these general patterns by looking at a question that asked respondents the extent to which they agreed with a number of different characterizations of their institutions and departments. Chart 4.8 below summarizes the responses and shows some interesting Page 26 of 71 patterns. First, the statement that respondents most agreed with is that they emphasize past performance and the one they least agree with is an emphasis on future performance. Future performance requires past data, but goes a step further and requires the ability to forecast the future using more sophisticated statistical techniques (e.g. time series analysis), broader information gathering efforts about the environment and an ability to incorporate less tangible information about the environment into the forecast or prediction. Respondents also indicated weaker agreement that their institution has a culture of evidence-based decision making. Evidence for scientific enquiry requires the ability to conduct controlled experiments or implementing other techniques to control for multiple factors to isolate their impacts and also the recognition of the need to distinguish between correlation and causation. A culture of such decision making must certainly entail an expectation that hypotheses that are more rigorously supported by empirical evidence and scientific methods are more likely to be true and thus relied upon. The fact that IR Departments are not heavily relying on statistical techniques, measuring value added or comparing relative to benchmarks is consistent with a culture that relies less on evidence and scientific methods as these are common tools in the social sciences for gathering and evaluating evidence. Of course, even when there is a strong culture of evidence-based decision making, conducting experiments, gathering benchmark data, and running complex statistical algorithms on large datasets are expensive and time consuming activities, and thus possibly too slow and costly to provide the information senior administration needs to make current decisions.23 Chart 4.8: Agreement with Statements about Institution and Department 5. What Is Changing? We asked respondents questions about what has changed in the past three to five years or is currently being changed in their Departments and offices. Did their mandate change? Have they been hiring? Have they been producing new metrics and, if so, which ones? Have they been putting in effort into improving reporting, such as developing dashboards or addressing some of the gaps identified above, 23 The one large difference between colleges and universities in how they responded to the profile questions (results not shown) is that a much larger share of respondents at universities agreed or strongly agreed (50%) with having a culture of evidenced-based decision making than colleges (18%). Page 27 of 71 such as a lack of a data dictionary or data governance framework? IR Departments and offices are indeed growing and are working on addressing some of the gaps identified above. 5.1 IR Department Mandate and Staff Changes in the Past Five Years Evidence of growth is shown in Table 5.1 below, which shows changes in the net number of employees in the last three years broken down by whether the Department’s core mandate has increased, mostly stayed the same or decreased. Looking at the last column, 88.6% of respondents reported an increase in their core mandate and only 1.3% reported a decrease. Of those with an increased mandate, only 11.4% (second last column) reported a decrease in employees and 56% reported an increase. Compare this with the 8.1% of Departments (last column) with no change in core mandate. Only 8.3% of these (second last column) added any new staff. And none of those reporting a decrease in mandate reported new staff. Thus we see that growth in IR is largely tied to an increase in mandate. This is far from a forgone result and has implications for performance metrics. For example, in many areas of an institution, employees are added to meet growth in the size of the institution, such as hiring more professors to teach more students, without any change in mandate (other than to teach more students). Of course, what constitutes a change in mandate is subject to interpretation, but it seems reasonable that these data suggest that new IR staff are not hired, for example, to simply improve the quality of existing performance metrics. Consistent with this perspective, we find that the number of performance metrics produced increases but not by very much when new IR staff are added (results are found in Section 7), suggesting that a significant amount of the additional staff time is used to handle new responsibilities. Unfortunately, our data do not allow us to explore the more interesting question as to whether IR Departments become spread too thin when they expend or maintain an efficient allocation of resources. Table 5.1 below shows how much IR Departments are expanding. Among those with an increase in core mandate (first row), the average increase is two net FTE staff. This is considerable when considering that the average size of an expanding Department is 4.64 FTE, implying a more than doubling of staff over the last three years. And this growth is occurring in an environment of fiscal restraint precipitated by the 2008 financial crisis.24 24 Indeed, post-secondary institutions in the United States generally experienced more and larger budgetary cutbacks over this period than Canada, but when examining the data by country (not shown), there is no convincing evidence that IR department growth in the U.S. was any less than in Canada. Page 28 of 71 Table 5.1: Department Mandate and Staffing Changes in the Past Five Years 5.2 Areas of Development In terms of what infrastructure gaps IR Department have been working on, Chart 5.1 shows that getting information to stakeholders has been given the most attention. A significant majority (60%) of respondents are working on creating or improving dashboards.25 25 One might think of a pyramid of information reporting, with data collection and processing at the bottom, data infrastructure, such as data warehouses and dictionaries on the second level, reports and then dashboards to summarize and provide analysis on the third level and finally a balanced scorecard report on the top. A balanced scorecard brings many areas of data together to provide an overview of an institution. Perhaps, once dashboards have been mostly completed, senior management will feel overwhelmed with information and then the demand for balanced scorecards at the top of the information pyramid will increase. In the meantime, significant effort to build the middle layers of the pyramid continues. Page 29 of 71 Chart 5.1: Information Tools Currently Being Developed or Significantly Improved Interestingly, if we look at institutions that have added staff (Chart 5.2), the one area that stands out in getting more attention is to create or improve the data dictionary. This makes some sense in that it is both a labour intensive process and, unlike improving data governance or adding tools and dashboards, it is largely the work of those involved directly with producing information from data (e.g. IT and IR professionals) rather than information consumers (e.g. senior decision makers).26 In other words, while senior decision makers might redirect some resources to IR Departments in terms of new staff and funds, senior decision makers’ own time and attention is necessary to produce something like a 26 We cannot make a strong inference here. The highlighted difference shown in Chart 5.2 is only statistically significant at the 10% level so these results do not pass the usual 5% significance test and thus are more likely to have arisen by chance rather than reflecting a real difference. Page 30 of 71 Balanced Scorecard and given very large workloads and demands on these senior administrators, that time and attention may simply not be available. Chart 5.2: Relationship between Changes in IR Staffing Levels and Tools Developed or Significantly Improved 5.3 Metric Areas with the Most Development Activity The final chart that we consider in this section summarizes responses to a series of questions that asked respondents to list metrics that in the past three years were developed, newly produced, modified, consolidated or eliminated. Respondents could list any metrics and we coded them into 10 categories as shown in Chart 5.3 below (the tenth “other” category is not shown).27 The metric areas that show the most activity – Student Success Metrics, Enrolment Management and Financial Metrics – are the same metric categories that we show in Section 6 to be the most important. It is nevertheless revealing how much more activity there is with metrics related to student success compared to other areas. One might expect that since these metrics reflect core functions of any postsecondary institution that they would be already in place, and this view is partially supported by the data as the share of metrics modified in this category relative to all activity is much higher than with Financial Metrics, for example, yet there is also considerable new development and production of new metrics in this area. As we will 27 As very few metrics were eliminated or consolidated, the chart excludes these activities. Page 31 of 71 see in Section 6, this is an area that institutions naturally rate as very important, but where quality is below average.28 Chart 5.3: Focus of Changes to Metrics in Past Three Years by Metric Category 6. Performance Metrics The potential number of institutional performance metrics is considerable. A compilation of metrics from Ontario universities showed 882 distinct metrics and related data used and most of these metrics are highly institution specific, used by only one or two institutions.29 Some of the explanation for the plethora of metrics is simply that institutions take slightly different approaches for measuring what is essentially the same underlying institutional characteristic due to the lack of industry standards for performance metrics. There are also metrics that are highly idiosyncratic to an institution, arising from some distinct aspect of its mission or environment (as noted above in Chart 4.8, over 50% of respondents agreed that their metrics reflect a unique aspect of their institution’s mission). However, for this analysis we need some level of standardization to be able to compare institutions and draw inferences. We attempted to strike a balance between being specific enough that respondents would largely understand and agree on what the metric means and keeping the number of metrics for 28 There are clear inherent difficulties of creating valid institutional measures of critical outputs such as the amount of learning or impact on labour market outcomes. Ideal measures require not only data that is difficult to collect but a measure of the counterfactual; what learning or labour market outcomes would have been realized if students did not first go to college or university. And even if such ideal measures could be created, attributing the cause of good or poor results, the presumed reason for creating such metrics in the first place, would be neither easy nor uncontroversial. That despite these problems institutions nevertheless push on trying to create and improve metrics in this area is laudable as the potential benefits for students and graduates are significant. 29 Ken Norrie, Higher Education Quality Council of Ontario, Performance Indicators Workshop for Universities Report. Retrieved from http://www.heqco.ca/SiteCollectionDocuments/PI%20Workshop%20Final%20Report%20EN.pdf Page 32 of 71 respondents to evaluate manageable. To this end, the metrics included in the survey instrument covered a broad range of areas of an institution, from student success to employment outcomes. 6.1 Metric Rankings Table 6.1 below shows a list of each of the 40 metrics, grouped together into ten metric categories.30 We ensured that there were at least two metrics in each metric category. (Many of the individual metrics are probably best understood as categories themselves, each made up of still a finer set of submetrics.) The subtotals represent the maximum score for each metric category and the categories are rank ordered, from highest to lowest score.31 The order of categories makes sense; student success and satisfaction are at the heart of any institution’s mission and all the other activities support these two, with enrolment management being arguable the most fundamental. Within each metric category, there is considerable range in individual metric scores. Take, for instance, Student Success. Graduation rate is given a score of 8.1 compared to Student Alumni Awards which has a score of only 5.5. Generally speaking the data show that being selective allows the same resources to produce a much better overall measure of an institution. For example, an IR Department that selects the top metric in each category (nine metrics) would produce a much better overall perspective of an institution than one that also produced only nine metrics, but selected multiple metrics from some categories and no metrics from others. Of course, not all metrics require the same amount of resources (time, data, effort, skill) to produce at a given quality level and we did not collect data on the costs of producing a metric as this can be very difficult to measure given that it depends on many conditions (e.g. the state of the student information systems, enterprise reporting systems, definitions) and there can be large number of synergies between the production of metrics (e.g. a good data dictionary can reduce the cost of producing many of the metrics). Nevertheless, the collective wisdom from respondents captured in Table 6.1 may at the very least help to steer senior decision makers away from metrics that are generally not very useful toward those that can deliver more information. 30 To keep the number of metrics manageable for respondents, we initially randomly divided respondents into two groups, A and B, numbered each metric in sequence and then presented respondents in group A with only even number metrics and respondents in group B with odd number metrics. After a preliminary analysis, we noticed that there were insufficient observations for some of the metrics to draw robust inferences and so resurveyed some respondents with the group of metrics that they did not see in the first survey. 31 A metric score is equal to the individual metric quality (where 1 = poor and 5 = excellent) and importance scores (where 1 = very low and 5 = very high) summed together. Thus, a metric rated as both excellent quality (5) and very high importance (5) would have a metric score of 10; a metric rated as poor quality (1) and very low importance (1), would have a metric score of 2. Other types of ways of combining quality and importance were considered in the statistical analysis, including using differential weighting or multiplying a transformation of the quality index with a transformation of the importance metrics. It turned out that the addition of the two indices produced the most useful insights so we use this definition throughout. Page 33 of 71 Table 6.1: Metric Ranking by Metric Score, Grouped by Metric Category Page 34 of 71 In terms of the number of metrics produced by an institution, the average, rated at a “fair” quality level or better, is 28, about three quarters of the total number listed.32 The incremental number of metrics added as the IR Department adds staff is very small: only about 1.4 new metrics are added per staff member so that institutions with just one IR staff member still produce 27 metrics, on average. There are many ways to explain this result. One possible explanation is that for small IR Departments, many of the metrics are produced by or at least with more help from other areas, such as the registrar or the information technology unit. A second possible explanation for the low level of incremental metric output is that new IR staff members are tasked with other activities, such as data analysis for strategic decision making, environmental scans or surveys, and these other activities may only contribute one or two additional metrics. Finally, incremental metrics may just be much more difficult to produce and thus require significantly more staff time (increasing marginal cost). 6.2 Four-Quadrant Diagram of Metric Quality and Importance One limitation of the above ranked list of metrics is that it combines metric quality and importance together, thereby masking how quality and importance are related to each other. We can separate out these two dimensions: in Chart 6.1 the vertical axis corresponds to Metric Importance and the horizontal axis corresponds to Metric Quality. Each metric is plotted at the coordinates of its average importance and quality scores based on respondents that produced the metric at a “fair” quality level or better.33 We have also divided the space into four areas or quadrants, where the dividing lines show metric quality and importance at their respective averages across all metrics. We have categorized the four quadrants as follows: Key Metrics: high importance and high quality Potentially Distracting Metrics: low importance and high quality Challenging Key Metrics: high importance and low quality; and, Low Value Metrics: low importance and low quality. Clearly, the retention and graduation rates metrics represent the most clear purpose of any postsecondary institution and are ranked to have both the highest importance and quality (top right).34 One interesting quadrant is the Challenging Key Metrics, those of lower quality but high importance. Student learning or skill gain and graduate employment outcomes stand out. Their importance is self-evident, but the fact that their quality is relatively low suggest that it is an area needing more attention. 32 The results presented here adjust for the fact that some institutions were only presented with about half the total list of 40 metrics. 33 For metrics of poor quality level it may be that the respondent is not very informed about the potential importance of the metric and since it may take very little effort to produce a metric at a poor quality level, we generally treat these as metrics that are not produced at all. 34 Certainly, there are some institution, programs and situations where students who transfer to another institution for more advanced studies before graduating represents a successful outcome, but that does not diminish the importance of retention and graduation metrics, only that more information, such as separating out transfers, is needed their meaning. Page 35 of 71 Chart 6.1: Four Quadrant Diagram of Metrics 6.3 Metric Areas Ranked Based on Main Use An alternative way to understand metric importance is to look at what they are primarily used for. For each of the ten metric categories, we asked respondents to select the primary purpose(s) of that metric category: Strategic, Quality, Describe, Regulate or Prioritize. Chart 6.2 below shows the percentage of times each purpose (the rows) was selected for each metric category (the columns). We have ordered the metric purpose based on an assumed internal institutional priorities, putting Strategic purpose ahead of Quality and Quality ahead of Describe, etc. (Prioritize we placed last mostly because it was one of the least commonly selected purposes.) As intuition would suggest and consistent with Table 6.1 above, Student Success, Enrolment Management, Financials and Student Engagement/Satisfaction are the most important and Faculty Characteristics, Research and Libraries the least, based on this ranking of purposes. Page 36 of 71 Chart 6.2: Main Use of Metric Categories 7. Statistical Analysis: Linking Inputs and Operating Environment with Metric Output So far we have described variations in direct IR inputs, operating environments and how performance metrics compare with each other. However, we have not shown how IR Departments compare with each other in terms of their metric productivity and have only touched on how metric productivity might relate to inputs and operating environments. Ultimately, understanding how much, if at all, any particular input or environmental condition contributes to producing a more valuable suite of metrics, or what synergies exist between them, should lead to more productive IR Departments. 7.1 Index of Metric Productivity A significant challenge is to define a measure of overall metric system quality that is amenable to statistical analysis. There is no extensive literature that we know of that describes how to construct such a measure for post-secondary institutional performance metrics. Our approach is relatively straightforward, but we need to explain the rationale. In particular, since we are using self-reported measures of metric quality and importance, there is risk of unintentional and unconscious bias entering the data. For example, there may be an unconscious bias toward assessing a metric that is viewed as high quality to be also of high importance (the incongruity between producing a high quality metric of little importance may be unconsciously resisted). While we cannot completely avoid such biases, we take an approach that we believe attenuates the more obvious ones. A simple way to create an index is to multiply metric quality by metric importance and sum these together for each institution. The larger the score, or index value, the more productive the IR Department is measured to be. There are some potential limitations using this approach. The numerical scales used to translate qualitative evaluation of quality and importance are ordinal rather than cardinal scales and thus there is some arbitrariness in combining them and there is no particular reason that the ordinal scales of importance and quality should correspond one-for-one with each other Page 37 of 71 or that respondents would agree on the implicit ordinal scale meanings.35 Also, it could be that a better way to combine importance and quality is to sum them together or use some other functional form. We start by examining differences in metric quality indices between colleges and universities and consider several different possible indices of the quality of the set of metrics produced. The first index (Index 1) uses all responses to calculate a group average importance per metric and multiplies this by the individual institution’s reported quality. Thus importance value is the average importance calculated over all respondents rating the metric. This is done in order to reduce the potential of unwanted correlation between a respondent’s assessment of a metric’s quality and importance. The first row of Table 7.1 shows the average value for Index 1. Universities have a higher overall index value (normalized to a maximum of 100) than colleges and this difference is highly statistically significant. One explanation for the difference is that universities and colleges may differ on which metrics they consider important (colleges are less interested in research metrics, for example,) and thus the difference between college and universities reflect this difference. Index 2 uses an importance value that is based on averages calculated separately for colleges and universities. The difference between colleges and universities persists and is even slightly larger, which is inconsistent with the notion that colleges and universities view the fundamental importance of various metrics differently. Index 3 goes a step further and uses self-reported metric importance and the results are essentially unchanged in that universities continue to have a higher overall average index value. It is interesting that the more institution specific the importance ranking, the lower the overall average score. This may be a sign of a reporting bias as discussed above, but in the opposite direction whereby a respondent is more critical of the quality of metrics produce at his or her institution that he or she feels are the most important. However, if there is a reporting bias, it does not seem to be much different between colleges and universities as the differences in the average quality index between the two groups does not change much. (Below, we investigate if this is a real difference between institution types or if other factors correlated with institution type, such as number of IR staff, are driving the result.) Since the index is made up of three components, metric importance, metric quality and number of metrics produced, we can break down the difference between colleges and universities by looking only at one of the components, average metric importance. Index 4 is per metric importance-quality index value and while colleges still have a lower average than universities, the difference is no longer statistically significant, suggesting that the reason for the differences seen above is that colleges simply produce fewer metrics. 35 For example, it may be the case that an increase of 1 unit in importance from switching to a new metric is equivalent, in the sense as a director would be indifferent if faced with a choice, to a 2 unit increase in quality rather than to a 1 unit increase. And two different directors, may have different views in how they would compare scales and relative scales. Page 38 of 71 Table 7.1: IR Quality Index Index 6 calculates the average per-metric importance value (based on the importance for each metric averaged over respondents) for those metrics that a respondent produces of at least of “fair” quality. We exclude poor quality metrics to take into account the fact that such metrics may not be much different than a metric that is not produced at all. Colleges have a statistically significant lower average per-metric quality index; since this measure excludes quality, other than requiring a minimum quality level, when combined with the result for Index 5, there is some evidence that colleges produce a set of slightly less important metrics but at a slightly higher self-reported quality level. Finally, Index 7 uses self-reported metric importance, and the difference between colleges and universities is still marginally statistically significant. Thus differing views between institutions on what is and is not importance is not driving the difference between colleges and universities. 7.2 Regression Analysis of Metric Production and Importance Ideally we would like to use regression analysis to determine which factors affect the choice of metrics, the quality and the number produced. However, we face several challenges. First, there are two dependent variables (number of metrics and the average metric quality) rather than just one. Second, there is a real potential for a subjective reporting bias introducing a correlation between quality and importance that we would like to remove. There was some evidence of this above. However, we do not have a good alternative measure of metric quality in the survey data. As an alternative, we restrict the analysis to only look at the number of metrics and the choice of metrics in terms of a group-based importance scale, as was done above. In particular, a metric is defined as being produced for the purposes of this section if its quality is at least “fair”. Thus “poor” quality metrics are treated as if they Page 39 of 71 were not produced at all. Decomposing quality into a dichotomous variable – fair or better and poor or not produced – produces the most robust results in terms of statistical significance and direction of effects that are intuitive (e.g. more IR staff is positively correlated with more metrics being produced). We also treat the index as a continuous variable rather than an ordinal variable. Ordered dependent variable regressions might be used profitably on this data, but extensive experimentation with alternative regression functional forms and error structures was beyond the scope of the project. Another reason we limit how much we how we use the quality responses is because of a possible omitted variable bias that would introduce a spurious correlation between metric quality and quantity. In particular, for a fixed level of inputs and operating environment, we would expect to see a qualityquantity tradeoff so that higher average quality is associated with fewer metrics and vice versa. However, we do not see this in the data (results not shown). In fact, we observe the opposite (positive correlation between quality and quantity), and the likely explanation is that the data do not permit us to fully control for all significant factors that affect metric production and quality. For example, the director’s innate ability at managing staff, influencing more senior decision makers and creating Department strategy would be expected to have a significant impact on IR productivity, and could positively affect both the quality and quantity of metrics. Since we do not measure this innate ability without error (director tenure or background are imperfect measures), this is a classic errors-in-variables problem and we would expect it to show up as an induced positive correlation between metric quality and quantity even after controlling for other observables. Given this limitations, the first specification estimates a two-equation simultaneous model of metric count and metric importance. The first equation determines the number of metrics to produce at or above a “fair” quality level and a second equation determines the average importance of the fair-orbetter quality metrics. (Below, as a second specification, we consider the selection of the suite of metrics that achieve the average metric importance level.) The main link between the two equations is a rationality assumption; IR Departments agree to some extent on what are important metrics and given a finite number of metrics and resources, are more likely to choose to produce metrics that are more important, so that as additional metrics are produced, they are necessarily at a lower average importance level. This type of diminishing returns hypothesis is not imposed on the model, but is testable in the model specification. The two-equation model is specified as follows: 𝑛𝑢𝑚_𝑚𝑒𝑡𝑟𝑖𝑐𝑠 = 𝛽0 + 𝛽1 𝑟𝑒𝑝𝑜𝑟𝑡_𝑝𝑟𝑒𝑠 + 𝛽2 𝑟𝑒𝑝𝑜𝑟𝑡_𝑝𝑟𝑜𝑣𝑜𝑠𝑡 + 𝛽3 𝑖𝑟_𝑠𝑡𝑎𝑓𝑓 + 𝛽4 𝑖𝑟_𝑠𝑡𝑎𝑓𝑓 × 𝑑𝑎𝑡𝑎_𝑏𝑎𝑟𝑟𝑖𝑒𝑟𝑠 + 𝛽5 𝑑𝑖𝑟_𝑠𝑜𝑐𝑖𝑎𝑙_𝑒𝑑𝑢𝑐 + 𝛽6 𝑔𝑜𝑣𝑡_𝑖𝑛𝑖𝑡𝑖𝑎𝑡𝑒 + 𝛽7 𝑛𝑒𝑤_𝑚𝑒𝑡𝑟𝑖𝑐𝑠_𝑝𝑟𝑜𝑑𝑢𝑐𝑒𝑑 + 𝛽8 𝑙𝑎𝑐𝑘_𝑠𝑢𝑝𝑝𝑜𝑟𝑡 + 𝛽9 𝑖𝑟_𝑦𝑒𝑎𝑟𝑠_𝑒𝑥𝑖𝑠𝑡𝑒𝑑 + 𝛽10 𝑙𝑜𝑔_ℎ𝑒𝑎𝑑_𝑐𝑜𝑢𝑛𝑡 + 𝛽11 𝑐𝑜𝑙𝑙𝑒𝑔𝑒 + 𝛽12 𝑐𝑎𝑛𝑎𝑑𝑎 + 𝛽13 𝑔𝑟𝑜𝑢𝑝𝑎 + [𝑟𝑒𝑠𝑢𝑟𝑣𝑒𝑦𝑒𝑑 𝑖𝑛𝑡𝑒𝑟𝑎𝑐𝑡𝑖𝑜𝑛𝑠] + 𝜀1 𝑎𝑣𝑔𝑖𝑚𝑝𝑜𝑟𝑡𝑎𝑛𝑐𝑒 = 𝛽14 + 𝛽15 𝑛𝑢𝑚_𝑚𝑒𝑡𝑟𝑖𝑐𝑠 + 𝛽16 𝑑𝑖𝑟_𝑠𝑜𝑐𝑖𝑎𝑙_𝑒𝑑𝑢𝑐 + 𝛽17 𝑖𝑟_𝑦𝑒𝑎𝑟𝑠_𝑒𝑥𝑖𝑠𝑡𝑒𝑑 + 𝛽18 𝑐𝑜𝑙𝑙𝑒𝑔𝑒 + 𝛽19 𝑐𝑎𝑛𝑎𝑑𝑎 + 𝛽20 𝑔𝑟𝑜𝑢𝑝𝑎 + [𝑟𝑒𝑠𝑢𝑟𝑣𝑒𝑦𝑒𝑑 𝑖𝑛𝑡𝑒𝑟𝑎𝑐𝑡𝑖𝑜𝑛𝑠] + 𝜀2 Page 40 of 71 Where num_metrics is the number of metrics the respondent indicated were produced at a quality level of “fair” or better metric_importance is the average importance value of the metrics included in equation 1 where the importance weight of a metric is the averaged importance value for all respondents producing that metric at the “fair” quality level or higher report_pres is a dummy variable equal to 1 if the IR unit reports to the President’s office and 0 otherwise report_provost is a dummy variable equal to 1 if the IR unit reports to the Provost or Vice-president Academic and 0 otherwise ir_staff is the number of full-time equivalent IR staff members with a graduate degree (Masters or Ph.D.) data_barriers is a dummy equal to 1 if the respondent answered ____ to questions ---- and 0 otherwise dir_social_educ is a dummy variable equal to 1 if the director’s background is in social science or education and 0 otherwise govt_initiate is a dummy variable equal to 1 if the respondent answered ____ to question ---- and 0 otherwise new_metrics_produced is a dummy variable equal to 1 if the respondent indicated that they had created new metrics in the past three years (question x) and 0 otherwise ir_years_existed is the number of years the ir Department has existed (0 if unknown and then a dummy variable is included to indicate the missing value) ir_years_existed_miss log_head_count the natural logarithm of the reported institution enrolment (question X) college a dummy variable equal to 1 if the institution is a college and 0 if the institution is a university Canada a dummy variable if the institution is located in Canada and 0 if it is located in the United States of America Lack_support a dummy variable equal to 1 if the respondent indicated that support from senior decision makers to build the required data infrastructure was a barrier to a large or very large extent (survey question 16, Table 4.2) and 0 otherwise36 36 We also considered an alternative indicator of support to be whether a lack of support from academic leaders or executive was chosen as a top three barrier to performance metric production (question 22, Table 4.3). However, in all specification this variable always showed no statistically significant correlation. Moreover, we believe the Page 41 of 71 resurveyed interactions the above variables all interacted with a dummy variable equal to 1 if the respondent responded to the second survey for more information on metrics and 0 otherwise (the coefficients on these interaction effects are not shown in the tables below) Thus, in this two-equation system, the dependent variable in the first equation (the number of metrics produced or num_metrics ) is included as an explanatory variable in the second to capture the fact that as more metrics are produced, diminishing returns should cause the average metric importance value to decrease. The coefficient on num_metrics in the second equation should be negative if there are diminishing returns. Obviously, the error or disturbance term in the second equation cannot be assumed to be uncorrelated with the number of metrics and we should also expected that the disturbance terms in each equation will be correlated with each other. We use three-stage least squares to estimate the system.37 Focusing on the first equation, we expect that more staff should result in more metrics produced. However, the existence of data barriers may reduce the number of metrics produced or the ability of staff to create and maintain metrics. We have no particular reason to expect the director’s background to matter, but as 53% of directors have a background in social science or education, arguable the closest match to an IR role, we have sufficient variation in the data to examine if director background matters. Governments are important consumers of metrics and the institutional metrics they usually ask for, such as enrolment and graduation rates, are typically metrics that institutions also consider important. Thus we do not expect government demands to increase average metric importance, but may affect productivity (i.e. the number of metrics produced) since providing metrics to a third party requires additional time and effort to conform to definitional requirements, deal with unusual results, tracking down reasons for data errors and setting up and using methods to communicate and transfer data. As a result, we expect that the more the government is involved, the fewer metrics may be produced. We also include dummy variables to indicate if the IR Department or office reports to the President’s Office or reports to the Provost or Vice-President Academic. While it could matter where the IR unit is positioned in the organization, we have no expectations on how. If new metrics have been produced in the last few years, we might expect that these institutions should have more metrics. Of course, it could be that institutions that have a low number of metrics are only catching up (a regression to the mean effect). Thus the sign of the coefficient on this variable is difficult to predict. We expect that more support from senior decision makers should lead to production of more metrics. However, it is possible that directors that are the most ambitious also feel the most constrained by limited resources and limited attention, which, if the case, could counter-intuitively result in lower perceived support being associated with the production of more metrics. We have no way to disentangle specificity of question 16 that asks about support for a specific initiative, building data infrastructure, as a better gauge of support. 37 Stata’s reg3 is used to fit the model. Page 42 of 71 these two possibilities, and though they may both exist, we only measure the net effect. Nevertheless, seeing if there is a relationship at all is a starting point. An IR Department or office that is older might be expected to produce more metrics, all other things equal. However, to the extent that department age is positively associated with department size (older departments are, in fact, generally larger), some of the effect of department age on the number of metrics could be captured by number of IR staff. Age of department or office might also be correlated with a reduction in data infrastructure barriers as data warehouses are built and data dictionaries created. Still, we might expect some residual effect of department age on the number of metrics that captures intangible improvements, such as learning-by-doing that are not captured by other explanatory variables.38 We control for a few additional institutional characteristics: institution size (the natural logarithm of head count or log_head_count), college or university, Canada or the United States of America. We have no expectations for direction of association with these variables, but would like to see if there are any remaining effects once other factors have been controlled for. In particular, we saw above that colleges tended to produce fewer metrics than universities, but this could be because colleges generally have smaller IR Departments. We included a dummy variable indicating whether the respondent’s questions were from the version A survey or the version B survey as the set of metrics were different for each version.39 Finally, we include interactions of all variables with a dummy to indicate whether an institution responded to a second request for additional survey data or not. As the conditions for evaluating the second set of performance metrics were different for the second survey (in particular, the other contextual questions were not shown), we want to remove any possible effect on the results. There is no particularly interesting interpretation of the coefficient estimates for these interaction terms, so we supress them to make the results easier to read. The second equation related average metric quality to the number of metrics produced, the background of the director, and the age of the department. The main identifying exclusion (at least one right-hand side variable in the first equation needs to be excluded from the second equation) is the number of staff. We expect that the number of staff is important for determining the number of metrics produced, but to have much less impact on which metrics to produce. That decision would be more a function of senior decision maker’s priorities and the mission of the institution. The director’s background might 38 Experience of the key IR administrator might also be thought to be important. We did not ask respondents for total number of years as director, but we did ask respondents for the number of years at their institution in as an administrator of IR and as an employee, which should be good proxies for experience. We find that the key administrator’s time at the institution is too closely correlated with the age of the IR department or office to be able to measure a separate effect with the sample size that we have when both are included as explanatory variables. Since age of the department provides a better fit, we only it as a regressor, but one can think that it also embodies the director and staff’s experience. 39 It turns out that Group A respondents had lower average importance scores, which is simply an artifact of how the questions were ordered. As the respondents were assigned at random, including this dummy variable is sufficient to remove all the effects of the two difference versions from the results. Page 43 of 71 have some influence, especially to the degree that the director influences senior decision makers as to which metrics are likely to provide the most value. Support from senior decision makers may or may not matter for the same reasons stated above – namely, this is perceived support and respondents that believe they have less support may actually just be more ambitious and thus feel more constrained. Finally, country and institution type are included to see if any remaining differences exist between these two important groupings.40 Table 7.2 below shows the coefficient estimates for the two-equation model. Looking at the results for the first equation of the system, the number of metrics increases by about 0.9 per employee (fte_staff_master_phd, coefficient of 0.45 multiplied by 2 as only half the possible metrics were included for each respondent.) The average number of metrics is about 28 so an increase of 0.9 metrics per additional employee is relatively small, and would be consistent with additional labor being used to work on non-metric producing activities and also decreasing returns to metric production. Interestingly, however, the coefficient on the interaction of staff with the data barrier dummy variable (fte_staff_master_phdXdata_barrier) is statistically significant and negative. That is, when poor data infrastructure is seen as a problem, adding additional IR staff does not result in any additional metrics being produced. The fact that data infrastructure and labour are complements in metric production should be of no surprise to anyone working in institutional research, but what is interesting is the large size of the effect. In light of the above discussion that the perceptions of key barriers to metric production were contradictory in the interviews compared to the survey – more senior positions and thus those more involved with resource allocation may attribute more of the problems to poor data inputs while a director who is closer to the mechanics of creating and managing metric production and infrastructure points to a lack of resources – we have a potential resolution; senior decision makers are right in the sense that better quality data inputs (or the data infrastructure to improve the quality of raw data inputs) are critical and IR directors are correct in that more metric production requires additional staff. Recognizing that both are right can potentially lead to a more productive discussion between those asking for metrics and those responsible for producing them. There are two other results of particular interest in the first equation. One is that if the government is seen as a main initiator of the production of metrics, the total number of metrics produced is lower (govt_initate). This is consistent with the hypothesis that reporting metrics to government requires more time and this has an opportunity cost, at least partially realized as fewer institutional metrics being produced. An alternative explanation is that the metrics the government seeks are ones that cost more to produce in terms of staff and data resources and an institution not required to produce some of these government-driven metrics might choose to produce a larger number of other metrics with the same IR staff and data. The other interesting result is that directors with a social science or education background lead Departments or offices that produce more metrics (director_social_educ). The coefficient is 2.05 (which 40 We also tried including a dummy variable to indicate whether a college was highly selective in its admission policy or had a more open access philosophy. We did not find any differences between these two groups once the other variables were included. Page 44 of 71 implies 4.1 additional metrics taking into account that respondents only evaluated half the available metrics), or about 14% more than the number average of metrics, and is highly statistically significant. Thus the effect size is quite large. We have controlled for Department age, staffing levels, institution size and institution type so the explanation cannot be related to directors with this background being disproportionately employed in older IR Departments, universities or larger institutions.41 The data does not tell us why this should be the case. One might speculate that such directors, because of their backgrounds, put relatively more value on a broader range of metrics than a director with a background in business or economics that might put more emphasis on strategic information that is not part of performance metrics per se. Since we do not measure all the activities of an IR Department, this result cannot be interpreted in terms of overall IR productivity. There are several coefficients that turned out not to be statistically significant that are worth highlighting. A perceived lack of support from senior administrators had no impact on the number of metrics. Not only is the coefficient not statistically significant, the parameter estimate is very small. The area that IR reports to also does not seem to matter for metric production. Institution size (log_head_count) is statistically insignificant, though this is partially due to multicollinearity with IR Department staff levels and, intuitively, the number of staff is more important than institution size so only the coefficient on staff levels is statistically significant. The age of the Department also does not matter. This could be because age or experience of the IR group drives quality rather than the number of metrics or, like with institution size, any age effects that result in larger IR Departments are already explained with the number of staff. Finally, there are no country differences or differences between colleges and universities. The coefficient on the college dummy is negative, consistent with the results reported above that colleges produce fewer metrics, but the difference is not statistically significant with the other controls included.42 Turning to the second equation, which relates the average importance of all fair-quality or better metrics produced at an institution with the number of metrics produced and other potential factors, we see that the coefficient on the number of metrics produced (num_fairplus_metrics) is negative and highly statistically significant (-0.0053, p-value = 0.009), as consistent with the hypothesis of diminishing returns to metric production and that IR Departments tend to prioritize more important metrics first, so that the average metric importance drops as more metrics are produced. (While this is an obvious result, it nonetheless helps validate the model specification, especially given that the importance level is based on the group average rather than being institutional specific.) 41 Controlling for the director’s years at the institution or years as an administrator was never statistically significant and had little effect on the other coefficients. Thus we exclude it from the table, but these unpublished results mostly rule out differences in experience driving the higher productivity of directors with social science or education backgrounds. 42 This is partially due to another multicollinearity problem where the dummy variable admit_type (whether the college is open admission or selective) is correlated with the institution type. If the admit_type dummy is excluded from the estimation, then the coefficient on the college dummy is marginally statistically significant (results not shown) suggesting colleges may produce slightly fewer metrics from the list of metrics presented in the survey. Page 45 of 71 There are three other noteworthy results. The first is that a key factor associated with higher average importance is the age of the institution (log_ir_years_existed, 0.25, p-value = 0.005). Since the model controls for staff levels, institution type, and the number of metrics, it appears that age has a separate important effect on which metrics are produced. Department age could capture learning-by-doing or some other process in which metrics are produced, ineffective ones are discarded, and new metrics are considered.43 The second interesting result is that the average metric quality for colleges is higher than for universities (college, 0.0565, equivalent to 7.5% of the mean dependent variable, p-value = 0.004). The reason cannot be because universities produce more metrics and therefore diminishing returns to metric importance are stronger for universities, as the second equation controls for the number of metrics produced. It could be explained by colleges ranking the importance of their most-produced metrics higher than universities and therefore does not reflect a genuine difference in metric selection. We find evidence to support this explanation; if importance weights are based solely on the responses from universities (results not shown), the coefficient on the college dummy in the second equation becomes statistically insignificant while the rest of the statistically significant coefficients retain their signs and statistical significance. Finally, we note that there is some weak evidence that IR Departments or offices that report to the President’s Office produce a set of metrics that are less important. It may be that as the president needs to have a broader overall perspective and influence various constituencies, presidents request IR Departments produce a broader range of metrics, but at an opportunity cost of not producing some metrics that IR administrators consider more important. This could thus reflect a difference in opinion so that if we were to solicit presidents for their opinions on which metrics are the most important, the rankings may be different. Alternatively, it could be that having a breadth of metrics has an intrinsic value separately from the importance of the individual metrics and therefore is not captured in how the dependent variable is constructed. This suggests it may be useful to consider other ways to measure the choice of which metrics to produce, as in the four-quadrant diagram, which we turn to next. 43 An alternative explanation is a survivorship-bias whereby IR departments that produce more important metrics are more likely to survive for a longer period of time. This seems implausible to us as some institutional metrics are quasi-obligatory, and it would require that a low performing IR department or office is eliminated for some period of time altogether, and then a new one started in its place some years later rather than simply replacing the key direct administrator. Page 46 of 71 Table 7.2: Two-Equation Regression Model Explaining Number of Metrics Produced (Equation 1) and Average Metric Quality (Equation 2) Equation 1: Number of Metrics Produced Number of Obs. 143 R2 0.407 Variable Coef. Std. Est. Err. reports_pres .734 1.36 reports_acad .385 1.08 lack_support_e .143 1.41 Pvalue .590 .722 .919 director_social_educ 1.96 .846 .021 govt_initiate_e group_a admit_type metrics_produced_e log_ir_years_existed fte_staff_master_phd fte_staff_master_phd X data_barrier log_head_count college canada constant -3.36 1.22 .360 -.081 -.037 .421 1.03 1.24 .374 .918 .661 .228 .001 .328 .336 .930 .955 .065 -.782 .408 .055 .331 -2.28 -1.09 14.4 .520 1.58 .910 5.08 .524 .148 .230 .005 Equation 2: Average Metric Quality Number of Obs. 143 R2 0.570 Variable Coef. Est. reports_pres -.035 reports_acad -.003 lack_support_e .026 num_lowplus_metrics_ -.007 by_id director_social_educ .004 log_ir_years_existed .020 group_a -.131 college .056 canada -.014 constant .838 Std. Err. .0170 .0123 .0167 Pvalue .043 .784 .118 .0018 .000 .0111 .0081 .0151 .0178 .0112 .0373 .710 .013 .000 .002 .203 .000 7.3 Regression Analysis Applied to the Four-quadrant Diagram As discussed above, the four-quadrant diagram places each of the forty metrics into one of four categories reproduced here for reference: Key Metrics: high importance and high quality Potentially Distracting Metrics: low importance and high quality Challenging Key Metrics: high importance and low quality; and, Low Value Metrics: low importance and low quality. For each institution, we can calculate the number of metrics in each category that the institution produces at a “fair” quality level or better and use this to create quadrant shares that sum to 100 percent by institution. Note that the quadrant a metric belongs to is not institution specific. That is, we use the positions of the metrics in Chart 6.1, which are based on the average quality and average importance values of all respondents in the sample. The institutions’ shares in each category then vary between institution only because of the choice of which metrics are produced (at the fair quality value or better), and not on the institution’s own assessment of metric quality, beyond being at least of fair quality, or importance. The main reason to essentially average out some of the institution specific Page 47 of 71 information is to reduce potential bias in the relationship of quality and importance that can arise from subjective assessments of these characteristics. Our goal is to examine what observable IR inputs and operating environment variables correlate with the composition of metrics produced. As mentioned in the section above on methodology, we use a regression technique that takes into account that the dependent variables, the shares, sum to 100% and lie between 0 and 1. As the shares sum to 100%, we have three equations (the key metrics share equation is the omitted equation): distracting_metrics_share = β0 + β1 lack_support + β2 director_social_educ + β3 govt_initiate + β4 selective_admit + β5 metrics_produced + β6log_years_existed + β7 college + β8 canada + β9 fte_staff + β10 fte_staffXdata_barriers + [interactions with a resurveyed dummy variable] +η1 challenging_metrics_share= β11 + β12 lack_support + β13 director_social_educ + β14 govt_initiate + β15 selective_admit + β16 metrics_produced + β17 log_years_existed + β18 college + β19 canada + β20 fte_staff + β21 fte_staffXdata_barriers + [interactions with a βresurveyed dummy variable] +η2 low_value_metrics_share = β22 + β23 lack_support + β24 director_social_educ + β25 govt_initiate + β26 selective_admit + β27 metrics_produced + β28 log_years_existed + β29 college + β30 canada + β31 fte_staff + β32 fte_staffXdata_barriers + [interactions with a resurveyed dummy variable] +η3 key_metric_share = 1 - distracting_metrics_share - challenging_metrics_share low_value_metrics_share The coding for the right-hand side variable names is the same as with the previous regression (Section 7.1). (The ηis denote the error terms.) We generally expect that more full time staff, the lower the share of metrics produced in the Key Metric quadrant. This simply reflects a rationality assumption where IR Departments will tend to use limited resources to produce the most valuable metrics first and then shift to other metric categories as more resources become available. The more interesting question is whether additional IR staff is used to increase the share of challenging metrics or whether marginal resources are instead directed toward low value or distracting metrics. We do not have strong hypotheses about the other variables and instead use the regression as investigative. Table 7.3 shows the regression results. Other than the random survey selection, none of the variables explain variation in the share of Low Value Metrics that are produced. However, additional IR staff result in an increased share of Challenging Key Metrics and possibly, though of marginal statistical significance, the share of Potentially Distracting Metrics. Thus, if we equate lower average quality with higher difficulty to produce, then it appears that additional resources are targeted at either the production of metrics that are easy to produce but less important, or that are hard to produce but more important. This is a natural tradeoff that any IR administrator would have to make. Another key driver of metric category shares is the degree to which government drives metric production (govt_initiate). When government is considered to be more involved in initiating metrics, the share of Key Metrics increases and the share of Challenging Key Metrics and Potentially Distracting Page 48 of 71 Metrics fall. One interpretation is that governments typically request metrics that are categorized as Key Metrics and this comes at an opportunity cost of fewer other types of metrics produced. In cases where government involvement reduces the production of Potentially Distracting Metrics, government involvement may improve overall value, but it may also reduce the production of Challenging Key Metrics. As we saw in the above specification, the net effect seems to be slightly negative. Interestingly, a lack of support from senior leaders to build necessary data infrastructure is associated with a higher share of Challenging Key Metrics, though this is of marginal statistical significance. As mentioned above, it may be that it is institutions that have more ambitious IR administrators who are working on these areas of metrics are also the ones who feel most constrained and thus the measured effect in the model does not indicate a causal relationship. The other two noteworthy statistically significant results are both related to the share of Distracting Key Metrics produced. Colleges and Canadian institutions (coefficients for the college and canada dummy variables in the third equation) appear to produce a lower share of these less desirable metrics than other institutions.44 The regressions control for staff size, government involvement, and whether the institution has a highly selective admission policy or not, so it must be some other difference or differences in institution types that explain the differences in the metric share distributions. 44 The discrete estimated effects are that colleges have a 10.0% lower share of Potentially Distracting Metrics and Canadian institutions have a 3.1% lower share of these metrics. Page 49 of 71 Table 7.3: Three-Equation Regression Explaining Four Quadrant Shares (Effect on Shares Relative to Key Metric Quadrant) Number of Obs. R2 Variable Equation 1: Low Value Metrics Share 143 0.407 Coef. Std. PEst. Err. value -.020 .129 0.878 .078 .080 0.327 -.140 .156 0.370 1.33 .133 0.000 -.057 .041 0.174 .092 .086 0.284 .034 .034 0.317 -.172 .158 0.276 -.091 .067 0.175 .007 .013 0.580 Equation 2: Challenging Key Metric Share 143 0.570 Coef. Std. PEst. Err. value .353 .187 0.058 -.010 .094 0.916 -.548 .141 0.000 .196 .164 0.233 .014 .058 0.806 .010 .108 0.925 .0197 .052 0.706 .384 .234 0.100 -.040 .093 0.667 .038 .015 0.014 Equation 3: Potentially Distracting Metrics 143 0.570 Coef. Std. PEst. Err. value -.047 .183 0.797 .003 .083 0.974 -.374 .137 0.006 1.49 .173 0.000 -.044 .042 0.303 -.127 .075 0.092 -.053 .039 0.172 -.507 .223 0.023 -.201 .090 0.026 .031 .016 0.050 0.566 0.099 .026 .045 lack_support_e director_social_educ govt_initiate_e group_a selective metrics_produced_e log_ir_years_existed college canada fte_staff_master_phd fte_staff_master_phd .012 .038 0.755 .082 .050 X data_barrier constant -.584 .243 0.016 -.598 .368 0.104 .203 .259 0.432 Note that this is not a linear regression and thus the effect sizes cannot be read directly from the coefficients. The direction (sign) of the effect and measure of statistical significance (p-values), however, can be read directly from the table. 8. Conclusion Despite budget cutbacks in the higher education sector in recent years, IR Departments, offices and related units have been given more resources, especially additional staff, yet a lack of staff is still the number one perceived barrier for IR directors. Current IR staff are generally well educated and the available technologies are considered adequate and some of this perception of a lack of resources is likely related more to the perceptions of ambitious IR directors than to an actual under-resourcing relative to other peer institutions. Still, the finding that adding a staff member only yields one or two additional performance metrics and then only when good data infrastructure is in place, speaks to how additional responsibilities typically get added to IR Departments when they grow as well as the increased cost of producing ever more useful institutional metrics. Moreover, even with the added staff, IR Departments still tend to rely on basic survey and analytic technologies, focusing on reporting historical data, though a large minority do use more sophisticated analytical techniques. We suspect that there has been underinvestment in data infrastructure at many institution over a longer period of time owing to the high cost and long-delivery period of developing this type of infrastructure; as a result many IR Departments are not as efficient at metric production as they could be and there is, as a result, less staff time to do more sophisticated analyses to tackle important but challenging types of postsecondary measurements, like the amount of learning taking place. Thus while there seems to be plenty Page 50 of 71 of scope for moving toward more rigorous research methodologies and supporting a stronger culture of evidence-based decision making, going beyond basic reporting of historical data is still very labourintensive and IR Departments’ appetites for more resources will be hard to satiate. Finally, it is interesting that we have found no real differences between Canada and the United States once other factors have been taken into account. There are differences between colleges (community, mostly diploma granting institutions) and universities (mostly four-year and upper degree granting institutions), with colleges typically have smaller IR Departments than universities of the same size, but there is little evidence of substantial differences in metric productivity between colleges and universities once the number of IR staff has been taken into account. Page 51 of 71 Appendix 1 The survey questionnaire is contained below. The respondents filled in the questionnaire online and thus the presentation format was different, showing questions one at a time and allowing for conditional branching on some questions, depending on a previous answer. Also, respondents were only presented with half the detailed metrics to assess in the first survey. (Some respondents were asked to complete a follow up survey to assess the metrics that they did not assess in the first survey. The follow up survey is not included here.) Developing Metrics for Internal Decision Making in Higher Education The focus of the survey is on performance metrics that are related to strategic decision making at your institution. Operational-level metrics that support the day-to-day operations of individual Departments within the institution are not within the scope of this research. Your answers will be used for research purposes only. Your participation in this survey is entirely voluntary. All survey data is being collected confidentially and securely stored by Academica Group. No individual responses will be reported on or in any way linked to personally identifying information. This section asks for a bit of information about your department and staff. 1. How long have you been the key administrator of Institutional Research in this institution? Please enter numeric response only. _______________ 2. What is your academic background (based on highest credential achieved)? Please select all that apply. Business / Economics Statistics / Mathematics Computer Science / Technology Social Sciences Humanities Natural Sciences Education Other_______________________ 3. How long have you been an employee at your current institution? Please enter numeric response only. _______________ 4. Including yourself, the total number of full-time staff in the Institutional Research department is: Please enter numeric response only. Please exclude co-op students or intern students. _______________ Page 52 of 71 5. The total number of part-time staff in the Institutional Research department is: Please enter numeric response only. Please exclude co-op students or intern students. _______________ Page 53 of 71 6. Including yourself, please indicate the highest level of education of IR staff (full- and parttime) Please exclude co-op students or intern students. Number of staff with less than a Bachelor's Degree: _________________ Number of staff with a Bachelor's Degree: ________________ Number of staff with a Master’s Degree: __________________ Number of staff with a Doctorate Degree: ________________ 7. In the past 5 years, how would you describe the change in the department when it comes to its core mandate or responsibilities? Increased its core mandate or responsibilities Reduced its core mandate or responsibilities Did not change its core mandate or responsibilities 8. In the past 5 years, what has been the net change in the total number of employees in the IR department (including full- and part-time staff)? Please exclude co-op students or intern students. Increased (Branch to Q9) Decreased (Branch to Q10) Stayed the same 9. The total number of employees in the IR department increased by… 1 2 3 4 5 6 7 8 9 10 10. The total number of employees in the IR department decreased by… 1 2 3 4 5 6 7 8 9 10 Page 54 of 71 11. Approximately, how many full-time students were enrolled in Fall (September) 2013 at your institution? Please select one response only. Headcount _________________ Data not readily available Prefer not to answer The next section will ask you about institutional performance metrics and related processes. 12. Please indicate the main use or uses for each category of metrics at your institution. Page 55 of 71 Prioritize Not Describe Regulatory Measure programs applicable institution and compliance progress of Quality for (Do not its (government, strategic Assurance resource collect / characteristics accreditation) plan/direction allocation Don't know) I. Enrolment Management (Applications/conversion, Enrolment counts, Market share, Student demographics, etc.) II. Student Success (Graduation rate, Retention rate, Student and alumni awards, etc.) III. Student Engagement and Satisfaction (Students’ use of college facilities and services, Student satisfaction, etc.) IV. Staff Engagement and Demographics (Employee diversity, Employee engagement, Faculty and staff demographics, etc.) V. Research (Publications, Citations, Inventions, etc.) VI. Libraries (Holdings, Acquisitions, Expenditures, etc.) VII. Facilities (Safety, Environmental sustainability, Space utilization, etc.) VIII. Financial (Budget surplus, Income/net contribution, Endowment, etc.) IX. Instructional Productivity (Student-tofaculty ratio, Class size, etc.) X. Faculty Characteristics (Faculty awards, % Page 56 of 71 Faculty with PhD,% Faculty tenure track, etc.) 13. This section has a list of selected metrics from the previous question. The list excludes the metrics that you may have indicated as not applicable to your institution. Imagine you’re asked to provide advice about the relative importance of various metrics to another IR administrator at an institution similar to yours. How would you rate the importance of the following metrics in making strategic decisions? Very Low Low High Very High Unsure/No Importance importance Moderate importance Importance opinion (1) (2) (3) (4) (5) Enrolment Management Market share Student geographic origin Student Success Graduation rate Student and alumni awards Student learning or skill gain Page 57 of 71 Very Low Low High Very High Unsure/No Importance importance Moderate importance Importance opinion (1) (2) (3) (4) (5) Student Satisfaction Student satisfaction Students’ evaluation of courses/program/faculty Staff Engagement Employee engagement Research Publications Inventions Research expenditures Libraries Holdings Expenditures Facilities Environmental sustainability Facilities condition indices Financial Income/net contribution Net assets Capital investment Instructional Productivity Student-to-faculty ration Faculty Characteristics Faculty awards % Faculty tenure track Page 58 of 71 14. On a scale of 1-5, how would you rate the degree of success in implementing each of the following metrics at your institution? Unsure/ No opinion Enrolment Management Market share Students’ geographic origin Student Success Graduation rate Students and alumni awards Student learning or skill gain Student Satisfaction Student satisfaction Students’ evaluation of courses/program/faculty Staff Engagement Employee engagement Research Publications Inventions Research expenditures Libraries Holdings Expenditures Facilities Environmental sustainability Facilities condition indices Financial Income/net contribution Net assets Page 59 of 71 Poor (1) Fair (2) Good (3) Very good Excellent (4) (5) Instructional Productivity Student-to-faculty ratio Faculty Characteristics Faculty awards % Faculty tenure track 15. To what extent does each of the following initiate or influence the development of performance metrics at your institution? Not applicable Board of Governors or Senate State Board (U.S. only) Provincial government (Canada only) National government (Australia and New Zealand only) President Academic executives (VPAcademic, Provost, Deputy or Vice Provost Non-academic executives (Vice President, Associate or Assistant Vice President) Mid-level non-academic management (managers, directors) Faculty Non-teaching staff Mid-level academic managers (chairs, department heads, associate deans, managers, directors) Page 60 of 71 Don't know To no To a very To a To a large To a very extent at small moderate extent large all extent extent extent (1) (2) (3) (4) (5) Data consortium 15-1. Are there any other groups or individuals who initiate or influence the development of performance metrics at your institution? Yes No 15-2. Please list any other group(s) or individual(s): 1. ____________ 2. ____________ 3. ____________ 15-3 Please rate the extent the other groups or individuals initiate or influence the development of performance metrics at your institution. To no Not extent at applicable all (1) To a very To a To a large To a very small moderate extent large extent extent extent (2) (3) (4) (5) Item #1 Item #2 Item #3 16. On a scale of 1 to 5, to what extent do you consider the following to be barriers to the effective development and management of metrics at your institution? To no To a very To a To a large To a very Unsure / Not extent at small moderate extent large No applicable all extent extent extent opinion (1) (2) (3) (4) (5) 1. Timeliness of data from external databases 2. Timeliness of data from internal databases 3. Inadequate number of staff in institutional research 4. Insufficient skills of existing staff in institutional research in effectively analyzing data Page 61 of 71 5. Insufficient staff and resources from I.T. 6. Difficulty integrating/combining data from different data sources 7. Data are mainly snapshots from a ‘live’ information system and are not stored in a static / frozen state to maintain analysis at a fixed reporting period 8. Lack of required operational or raw data from internal databases 9. Lack of required operational or raw data from external databases 10. Lack of reliable data to make accurate measurements 11. Complexity of existing information management systems 12. The level of support from senior decision makers to build the required data infrastructure 13. Lack of a data governance framework with clear responsibilities for data accountability/stewardship 14. Gaps in existing data governance framework 15. Lack of a useful data dictionary (for standardized definitions and business rules) 16. Lack of a useful data warehouse/data marts 17. Low level of trust (internal data leaders’ inhibition to release data) Page 62 of 71 17. Please indicate your level of agreement with each of the following statements. Unsure / Strongly Disagree Neither Agree No opinion disagree agree nor disagree (1) (2) (3) (4) 1. Our institution has a comprehensive strategic performance measurement framework (e.g. Balanced Scorecard, Business Intelligence tool). 2. Our suite of metrics includes specific measures that relate to a highly unique aspect of the overall mission of the institution. 3. Our institution puts emphasis on measures that predict future performance. 4. Our institution puts emphasis on measures of past performance. 5. Our institution has a strong culture of evidence-based decision making. 6. Our department is underresourced at my institution 7. Our department has a very collaborative relationship with the Information Technology department. Strongly agree (5) 18. Within the past 3 years, what best describes your key initiatives regarding metrics. Please select all that apply. We produced a new metric or a new set of metrics for our institution. (Branch to Q19-1) We consolidated some of our existing metrics [Branch to Q19-2) We modified existing metrics used at our institution. [Branch to Q19-3) We eliminated specific metrics which we were regularly producing in the past three years. [Branch to Q19-4) We made plans for producing new metrics. [Branch to Q19-5) There were no changes in the types of metrics that we have been monitoring. [Branch to Q20) Page 63 of 71 19-1 Metrics Produce Please list any new metrics that you have produced in the past 3 years. Please be as specific as possible. 1 __________________ 2 __________________ 3 __________________ 4 __________________ 5 __________________ 19-2 Metrics Modify Please list any new metrics that you have modified in the past 3 years. Please be as specific as possible. 1 __________________ 2 __________________ 3 __________________ 4 __________________ 5 __________________ 19-3 Metrics Eliminate Please list any new metrics that you have eliminated in the past 3 years. Please be as specific as possible. 1 __________________ 2 __________________ 3 __________________ 4 __________________ 5 __________________ 19-4 Metrics Consolidate Please list any existing metrics that you have consolidated in the past 3 years. Please be as specific as possible. 1 __________________ 2 __________________ 3 __________________ 4 __________________ 5 __________________ 19-5 Metrics Plan Please list any new metrics that you have planned to produce in the near future (please list up to five). 1 ____________________ Page 64 of 71 2 ____________________ 3 ____________________ 4 ____________________ 5 ____________________ Discussions are on-going (not certain yet at this point about the specific metrics) 20. Which of the following currently exists at your institution? Please select all that apply. A dashboard A balanced scorecard A new business intelligence tool Data visualization software Data governance framework Statistical software Online survey software A data dictionary and / or data library None of the above 21. Which of the following are currently being developed or significantly improved at your institution? Please select all that apply. Dashboard Balanced Scorecard New Business Intelligence tool Data governance framework Data dictionary and/or data library None of the above 22. What would you say are the top three challenges in producing new metrics for your institution? Please select up to three items. Buy-in from academic leaders Buy-in from the executive management The integrity (accuracy, completeness, consistency) of the data to be used in developing metrics Page 65 of 71 The concern that other institutions are not collecting the new metric that we want to produce, so benchmarking will not be possible Lack of data needed (internal or external to our institution) The tendency to select metrics that are easy to quantify instead of those that are harder to develop but are more meaningful. Lack of resources to develop, maintain, or report metrics. Other_______________ 23. How often do use the following in producing and analyzing your metrics? DATA COLLECTION PLATFORM AND TOOLS Not Not at all Seldom/rarely Sometimes Usually applicable (1) (2) (3) (4) Always (5) Student Information System Learning Management System Other enterprise systems (e.g. HR, financial) Survey data External data (e.g. Application Center, Census Bureau, Statistics Canada, IPEDS, uCubeAustralia, Education CountsNew Zealand, etc.) DATA STORAGE Not Not at all Seldom/rarely Sometimes Usually applicable (1) (2) (3) (4) Always (5) Not Not at all Seldom/rarely Sometimes Usually applicable (1) (2) (3) (4) Always (5) Flat files (e.g. MS Excel, text files) Relational Databases (e.g. MS Access) Data warehouse or data marts Cloud Hadoop ANALYTIC/REPORTING TOOLS Spreadsheets (e.g. Excel) Statistical Packages (e.g. SAS, SPSS, Stata, STATISTICA) Page 66 of 71 Content Analysis / Text Mining Packages (e.g. Provalis, SPSS Text Miner, SAS Text Analyser) Enterprise Business Intelligence (e.g. Cognos, Crystal Reports, Essbase, SQL Server Analysis Services/Excel BI) Visualization software (e.g. Tableau, Qlikview, Microsoft PowerPivot, SAS JMP) ANALYTICAL METHODS Not Not at all Seldom/rarely Sometimes Usually applicable (1) (2) (3) (4) Always (5) Basic reporting/descriptive statistics Statistical measurement of associations and causal relationships Value-added approach - i.e. adjustment of inputs to isolate real impacts Time-series analysis Comparison of programs within the institution Comparison of own programs to similar programs outside the institution Comparison of own institutional performance against peer institutions 24. What are the main difficulties you find with using data management / data analysis technologies? Please select all that apply. High skills and training requirements for IR staff Complexity of use of existing technology even for trained users Financial cost of acquiring and maintaining technology Limitation in the capabilities of the technology (e.g. automation) Significant IT support/involvement required for implementation or use of technologies Overall poor quality (e.g. inflexible, low quality graphics, buggy, lack of technical support) Page 67 of 71 25. What is the overall average for new first year students in all programs in academic year 2013 at your institution? High school average (%)_____ SAT ______ ACT ______ Other, please specify__________ Data not readily available Prefer not to answer 26. Which statement best describes the type of admission at your institution? All / almost all programs are open admission A majority of programs are open admission There is roughly an equal number of programs with open and selective entry A majority of programs are selective entry All / almost all programs are selective entry Other, please specify_____________ 27. How long has the department (Institutional Research) existed at your institution? Please enter numeric response only. _________________ ...or check this box if you are not sure Don't know 28. The department (Institutional Research) currently reports directly to: President Academic executives (VP-Academic, Provost, Deputy or Vice Provost) Non-academic executives (Vice President, Associate or Assistant Vice President) Registrar Students Services Information Technology Chief Financial Officer Other__________________ Page 68 of 71 Page 69 of 71 29. Please provide any comments on metrics (processes, development, reporting, etc.) at your institution that you feel are important but were not included in this survey. 30. Please let us know if you have any questions or concerns about this survey. Thank you for your participation in this study. A copy of the survey results will be emailed to you. Academica Group programmed this instrument, and can be contacted at surveys@academicagroup.com Appendix 2 This appendix contains the questions that were used to guide the interviews. Interviews followed a conversational style and some questions may have been skipped if the interviewer thought they had already been addressed in the course of the conversation or were not considered relevant. Measurement and Processes Page 70 of 71 What primary metrics do you monitor to track the status of your institutional performance against your institutional mission and goals? What makes a metric valuable and meaningful to your institution? If your department has developed (or is currently developing) any new metrics, please describe the process you’ve undertaken. What were some of the successes and challenges? Are there any metrics you are not currently using that you think would be particularly valuable to your institution? What statistical techniques do you use or would recommend for building more valid and useful metrics? Data Collection and Information Management Systems Please describe your processes for collecting and managing data that are used to create metrics. What challenges, if any, have you experienced in using any internal and external databases that you use to produce metrics? Regarding your internal or external collaborations pertaining to data collection and management, please describe one example you would consider to be particularly successful and another that is not as successful as you would have liked. Please describe any changes that your institution implemented (in the past 2-3 years) regarding data collection instruments and processes. Reporting & Dissemination Please describe your communication/dissemination processes regarding metrics. What key resources are required to support these processes? Are there changes you would like to see regarding these processes? Program Prioritization Do you use a program prioritization process? Does the process lead to developing new metrics or enhancing existing ones? Please tell us about the most and least useful metrics that you use for the prioritization process. What are some of the successes and challenges? External Ranking Systems Page 71 of 71 What is your opinion about external ranking of your institution or its programs? Have the criteria used for external institutional rankings affected the choice of metrics you collect? If so, in what way? Does your institution undertake (or has it undertaken) any action based on such rankings?