Producing Better Institutional Performance Metrics More Efficiently Related Units

advertisement
Producing Better Institutional
Performance Metrics More Efficiently
Results of a 2014 On-line Survey of Institutional Research Departments, Offices and
Related Units
January 23, 2015
Institutional Research
Sheridan College
Ontario, Canada
Financial support for this research was provided by the Ontario Ministry of Training, Colleges and Universities from the Productivity and
Innovation Fund.
Summary
The incredible rate at which post-secondary institutions have been accumulating data over the past cade
should mean that they are now overflowing with useful information to monitor institutional
performance and make strategic decisions. However, since raw data has to go through a complex and
expensive intermediate process to become useful information, simply accumulating more and more
data does not guarantee that more or better information is being produced. In fact, it is the strength
and efficiency of this intermediate process that will have much more bearing on the amount of useful
information available to senior decision makers than the quantity of raw data available.
A main output of this intermediate information-generating process is institutional performance metrics.
These metrics are used to convey strategic- or performance-relevant information in a form that senior
decision-makers can use. The purpose of this study is to help those charged with building and improving
their institutions’ performance metrics system with empirically grounded research based on an
extensive survey of the leaders of Institutional Research (IR) Departments and Offices throughout the
United States and Canada.1
This study was financially supported by the Ontario Ministry of Training, Colleges and Universities as
part of a large Ontario-wide fund known as the Productivity and Innovation Fund (“PIF”), which was
made available to post-secondary institutions in Ontario to foster innovation, improve the quality of
learning and promote greater efficiency. The research, led by Sheridan’s Department of Institutional
Research with research support from Academica Group (a research firm specializing in higher
education), included in-depth telephone interviews of 38 key administrators closely associated with the
production and use of performance metrics in addition to a survey sent to 1,307 IR Departments (192
responses, 152 complete responses) or related units in Canada, the United States, Australia and New
Zealand.2 All respondents, whether they fully completed the survey or not, were promised a copy of the
results of the survey. This report contains those results.
Simplified Model of Metric Production and Net Benefits
The analysis of the survey data directly informs two important choices: 1) which metrics to focus on
because they provide a higher net benefit; and, 2) what mix of resources, environment and data
infrastructure result in higher IR productivity. A simple framework is used to model institutional metric
production.3 The framework places the IR Department at its center, pulling data from databases,
1
For expositional convenience, we refer to the primary unit responsible for performance metric production as the
Institutional Research (IR) Department, though that unit can have many different names and institutional metric
production in some institutions is quite decentralized and thus not the primary responsibility of any one
department. Also for expositional convenience, we refer to the key administrator of the IR Department or related
unit, that is, the person who typically completed the survey, as the “director”, even though there are many other
possible titles and positions for this role.
2
Of the 1,307 institutions contacted, 152 completed the full survey and an additional 40 completed some of the
survey (response rate of 11.6% for completed surveys, 14.7% for completed and partially completed surveys). See
Table 2.1 in the main text for additional summary statistics about response rates.
3
We focus on a centralized model of metric production because the survey data best matches this model. That is,
the survey targeted one respondent per institution, the person who was the most involved with directing the
Page 2 of 71
transforming it into metrics and delivering it to senior decision makers and governments. Raw data,
mostly from institutional administrative database systems and surveys, is accessed, cleaned and
prepared via computer algorithms and manual processing resulting in an intermediate data input that is
further processed into performance metrics. Validation checks are done at various points and the
resulting metrics are distributed by some reporting mechanism to senior decision makers. The collective
value of the individual metrics produced determines the overall value of the system of performance
metrics. Direct costs of producing these metrics are the labour costs of the IR staff to understand
business rules, write computer code, design and administer surveys, etc. Creating and maintaining a
data warehouse, data marts and data dictionaries (data infrastructure) that can be used for metric
production is another possible cost, but also lowers or eliminates the cost to produce the intermediate
data input.
Survey Results
While an oversimplification, the survey results map quite well to this model. In particular, the survey
results provide information about the value of individual metric outputs, their collective value and how
changes in the inputs (IR staff, data infrastructure and the operating environment) correlate with the
outputs.
The Value of Metrics
Respondents were presented with up to 40 typical institutional performance metrics; for each metric,
respondents were asked to identify if the potential metrics is produced, the quality of the metric at their
institution and their view of the metric’s theoretical or potential importance, independent of its current
importance, which may be affected by lower quality at their institution (i.e. what they would tell a new
director of an IR department at another similar type of institution). The most important broad metric
categories are Student Success, Enrolment Management and Student Satisfaction.4 Within each
category, there is considerable variation in the individual metric scores. Thus, ignoring cost
considerations, potential synergies and highly institution-specific missions and goals, carefully choosing
metrics within each broad metric category can potentially yield a much higher overall metric score and a
more comprehensive view of the institution’s performance. See
Table 1 below for a detailed list or ranked metrics.
production of institutional metrics. Institutions with highly decentralized models for producing metrics should
interpret the results with this in mind.
4
An alternative way to rank metric categories is by their primary use and, assuming that strategic or quality
assurance are the most important uses, produces the same top rankings.
Page 3 of 71
Table 1: Metric Ranking by Metric Score, Grouped by Metric Category
Page 4 of 71
Metrics can also be considered along the dimensions of quality and importance. Chart 1 below plots
each metric’s quality (horizontal axis) and importance (vertical axis), and is located in one of four
quadrants:




Key Metrics: high importance and high quality
Potentially Distracting Metrics: low importance and high quality
Challenging Key Metrics: high importance and low quality; and,
Low Value Metrics: low importance and low quality.
Retention and graduation rate metrics are closely tied to arguably the main broad purpose of almost any
post-secondary institution and are ranked to have both the highest importance and quality (top right).5
One interesting quadrant is the Challenging Key Metrics (top left), those of lower quality but high
importance. Student learning or skill gain and graduate employment outcomes stand out. Their
importance is also self-evident, but the fact that their quality is relatively low suggest that it is difficult or
expensive to create metrics that measure these vital functions and outputs of an institution accurately.
5
Certainly, there are some institutions, programs and situations where students who transfer to another
institution for more advanced studies before graduating represent a successful outcome, but that does not
diminish the importance of retention and graduation metrics, only that more information, such as separating out
transfers, is needed.
Page 5 of 71
Chart 1: Four Quadrant Diagram of Metrics
IR Resources, Characteristics and Operating Environment
Institutions in the sample range from as small as 432 students to as large as 60,000. The largest IR
Department has 15 full-time equivalent (FTE) staff. University IR Departments are about twice as large
as those in colleges. Larger institutions have, on average, larger IR Departments but there are
considerable economies of scale. Chart 3 below plots the IR FTE staff levels (vertical axis) against the
reported institution headcount (horizontal axis, in base 10 logarithm). For every doubling of the
institution size, the IR Department does not also double but adds about 1.1 new staff (about 25% of the
average IR size). Significant variation exists between institutions, and based on eight relevant openended comments that are relevant to institution size, and paraphrased in the chart, larger than expected
departments have responsibilities beyond a traditional IR Department and smaller than expected
departments seem to indicate more limited capabilities and possibly some frustration with senior
leadership.
Page 6 of 71
Chart 3: Relationship between Institution Size and Number of FTE IR Staff
The levels of education of IR staff are high, with a majority (62%) having a graduate degree (19% hold a
Ph.D). IR key administrators (“directors”) have diverse educational backgrounds, with the largest share
having a background in Education (28.2%), followed by Social Sciences (24.8%) and Business/Economics
(17.4%). The IR Department’s position in the organization reporting structure varies substantially; a
large share of departments (44%) falls under the Vice-president Academic/Provost, a nearly equal share
(45%) is positioned under either the President or a non-academic executive.
Respondents indicated that a lack of IR staff was the largest barrier to improving performance metrics,
though the skills of existing IR staff were not seen as lacking (ranked as the lowest barrier). The
availability of raw data is also not considered a barrier; rather, the lack of necessary infrastructure (e.g.
data warehouses, data dictionaries) and support to create the infrastructure systems (e.g. data
Page 7 of 71
governance, senior decision maker support) are seen as more important problems. IR Departments
typically use tools and data sources that have been around for decades and only a few are routinely
using the latest data-driven approaches, though a large minority use more sophisticated statistical
techniques routinely. Respondents flagged a lack of perceived buy-in among senior decision makers as a
barrier, though much of the impetus for creating metrics comes from senior executives with much less
demand from mid-level management, Faculties or even the Board of Governors. 6
Relationship between Inputs, Operating Environment and Performance Metric
Outputs
To measure the correlation between metric outputs, IR inputs and the operating environment, we
construct an index of metric output and correlate this to reported input levels, operating conditions and
interactions between inputs and operating conditions.
The following results are found:





For every additional IR employee, if there is a good data infrastructure, 0.9 (3.2%) additional
metrics are produced and the share of Challenging Key Metrics is higher.
With poor data infrastructure, adding more IR staff results in no additional metrics being
produced.
A perceived lack of support from senior administrators had no impact on the number of metrics
produced. Counterintuitively, it is weakly associated with a higher share of Challenging Key
Metrics. (It may be that more ambitious IR administrators may perceive themselves as being
more constrained by limited resources.)
Age of the IR Department is not correlated with the number of metrics produced but is strongly
positively correlated with the average metric importance of the metrics produced, independent
of other factors, perhaps reflecting a learning-by-doing process.
The more metrics that are produced the lower the average metric importance (diminishing
returns to metric importance)
6
The results from the interviews are inconsistent from the survey results in that they differ markedly in how
insufficient raw data inputs and lack of resources (lack of staff) are ranked as barriers: the interviewees ranked
“lack of input data” much higher than “lack of resources”, whereas survey respondents ranked “lack of input data”
much lower than “lack of resources”. One possible explanation for the incongruity is that the survey was mostly
answered by those directly responsible for IR (over 75% were at the director level or lower) while the interviewees
were mostly at a more senior level (only 45% were at the director level or lower), and these two levels of
administration may, in an uncoordinated fashion, be using different definitions of resources and data inputs. For
instance, senior administrators, further away from the information-producing process, may consider a researchready dataset, that generated in Step 1 of the simple model described above, that is easily extracted from a data
warehouse as part of the raw data inputs or resources for IR, while directors and managers consider the underlying
operational raw data as the data inputs, and compiling and cleaning this data into data warehouses and useable
datasets to be part of the research process itself and thus requiring staff time to process. As explained, the
statistical analysis shows that, other things being the same, departments with more IR staff do not produce more
metrics when the data infrastructure is inadequate. Thus, in some ways, both definitions could be correct
depending on whether there is a good research data infrastructure and where the cost to create and maintain such
an infrastructure lies.
Page 8 of 71




If government is seen as a main initiator of the production of metrics, the total number of
metrics produced is lower, the share of Key Metrics is higher and the share of Challenging Key
Metrics and Potentially Distracting Metrics is lower. One interpretation is that governments
typically request metrics that are categorized as Key Metrics and this comes at an opportunity
cost of fewer other types of metrics produced
Directors with a social science or education background lead departments or offices that
produce about 14% more metrics for reasons unrelated to institution type, size, department size
or country.
There are no differences between countries or institution types in the number of metrics
produced once other differences have been taken into account. Average metric importance for
colleges is slightly higher than for universities.
IR Departments that report to the President’s Office produce a mix of metrics that are slightly
less important, which may be due to a choice of a greater breadth of measurement at the
expense of not producing some metrics generally considered to be more important.
The results can be summarized within the structure of the simple model described above. The
production of performance metrics of IR Departments does depend on the number of staff, but staff
work with a research data infrastructure and if the data infrastructure is lacking the ability of staff to
produce new metrics will be greatly reduced, possibly to zero.7 If the department is significantly
understaffed relative to a typical size for the same institution type, metric output is also likely to be low
or of poorer quality. IR Departments focus on core metrics (Key Metrics) when resources are limited (or
when mandated by government), but as additional resources are available tend to develop metrics that
are importance but are more difficult to create at a high level of quality (Challenging Key Metrics). As
more metrics are produced, the value of new metrics becomes smaller since directors and their
institutions rationally have focused on producing the most valuable metrics first. Some of the
productivity of an IR Department comes with maturity of the department; the size of the department,
quality of data infrastructure or level of support from senior management are unlikely to fully replace
this maturation process. Furthermore, changing where the IR Department is positioned in the
institution is unlikely to have a significant impact on metric production.
7
It could be that new staff have so much trouble producing additional metrics that they work instead on nonmetric related activities or they start to work on improving the data infrastructure. In either case, no new metrics
are created in the short-run.
Page 9 of 71
1. Introduction
In order to build up and enhance its own suite of performance metrics, in the Winter of 2013 Sheridan
College, working with Academica Group (a research firm specializing in higher education), collected
survey data from 192 Institutional Research (IR) Departments, offices and related units8 from publicly
funded post-secondary institutions throughout the United States of America, Canada, Australia and New
Zealand. This study was financially supported by the Ontario Ministry of Training, Colleges and
Universities (“the Ministry”) as part of a large Ontario-wide fund known as the Productivity and
Innovation Fund (“PIF”), which was made available to post-secondary institutions in Ontario to improve
efficiency and foster innovation. The research began in December 2013, led by Sheridan’s Department
of Institutional Research with research support from Academica Group. The background research used
to develop the survey instrument involved a review of the literature and in-depth telephone interviews
of 38 key administrators closely associated with the production and use of performance metrics. The
survey was sent to 1,307 IR Departments or related units in Canada, the United States, Australia and
New Zealand. Of these, 192 responded and 152 fully completed the survey and were included in the
analysis presented in this report.9 All respondents who participated in either an interview or the survey
were promised a copy of the results of the survey. This report contains those results.
1.1.Definitions
Throughout the report, we refer to institutional performance metrics, sometimes calling them
performance metrics or simply metrics. This study does not deal with detailed operational metrics but
instead focuses on the types of metrics that would be of interest to senior decision makers, the Board of
Governors and governments. We also refer to IR Departments and offices even though at some
institutions included in our sample there is no department or office referred to as Institutional Research.
Moreover, even where an institution has an IR Department or office some of the important metrics may
be produced by other units. Thus when we refer to IR Departments or offices it should be taken within
context to mean the unit or units that are primarily responsible for the production of performance
metrics.
1.2.Organization of Report
The rest of the report is organized as follows. Section 2 briefly describes the survey methodology.
Sections 3 through 5 present summaries of key parts of the survey. Section 3 deals with direct inputs in
IR Departments and offices. Section 4 examines the operating environment and barriers, and Section 5
documents the recent changes in terms of the development and production of institutional metrics.
Section 6 examines the institutional performance metrics themselves. Readers mostly or only interested
in metric ranking and categorization may want to skip to this section. Section 7 pulls the other sections
together by using multiple equation regression analysis to estimate associations of importance-adjusted
8
For expositional convenience, we refer to all these units simply as IR departments in this report.
Of the 1,307 institutions contacted, 152 completed the full survey and an additional 40 completed some of the
survey (response rate of 11.6% for completed surveys, 14.7% for completed and partially completed surveys). See
Table 2.1 in the main text for additional summary statistics about response rates.
9
Page 10 of 71
metric production and the choice of metric production with potential drivers, including IR staff
resources, the operating environment and institutional characteristics. Section 8 concludes.
2. Methodology
The main data source for this study is a survey of the key administrators of Institutional Research
Departments and related units. In order to construct the survey instrument, a set of telephone
interviews of key administrators10 of post-secondary IR departments was done. A total of 150 colleges
and universities were invited to participate in a telephone survey. Thirty-eight of these completed the
interview, which lasted from 18 to 56 minutes or an average of 38 minutes (see Appendix 2 for the
interview questions).
All interviews were done by telephone and conducted by the lead researcher at Academica guided by a
fixed list of interview questions, but using a conversational style. Each interview was recorded and
transcribed by a third party and then anonymized by the researcher at Academica. Interviewees were
asked for permission at the beginning of the interview to have the interview recorded and transcribed,
and were assured that all identifying information would be removed from transcripts to maintain
anonymity and confidentiality. Sheridan’s lead IR researcher was provided with these anonymized
interview transcripts as well as a summary of the main findings prepared by the Academica researcher.
The information from the interviews was used to help develop most of the main survey sections. The
survey instrument itself asked respondents for information about the department or office (e.g.
resources, staff composition, position in the reporting structure, history), the institution (e.g. size,
degree of admission selectivity), and performance metrics (e.g. opinions about the importance and
quality, recent developments, barriers to production, initiators). There were also two open ended
questions allowing respondents to provide any other information to help with understanding the
context of some of the answers and to provide feedback about the survey itself. See Appendix 1 for the
full survey instrument.
Numerous questions in the survey instrument ask the respondent for his or her opinion or evaluation.
Likert scales were used for many of these questions. For a few of the questions involving at least some
subjective judgment, the survey asked the respondent for related judgements in multiple nonconsecutive questions using different prompts. This allows for some testing of internal validity of the
instrument, but for the most part, the length of the survey precluded a design that would allow
extensive internal validity testing of individual questions. Notwithstanding this, the strength and
direction of statistical association based on variation in responses across respondents provides a strong
test of the validity of the key results.
In order to conduct a robust statistical analysis, it was important to have a large enough sample. One
objective of the study was to examine potential differences between various subgroups of institutions.
10
Includes Provosts, Associate Provosts, Vice-presidents, Associated Vice-presidents, Directors, and Managers.
For the survey, the person who appeared to be the most directly responsible for leading the IR department, IR
office or closely related unit was contacted.
Page 11 of 71
This further increased the need to have a large sample in order to ensure there were enough
observations in each relevant subgroup. The main grouping variables used were country, institution
type (college or university) 11, institution size and open versus selective admission policy. As the
subsequent results show, the final useable sample of 152 observations was indeed large enough to be
able to draw meaningful conclusions from the statistical analysis.
To build a contact list of potential respondents, a list of all publicly funded post-secondary institutions in
Canada, the United States, Australia and New Zealand was created. From this list, we categorized
institutions by country, institution type, and size (small, medium and large).12 For Canada (excluding
Quebec), Australia and New Zealand, all post-secondary public institutions in the population for which
contact information could be obtained were included and for the United States, where the cost of
collecting contact information for all public post-secondary institutions was higher, 1,500 institutions
with a minimum of 1,000 students enrolled were selected at random and from these, contact
information for 1,181 (79%) was located. The contact person was initially notified by email in early April
that there would be a request to participate in the survey and was also provided information about the
survey’s purpose. The contact was then emailed a link to the survey about one to two weeks later. Up
to five reminder emails were sent during the second half of April. A second brief follow up survey was
sent to all respondents in the United States in order to collect additional information on the importance
and quality of specific metrics. The initial survey took longer to complete than expected, with
respondents taking an average of 40 minutes.
Of the 1,307 respondents contacted, 152 completed the full survey and an additional 40 completed
some of the survey (response rate of 11.6% for completed surveys, 14.7% for completed and partially
completed surveys). Table 2.1 below shows the breakdown of responses by country, size and institution
type. Three quarters of the complete surveys (115) were from American institutions while only three
were from Australia and New Zealand. Given the very small number of respondents for Australia and
New Zealand, those responses are excluded from the analysis.13 Canadian institutions were somewhat
more likely to complete the full survey than American institutions (91.9% versus 77.2%, the difference
being statistically significant at the 5% level).
11
There are country differences in what is defined as a college and university. For the purposes of this report, for
the United States we refer to community colleges and two-year colleges as colleges and four-year and longer
institutions as universities; for Canada, we take the college and university designation based on contact list sources
that are separate for colleges and universities.
12
For American institutions, the standard Integrated Postsecondary Education Data System (IPEDS) classification
was used: 0 - 4,999 students enrolled (small); 5,000 - 9,999 (medium); 10,000 and above (large). For all other
respondents we used the Ontario Ministry of Training, Colleges and Universities standard categories: 0 – 3,000
(small); 3,000 – 8.000 (medium); 8,000 and above (large).
13
However, all the Australian and New Zealand institutions that participated were also sent a copy of this research
and we believe that most of the results will be of direct relevance to these institutions as well.
Page 12 of 71
Table 2.1 Number of Respondents by Country and Completion Status
2.1 Methodology for the Survey Analysis
A number of different methodologies were used to analyse the survey data. The analyses for Sections 3,
4 and 5 employ only basic summary statistics and some graphical representation of relationships
between variables. These approaches alone are sufficient to provide meaningful insights. However,
more complex inferences are only possible using more advanced statistical techniques that go beyond
basic multivariate regression analysis. These are taken up in Sections 6 and 7. However, this paper’s
focus is on presenting the results and interpretations, and thus the statistical techniques used are not
discussed in detail.
The ranking and categorization of performance metrics is the subject of Section 6. In order to rank
metrics relative to each other, the numeric value of importance and quality, as defined using five-point
Likert scales, are summed together across all respondents to create an overall quality-importance value
for that metric, thus treating the ordinal importance and quality scales as if they were equal cardinal
scales. The shortcomings of this approach are discussed in that section as well as checks of the
robustness of the results to alternative specifications.
The more complex analysis involving multivariate and multi-equation statistical techniques is taken up in
Section 7. In this section, the focus is not on the metrics but on the factors that cause the set of metrics
produced by IR Departments to differ across institutions. We initially constructed an index of IR metric
productivity by again summing together quality and importance of all metrics the respondent rated.
Some limitations were found with this approach. In particular, we found some evidence of a possible
rater bias that induced a potentially spurious correlation between metric quality and importance. In
order to mitigate the potential bias a different approach was used that involved two steps. The first
step was to construct an average metric importance for each metric by averaging across all respondents
that rated the importance of that metric. The second step was to, for each respondent, add together
the average metric importance value computed in the first step for only those metrics that the
respondent indicated were produced at his or her institution at a “fair” quality level or better. Thus this
approach converts the metric quality to a dichotomous scale (fair-and-above or poor-or-not-produced)
and uses average importance values to attenuate any potential unwanted, and unconscious, respondent
bias in reported metric quality and importance.
Page 13 of 71
The resulting IR productivity index is decomposed into its two constituent components: 1) the total
number of minimum quality metrics produced, and 2) the group-based importance-weighted value of
those metrics. Both components are correlated to various IR inputs (e.g. staff levels, director
educational background) and characteristics of the operating environment using a two-equation system.
Specifically, the first equation in this system relates the number of fair-and-above quality metrics with
inputs and operating environment, and the second equation relates the average importance value of
these fair-and-above quality metrics to a different but overlapping set of explanatory variables. The
number of fair-and-above metrics is included in the second equation to capture diminishing returns in
the importance-value of metrics; if institutions build the set of metrics rationally, starting with more
important metrics and then less and less important metrics, the data should show that the more metrics
that are produced, the lower should be the importance of the last metric produced. (We do not impose
a diminishing returns assumption on the specification but allow it to be estimated freely.)14
The final analysis returns to the four-quadrant diagram introduced in Section 6 and places it in a multiequation setting. For each respondent, we first calculate the share of metrics that fall into each of the
four quadrants described in Section 6. We then examine how various factors influence these shares,
including staffing levels, the role of government, barriers and institutional characteristics. (Since shares
lie between zero and one and sum to 100% this creates certain restrictions on the error term that are
not valid using ordinary least squares regression techniques. We use a quasi-maximum likelihood
fractional multinomial logit model proposed by Papke and Wooldridge (1996) 15 and developed for Stata
by Buis (fmlogit).16)
3. IR Departments, Offices and Their Resources
This section discusses the characteristics of the institutions, the IR Departments and resource levels.
3.1 Institution and IR Department Size
The variation in the institutions included in the analysis is large. Institution size ranges from as small as
432 students to as large as 60,000 students, as shown in Table 3.1 below (based on reported full-time
students enrolled in the fall term (September 2013). The median enrolment is smaller than the mean,
indicating that the distribution is skewed toward smaller institutions, which is not surprising. This skew
is also apparent in the institution size histograms shown in Chart 3.1. There are both small and large
institutions represented for each country/institution type combination. The large variation in institution
size, as we will see shortly, is very useful for understanding the degree to which institution size drives IR
Department size and, in turn, the number of metrics produced.
14
Since this approach includes the dependent variable in the first-equation as an endogenous right-hand side
variable in the second, we use a three-stage simultaneous equation technique (reg3 in Stata) as the assumptions
for ordinary least squares in this specification are not valid.
15
Econometric Methods for Fractional Response Variables with an Application to 401(k) Plan Participation Rates.
Journal of Applied Econometrics, 11(6):619-632.
16
http://maartenbuis.nl/software/fmlogit.html , Maarten L. Buis, accessed September 3, 2014.
Page 14 of 71
Table 3.1: Institution Size by Country and Institution Type
Chart 3.1: Institution Size Distribution by Country and Institution Type
Page 15 of 71
There is also considerable range in the size of IR Departments (see Table 3.2 below). The size of the IR
Departments varies from zero full-time equivalent (FTE) staff to 15.17 University IR Departments are
about twice as large as those of colleges, with American two-year/community colleges having the
smallest IR Departments, on average. The difference between average Canada and the United States
sizes is explained by differences in average institution size and type in this sample.18
Table 3.2: Number of Staff in Institutional Research Department (and Related Units) by Institution
Type and Country
We expect that larger institutions will on average have larger IR Departments. However, because much
of IR data preparation and report generation is based on scalable (and extensible) computer routines,
significant economies of scale in metric production and other IR activities should exist; thus, doubling
the size of an institution should result in the IR department expanding by less than double. We can see
this clearly in Chart 3.2 below, which plots the IR FTE staff levels against the reported institution
headcount (in base 10 logarithm). As the slope of the regression line through the data shows, for every
doubling of the institution size, about 1.1 new IR staff are added.
Chart 3.2 also shows the underlying data points, and the significant variation between institutions is
easy to see. Eight of the respondents provided open-ended comments that are relevant to institution
size, and are paraphrased in the chart (comments are located near the corresponding institution data
point but are not indicated precisely to protect confidentiality). Some respondents’ comments indicate
that they feel that their departments had responsibilities beyond a traditional IR Department and this
explained why they were larger than would be expected. These comments were all located above the
regression line, confirming that these departments or offices are indeed larger than what would be
expected. One of the larger IR offices of a large institution indicated that they were using in-house
17
The number of full-time equivalent staff was calculated as the number of reported full time staff plus 0.5
multiplied by the reported number of part time staff.
18
Based on an unreported ordinary least squares regression of the number of FTE staff on the natural logarithm of
institution headcount, a college dummy variable (equal to 1 for a college and 0 for a university) and a country
dummy variable (equal to 1 for Canada and 0 for the United States of America), the coefficient multiplying the
country dummy variable was found to be statistically insignificant.
Page 16 of 71
reporting tools, though positioning to move to an outside vendor, which we speculate may explain
additional staff needed to maintain these reporting tools. On the other hand, smaller than average
departments for the institution size made comments that seem to indicate more limited abilities and
possibly some frustration with senior leadership.
Chart 3.2: Relationship between Institution Size and Number of FTE IR Staff
3.2 IR Department Characteristics
The length of time an IR Department has existed varies considerably, with many departments being
quite young while others have been established for decades. Chart 3.3 below shows the distribution of
the age of the department (responses were missing for 40 institutions). The tenure of the department
turns out to be closely correlated with a measure of productivity that is examined in Section 8.
Page 17 of 71
Chart 3.3: Distribution in the Number of Years the IR Department Has Existed
In addition to institution size, the size of the IR Department is related to institution type (e.g. college
versus university) and the age of department. These relationships are shown in Chart 3.4 below. The
scatter plots are similar to Chart 3.2 above, with each graph showing headcount (base 10 logarithmic
scale) on the horizontal axis and IR FTE staff counts on the vertical axis. The left-hand graph shows the
data for colleges and the right-hand for universities, while within each graph the orange markers (and
orange trend lines) correspond to IR Departments that have existed for 10 or more years and the blue
markers (and blue trend lines) correspond to IR departments that have existed for less than 10 years.
Readily apparent is the difference in the slopes of the lines between colleges and universities with the
university trend line being much steeper, suggesting that university IR Departments and offices vary
more with institution size than those in colleges. It is also evident that there is a significant difference
between older and younger IR Departments (differences in the slopes of the orange and blue lines) for
universities. Older IR Departments in universities are on average larger than their younger counterparts
at similarly sized universities. This may be evidence of a ratcheting effect whereby as IR Departments
develop more metrics and take on additional responsibilities, existing responsibilities remain,
necessitating department growth. However, it is not obvious why this would not also apply to colleges,
yet age of the IR Department or office does not seem to matter for colleges.19
19
The difference in slopes between older and younger IR Departments among universities is marginally statistically
significant so it may be that there is no actual difference even among universities and the results are simply due to
this particular sample. On the other hand, the difference in slope of the trend lines between colleges and
universities is highly statistically significant so it is very unlikely that he college-university difference in slopes is due
Page 18 of 71
Chart 3.4: Relationship between Institution Size and Number of FTE IR Staff, By Institution Type and
Age of the IR Department
The levels of education of IR staff vary quite a bit, reflecting the wide range of skills needed in an IR
Department or office. A majority (62%) of IR staff have a graduate degree (19% hold a Ph.D.), and as
Table 3.3 below shows, this does not vary by country or institution type (the differences in the
distributions are not statistically significant).
to chance. Also, the results of the regression analysis presented in Section 7 show that, independent of its size,
the department’s age is associated with the choice of which metrics to produce, with older departments and
offices tending to produce metrics considered collectively as more important.
Page 19 of 71
Table 3.3: Distribution of IR Staff Credentials by Country and Institution Type
IR key administrators (“directors”) have diverse educational backgrounds, with the largest share having
a background in Education (28.2%), followed by Social Sciences (24.8%) and Business/Economics
(17.4%). Other educational backgrounds, such as Engineering, Computer Science, Public Administration,
are also represented. Chart 3.5 shows that there are some country differences, with IR administrators
of Canadian institutions more often having educational backgrounds in business or economics compared
to their American-based colleagues who are more likely to have backgrounds in education. In terms of
institution type, there are no statistically significant differences between colleges and universities in
terms of the director’s background. Does the director’s background matter to the production of
metrics? In the later statistical analysis section we present evidence that it does a bit, with those having
an Education or Social Science background producing more metrics with a given number of IR staff,
which could reflect a difference in how tradeoffs on allocating staff to metric and non-metric related
activities is influenced by a directors background, or possibly how the hiring criteria for the IR director is
influenced by varying institution needs.
Chart 3.5: Distribution of Educational Background of the IR Key Administrator by Area of Study and
Country
We asked respondents how long they were at the institution and how long they have been in the role as
the key administrator of Institutional Research. From these responses we infer that those who were at
the institution longer than in their current role were internal hires (or transfers) and the others were
Page 20 of 71
external hires. By this definition, roughly equal shares were hired externally (53.5%) versus internally
(46.5%). Internal hires occur after about eight to ten years working at the institution, even for younger
IR departments (see Chart 3.6) and the number of years of experience in the IR leadership role is quite
similar between internal and external hires. Despite these similarities, directors’ total years of
experience in the institution varies considerably by institution (standard deviation of 5.67 years), even
when normalized by the number of years the IR Department or office has existed.20
Chart 3.6: Relationship between Key Administrator’s Previous Role, Tenure at Institution and IR
Department’s Age
4. Operating Environment
4.1 Reporting Position of the IR Department
IR Departments and offices, which typically have a wide institutional mandate, have a position in the
organization reporting structure that varies substantially. While the largest share (44%) falls under the
20
Years the IR department or office has existed explains less than 7% of the variation in tenure time as the key IR
administrator in a regression of the latter on the former (regression results not shown).
Page 21 of 71
Vice-President Academic/Provost, a nearly equal share (45%) is positioned under either the President or
a non-academic executive (see Chart 4.1). One interesting question is whether the organizational
position of IR matters to the production of metrics. While many arguments can be advanced for why it
should or should not matter, it seems that this question is best addressed empirically. We will return to
this question later.
Chart 4.1: IR Department’s Position in the Reporting Structure
4.2 Barriers to Metric Production
Of course, there are many other factors that can affect an IR Department’s production of performance
metrics. While the survey is not able to capture all of these, it does provide a number of measures of
potentially relevant operating conditions, ranging from the availability of technical tools, to data
resources, to the support of senior leadership. Chart 4.2 shows a ranking of potential barriers based on
the number of respondents that thought the item was a barrier to a large or very large extent. A lack of
IR staff was considered the largest barrier to improving performance metrics, though the skills of
existing IR staff were not seen as lacking (ranked as the lowest barrier). It is possible that IR
Departments have generally been staffed with highly qualified staff (as evidenced by the large share
with graduate level credentials) but that this is also expensive and thus limits the number of total staff
that could be employed. Whether an average IR Department or office would be better off moving to a
greater balance between the level of training and the number of staff is a question that is beyond the
scope of this research. Chart 4.2 also reveals that the availability of data is not a key issue; rather, the
lack of necessary infrastructure (e.g. data warehouses, data dictionaries) and support to create the
infrastructure systems (e.g. data governance, senior decision maker support) are seen to be the main
barriers. To summarize, IR Departments and offices have a few highly qualified staff with lots of
potentially useful data to analyse, but lack a sufficient number of staff or the required data
infrastructure to efficiently turn that data into useful information.
Page 22 of 71
Chart 4.2: Extent a Barrier to Effective Development and Management of Metrics
Chart 4.3 below summarizes data on the top three challenges to performance metric production.
Consistent with the above results, insufficient resources is seen as the top challenge. The rest of the
responses are spread roughly equally among the remaining barriers, with the exception of a lack of
benchmarking data, which was chosen significantly less often. The lack of perceived buy-in among
senior decision makers was confirmed in the interviews, and is particularly problematic.21
21
The interview and survey results are inconsistent, however, in an interesting way. In particular, they differ
markedly in how insufficient raw data inputs and lack of resources (lack of staff) are ranked: the interviewees
ranked “lack of input data” much higher than “lack of resources”, whereas survey respondents ranked “lack of
input data” much lower than “lack of resources”. One possible explanation for the incongruity is that the survey
was mostly answered by those directly responsible for IR (over 75% were at the director level or lower) while the
interviewees were mostly at a more senior level (only 45% were at the director level or lower), and these two
levels of administration may, in an uncoordinated fashion, be using different definitions of resources and data
inputs. For instance, senior administrators, further away from the information-producing process, may consider a
research-ready dataset that is easily extracted from a data warehouse to be part of the raw data inputs or
resources, while directors and managers consider the underlying operational raw form as the raw data inputs, and
compiling and cleaning this data into data warehouses and useable datasets to be part of the research process
itself and thus requiring staff time to process. The statistical analysis supports this latter view. Anticipating the
results, additional staff does seem to result in more metric production (the IR director view) but only when the
data infrastructure (e.g. data marts, data dictionaries) are not significant barriers (the senior administrator view,
when considering data infrastructure as part of the inputs).
Page 23 of 71
Chart 4.3: Top Three Challenges to Producing New Metrics
It is interesting to note that despite issues with perceived buy-in from senior decision makers is that
much of the demand for initiating metric development originates with senior executives (Chart 4.4
below), with much less pressure coming from mid-level management, faculties or even the board of
governors. In Section 7 we examine if buy-in or perceived buy-in from senior leaders matters for the
production of performance metrics.
Chart 4.4: Initiate or Influence Development of Performance Metrics
4.3 Data Collection, Analysis and Reporting Technologies and Methodologies
When asked about which data and information sharing tools exist, the large majority of respondents
have online survey and statistical software, but only about a third have a data dictionary and less than a
quarter have a data governance framework, consistent with the above results. It is possible that data
dictionaries and data governance frameworks, key components of data infrastructure, are not
particularly important for producing a high-level representation of an institution of the type envisioned
by a Balanced Scorecard. In fact, we find that only the use of statistical software is positively correlated
Page 24 of 71
with the production of a Balanced Scorecard.22 On the other hand, it could be that institutions
misallocate resources, working to early on developing Balanced Scorecards before having good data
infrastructure in place that would support efficient metric production and ultimately be the main inputs
into such a scorecard.
Chart 4.5: Information Tools Existing at Institution
Probing technologies, data and some of the main statistical tools further, in Chart 4.6 we see that IR
Departments and offices are typically using tools and data sources that have been around for decades
(e.g. flat files, spreadsheets, and student information systems) and only a few are routinely using the
latest data-driven approaches, such as those related to big data (e.g. HADOOP, Learning Management
System data, text mining). Sophisticated statistical analysis (e.g. measuring value-added of services,
complex patterns in time-series data, or multivariate correlations) are used routinely by about a third of
institutions. While we expect more institutions to move towards using more recent and sophisticated
analytical techniques and tools, and the fact that around a third are using these more sophisticated
approaches can be taken as a positive indication, such approaches are nevertheless typically very
resource intensive and may be a challenge for all but the largest of institutions. Indeed, where large
institutions (or large IR Departments and offices with more than 6 employees) stand out in how they
benefit from their size is in their use of enterprise systems. They are two-times more likely to have and
routinely use these systems (chart not shown).
22
A simple linear multivariate regression of a dummy variable indicating whether an institution produces a
balanced score card on dummy variables for the presence or absence of a data dictionary, data governance
framework, online survey tools, statistical software, data visualization software, only the coefficient on the
statistical software dummy variable was statistically significant and positive.
Page 25 of 71
Chart 4.6: Use of Data Collection and Information Tools
The use of technologies by IR Departments come with their own set of barriers. Chart 4.7 shows the
main difficulties that hinder the adoption and use of a technology. All the responses cluster around the
support and cost of the technology rather than the technology per se. Financial cost, training and IT
support are mentioned about twice as often as barriers relative to the technologies. This is fully
consistent with the results above that show financial barriers, which translate into insufficient staff,
being a key limiting factor. As we see in the next section, it is not as if senior decision makers have
ignored requests for more resources, but it may be that what has been given is still far from enough to
fully deliver information decision support at the level of what is expected.
Chart 4.7: Main Difficulties Using Data Management and Analysis Technologies
Finally, we see some validation of these general patterns by looking at a question that asked
respondents the extent to which they agreed with a number of different characterizations of their
institutions and departments. Chart 4.8 below summarizes the responses and shows some interesting
Page 26 of 71
patterns. First, the statement that respondents most agreed with is that they emphasize past
performance and the one they least agree with is an emphasis on future performance. Future
performance requires past data, but goes a step further and requires the ability to forecast the future
using more sophisticated statistical techniques (e.g. time series analysis), broader information gathering
efforts about the environment and an ability to incorporate less tangible information about the
environment into the forecast or prediction. Respondents also indicated weaker agreement that their
institution has a culture of evidence-based decision making. Evidence for scientific enquiry requires the
ability to conduct controlled experiments or implementing other techniques to control for multiple
factors to isolate their impacts and also the recognition of the need to distinguish between correlation
and causation. A culture of such decision making must certainly entail an expectation that hypotheses
that are more rigorously supported by empirical evidence and scientific methods are more likely to be
true and thus relied upon. The fact that IR Departments are not heavily relying on statistical techniques,
measuring value added or comparing relative to benchmarks is consistent with a culture that relies less
on evidence and scientific methods as these are common tools in the social sciences for gathering and
evaluating evidence. Of course, even when there is a strong culture of evidence-based decision making,
conducting experiments, gathering benchmark data, and running complex statistical algorithms on large
datasets are expensive and time consuming activities, and thus possibly too slow and costly to provide
the information senior administration needs to make current decisions.23
Chart 4.8: Agreement with Statements about Institution and Department
5. What Is Changing?
We asked respondents questions about what has changed in the past three to five years or is currently
being changed in their Departments and offices. Did their mandate change? Have they been hiring?
Have they been producing new metrics and, if so, which ones? Have they been putting in effort into
improving reporting, such as developing dashboards or addressing some of the gaps identified above,
23
The one large difference between colleges and universities in how they responded to the profile questions
(results not shown) is that a much larger share of respondents at universities agreed or strongly agreed (50%) with
having a culture of evidenced-based decision making than colleges (18%).
Page 27 of 71
such as a lack of a data dictionary or data governance framework? IR Departments and offices are
indeed growing and are working on addressing some of the gaps identified above.
5.1 IR Department Mandate and Staff Changes in the Past Five Years
Evidence of growth is shown in Table 5.1 below, which shows changes in the net number of employees
in the last three years broken down by whether the Department’s core mandate has increased, mostly
stayed the same or decreased. Looking at the last column, 88.6% of respondents reported an increase
in their core mandate and only 1.3% reported a decrease. Of those with an increased mandate, only
11.4% (second last column) reported a decrease in employees and 56% reported an increase. Compare
this with the 8.1% of Departments (last column) with no change in core mandate. Only 8.3% of these
(second last column) added any new staff. And none of those reporting a decrease in mandate reported
new staff. Thus we see that growth in IR is largely tied to an increase in mandate. This is far from a
forgone result and has implications for performance metrics. For example, in many areas of an
institution, employees are added to meet growth in the size of the institution, such as hiring more
professors to teach more students, without any change in mandate (other than to teach more students).
Of course, what constitutes a change in mandate is subject to interpretation, but it seems reasonable
that these data suggest that new IR staff are not hired, for example, to simply improve the quality of
existing performance metrics. Consistent with this perspective, we find that the number of performance
metrics produced increases but not by very much when new IR staff are added (results are found in
Section 7), suggesting that a significant amount of the additional staff time is used to handle new
responsibilities. Unfortunately, our data do not allow us to explore the more interesting question as to
whether IR Departments become spread too thin when they expend or maintain an efficient allocation
of resources.
Table 5.1 below shows how much IR Departments are expanding. Among those with an increase in core
mandate (first row), the average increase is two net FTE staff. This is considerable when considering
that the average size of an expanding Department is 4.64 FTE, implying a more than doubling of staff
over the last three years. And this growth is occurring in an environment of fiscal restraint precipitated
by the 2008 financial crisis.24
24
Indeed, post-secondary institutions in the United States generally experienced more and larger budgetary
cutbacks over this period than Canada, but when examining the data by country (not shown), there is no
convincing evidence that IR department growth in the U.S. was any less than in Canada.
Page 28 of 71
Table 5.1: Department Mandate and Staffing Changes in the Past Five Years
5.2 Areas of Development
In terms of what infrastructure gaps IR Department have been working on, Chart 5.1 shows that getting
information to stakeholders has been given the most attention. A significant majority (60%) of
respondents are working on creating or improving dashboards.25
25
One might think of a pyramid of information reporting, with data collection and processing at the bottom, data
infrastructure, such as data warehouses and dictionaries on the second level, reports and then dashboards to
summarize and provide analysis on the third level and finally a balanced scorecard report on the top. A balanced
scorecard brings many areas of data together to provide an overview of an institution. Perhaps, once dashboards
have been mostly completed, senior management will feel overwhelmed with information and then the demand
for balanced scorecards at the top of the information pyramid will increase. In the meantime, significant effort to
build the middle layers of the pyramid continues.
Page 29 of 71
Chart 5.1: Information Tools Currently Being Developed or Significantly Improved
Interestingly, if we look at institutions that have added staff (Chart 5.2), the one area that stands out in
getting more attention is to create or improve the data dictionary. This makes some sense in that it is
both a labour intensive process and, unlike improving data governance or adding tools and dashboards,
it is largely the work of those involved directly with producing information from data (e.g. IT and IR
professionals) rather than information consumers (e.g. senior decision makers).26 In other words, while
senior decision makers might redirect some resources to IR Departments in terms of new staff and
funds, senior decision makers’ own time and attention is necessary to produce something like a
26
We cannot make a strong inference here. The highlighted difference shown in Chart 5.2 is only statistically
significant at the 10% level so these results do not pass the usual 5% significance test and thus are more likely to
have arisen by chance rather than reflecting a real difference.
Page 30 of 71
Balanced Scorecard and given very large workloads and demands on these senior administrators, that
time and attention may simply not be available.
Chart 5.2: Relationship between Changes in IR Staffing Levels and Tools Developed or Significantly
Improved
5.3 Metric Areas with the Most Development Activity
The final chart that we consider in this section summarizes responses to a series of questions that asked
respondents to list metrics that in the past three years were developed, newly produced, modified,
consolidated or eliminated. Respondents could list any metrics and we coded them into 10 categories
as shown in Chart 5.3 below (the tenth “other” category is not shown).27 The metric areas that show
the most activity – Student Success Metrics, Enrolment Management and Financial Metrics – are the
same metric categories that we show in Section 6 to be the most important. It is nevertheless revealing
how much more activity there is with metrics related to student success compared to other areas. One
might expect that since these metrics reflect core functions of any postsecondary institution that they
would be already in place, and this view is partially supported by the data as the share of metrics
modified in this category relative to all activity is much higher than with Financial Metrics, for example,
yet there is also considerable new development and production of new metrics in this area. As we will
27
As very few metrics were eliminated or consolidated, the chart excludes these activities.
Page 31 of 71
see in Section 6, this is an area that institutions naturally rate as very important, but where quality is
below average.28
Chart 5.3: Focus of Changes to Metrics in Past Three Years by Metric Category
6. Performance Metrics
The potential number of institutional performance metrics is considerable. A compilation of metrics
from Ontario universities showed 882 distinct metrics and related data used and most of these metrics
are highly institution specific, used by only one or two institutions.29 Some of the explanation for the
plethora of metrics is simply that institutions take slightly different approaches for measuring what is
essentially the same underlying institutional characteristic due to the lack of industry standards for
performance metrics. There are also metrics that are highly idiosyncratic to an institution, arising from
some distinct aspect of its mission or environment (as noted above in Chart 4.8, over 50% of
respondents agreed that their metrics reflect a unique aspect of their institution’s mission). However,
for this analysis we need some level of standardization to be able to compare institutions and draw
inferences. We attempted to strike a balance between being specific enough that respondents would
largely understand and agree on what the metric means and keeping the number of metrics for
28
There are clear inherent difficulties of creating valid institutional measures of critical outputs such as the amount
of learning or impact on labour market outcomes. Ideal measures require not only data that is difficult to collect
but a measure of the counterfactual; what learning or labour market outcomes would have been realized if
students did not first go to college or university. And even if such ideal measures could be created, attributing the
cause of good or poor results, the presumed reason for creating such metrics in the first place, would be neither
easy nor uncontroversial. That despite these problems institutions nevertheless push on trying to create and
improve metrics in this area is laudable as the potential benefits for students and graduates are significant.
29
Ken Norrie, Higher Education Quality Council of Ontario, Performance Indicators Workshop for Universities
Report. Retrieved from
http://www.heqco.ca/SiteCollectionDocuments/PI%20Workshop%20Final%20Report%20EN.pdf
Page 32 of 71
respondents to evaluate manageable. To this end, the metrics included in the survey instrument
covered a broad range of areas of an institution, from student success to employment outcomes.
6.1 Metric Rankings
Table 6.1 below shows a list of each of the 40 metrics, grouped together into ten metric categories.30
We ensured that there were at least two metrics in each metric category. (Many of the individual
metrics are probably best understood as categories themselves, each made up of still a finer set of submetrics.) The subtotals represent the maximum score for each metric category and the categories are
rank ordered, from highest to lowest score.31 The order of categories makes sense; student success and
satisfaction are at the heart of any institution’s mission and all the other activities support these two,
with enrolment management being arguable the most fundamental.
Within each metric category, there is considerable range in individual metric scores. Take, for instance,
Student Success. Graduation rate is given a score of 8.1 compared to Student Alumni Awards which has
a score of only 5.5. Generally speaking the data show that being selective allows the same resources to
produce a much better overall measure of an institution. For example, an IR Department that selects
the top metric in each category (nine metrics) would produce a much better overall perspective of an
institution than one that also produced only nine metrics, but selected multiple metrics from some
categories and no metrics from others. Of course, not all metrics require the same amount of resources
(time, data, effort, skill) to produce at a given quality level and we did not collect data on the costs of
producing a metric as this can be very difficult to measure given that it depends on many conditions
(e.g. the state of the student information systems, enterprise reporting systems, definitions) and there
can be large number of synergies between the production of metrics (e.g. a good data dictionary can
reduce the cost of producing many of the metrics). Nevertheless, the collective wisdom from
respondents captured in Table 6.1 may at the very least help to steer senior decision makers away from
metrics that are generally not very useful toward those that can deliver more information.
30
To keep the number of metrics manageable for respondents, we initially randomly divided respondents into two
groups, A and B, numbered each metric in sequence and then presented respondents in group A with only even
number metrics and respondents in group B with odd number metrics. After a preliminary analysis, we noticed
that there were insufficient observations for some of the metrics to draw robust inferences and so resurveyed
some respondents with the group of metrics that they did not see in the first survey.
31
A metric score is equal to the individual metric quality (where 1 = poor and 5 = excellent) and importance scores
(where 1 = very low and 5 = very high) summed together. Thus, a metric rated as both excellent quality (5) and
very high importance (5) would have a metric score of 10; a metric rated as poor quality (1) and very low
importance (1), would have a metric score of 2. Other types of ways of combining quality and importance were
considered in the statistical analysis, including using differential weighting or multiplying a transformation of the
quality index with a transformation of the importance metrics. It turned out that the addition of the two indices
produced the most useful insights so we use this definition throughout.
Page 33 of 71
Table 6.1: Metric Ranking by Metric Score, Grouped by Metric Category
Page 34 of 71
In terms of the number of metrics produced by an institution, the average, rated at a “fair” quality level
or better, is 28, about three quarters of the total number listed.32 The incremental number of metrics
added as the IR Department adds staff is very small: only about 1.4 new metrics are added per staff
member so that institutions with just one IR staff member still produce 27 metrics, on average. There
are many ways to explain this result. One possible explanation is that for small IR Departments, many of
the metrics are produced by or at least with more help from other areas, such as the registrar or the
information technology unit. A second possible explanation for the low level of incremental metric
output is that new IR staff members are tasked with other activities, such as data analysis for strategic
decision making, environmental scans or surveys, and these other activities may only contribute one or
two additional metrics. Finally, incremental metrics may just be much more difficult to produce and
thus require significantly more staff time (increasing marginal cost).
6.2 Four-Quadrant Diagram of Metric Quality and Importance
One limitation of the above ranked list of metrics is that it combines metric quality and importance
together, thereby masking how quality and importance are related to each other. We can separate out
these two dimensions: in Chart 6.1 the vertical axis corresponds to Metric Importance and the
horizontal axis corresponds to Metric Quality. Each metric is plotted at the coordinates of its average
importance and quality scores based on respondents that produced the metric at a “fair” quality level or
better.33 We have also divided the space into four areas or quadrants, where the dividing lines show
metric quality and importance at their respective averages across all metrics. We have categorized the
four quadrants as follows:




Key Metrics: high importance and high quality
Potentially Distracting Metrics: low importance and high quality
Challenging Key Metrics: high importance and low quality; and,
Low Value Metrics: low importance and low quality.
Clearly, the retention and graduation rates metrics represent the most clear purpose of any postsecondary institution and are ranked to have both the highest importance and quality (top right).34 One
interesting quadrant is the Challenging Key Metrics, those of lower quality but high importance. Student
learning or skill gain and graduate employment outcomes stand out. Their importance is self-evident,
but the fact that their quality is relatively low suggest that it is an area needing more attention.
32
The results presented here adjust for the fact that some institutions were only presented with about half the
total list of 40 metrics.
33
For metrics of poor quality level it may be that the respondent is not very informed about the potential
importance of the metric and since it may take very little effort to produce a metric at a poor quality level, we
generally treat these as metrics that are not produced at all.
34
Certainly, there are some institution, programs and situations where students who transfer to another
institution for more advanced studies before graduating represents a successful outcome, but that does not
diminish the importance of retention and graduation metrics, only that more information, such as separating out
transfers, is needed their meaning.
Page 35 of 71
Chart 6.1: Four Quadrant Diagram of Metrics
6.3 Metric Areas Ranked Based on Main Use
An alternative way to understand metric importance is to look at what they are primarily used for. For
each of the ten metric categories, we asked respondents to select the primary purpose(s) of that metric
category: Strategic, Quality, Describe, Regulate or Prioritize. Chart 6.2 below shows the percentage of
times each purpose (the rows) was selected for each metric category (the columns). We have ordered
the metric purpose based on an assumed internal institutional priorities, putting Strategic purpose
ahead of Quality and Quality ahead of Describe, etc. (Prioritize we placed last mostly because it was one
of the least commonly selected purposes.) As intuition would suggest and consistent with Table 6.1
above, Student Success, Enrolment Management, Financials and Student Engagement/Satisfaction are
the most important and Faculty Characteristics, Research and Libraries the least, based on this ranking
of purposes.
Page 36 of 71
Chart 6.2: Main Use of Metric Categories
7. Statistical Analysis: Linking Inputs and Operating Environment with
Metric Output
So far we have described variations in direct IR inputs, operating environments and how performance
metrics compare with each other. However, we have not shown how IR Departments compare with
each other in terms of their metric productivity and have only touched on how metric productivity might
relate to inputs and operating environments. Ultimately, understanding how much, if at all, any
particular input or environmental condition contributes to producing a more valuable suite of metrics,
or what synergies exist between them, should lead to more productive IR Departments.
7.1 Index of Metric Productivity
A significant challenge is to define a measure of overall metric system quality that is amenable to
statistical analysis. There is no extensive literature that we know of that describes how to construct
such a measure for post-secondary institutional performance metrics. Our approach is relatively
straightforward, but we need to explain the rationale. In particular, since we are using self-reported
measures of metric quality and importance, there is risk of unintentional and unconscious bias entering
the data. For example, there may be an unconscious bias toward assessing a metric that is viewed as
high quality to be also of high importance (the incongruity between producing a high quality metric of
little importance may be unconsciously resisted). While we cannot completely avoid such biases, we
take an approach that we believe attenuates the more obvious ones.
A simple way to create an index is to multiply metric quality by metric importance and sum these
together for each institution. The larger the score, or index value, the more productive the IR
Department is measured to be. There are some potential limitations using this approach. The
numerical scales used to translate qualitative evaluation of quality and importance are ordinal rather
than cardinal scales and thus there is some arbitrariness in combining them and there is no particular
reason that the ordinal scales of importance and quality should correspond one-for-one with each other
Page 37 of 71
or that respondents would agree on the implicit ordinal scale meanings.35 Also, it could be that a better
way to combine importance and quality is to sum them together or use some other functional form.
We start by examining differences in metric quality indices between colleges and universities and
consider several different possible indices of the quality of the set of metrics produced. The first index
(Index 1) uses all responses to calculate a group average importance per metric and multiplies this by
the individual institution’s reported quality. Thus importance value is the average importance
calculated over all respondents rating the metric. This is done in order to reduce the potential of
unwanted correlation between a respondent’s assessment of a metric’s quality and importance. The
first row of Table 7.1 shows the average value for Index 1. Universities have a higher overall index value
(normalized to a maximum of 100) than colleges and this difference is highly statistically significant.
One explanation for the difference is that universities and colleges may differ on which metrics they
consider important (colleges are less interested in research metrics, for example,) and thus the
difference between college and universities reflect this difference. Index 2 uses an importance value
that is based on averages calculated separately for colleges and universities. The difference between
colleges and universities persists and is even slightly larger, which is inconsistent with the notion that
colleges and universities view the fundamental importance of various metrics differently. Index 3 goes a
step further and uses self-reported metric importance and the results are essentially unchanged in that
universities continue to have a higher overall average index value. It is interesting that the more
institution specific the importance ranking, the lower the overall average score. This may be a sign of a
reporting bias as discussed above, but in the opposite direction whereby a respondent is more critical of
the quality of metrics produce at his or her institution that he or she feels are the most important.
However, if there is a reporting bias, it does not seem to be much different between colleges and
universities as the differences in the average quality index between the two groups does not change
much. (Below, we investigate if this is a real difference between institution types or if other factors
correlated with institution type, such as number of IR staff, are driving the result.)
Since the index is made up of three components, metric importance, metric quality and number of
metrics produced, we can break down the difference between colleges and universities by looking only
at one of the components, average metric importance. Index 4 is per metric importance-quality index
value and while colleges still have a lower average than universities, the difference is no longer
statistically significant, suggesting that the reason for the differences seen above is that colleges simply
produce fewer metrics.
35
For example, it may be the case that an increase of 1 unit in importance from switching to a new metric is
equivalent, in the sense as a director would be indifferent if faced with a choice, to a 2 unit increase in quality
rather than to a 1 unit increase. And two different directors, may have different views in how they would compare
scales and relative scales.
Page 38 of 71
Table 7.1: IR Quality Index
Index 6 calculates the average per-metric importance value (based on the importance for each metric
averaged over respondents) for those metrics that a respondent produces of at least of “fair” quality.
We exclude poor quality metrics to take into account the fact that such metrics may not be much
different than a metric that is not produced at all. Colleges have a statistically significant lower average
per-metric quality index; since this measure excludes quality, other than requiring a minimum quality
level, when combined with the result for Index 5, there is some evidence that colleges produce a set of
slightly less important metrics but at a slightly higher self-reported quality level. Finally, Index 7 uses
self-reported metric importance, and the difference between colleges and universities is still marginally
statistically significant. Thus differing views between institutions on what is and is not importance is not
driving the difference between colleges and universities.
7.2 Regression Analysis of Metric Production and Importance
Ideally we would like to use regression analysis to determine which factors affect the choice of metrics,
the quality and the number produced. However, we face several challenges. First, there are two
dependent variables (number of metrics and the average metric quality) rather than just one. Second,
there is a real potential for a subjective reporting bias introducing a correlation between quality and
importance that we would like to remove. There was some evidence of this above. However, we do not
have a good alternative measure of metric quality in the survey data. As an alternative, we restrict the
analysis to only look at the number of metrics and the choice of metrics in terms of a group-based
importance scale, as was done above. In particular, a metric is defined as being produced for the
purposes of this section if its quality is at least “fair”. Thus “poor” quality metrics are treated as if they
Page 39 of 71
were not produced at all. Decomposing quality into a dichotomous variable – fair or better and poor or
not produced – produces the most robust results in terms of statistical significance and direction of
effects that are intuitive (e.g. more IR staff is positively correlated with more metrics being produced).
We also treat the index as a continuous variable rather than an ordinal variable. Ordered dependent
variable regressions might be used profitably on this data, but extensive experimentation with
alternative regression functional forms and error structures was beyond the scope of the project.
Another reason we limit how much we how we use the quality responses is because of a possible
omitted variable bias that would introduce a spurious correlation between metric quality and quantity.
In particular, for a fixed level of inputs and operating environment, we would expect to see a qualityquantity tradeoff so that higher average quality is associated with fewer metrics and vice versa.
However, we do not see this in the data (results not shown). In fact, we observe the opposite (positive
correlation between quality and quantity), and the likely explanation is that the data do not permit us to
fully control for all significant factors that affect metric production and quality. For example, the
director’s innate ability at managing staff, influencing more senior decision makers and creating
Department strategy would be expected to have a significant impact on IR productivity, and could
positively affect both the quality and quantity of metrics. Since we do not measure this innate ability
without error (director tenure or background are imperfect measures), this is a classic errors-in-variables
problem and we would expect it to show up as an induced positive correlation between metric quality
and quantity even after controlling for other observables.
Given this limitations, the first specification estimates a two-equation simultaneous model of metric
count and metric importance. The first equation determines the number of metrics to produce at or
above a “fair” quality level and a second equation determines the average importance of the fair-orbetter quality metrics. (Below, as a second specification, we consider the selection of the suite of
metrics that achieve the average metric importance level.) The main link between the two equations is
a rationality assumption; IR Departments agree to some extent on what are important metrics and given
a finite number of metrics and resources, are more likely to choose to produce metrics that are more
important, so that as additional metrics are produced, they are necessarily at a lower average
importance level. This type of diminishing returns hypothesis is not imposed on the model, but is
testable in the model specification.
The two-equation model is specified as follows:
𝑛𝑢𝑚_𝑚𝑒𝑡𝑟𝑖𝑐𝑠 = 𝛽0 + 𝛽1 𝑟𝑒𝑝𝑜𝑟𝑡_𝑝𝑟𝑒𝑠 + 𝛽2 𝑟𝑒𝑝𝑜𝑟𝑡_𝑝𝑟𝑜𝑣𝑜𝑠𝑡 + 𝛽3 𝑖𝑟_𝑠𝑡𝑎𝑓𝑓 + 𝛽4 𝑖𝑟_𝑠𝑡𝑎𝑓𝑓
× 𝑑𝑎𝑡𝑎_𝑏𝑎𝑟𝑟𝑖𝑒𝑟𝑠 + 𝛽5 𝑑𝑖𝑟_𝑠𝑜𝑐𝑖𝑎𝑙_𝑒𝑑𝑢𝑐 + 𝛽6 𝑔𝑜𝑣𝑡_𝑖𝑛𝑖𝑡𝑖𝑎𝑡𝑒
+ 𝛽7 𝑛𝑒𝑤_𝑚𝑒𝑡𝑟𝑖𝑐𝑠_𝑝𝑟𝑜𝑑𝑢𝑐𝑒𝑑 + 𝛽8 𝑙𝑎𝑐𝑘_𝑠𝑢𝑝𝑝𝑜𝑟𝑡 + 𝛽9 𝑖𝑟_𝑦𝑒𝑎𝑟𝑠_𝑒𝑥𝑖𝑠𝑡𝑒𝑑
+ 𝛽10 𝑙𝑜𝑔_ℎ𝑒𝑎𝑑_𝑐𝑜𝑢𝑛𝑡 + 𝛽11 𝑐𝑜𝑙𝑙𝑒𝑔𝑒 + 𝛽12 𝑐𝑎𝑛𝑎𝑑𝑎 + 𝛽13 𝑔𝑟𝑜𝑢𝑝𝑎
+ [𝑟𝑒𝑠𝑢𝑟𝑣𝑒𝑦𝑒𝑑 𝑖𝑛𝑡𝑒𝑟𝑎𝑐𝑡𝑖𝑜𝑛𝑠] + 𝜀1
𝑎𝑣𝑔𝑖𝑚𝑝𝑜𝑟𝑡𝑎𝑛𝑐𝑒
= 𝛽14 + 𝛽15 𝑛𝑢𝑚_𝑚𝑒𝑡𝑟𝑖𝑐𝑠 + 𝛽16 𝑑𝑖𝑟_𝑠𝑜𝑐𝑖𝑎𝑙_𝑒𝑑𝑢𝑐 + 𝛽17 𝑖𝑟_𝑦𝑒𝑎𝑟𝑠_𝑒𝑥𝑖𝑠𝑡𝑒𝑑
+ 𝛽18 𝑐𝑜𝑙𝑙𝑒𝑔𝑒 + 𝛽19 𝑐𝑎𝑛𝑎𝑑𝑎 + 𝛽20 𝑔𝑟𝑜𝑢𝑝𝑎 + [𝑟𝑒𝑠𝑢𝑟𝑣𝑒𝑦𝑒𝑑 𝑖𝑛𝑡𝑒𝑟𝑎𝑐𝑡𝑖𝑜𝑛𝑠] + 𝜀2
Page 40 of 71
Where
num_metrics is the number of metrics the respondent indicated were produced at a quality level of
“fair” or better
metric_importance is the average importance value of the metrics included in equation 1 where the
importance weight of a metric is the averaged importance value for all respondents producing that
metric at the “fair” quality level or higher
report_pres is a dummy variable equal to 1 if the IR unit reports to the President’s office and 0 otherwise
report_provost is a dummy variable equal to 1 if the IR unit reports to the Provost or Vice-president
Academic and 0 otherwise
ir_staff is the number of full-time equivalent IR staff members with a graduate degree (Masters or Ph.D.)
data_barriers is a dummy equal to 1 if the respondent answered ____ to questions ---- and 0 otherwise
dir_social_educ is a dummy variable equal to 1 if the director’s background is in social science or
education and 0 otherwise
govt_initiate is a dummy variable equal to 1 if the respondent answered ____ to question ---- and 0
otherwise
new_metrics_produced is a dummy variable equal to 1 if the respondent indicated that they had created
new metrics in the past three years (question x) and 0 otherwise
ir_years_existed is the number of years the ir Department has existed (0 if unknown and then a dummy
variable is included to indicate the missing value)
ir_years_existed_miss
log_head_count the natural logarithm of the reported institution enrolment (question X)
college a dummy variable equal to 1 if the institution is a college and 0 if the institution is a university
Canada a dummy variable if the institution is located in Canada and 0 if it is located in the United States
of America
Lack_support a dummy variable equal to 1 if the respondent indicated that support from senior
decision makers to build the required data infrastructure was a barrier to a large or very large extent
(survey question 16, Table 4.2) and 0 otherwise36
36
We also considered an alternative indicator of support to be whether a lack of support from academic leaders or
executive was chosen as a top three barrier to performance metric production (question 22, Table 4.3). However,
in all specification this variable always showed no statistically significant correlation. Moreover, we believe the
Page 41 of 71
resurveyed interactions the above variables all interacted with a dummy variable equal to 1 if the
respondent responded to the second survey for more information on metrics and 0 otherwise (the
coefficients on these interaction effects are not shown in the tables below)
Thus, in this two-equation system, the dependent variable in the first equation (the number of metrics
produced or num_metrics ) is included as an explanatory variable in the second to capture the fact that
as more metrics are produced, diminishing returns should cause the average metric importance value to
decrease. The coefficient on num_metrics in the second equation should be negative if there are
diminishing returns.
Obviously, the error or disturbance term in the second equation cannot be assumed to be uncorrelated
with the number of metrics and we should also expected that the disturbance terms in each equation
will be correlated with each other. We use three-stage least squares to estimate the system.37
Focusing on the first equation, we expect that more staff should result in more metrics produced.
However, the existence of data barriers may reduce the number of metrics produced or the ability of
staff to create and maintain metrics. We have no particular reason to expect the director’s background
to matter, but as 53% of directors have a background in social science or education, arguable the closest
match to an IR role, we have sufficient variation in the data to examine if director background matters.
Governments are important consumers of metrics and the institutional metrics they usually ask for, such
as enrolment and graduation rates, are typically metrics that institutions also consider important. Thus
we do not expect government demands to increase average metric importance, but may affect
productivity (i.e. the number of metrics produced) since providing metrics to a third party requires
additional time and effort to conform to definitional requirements, deal with unusual results, tracking
down reasons for data errors and setting up and using methods to communicate and transfer data. As a
result, we expect that the more the government is involved, the fewer metrics may be produced. We
also include dummy variables to indicate if the IR Department or office reports to the President’s Office
or reports to the Provost or Vice-President Academic. While it could matter where the IR unit is
positioned in the organization, we have no expectations on how.
If new metrics have been produced in the last few years, we might expect that these institutions should
have more metrics. Of course, it could be that institutions that have a low number of metrics are only
catching up (a regression to the mean effect). Thus the sign of the coefficient on this variable is difficult
to predict.
We expect that more support from senior decision makers should lead to production of more metrics.
However, it is possible that directors that are the most ambitious also feel the most constrained by
limited resources and limited attention, which, if the case, could counter-intuitively result in lower
perceived support being associated with the production of more metrics. We have no way to disentangle
specificity of question 16 that asks about support for a specific initiative, building data infrastructure, as a better
gauge of support.
37
Stata’s reg3 is used to fit the model.
Page 42 of 71
these two possibilities, and though they may both exist, we only measure the net effect. Nevertheless,
seeing if there is a relationship at all is a starting point.
An IR Department or office that is older might be expected to produce more metrics, all other things
equal. However, to the extent that department age is positively associated with department size (older
departments are, in fact, generally larger), some of the effect of department age on the number of
metrics could be captured by number of IR staff. Age of department or office might also be correlated
with a reduction in data infrastructure barriers as data warehouses are built and data dictionaries
created. Still, we might expect some residual effect of department age on the number of metrics that
captures intangible improvements, such as learning-by-doing that are not captured by other explanatory
variables.38
We control for a few additional institutional characteristics: institution size (the natural logarithm of
head count or log_head_count), college or university, Canada or the United States of America. We have
no expectations for direction of association with these variables, but would like to see if there are any
remaining effects once other factors have been controlled for. In particular, we saw above that colleges
tended to produce fewer metrics than universities, but this could be because colleges generally have
smaller IR Departments.
We included a dummy variable indicating whether the respondent’s questions were from the version A
survey or the version B survey as the set of metrics were different for each version.39 Finally, we include
interactions of all variables with a dummy to indicate whether an institution responded to a second
request for additional survey data or not. As the conditions for evaluating the second set of
performance metrics were different for the second survey (in particular, the other contextual questions
were not shown), we want to remove any possible effect on the results. There is no particularly
interesting interpretation of the coefficient estimates for these interaction terms, so we supress them to
make the results easier to read.
The second equation related average metric quality to the number of metrics produced, the background
of the director, and the age of the department. The main identifying exclusion (at least one right-hand
side variable in the first equation needs to be excluded from the second equation) is the number of
staff. We expect that the number of staff is important for determining the number of metrics produced,
but to have much less impact on which metrics to produce. That decision would be more a function of
senior decision maker’s priorities and the mission of the institution. The director’s background might
38
Experience of the key IR administrator might also be thought to be important. We did not ask respondents for
total number of years as director, but we did ask respondents for the number of years at their institution in as an
administrator of IR and as an employee, which should be good proxies for experience. We find that the key
administrator’s time at the institution is too closely correlated with the age of the IR department or office to be
able to measure a separate effect with the sample size that we have when both are included as explanatory
variables. Since age of the department provides a better fit, we only it as a regressor, but one can think that it also
embodies the director and staff’s experience.
39
It turns out that Group A respondents had lower average importance scores, which is simply an artifact of how
the questions were ordered. As the respondents were assigned at random, including this dummy variable is
sufficient to remove all the effects of the two difference versions from the results.
Page 43 of 71
have some influence, especially to the degree that the director influences senior decision makers as to
which metrics are likely to provide the most value. Support from senior decision makers may or may not
matter for the same reasons stated above – namely, this is perceived support and respondents that
believe they have less support may actually just be more ambitious and thus feel more constrained.
Finally, country and institution type are included to see if any remaining differences exist between these
two important groupings.40
Table 7.2 below shows the coefficient estimates for the two-equation model. Looking at the results for
the first equation of the system, the number of metrics increases by about 0.9 per employee
(fte_staff_master_phd, coefficient of 0.45 multiplied by 2 as only half the possible metrics were included
for each respondent.) The average number of metrics is about 28 so an increase of 0.9 metrics per
additional employee is relatively small, and would be consistent with additional labor being used to
work on non-metric producing activities and also decreasing returns to metric production. Interestingly,
however, the coefficient on the interaction of staff with the data barrier dummy variable
(fte_staff_master_phdXdata_barrier) is statistically significant and negative. That is, when poor data
infrastructure is seen as a problem, adding additional IR staff does not result in any additional metrics
being produced. The fact that data infrastructure and labour are complements in metric production
should be of no surprise to anyone working in institutional research, but what is interesting is the large
size of the effect. In light of the above discussion that the perceptions of key barriers to metric
production were contradictory in the interviews compared to the survey – more senior positions and
thus those more involved with resource allocation may attribute more of the problems to poor data
inputs while a director who is closer to the mechanics of creating and managing metric production and
infrastructure points to a lack of resources – we have a potential resolution; senior decision makers are
right in the sense that better quality data inputs (or the data infrastructure to improve the quality of raw
data inputs) are critical and IR directors are correct in that more metric production requires additional
staff. Recognizing that both are right can potentially lead to a more productive discussion between
those asking for metrics and those responsible for producing them.
There are two other results of particular interest in the first equation. One is that if the government is
seen as a main initiator of the production of metrics, the total number of metrics produced is lower
(govt_initate). This is consistent with the hypothesis that reporting metrics to government requires
more time and this has an opportunity cost, at least partially realized as fewer institutional metrics being
produced. An alternative explanation is that the metrics the government seeks are ones that cost more
to produce in terms of staff and data resources and an institution not required to produce some of these
government-driven metrics might choose to produce a larger number of other metrics with the same IR
staff and data.
The other interesting result is that directors with a social science or education background lead
Departments or offices that produce more metrics (director_social_educ). The coefficient is 2.05 (which
40
We also tried including a dummy variable to indicate whether a college was highly selective in its admission
policy or had a more open access philosophy. We did not find any differences between these two groups once the
other variables were included.
Page 44 of 71
implies 4.1 additional metrics taking into account that respondents only evaluated half the available
metrics), or about 14% more than the number average of metrics, and is highly statistically significant.
Thus the effect size is quite large. We have controlled for Department age, staffing levels, institution
size and institution type so the explanation cannot be related to directors with this background being
disproportionately employed in older IR Departments, universities or larger institutions.41 The data does
not tell us why this should be the case. One might speculate that such directors, because of their
backgrounds, put relatively more value on a broader range of metrics than a director with a background
in business or economics that might put more emphasis on strategic information that is not part of
performance metrics per se. Since we do not measure all the activities of an IR Department, this result
cannot be interpreted in terms of overall IR productivity.
There are several coefficients that turned out not to be statistically significant that are worth
highlighting. A perceived lack of support from senior administrators had no impact on the number of
metrics. Not only is the coefficient not statistically significant, the parameter estimate is very small. The
area that IR reports to also does not seem to matter for metric production. Institution size
(log_head_count) is statistically insignificant, though this is partially due to multicollinearity with IR
Department staff levels and, intuitively, the number of staff is more important than institution size so
only the coefficient on staff levels is statistically significant. The age of the Department also does not
matter. This could be because age or experience of the IR group drives quality rather than the number
of metrics or, like with institution size, any age effects that result in larger IR Departments are already
explained with the number of staff. Finally, there are no country differences or differences between
colleges and universities. The coefficient on the college dummy is negative, consistent with the results
reported above that colleges produce fewer metrics, but the difference is not statistically significant
with the other controls included.42
Turning to the second equation, which relates the average importance of all fair-quality or better
metrics produced at an institution with the number of metrics produced and other potential factors, we
see that the coefficient on the number of metrics produced (num_fairplus_metrics) is negative and
highly statistically significant (-0.0053, p-value = 0.009), as consistent with the hypothesis of diminishing
returns to metric production and that IR Departments tend to prioritize more important metrics first, so
that the average metric importance drops as more metrics are produced. (While this is an obvious
result, it nonetheless helps validate the model specification, especially given that the importance level is
based on the group average rather than being institutional specific.)
41
Controlling for the director’s years at the institution or years as an administrator was never statistically
significant and had little effect on the other coefficients. Thus we exclude it from the table, but these unpublished
results mostly rule out differences in experience driving the higher productivity of directors with social science or
education backgrounds.
42
This is partially due to another multicollinearity problem where the dummy variable admit_type (whether the
college is open admission or selective) is correlated with the institution type. If the admit_type dummy is excluded
from the estimation, then the coefficient on the college dummy is marginally statistically significant (results not
shown) suggesting colleges may produce slightly fewer metrics from the list of metrics presented in the survey.
Page 45 of 71
There are three other noteworthy results. The first is that a key factor associated with higher average
importance is the age of the institution (log_ir_years_existed, 0.25, p-value = 0.005). Since the model
controls for staff levels, institution type, and the number of metrics, it appears that age has a separate
important effect on which metrics are produced. Department age could capture learning-by-doing or
some other process in which metrics are produced, ineffective ones are discarded, and new metrics are
considered.43
The second interesting result is that the average metric quality for colleges is higher than for universities
(college, 0.0565, equivalent to 7.5% of the mean dependent variable, p-value = 0.004). The reason
cannot be because universities produce more metrics and therefore diminishing returns to metric
importance are stronger for universities, as the second equation controls for the number of metrics
produced. It could be explained by colleges ranking the importance of their most-produced metrics
higher than universities and therefore does not reflect a genuine difference in metric selection. We find
evidence to support this explanation; if importance weights are based solely on the responses from
universities (results not shown), the coefficient on the college dummy in the second equation becomes
statistically insignificant while the rest of the statistically significant coefficients retain their signs and
statistical significance.
Finally, we note that there is some weak evidence that IR Departments or offices that report to the
President’s Office produce a set of metrics that are less important. It may be that as the president
needs to have a broader overall perspective and influence various constituencies, presidents request IR
Departments produce a broader range of metrics, but at an opportunity cost of not producing some
metrics that IR administrators consider more important. This could thus reflect a difference in opinion
so that if we were to solicit presidents for their opinions on which metrics are the most important, the
rankings may be different. Alternatively, it could be that having a breadth of metrics has an intrinsic
value separately from the importance of the individual metrics and therefore is not captured in how the
dependent variable is constructed. This suggests it may be useful to consider other ways to measure the
choice of which metrics to produce, as in the four-quadrant diagram, which we turn to next.
43
An alternative explanation is a survivorship-bias whereby IR departments that produce more important metrics
are more likely to survive for a longer period of time. This seems implausible to us as some institutional metrics
are quasi-obligatory, and it would require that a low performing IR department or office is eliminated for some
period of time altogether, and then a new one started in its place some years later rather than simply replacing the
key direct administrator.
Page 46 of 71
Table 7.2: Two-Equation Regression Model Explaining Number of Metrics Produced (Equation 1) and
Average Metric Quality (Equation 2)
Equation 1: Number of Metrics Produced
Number of Obs.
143
R2
0.407
Variable
Coef.
Std.
Est.
Err.
reports_pres
.734 1.36
reports_acad
.385 1.08
lack_support_e
.143 1.41
Pvalue
.590
.722
.919
director_social_educ
1.96
.846
.021
govt_initiate_e
group_a
admit_type
metrics_produced_e
log_ir_years_existed
fte_staff_master_phd
fte_staff_master_phd
X data_barrier
log_head_count
college
canada
constant
-3.36
1.22
.360
-.081
-.037
.421
1.03
1.24
.374
.918
.661
.228
.001
.328
.336
.930
.955
.065
-.782
.408
.055
.331
-2.28
-1.09
14.4
.520
1.58
.910
5.08
.524
.148
.230
.005
Equation 2: Average Metric Quality
Number of Obs.
143
R2
0.570
Variable
Coef.
Est.
reports_pres
-.035
reports_acad
-.003
lack_support_e
.026
num_lowplus_metrics_
-.007
by_id
director_social_educ
.004
log_ir_years_existed
.020
group_a
-.131
college
.056
canada
-.014
constant
.838
Std.
Err.
.0170
.0123
.0167
Pvalue
.043
.784
.118
.0018
.000
.0111
.0081
.0151
.0178
.0112
.0373
.710
.013
.000
.002
.203
.000
7.3 Regression Analysis Applied to the Four-quadrant Diagram
As discussed above, the four-quadrant diagram places each of the forty metrics into one of four
categories reproduced here for reference:




Key Metrics: high importance and high quality
Potentially Distracting Metrics: low importance and high quality
Challenging Key Metrics: high importance and low quality; and,
Low Value Metrics: low importance and low quality.
For each institution, we can calculate the number of metrics in each category that the institution
produces at a “fair” quality level or better and use this to create quadrant shares that sum to 100
percent by institution. Note that the quadrant a metric belongs to is not institution specific. That is, we
use the positions of the metrics in Chart 6.1, which are based on the average quality and average
importance values of all respondents in the sample. The institutions’ shares in each category then vary
between institution only because of the choice of which metrics are produced (at the fair quality value
or better), and not on the institution’s own assessment of metric quality, beyond being at least of fair
quality, or importance. The main reason to essentially average out some of the institution specific
Page 47 of 71
information is to reduce potential bias in the relationship of quality and importance that can arise from
subjective assessments of these characteristics.
Our goal is to examine what observable IR inputs and operating environment variables correlate with
the composition of metrics produced. As mentioned in the section above on methodology, we use a
regression technique that takes into account that the dependent variables, the shares, sum to 100% and
lie between 0 and 1. As the shares sum to 100%, we have three equations (the key metrics share
equation is the omitted equation):
distracting_metrics_share = β0 + β1 lack_support + β2 director_social_educ + β3 govt_initiate + β4
selective_admit + β5 metrics_produced + β6log_years_existed + β7 college + β8 canada + β9 fte_staff +
β10 fte_staffXdata_barriers + [interactions with a resurveyed dummy variable] +η1
challenging_metrics_share= β11 + β12 lack_support + β13 director_social_educ + β14 govt_initiate + β15
selective_admit + β16 metrics_produced + β17 log_years_existed + β18 college + β19 canada + β20 fte_staff
+ β21 fte_staffXdata_barriers + [interactions with a βresurveyed dummy variable] +η2
low_value_metrics_share = β22 + β23 lack_support + β24 director_social_educ + β25 govt_initiate + β26
selective_admit + β27 metrics_produced + β28 log_years_existed + β29 college + β30 canada + β31
fte_staff + β32 fte_staffXdata_barriers + [interactions with a resurveyed dummy variable] +η3
key_metric_share = 1 - distracting_metrics_share - challenging_metrics_share low_value_metrics_share
The coding for the right-hand side variable names is the same as with the previous regression (Section
7.1). (The ηis denote the error terms.) We generally expect that more full time staff, the lower the
share of metrics produced in the Key Metric quadrant. This simply reflects a rationality assumption
where IR Departments will tend to use limited resources to produce the most valuable metrics first and
then shift to other metric categories as more resources become available. The more interesting
question is whether additional IR staff is used to increase the share of challenging metrics or whether
marginal resources are instead directed toward low value or distracting metrics. We do not have strong
hypotheses about the other variables and instead use the regression as investigative.
Table 7.3 shows the regression results. Other than the random survey selection, none of the variables
explain variation in the share of Low Value Metrics that are produced. However, additional IR staff
result in an increased share of Challenging Key Metrics and possibly, though of marginal statistical
significance, the share of Potentially Distracting Metrics. Thus, if we equate lower average quality with
higher difficulty to produce, then it appears that additional resources are targeted at either the
production of metrics that are easy to produce but less important, or that are hard to produce but more
important. This is a natural tradeoff that any IR administrator would have to make.
Another key driver of metric category shares is the degree to which government drives metric
production (govt_initiate). When government is considered to be more involved in initiating metrics,
the share of Key Metrics increases and the share of Challenging Key Metrics and Potentially Distracting
Page 48 of 71
Metrics fall. One interpretation is that governments typically request metrics that are categorized as
Key Metrics and this comes at an opportunity cost of fewer other types of metrics produced. In cases
where government involvement reduces the production of Potentially Distracting Metrics, government
involvement may improve overall value, but it may also reduce the production of Challenging Key
Metrics. As we saw in the above specification, the net effect seems to be slightly negative.
Interestingly, a lack of support from senior leaders to build necessary data infrastructure is associated
with a higher share of Challenging Key Metrics, though this is of marginal statistical significance. As
mentioned above, it may be that it is institutions that have more ambitious IR administrators who are
working on these areas of metrics are also the ones who feel most constrained and thus the measured
effect in the model does not indicate a causal relationship.
The other two noteworthy statistically significant results are both related to the share of Distracting Key
Metrics produced. Colleges and Canadian institutions (coefficients for the college and canada dummy
variables in the third equation) appear to produce a lower share of these less desirable metrics than
other institutions.44 The regressions control for staff size, government involvement, and whether the
institution has a highly selective admission policy or not, so it must be some other difference or
differences in institution types that explain the differences in the metric share distributions.
44
The discrete estimated effects are that colleges have a 10.0% lower share of Potentially Distracting Metrics and
Canadian institutions have a 3.1% lower share of these metrics.
Page 49 of 71
Table 7.3: Three-Equation Regression Explaining Four Quadrant Shares (Effect on Shares Relative to
Key Metric Quadrant)
Number of Obs.
R2
Variable
Equation 1: Low Value
Metrics Share
143
0.407
Coef.
Std.
PEst.
Err.
value
-.020 .129
0.878
.078 .080
0.327
-.140 .156
0.370
1.33 .133
0.000
-.057 .041
0.174
.092 .086
0.284
.034 .034
0.317
-.172 .158
0.276
-.091 .067
0.175
.007 .013
0.580
Equation 2: Challenging
Key Metric Share
143
0.570
Coef.
Std.
PEst.
Err.
value
.353 .187
0.058
-.010 .094
0.916
-.548 .141
0.000
.196 .164
0.233
.014 .058
0.806
.010 .108
0.925
.0197 .052
0.706
.384 .234
0.100
-.040 .093
0.667
.038 .015
0.014
Equation 3: Potentially
Distracting Metrics
143
0.570
Coef.
Std.
PEst.
Err.
value
-.047 .183
0.797
.003 .083
0.974
-.374 .137
0.006
1.49 .173
0.000
-.044 .042
0.303
-.127 .075
0.092
-.053 .039
0.172
-.507 .223
0.023
-.201 .090
0.026
.031 .016
0.050
0.566
0.099
.026 .045
lack_support_e
director_social_educ
govt_initiate_e
group_a
selective
metrics_produced_e
log_ir_years_existed
college
canada
fte_staff_master_phd
fte_staff_master_phd
.012 .038
0.755
.082 .050
X data_barrier
constant
-.584 .243
0.016
-.598 .368
0.104
.203 .259
0.432
Note that this is not a linear regression and thus the effect sizes cannot be read directly from the
coefficients. The direction (sign) of the effect and measure of statistical significance (p-values), however,
can be read directly from the table.
8. Conclusion
Despite budget cutbacks in the higher education sector in recent years, IR Departments, offices and
related units have been given more resources, especially additional staff, yet a lack of staff is still the
number one perceived barrier for IR directors. Current IR staff are generally well educated and the
available technologies are considered adequate and some of this perception of a lack of resources is
likely related more to the perceptions of ambitious IR directors than to an actual under-resourcing
relative to other peer institutions. Still, the finding that adding a staff member only yields one or two
additional performance metrics and then only when good data infrastructure is in place, speaks to how
additional responsibilities typically get added to IR Departments when they grow as well as the
increased cost of producing ever more useful institutional metrics. Moreover, even with the added
staff, IR Departments still tend to rely on basic survey and analytic technologies, focusing on reporting
historical data, though a large minority do use more sophisticated analytical techniques. We suspect
that there has been underinvestment in data infrastructure at many institution over a longer period of
time owing to the high cost and long-delivery period of developing this type of infrastructure; as a result
many IR Departments are not as efficient at metric production as they could be and there is, as a result,
less staff time to do more sophisticated analyses to tackle important but challenging types of postsecondary measurements, like the amount of learning taking place. Thus while there seems to be plenty
Page 50 of 71
of scope for moving toward more rigorous research methodologies and supporting a stronger culture of
evidence-based decision making, going beyond basic reporting of historical data is still very labourintensive and IR Departments’ appetites for more resources will be hard to satiate.
Finally, it is interesting that we have found no real differences between Canada and the United States
once other factors have been taken into account. There are differences between colleges (community,
mostly diploma granting institutions) and universities (mostly four-year and upper degree granting
institutions), with colleges typically have smaller IR Departments than universities of the same size, but
there is little evidence of substantial differences in metric productivity between colleges and universities
once the number of IR staff has been taken into account.
Page 51 of 71
Appendix 1
The survey questionnaire is contained below. The respondents filled in the questionnaire online and
thus the presentation format was different, showing questions one at a time and allowing for
conditional branching on some questions, depending on a previous answer. Also, respondents were
only presented with half the detailed metrics to assess in the first survey. (Some respondents were
asked to complete a follow up survey to assess the metrics that they did not assess in the first survey.
The follow up survey is not included here.)
Developing Metrics for Internal Decision Making in Higher Education
The focus of the survey is on performance metrics that are related to strategic decision making at your
institution. Operational-level metrics that support the day-to-day operations of individual Departments
within the institution are not within the scope of this research.
Your answers will be used for research purposes only. Your participation in this survey is entirely
voluntary. All survey data is being collected confidentially and securely stored by Academica Group. No
individual responses will be reported on or in any way linked to personally identifying information.
This section asks for a bit of information about your department and staff.
1. How long have you been the key administrator of Institutional Research in this institution?
Please enter numeric response only.
_______________
2. What is your academic background (based on highest credential achieved)?
Please select all that apply.
Business / Economics
Statistics / Mathematics
Computer Science / Technology
Social Sciences
Humanities
Natural Sciences
Education
Other_______________________
3. How long have you been an employee at your current institution?
Please enter numeric response only.
_______________
4. Including yourself, the total number of full-time staff in the Institutional Research department
is:
Please enter numeric response only. Please exclude co-op students or intern students.
_______________
Page 52 of 71
5. The total number of part-time staff in the Institutional Research department is:
Please enter numeric response only. Please exclude co-op students or intern students.
_______________
Page 53 of 71
6. Including yourself, please indicate the highest level of education of IR staff (full- and parttime)
Please exclude co-op students or intern students.
Number of staff with less than a Bachelor's Degree: _________________
Number of staff with a Bachelor's Degree: ________________
Number of staff with a Master’s Degree: __________________
Number of staff with a Doctorate Degree: ________________
7. In the past 5 years, how would you describe the change in the department when it comes to
its core mandate or responsibilities?
Increased its core mandate or responsibilities
Reduced its core mandate or responsibilities
Did not change its core mandate or responsibilities
8. In the past 5 years, what has been the net change in the total number of employees in the IR
department (including full- and part-time staff)?
Please exclude co-op students or intern students.
Increased (Branch to Q9)
Decreased (Branch to Q10)
Stayed the same
9. The total number of employees in the IR department increased by…
1
2
3
4
5
6
7
8
9
10
10. The total number of employees in the IR department decreased by…
1
2
3
4
5
6
7
8
9
10
Page 54 of 71
11. Approximately, how many full-time students were enrolled in Fall (September) 2013 at your
institution? Please select one response only.
Headcount _________________
Data not readily available
Prefer not to answer
The next section will ask you about institutional performance metrics and related processes.
12. Please indicate the main use or uses for each category of metrics at your institution.
Page 55 of 71
Prioritize
Not
Describe
Regulatory
Measure
programs applicable
institution and compliance
progress of
Quality
for
(Do not
its
(government,
strategic
Assurance
resource
collect /
characteristics accreditation) plan/direction
allocation Don't know)
I. Enrolment
Management
(Applications/conversion,
Enrolment counts,
Market share, Student
demographics, etc.)
II. Student Success
(Graduation rate,
Retention rate, Student
and alumni awards, etc.)
III. Student Engagement
and Satisfaction
(Students’ use of college
facilities and services,
Student satisfaction, etc.)
IV. Staff Engagement
and Demographics
(Employee diversity,
Employee engagement,
Faculty and staff
demographics, etc.)
V. Research
(Publications, Citations,
Inventions, etc.)
VI. Libraries (Holdings,
Acquisitions,
Expenditures, etc.)
VII. Facilities (Safety,
Environmental
sustainability, Space
utilization, etc.)
VIII. Financial (Budget
surplus, Income/net
contribution,
Endowment, etc.)
IX. Instructional
Productivity (Student-tofaculty ratio, Class size,
etc.)
X. Faculty Characteristics
(Faculty awards, %
Page 56 of 71
Faculty with PhD,%
Faculty tenure track,
etc.)
13. This section has a list of selected metrics from the previous question. The list excludes the
metrics that you may have indicated as not applicable to your institution. Imagine you’re asked
to provide advice about the relative importance of various metrics to another IR administrator
at an institution similar to yours. How would you rate the importance of the following metrics
in making strategic decisions?
Very Low
Low
High
Very High
Unsure/No
Importance importance Moderate importance Importance
opinion
(1)
(2)
(3)
(4)
(5)
Enrolment Management
Market share
Student geographic origin
Student Success
Graduation rate
Student and alumni awards
Student learning or skill gain
Page 57 of 71
Very Low
Low
High
Very High
Unsure/No
Importance importance Moderate importance Importance
opinion
(1)
(2)
(3)
(4)
(5)
Student Satisfaction
Student satisfaction
Students’ evaluation of
courses/program/faculty
Staff Engagement
Employee engagement
Research
Publications
Inventions
Research expenditures
Libraries
Holdings
Expenditures
Facilities
Environmental sustainability
Facilities condition indices
Financial
Income/net contribution
Net assets
Capital investment
Instructional Productivity
Student-to-faculty ration
Faculty Characteristics
Faculty awards
% Faculty tenure track
Page 58 of 71
14. On a scale of 1-5, how would you rate the degree of success in implementing each of the
following metrics at your institution?
Unsure/
No opinion
Enrolment Management
Market share
Students’ geographic origin
Student Success
Graduation rate
Students and alumni awards
Student learning or skill gain
Student Satisfaction
Student satisfaction
Students’ evaluation of
courses/program/faculty
Staff Engagement
Employee engagement
Research
Publications
Inventions
Research expenditures
Libraries
Holdings
Expenditures
Facilities
Environmental sustainability
Facilities condition indices
Financial
Income/net contribution
Net assets
Page 59 of 71
Poor
(1)
Fair
(2)
Good
(3)
Very good Excellent
(4)
(5)
Instructional Productivity
Student-to-faculty ratio
Faculty Characteristics
Faculty awards
% Faculty tenure track
15. To what extent does each of the following initiate or influence the development of performance
metrics at your institution?
Not
applicable
Board of Governors or
Senate
State Board (U.S. only)
Provincial government
(Canada only)
National government
(Australia and New Zealand
only)
President
Academic executives (VPAcademic, Provost, Deputy
or Vice Provost
Non-academic executives
(Vice President, Associate or
Assistant Vice President)
Mid-level non-academic
management (managers,
directors)
Faculty
Non-teaching staff
Mid-level academic
managers (chairs,
department heads,
associate deans, managers,
directors)
Page 60 of 71
Don't
know
To no To a very
To a
To a large To a very
extent at small moderate extent
large
all
extent
extent
extent
(1)
(2)
(3)
(4)
(5)
Data consortium
15-1. Are there any other groups or individuals who initiate or influence the development of
performance metrics at your institution?
Yes
No
15-2. Please list any other group(s) or individual(s):
1. ____________
2. ____________
3. ____________
15-3 Please rate the extent the other groups or individuals initiate or influence the development
of performance metrics at your institution.
To no
Not
extent at
applicable
all
(1)
To a very
To a
To a large To a very
small
moderate extent
large
extent
extent
extent
(2)
(3)
(4)
(5)
Item #1
Item #2
Item #3
16. On a scale of 1 to 5, to what extent do you consider the following to be barriers to the effective
development and management of metrics at your institution?
To no To a very
To a To a large To a very
Unsure /
Not
extent at small moderate extent
large
No
applicable
all
extent
extent
extent
opinion
(1)
(2)
(3)
(4)
(5)
1. Timeliness of data from
external databases
2. Timeliness of data from
internal databases
3. Inadequate number of
staff in institutional research
4. Insufficient skills of
existing staff in institutional
research in effectively
analyzing data
Page 61 of 71
5. Insufficient staff and
resources from I.T.
6. Difficulty
integrating/combining data
from different data sources
7. Data are mainly snapshots
from a ‘live’ information
system and are not stored in
a static / frozen state to
maintain analysis at a fixed
reporting period
8. Lack of required
operational or raw data
from internal databases
9. Lack of required
operational or raw data
from external databases
10. Lack of reliable data to
make accurate
measurements
11. Complexity of existing
information management
systems
12. The level of support from
senior decision makers to
build the required data
infrastructure
13. Lack of a data
governance framework with
clear responsibilities for data
accountability/stewardship
14. Gaps in existing data
governance framework
15. Lack of a useful data
dictionary (for standardized
definitions and business
rules)
16. Lack of a useful data
warehouse/data marts
17. Low level of trust
(internal data leaders’
inhibition to release data)
Page 62 of 71
17. Please indicate your level of agreement with each of the following statements.
Unsure / Strongly Disagree Neither
Agree
No opinion disagree
agree nor
disagree
(1)
(2)
(3)
(4)
1. Our institution has a
comprehensive strategic
performance measurement
framework (e.g. Balanced
Scorecard, Business Intelligence
tool).
2. Our suite of metrics includes
specific measures that relate to
a highly unique aspect of the
overall mission of the institution.
3. Our institution puts emphasis
on measures that predict future
performance.
4. Our institution puts emphasis
on measures of past
performance.
5. Our institution has a strong
culture of evidence-based
decision making.
6. Our department is underresourced at my institution
7. Our department has a very
collaborative relationship with
the Information Technology
department.
Strongly
agree
(5)
18. Within the past 3 years, what best describes your key initiatives regarding metrics.
Please select all that apply.
We produced a new metric or a new set of metrics for our institution. (Branch to Q19-1)
We consolidated some of our existing metrics [Branch to Q19-2)
We modified existing metrics used at our institution. [Branch to Q19-3)
We eliminated specific metrics which we were regularly producing in the past three years.
[Branch to Q19-4)
We made plans for producing new metrics. [Branch to Q19-5)
There were no changes in the types of metrics that we have been monitoring. [Branch to Q20)
Page 63 of 71
19-1 Metrics Produce
Please list any new metrics that you have produced in the past 3 years.
Please be as specific as possible.
1 __________________
2 __________________
3 __________________
4 __________________
5 __________________
19-2 Metrics Modify
Please list any new metrics that you have modified in the past 3 years.
Please be as specific as possible.
1 __________________
2 __________________
3 __________________
4 __________________
5 __________________
19-3 Metrics Eliminate
Please list any new metrics that you have eliminated in the past 3 years.
Please be as specific as possible.
1 __________________
2 __________________
3 __________________
4 __________________
5 __________________
19-4 Metrics Consolidate
Please list any existing metrics that you have consolidated in the past 3 years.
Please be as specific as possible.
1 __________________
2 __________________
3 __________________
4 __________________
5 __________________
19-5 Metrics Plan
Please list any new metrics that you have planned to produce in the near future (please list up to
five).
1 ____________________
Page 64 of 71
2 ____________________
3 ____________________
4 ____________________
5 ____________________
Discussions are on-going (not certain yet at this point about the specific metrics)
20. Which of the following currently exists at your institution?
Please select all that apply.
A dashboard
A balanced scorecard
A new business intelligence tool
Data visualization software
Data governance framework
Statistical software
Online survey software
A data dictionary and / or data library
None of the above
21. Which of the following are currently being developed or significantly improved at your
institution?
Please select all that apply.
Dashboard
Balanced Scorecard
New Business Intelligence tool
Data governance framework
Data dictionary and/or data library
None of the above
22. What would you say are the top three challenges in producing new metrics for your
institution?
Please select up to three items.
Buy-in from academic leaders
Buy-in from the executive management
The integrity (accuracy, completeness, consistency) of the data to be used in developing metrics
Page 65 of 71
The concern that other institutions are not collecting the new metric that we want to produce,
so benchmarking will not be possible
Lack of data needed (internal or external to our institution)
The tendency to select metrics that are easy to quantify instead of those that are harder to
develop but are more meaningful.
Lack of resources to develop, maintain, or report metrics.
Other_______________
23. How often do use the following in producing and analyzing your metrics?
DATA COLLECTION PLATFORM AND TOOLS
Not
Not at all Seldom/rarely Sometimes Usually
applicable
(1)
(2)
(3)
(4)
Always
(5)
Student Information System
Learning Management System
Other enterprise systems (e.g.
HR, financial)
Survey data
External data (e.g. Application
Center, Census Bureau,
Statistics Canada, IPEDS, uCubeAustralia, Education CountsNew Zealand, etc.)
DATA STORAGE
Not
Not at all Seldom/rarely Sometimes Usually
applicable
(1)
(2)
(3)
(4)
Always
(5)
Not
Not at all Seldom/rarely Sometimes Usually
applicable
(1)
(2)
(3)
(4)
Always
(5)
Flat files (e.g. MS Excel, text
files)
Relational Databases (e.g. MS
Access)
Data warehouse or data marts
Cloud
Hadoop
ANALYTIC/REPORTING TOOLS
Spreadsheets (e.g. Excel)
Statistical Packages (e.g. SAS,
SPSS, Stata, STATISTICA)
Page 66 of 71
Content Analysis / Text Mining
Packages (e.g. Provalis, SPSS
Text Miner, SAS Text Analyser)
Enterprise Business Intelligence
(e.g. Cognos, Crystal Reports,
Essbase, SQL Server Analysis
Services/Excel BI)
Visualization software (e.g.
Tableau, Qlikview, Microsoft
PowerPivot, SAS JMP)
ANALYTICAL METHODS
Not
Not at all Seldom/rarely Sometimes Usually
applicable
(1)
(2)
(3)
(4)
Always
(5)
Basic reporting/descriptive
statistics
Statistical measurement of
associations and causal
relationships
Value-added approach - i.e.
adjustment of inputs to isolate
real impacts
Time-series analysis
Comparison of programs within
the institution
Comparison of own programs to
similar programs outside the
institution
Comparison of own institutional
performance against peer
institutions
24. What are the main difficulties you find with using data management / data analysis
technologies?
Please select all that apply.
High skills and training requirements for IR staff
Complexity of use of existing technology even for trained users
Financial cost of acquiring and maintaining technology
Limitation in the capabilities of the technology (e.g. automation)
Significant IT support/involvement required for implementation or use of technologies
Overall poor quality (e.g. inflexible, low quality graphics, buggy, lack of technical support)
Page 67 of 71
25. What is the overall average for new first year students in all programs in academic year
2013 at your institution?
High school average (%)_____
SAT ______
ACT ______
Other, please specify__________
Data not readily available
Prefer not to answer
26. Which statement best describes the type of admission at your institution?
All / almost all programs are open admission
A majority of programs are open admission
There is roughly an equal number of programs with open and selective entry
A majority of programs are selective entry
All / almost all programs are selective entry
Other, please specify_____________
27. How long has the department (Institutional Research) existed at your institution?
Please enter numeric response only.
_________________
...or check this box if you are not sure
Don't know
28. The department (Institutional Research) currently reports directly to:
President
Academic executives (VP-Academic, Provost, Deputy or Vice Provost)
Non-academic executives (Vice President, Associate or Assistant Vice President)
Registrar
Students Services
Information Technology
Chief Financial Officer
Other__________________
Page 68 of 71
Page 69 of 71
29. Please provide any comments on metrics (processes, development, reporting, etc.) at
your institution that you feel are important but were not included in this survey.
30. Please let us know if you have any questions or concerns about this survey.
Thank you for your participation in this study. A copy of the survey results will be emailed to you.
Academica Group programmed this instrument, and can be contacted at
surveys@academicagroup.com
Appendix 2
This appendix contains the questions that were used to guide the interviews. Interviews followed a
conversational style and some questions may have been skipped if the interviewer thought they had
already been addressed in the course of the conversation or were not considered relevant.
Measurement and Processes





Page 70 of 71
What primary metrics do you monitor to track the status of your institutional performance
against your institutional mission and goals?
What makes a metric valuable and meaningful to your institution?
If your department has developed (or is currently developing) any new metrics, please
describe the process you’ve undertaken. What were some of the successes and challenges?
Are there any metrics you are not currently using that you think would be particularly
valuable to your institution?
What statistical techniques do you use or would recommend for building more valid and
useful metrics?
Data Collection and Information Management Systems




Please describe your processes for collecting and managing data that are used to create
metrics.
What challenges, if any, have you experienced in using any internal and external databases
that you use to produce metrics?
Regarding your internal or external collaborations pertaining to data collection and
management, please describe one example you would consider to be particularly successful
and another that is not as successful as you would have liked.
Please describe any changes that your institution implemented (in the past 2-3 years)
regarding data collection instruments and processes.
Reporting & Dissemination


Please describe your communication/dissemination processes regarding metrics.
What key resources are required to support these processes? Are there changes you would
like to see regarding these processes?
Program Prioritization


Do you use a program prioritization process? Does the process lead to developing new
metrics or enhancing existing ones? Please tell us about the most and least useful metrics
that you use for the prioritization process.
What are some of the successes and challenges?
External Ranking Systems


Page 71 of 71
What is your opinion about external ranking of your institution or its programs? Have the
criteria used for external institutional rankings affected the choice of metrics you collect? If
so, in what way?
Does your institution undertake (or has it undertaken) any action based on such rankings?
Download