Project_alunmi_Donations_FinalProposal

advertisement
AGSM 672
Proposal: Alumni Donations
Prepared by:
Alex Swerzbin, Eric Shin, Rohit Jain
Executive Summary
This project illustrates the process used to better understand important variables in
determining, classifying and seeking alumni donations for Willamette University. The Office of
Annual Giving provided a data set in excel format for this analysis. The data set used mostly
demographic variables with 11,000 rows and 31 variables. This is the current data set they
maintain to assist them in seeking donations.
To assist the Office of Annual Giving in seeking donations, two different data mining modeling
techniques (predictive & cluster analysis) were utilized. This was done to better understand and
to provide insight on the importance of variables currently in use at the university’s Office of
Annual Giving, provide ideas and recommendations on variables they should track, and
recommend steps to improve their model and data collection strategies.
One of the major insights gained from this analysis is that the variables the Office of Annual
Giving tracks are poor in providing insight into the dollar amount to ask for from alumni. After
completing this modeling process the best value add we could provide was a set of
recommendations for the Office of Annual Giving to move forward with. These
recommendations are mostly comprised of ways to track variables and thoughts on which
variables to track to enhance the model creation process.
Purpose
One of the major issues all Universities face is how to ask for donations from their alumni. More
specifically, understanding who to ask and how much to ask. Willamette University prides itself
in having an excellent relationship with their alumni and receiving quality donations from them.
Every year millions of dollars are received in grants and donations to the university, which is a
lifeline for the University. Ninety-percent of the students enrolled at Willamette University
receive some form of financial aid, which is a huge number. The alumni network, with limited
disposable income and various options to donate their money, creates a situation where it is
crucial for the university to be careful in who and how much to ask for from alumni.
The ongoing business challenge for the university is ask for the right dollar amount, depending
on the individual alumnus’ capacity without violating the strong relationship they have created.
Project Goal
In order to address the business problem identified above, our team will work to determine and
evaluate two things. First the best predictive model to assist in determining what the asking
dollar amount to a prospective donor should be. Second perform a cluster analysis to get a
better sense of important variables and characteristics of those clusters.
Once we come up with these models, we would compare them to the existing model that the
Office of Annual Giving currently uses; provide recommendations on how to improve that
model and provide any additional knowledge we gain from this process.
Literature review
This is not a new problem; the Office of Annual Giving at Willamette University has been facing
this issue for years. They are in fact the ones who have provided the dataset to us. The data set
contains variables that help determine the annual giving of alumni and has eleven-thousand
rows of data. In the data we will have to deal with a lot of null values and demographic
variables. We will also face the challenge of poor variable selection currently tracked by the
Office of Annual Giving? This project can add value in providing the Office of Annual Giving
insights and recommendations on better variables to track moving forward so that they may
further refine the model we present to them.
To perform this project we will utilize tools such as XLMiner and SPSS modeler to create our
model and analyze the situation. Our model will be finalized in SPSS Modeler. Our team will
create a predictive model based on numerous prediction techniques (MLR, K-NN, Regression
trees, Neural Nets) with the potential to dive further into our understanding of the issue with
classification techniques (Classification trees, Neural Nets, Naïve Bayes).
Current Model
The Office of Annual Giving currently has a model in place for determining the right ask amount
from the donors. After getting the first donation from a donor, the Office of Annual Giving uses
a formula to ask for money in the coming years. The first donation is totally dependent on
donors and no asking amount is provided by the office of annual giving.
Last Gift
< $100
Gift 1
$200
> $100
See Below
> $2500
$ (blank)
Gift 2
$150
Gift 3
$100
Gift 4
$ (blank)
If the last gift is more than $100 and less than $2499, then they use the following criteria
Last Gift
>$100
<=2499
Gift 1
Last Gift *
3 times
Gift 2
Gift 1 *
2 times
Gift 3
Gift 2 *
1.5
times
Gift 4
$
(blank)
If the last gift is more than $ 2500 and under $ 3333, then they us the following criteria
Last Gift
>$2500
<=3333
Gift 1
Last Gift *
3 times
Gift 2
Gift 1 *
2 times
Gift 3
Gift 2 *
1.5
times
Gift 4
$
(blank)
The Office of Annual Giving also has other criterions. For instance, if a donor is categorized into
upper management, the ask amounts are then calculated with different formulas. (To date, we
have not yet received information on this method.)
Data Understanding
The current project will focus on determining what dollar amount could be asked from
prospective or existing donors, based on the following variables:
Variable
Explanation
Job Title
Job title of donor ex: Lawyer, Director etc.
City and State
City and State in which the donor lives
CLA (College of Liberal Arts)
Year in which donor Graduated from CLA
LAW
Year in which donor Graduated from the Law School
AGSM (Atkinson Graduate school
Year in which donor Graduated from AGSM
of Management)
FY05-06
Donation made in the year 05 – 06
FY06-07
Donation made in the year 06 – 07
FY07-08
Donation made in the year 07 – 08
FY08-09
Donation made in the year 08 - 09
FY09-10
Donation made in the year 09 – 10
FY10-11
Donation made in the year 10 – 11
Life time
Total donation made
Hypothesis
Our team hypothesizes that the right dollar amount to ask from donors would be dependent
mainly on predictors (Job Title, school of graduation and previous donations). However, since
these will be predictive and classifying models, a lot of the insights gained from modeling the
data will be exploratory in nature and will give analysts insights on what indicators they should
consider when deciding the right dollar amount to ask from alumni.
Project Procedure

Data collection: the team collected data from the Office of Annual Giving. The dataset
contains 11,000 rows of data. The dataset contains 7 variables to determine the annual
giving of an alumnus.

Gain an understanding of the current model that is being used by the Office of Annual
Giving to determine the right dollar amount to ask from alumni.

Visualize the data by computing statistics and viewing histograms of each variable

Identify outliers and research why they are outliers. If the outlier is a data entry error,
replace it with the three-sigma control limit value. If the outlier is legitimate, keep it in
the dataset.

Evaluate initial hypothesis and change it if needed.

Data pre-processes: partition the data into 40% training and 40% testing and 20%
validation. Transform the variables if needed. Remove any missing values in the data set.

Determine variable importance

Classify the data under a variety of Prediction Models

Evaluate the models using accuracy, lift charts, confusion matrices, and mean absolute
error.

Select the best model based on the above criteria.

Interpret model and identify any implications

Compare the model with the model being used by the annual giving staff

Present findings to other peers and the Office of Annual Giving staff
Data Preparation
In order to accurately create models from the excel data received from the Office of Annual
Giving; a certain level of data preparation was necessary to clean the data. The first step
involved eliminating redundancy within the data (example reunion class, spouse reunion class,
spouse gender). Another variable to be cleaned was the graduate school attended by alumni;
we had been given the year in which they had graduated and from which school. This was
presented to us in 4 different rows with a lot of null values. We converted those values to
categorical values (ordinal) assigning numerical values to different schools (1 = college of liberal
arts, 2 = law, 3 = AGSM). Next we adjusted whether his/her spouse is an alum (0 = no, 1 = yes).
This project looked at two separate techniques, predictive and cluster analysis; both utilized the
same data set and data prep methods.
Before cleaning the data
After cleaning the data
Model Development
We tried to follow the process of modeling that had been described in the class. First we
performed a data audit and found a small number of outliers and extremes in the data, which
were discarded and a super node was generated to improve model effectiveness. We had
enough rows of data to allow us to discard the outliers and extremes. The next step was to flag
variables which were nominal in nature; we used a “Set to Flag” node for
VAR.COUPLE.LAST.DESIGN.DESC variable, which stated which department alumni donated
money to. We did not convert gender as it was already flagged.
Once we had flagged the required variables the next step was to use auto data prep to
standardize required data for improved performance of models.
Once the data was standardized, due to a large number of variables, we decided to use feature
selection to reduce the number of variables for the analysis. We also reduced the number of
variables from 41 to 14. In order ensure that the model would not over-fit the data we created
a 60 – 40 partition in the data set; 60% being training and 40% testing.
Results
Predictive model: We tried a variety of predictive models like Regression analysis, Chaid
analysis, Neural set analysis, C&R analysis, but because the target field was numerical in nature
we could not use logistic regression. All the models yielded poor results, in terms of accuracy
and adjusted R-square value. One of the models we ran was a linear regression model. The best
R-Square we could get for the model 0.232, which shows how poor the model is.
The lift charts re-emphasized our results; that models predicting the life time donations are
very poor.
Cluster model: The clustering model was used to understand if we could gain insight in
patterns, in regards to clusters among donor groups based on demographics and historical gifts
and donations. First we used the 2-Step clustering model because this model self-selects the
number of clusters without the use having to input them. It gave us 2 clusters with a fairly good
cluster quality. The larger cluster (i.e. cluster 1) was almost 89% in size and was comprised
mostly of CLA graduates who were males. To confirm the cluster quality, we used the K-means
clustering model in which the user specifies the number of clusters. We ran the model
specifying 2 clusters and the resulting model yielded similar results with a cluster quality of
more than 0.5, which is considered to be good.
Once the clusters were formed, we wanted to understand how alumni of different graduate
schools gave as lifetime donations. This was to provide insight to the Office of Annual Giving to
help them better allocate their resources accordingly. So we did a scatter plot with “Lifetime
Donation” on the Y axis and “Graduate School Attended” on the X axis.
The resulting scatter plot showed that the maximum donation was given by College of Liberal
Art graduates, which was no surprise as the number of undergraduates graduating each year is
much higher than the law, business or education school graduates.
Conclusion & Recommendation
The problem of understanding who to ask, how much to ask and why is a difficult task for any
university. After conducting our analysis using both methods we have found that the data you
collect is just as important, if not more important, than the way you analyze it. By undertaking
this project and understanding the process behind annual donation seeking, we found that
demographic variables alone do not provide very good insight. From this we have come up with
a set of recommendations for the Office of Annual Giving to improve their data collection and
process:

First, invest more time and energy in building relationships with Willamette University’s
College of Liberal Arts alumni. Our cluster analysis revealed that they donate more and
more frequently compared to alumni from other schools at Willamette. This makes
sense because these alumni, in most cases, have spent more time at Willamette
University and do not have other undergraduate universities asking for donations as
well.

In our analysis, we also found that historical giving was a strong indicator of future
giving or aggregate giving. Because of this the Office of Annual Giving should track all
donations not just the past five fiscal years, if possible. They should continue to track
future giving and maintain that data in their continuing data collection for future
analysis. By doing this they will be able to better understand how much to ask for in the
future at certain points in time.

They should begin classifying job titles as categorical variables to allow for more ease in
future analysis. One simple way to do it would be classify all jobs in 4 different
categories entry level jobs, managerial jobs, senior level jobs, entrepreneurs or selfemployed. This will provide them with the ability to understand how job position plays a
part in a donors ability to give, making it easier for the Office of Annual Giving to
understand again how much to ask and who to build strong relationships with.

The Office of Annual Giving should begin tracking preferred methods of communication
from alumni, whether it is email, mail, telephone calls or personal visits. Response rate
should be measured if preferred method is not indicated by the alumni. This would
provide insight into behavioral actions associated with giving donations and could
provide efficiency of time in future donation seeking for the staff of the office of annual
giving.

Track the behavior of alumni in terms of Willamette University function attendance.
Knowing which alumni continues to be active as an alumnus would provide potential
insight into who would be a good candidate to ask large amounts from.

Lastly create an alumni satisfaction survey to gain a better understanding of who is
pleased or not with Willamette University. This would provide a quantified behavioral
data-set, which is better in creating both types of models.
Download