Transcript Slide 1
Modeling and Money:
The Two DO Mix
TAIR
February 1, 2006
Baylor University
Located in Waco Texas
Affiliated with Baptist General
Convention of Texas
Bachelors/Masters/Doctoral degrees
Seminary Mdiv and Dmin
Law
Fall Enrollment approximately 14,000
Nuggets
“If you’ve got terabytes of data, and
you’re relying on
data mining to find
interesting things
in there for you,
you’ve lost before
you’ve even begun.”
— Herb Edelstein
Predictive Modeling at BU
Enrollment Management
Inquiry to Net Deposit
Accept to Enroll
– Applications of model
• Moving from one stage to another
• Classification of students-new freshmen,
new transfers, graduate, etc.
• Texas and non-Texas students
Enrollment Management Stages
Inquiry
Applied
Accepted
Deposit
Net Deposit
Enroll
Retention
Graduation
Student Retention
– Applications --
• Fall to Spring Retention
• Fall to Fall Retention
• Enroll to Graduation
Donor Management
– Annual Gift
– Major Donor
– Planned Gift
– Retention/Upgrade
– New Donors
Business Questions
How can we identify potential major
donors?
How can we predict propensity of a
donor to make an annual gift?
How can we identify potential planned
giving donors?
How can we identify current donors
that can move to next level of giving?
How can we identify non-donor
constituents with characteristics of a
donor?
How can we predict expected value of
a gift?
Required Expertise
– Domain
– Data
– Analytical Methods
Project Team
Representatives from University
Development
Representatives from Institutional
Research
SAS Consultants
Process/Steps
Explore Development data
Build datasets for descriptive models
Validate datasets
Create profiles for analysis
Build datasets for predictive modeling/mining
Mine the data
Create predictive models
Apply the models
Test the models
Data Exploration
New database for IR
–Learn and learn more!
–Edit reports and data cleansing
Profiles
Donor
Non-donor
Alumnae donor
Hispanic donor
African-American donor
– More data cleansing!
Indicator Score
Creation of indicator variables with
yes/no (1/0) values
For Single households
-- 18 indicators
For Two-person households
-- 25 indicators (7 indicators could be
duplicated)
Indicator Variables
DOB_50_ind – over 50 years of age?
Married-Widowed_ind - married or
widowed?
Children_ind – any info on children?
Alumni_ind – an alumni?
Contact_ind – any contact info for donor?
Executive_ind – executive job code?
Leader_ind –Baylor relationship?
gift count – has donor made 15 gifts over
lifetime?
gift_5k – total cum gifts >= $5,000?
gift_25k – total cum gifts >= $25,000?
gift_100k – total cum gifts >= $100,000?
year5_ind – has donor made $250 gift in
EACH of last 5 years?
year2_ind – has donor made ANY gift in
EITHER of last 2 years?
Rating_ind – does donor have Echelon
rating?
Athletic_gift_ind – has donor made gift to
Athletic Department?
Alumn_assoc_ind – has donor made gift to
Alumni Association?
Spouse_alum_ind - is spouse coded an
alum?
Indicator Score Distribution
Average Cumulative Gift
Donor Household Profile
64,000+ Households
72% One donor in household
50% Alumni
60% Males
57% Married
19% indicate Baptist religion
58% indicate Texas residences
Non-Donor Household Profile
77,000+ Households
Most data fields have a large percent
of missing values
Donor Model for 2004
Use donors for previous 10 years
Create target variable
Identify predictor variables
Build model
Apply to 2005 donors
Categories of Predictors
Biographical/demographic - 20
Contact information - 12
Degree data – 9
Activities - 15
Gift information - 31
External rating information - 5
Research data - 4
Building Model
Target variable – gift in 2004
– 1 for household with 2004 donation
– 0 for household with no donation in 2004
Predictors constructed from donors
in 1994-2003 time period
Tools -- SAS Enterprise Miner
– Used to build, validate, and score
ROC Model Comparison
Lift Chart
Distribution of Scores
Model Comparisons
ROC curves and Lift charts indicate
all models are performing well
Misclassification rates for the
models are all close to 16%
Very little difference between
average profit for the models
Logistic regression was chosen as the
model to employ
Model Application
Analyze 2004 donors at the end of
June 2005
Determine those who have not made a
donation
Use probability scores to target
those most likely to make a gift
Future Work
Application of general model
– Annual gifts
– Major gifts
– Planned gifts
Non-donor model
Gift amount model
Life time value model
Thanks!
Questions or Comments