Transcript Slide 1

Modeling and Money:
The Two DO Mix
TAIR
February 1, 2006
Baylor University
 Located in Waco Texas
 Affiliated with Baptist General
Convention of Texas
 Bachelors/Masters/Doctoral degrees
 Seminary Mdiv and Dmin
 Law
 Fall Enrollment approximately 14,000
Nuggets
“If you’ve got terabytes of data, and
you’re relying on
data mining to find
interesting things
in there for you,
you’ve lost before
you’ve even begun.”
— Herb Edelstein
Predictive Modeling at BU
 Enrollment Management
Inquiry to Net Deposit
Accept to Enroll
– Applications of model
• Moving from one stage to another
• Classification of students-new freshmen,
new transfers, graduate, etc.
• Texas and non-Texas students
Enrollment Management Stages
 Inquiry
 Applied
 Accepted
 Deposit
 Net Deposit
 Enroll
 Retention
 Graduation
 Student Retention
– Applications --
• Fall to Spring Retention
• Fall to Fall Retention
• Enroll to Graduation
 Donor Management
– Annual Gift
– Major Donor
– Planned Gift
– Retention/Upgrade
– New Donors
Business Questions
 How can we identify potential major
donors?
 How can we predict propensity of a
donor to make an annual gift?
 How can we identify potential planned
giving donors?
 How can we identify current donors
that can move to next level of giving?
 How can we identify non-donor
constituents with characteristics of a
donor?
 How can we predict expected value of
a gift?
Required Expertise
– Domain
– Data
– Analytical Methods
Project Team
 Representatives from University
Development
 Representatives from Institutional
Research
 SAS Consultants
Process/Steps









Explore Development data
Build datasets for descriptive models
Validate datasets
Create profiles for analysis
Build datasets for predictive modeling/mining
Mine the data
Create predictive models
Apply the models
Test the models
Data Exploration
New database for IR
–Learn and learn more!
–Edit reports and data cleansing
Profiles
 Donor
 Non-donor
 Alumnae donor
 Hispanic donor
 African-American donor
– More data cleansing!
Indicator Score
 Creation of indicator variables with
yes/no (1/0) values
 For Single households
-- 18 indicators
 For Two-person households
-- 25 indicators (7 indicators could be
duplicated)
Indicator Variables
 DOB_50_ind – over 50 years of age?
 Married-Widowed_ind - married or
widowed?
 Children_ind – any info on children?
 Alumni_ind – an alumni?
 Contact_ind – any contact info for donor?
 Executive_ind – executive job code?
 Leader_ind –Baylor relationship?
 gift count – has donor made 15 gifts over
lifetime?
 gift_5k – total cum gifts >= $5,000?
 gift_25k – total cum gifts >= $25,000?
 gift_100k – total cum gifts >= $100,000?
 year5_ind – has donor made $250 gift in
EACH of last 5 years?
 year2_ind – has donor made ANY gift in
EITHER of last 2 years?
 Rating_ind – does donor have Echelon
rating?
 Athletic_gift_ind – has donor made gift to
Athletic Department?
 Alumn_assoc_ind – has donor made gift to
Alumni Association?
 Spouse_alum_ind - is spouse coded an
alum?
Indicator Score Distribution
Average Cumulative Gift
Donor Household Profile
 64,000+ Households
 72% One donor in household
 50% Alumni
 60% Males
 57% Married
 19% indicate Baptist religion
 58% indicate Texas residences
Non-Donor Household Profile
 77,000+ Households
 Most data fields have a large percent
of missing values
Donor Model for 2004
 Use donors for previous 10 years
 Create target variable
 Identify predictor variables
 Build model
 Apply to 2005 donors
Categories of Predictors
 Biographical/demographic - 20
 Contact information - 12
 Degree data – 9
 Activities - 15
 Gift information - 31
 External rating information - 5
 Research data - 4
Building Model
 Target variable – gift in 2004
– 1 for household with 2004 donation
– 0 for household with no donation in 2004
 Predictors constructed from donors
in 1994-2003 time period
 Tools -- SAS Enterprise Miner
– Used to build, validate, and score
ROC Model Comparison
Lift Chart
Distribution of Scores
Model Comparisons
 ROC curves and Lift charts indicate
all models are performing well
 Misclassification rates for the
models are all close to 16%
 Very little difference between
average profit for the models
 Logistic regression was chosen as the
model to employ
Model Application
 Analyze 2004 donors at the end of
June 2005
 Determine those who have not made a
donation
 Use probability scores to target
those most likely to make a gift
Future Work
 Application of general model
– Annual gifts
– Major gifts
– Planned gifts
 Non-donor model
 Gift amount model
 Life time value model
Thanks!
Questions or Comments