Transcript Data Mining

Predicting Blood Donor’s Attrition
with Data Mining Methods
Xing Wan
Introduction

Blood supply in the U.S. is always
inadequate

Recruitment and retention of blood donor has
become a top priority

RESEARCH QUESTIONS


What kind of blood donors is more likely to
drop out?
What are critical factors may lead to donor
attrition?
Introduction

BRIEF LITERATURE REVIEW

Current situation of blood supply in the U.S.

Prior survey based researches found some
important factors (such as donation experience
and convenient donation place) may influence
donor’s decision to return

A recent study on freshmen student attrition
used data mining techniques
DataSets
Donation
Records
Disaster relief
appeals and
incentive info.
Possible
survey data
Whole Dataset
For Modeling
Methodology and Tools

CRISP-DM







Business Understanding
Data Understanding
Data Preparation
Modeling
Evaluation (10-fold cross-validation approach)
Deployment
Tools:



SAS DM
SAS EG
Base SAS
Procedure

Data preprocessing

Dependent Variable

Target_dropout: code it as “1” if a donor did not
return after his/her first donation

Modeling

Popular classification methods


Logistic Regression; Decision Tree; ANN; SVM
Ensemble techniques

Bagging ,Busting and Information fusion
Procedure

Sensitivity analysis


measures the importance of independent
variable based on the change in modeling
performance that occurs if this variable is
not included in the model.
The greater the performance decrease, the
greater the ratio of importance.
Results



Identify potential dropout by using
data mining model
Identify important independent
variables or predictors
Develop and deploy Detention
strategies
References
1. Cheng, E., Chany, C., & Chauz, M. (2010). Data Analysis for Healthcare: A
Case Study in Blood Donation Center Analysis. Americas Conference on
Information Systems.
2. Delen, D. (2010). A comparative analysis of machine learning techniques for
student retention management. Decision Support Systems, 49, 498-506.
3. Masser, B., White, K., Hyde, M., & Terry, D. (2008). The Psychology of Blood
Donation: Current Research and Future Directions. Transfusion Medicine
Reviews, 22(3), 215-233.
4. Saltelli, A. ( 2002). Making best use of model evaluations to compute
sensitivity indices, . Computer Physics Communications 145, 280-297.
References
5. Schlumpf, K., Glynn, S., Schreiber, G., & Wright, D. (2008). Factors
influencing donor return. TRANSFUSION, 48, 264-272.
6. Schreiber, G., Sanchez, A., Glynn, S., & Wright, D. (2003). Increasing blood
availability by changing donation patterns. TRANSFUSION, 43, 590-597.
7. SPSS. SPSS PASW Modeler (formerly Clementine) User Manual. A
Comprehensive Data Mining Toolkit, 2010.
8. Yu, P., Chung, K., Lin, C., Chan, J., & Lee, C. (2007). Predicting potential
drop-out and future commitment for first-time donors based on first 1.5-year
donation patterns: the case in Hong Kong Chinese donors. Vox Sanguinis, 93,
57–63.