Data Cleaning

Download Report

Transcript Data Cleaning

Introduction to ANALYSIS
Data Editing
Your survey is done….
Now what?
There are two things to do before going any further.
A. Edit your forms (surveys)
A. Edit your forms (surveys)
B. Compose
Edit:
Look for bias.
Edit:
Lack of cooperation
The “Screw You Effect”
“The negative-participant role (also known as the screw-you
effect) in which the participant attempts to discern the
experimenter's hypotheses, but only in order to destroy the
credibility of the study.”
Edit:
Problematic responses
Non-responses
Illegible responses
Edit:
Problematic responses
Non-responses
Illegible responses
Illogical responses
10. If you served in the military, in which conflict did you serve?
a. WWII b) Korea c) Viet Nam d) Gulf War e) Iraq/Afghanistan
Edit:
Problematic responses
Non-responses
Illegible responses
Illogical responses
10. If you served in the military, in which conflict did you serve?
a. WWII b) Korea c) Viet Nam d) Gulf War e) Iraq/Afghanistan
Edit:
Problematic responses
Non-responses
Illegible responses
Illogical responses
10. If you served in the military, in which conflict did you serve?
a. WWII b) Korea c) Viet Nam d) Gulf War e) Iraq/Afghanistan
30. Your age? _______
Edit:
Problematic responses
Non-responses
Illegible responses
Illogical responses
10. If you served in the military, in which conflict did you serve?
a. WWII b) Korea c) Viet Nam d) Gulf War e) Iraq/Afghanistan
30. Your age? __36___
Edit:
Problematic responses
Non-responses
(could be a case, variable, or data point (value)
1. Ignore them
2. Eliminate case
3. Input a value
a. random number
b. the mean
c. a value calculated (usually by regression)
Edit:
Problematic responses
Illegible responses
1. Guess and input a value
2. Eliminate case
3. Use other answers to estimate answer
Edit:
Problematic responses
Illogical responses
1. Eliminate case
2. Use other answers to estimate answer
Create a codebook:
Create a codebook:
1. Case code (on each form)
2. Variable codes and labels
3. Scale of variables
4. Open-ended questions (?)
Now input your data!
Clean up you data.
1. Print out a frequency table for
each variable.
Clean up you data.
1. Print out a frequency table for
each variable.
2. Look for:
a. wrong or impossible numbers
b. Outliers (deviance)
c. Missing data
Clean up you data.
1. Print out a frequency table for
each variable.
2. Look for:
a. wrong or impossible numbers
b. Outliers (deviance)
c. Missing data
3. Look at cross-tabs for relationships
Work out the stats for your project.
Compose!!
Work out the stats for your project.
Compose!!
Whether you are this guy…
Work out the stats for your project.
Or this guy…
Compose!!