2_Describing%20%26%20Explaining%20Quantitative%20Data

Download Report

Transcript 2_Describing%20%26%20Explaining%20Quantitative%20Data

Handout Two: Describing/Explaining
Quantitative Data and Introduction to
SPSS
EPSE 592
Experimental Designs and Analysis in Educational
Research
Instructor: Dr. Amery Wu
1
About Analysis of Variance Designs
• Measurement of the data: quantitative
• Type of statistical inference: descriptive and inferential
• Type of Modeling: summative/descriptive and
explanatory/predictive.
Analysis of Variance Design
Descriptive
Type of
Inference
Inferential
Measurement of Data
Quantitative
Categorical
Summative/Descriptive
Summative/Descriptive
Explanatory/Predictive
Explanatory/Predictive
Summative/Descriptive
Summative/Descriptive
Explanatory/Predictive
Explanatory/Predictive
2
Goals of Today’s Class
Review of “Describing and Explaining Quantitative Data”
Brushing up Your SPSS
3
Computing the Standard Deviation of a Sample
- D2CAR
D: Deviation
2: Square: 2
C: Collection
A: Average
R: Square Root
Lab Activity-See Excel File “Mean, SD, Z
Scores, & Pearson’s Correlation”
4
Interpretation of the Standard Deviation of a
Sample
 Individuals in a sample differ in their values
of DV. These differences are of our interest
to study.
 Standard deviation (SD) is a summative
measure of the extent to which individuals in
a sample differ.
 Within a sample, some individual have
values close to the mean, others far away
from the mean. Standard Deviation is the
average difference from the mean across
the n individuals in a given sample.
5
Transforming the Raw Scores to the Z Scores
Lab Activity-See Excel File “Mean, SD, Z Scores, &
Pearson’s Correlation”
6
Interpretations of the Z Scores
 Z scores transformation re-scales the data to have a
center of 0 and unit of 1., so called standardization.
That is, the mean of the Z scores of a sample is 0,
and the SD is 1.
 A person’s Z scores indicates how many standard
units he/she is away from either side of the mean.
 A Z score shows a person’s relative standing on the
scale (-∞ to ∞) to others in the sample.
 For example, if Mary’s Z score is -1.75, she is 1.75
standard units away on the left hand side of the
mean.
 Note that Z score transformation does not normalize
a skewed raw score distribution.
7
Computing the Pearson’s Correlation r
Lab Activity-See Excel File “Mean, SD, Z Scores &
Pearson’s Correlation”
8
Use & Interpretation of Pearson’s r
•
One of the X and Y should be quantitative data.
•
X and Y are assumed to be linearly related.
•
It is a standardized measure (-1 to 1) for quantifying the
covariation between the two variables.
•
A positive r indicates if people’s X scores are high (low) ,
their Y scores tend to be high (low). A negative r
indicates if people’s X scores are high, their Y scores
tend to be low. If there is no trend between the scores of
X and Y, then r is zero.
•
The square of the r, the coefficient of determination,
provides an estimate of the proportion of overlapping
variance between X and Y (i.e., the degree to which the
two sets of numbers vary together).
9
Use of Data in the Course
Data Source
Special thanks to professor Susan J. Henly from
School of Nursing, the University of Minnesota
for the SPSS data file presented in today’s class.
Ethics for Data Use
Under the guidelines of Behavioral Research
Ethics Board (BREB) UBC, data circulated in this
course cannot be used for purposes other than
the learning activities required by this course,
unless they are open to public use.
10
Description of Professor Susan J. Henly’s Data
This data set includes 40 participants (20 boys) who were
randomly assigned to the treatment (new method to reduce
injection pain) or control group (just do it quickly!)
Immediately after the injection, the children were asked to
rate their pain on a 0-100 scale, while a nurse observer
who could not hear their response also rated their pain
based on their behavioural cues.
The dependent variable (i.e., data) we are modeling
(describing/summarizing or explaining/predicting) today is
the level of pain reported by the kid -“kidrate”
Q: Judging by the above description, what was the research
question? What type of design was used? What type of data
was collected? and what kind of inference could be made?
11
Quantitative Methodology Network
Research
Question
Design
Inference
Experimental
Observational
Descriptive vs. Inferential
Relational vs. Causal
Model
Data
Descriptive/Summative
Explanatory/Predictive
Continuous
Categorical
12
This Is Where We will be Today (A, Blue Cell)
Measurement of Data
Descriptive
Type of
the
Inference
Inferential
Continuous
Categorical
A
B
C
D
• Remember, what we are doing is to model the data by
1. Describe/Summarize
2. Explain/Predict
Data = Model + Residual
• Note that the inferences remain at the sample level with no
intention to generalize to the population. Namely, neither
C nor D is covered today.
13
Describing/Summarizing Central Tendency by Using
Numbers
kidrate
Valid
Statistics
kidrate
N
Mean
Median
Mode
Valid
Missing
40
0
65.3250
64.5000
77.00
Q1: In your opinion, which statistics
best characterize the central tendency
of kidrate, and why?
Q2: Can you tell proximately whether
the distribution of kidrate is normal,
positively skewed, or negative skewed,
and how?
35.00
36.00
38.00
41.00
46.00
47.00
50.00
51.00
52.00
54.00
57.00
59.00
60.00
61.00
63.00
64.00
65.00
69.00
70.00
71.00
72.00
73.00
74.00
75.00
77.00
83.00
84.00
86.00
95.00
100.00
Total
Frequency
1
1
1
1
2
1
2
1
1
1
1
1
2
2
1
1
2
2
1
1
1
1
1
1
3
1
1
1
1
3
40
Percent
2.5
2.5
2.5
2.5
5.0
2.5
5.0
2.5
2.5
2.5
2.5
2.5
5.0
5.0
2.5
2.5
5.0
5.0
2.5
2.5
2.5
2.5
2.5
2.5
7.5
2.5
2.5
2.5
2.5
7.5
100.0
Valid Percent
2.5
2.5
2.5
2.5
5.0
2.5
5.0
2.5
2.5
2.5
2.5
2.5
5.0
5.0
2.5
2.5
5.0
5.0
2.5
2.5
2.5
2.5
2.5
2.5
7.5
2.5
2.5
2.5
2.5
7.5
100.0
Cumulative
Percent
2.5
5.0
7.5
10.0
15.0
17.5
22.5
25.0
27.5
30.0
32.5
35.0
40.0
45.0
47.5
50.0
55.0
60.0
62.5
65.0
67.5
70.0
72.5
75.0
82.5
85.0
87.5
90.0
92.5
100.0
14
Describing/Summarizing Dispersion by Using
Numbers
De scri ptives
kidrate
Mean
95% Confidenc e
Int erval for Mean
5% Trimmed Mean
Median
Variance
St d. Deviation
Minimum
Maximum
Range
Int erquartile Range
Sk ewness
Kurtos is
Lower Bound
Upper Bound
St atist ic
65.3250
59.7784
St d. Error
2.74221
70.8716
65.0556
64.5000
300.789
17.34327
35.00
100.00
65.00
25.25
.289
-.374
.374
.733
Q1: How can minimum and maximum help detect aberrant data
points?
Q2: By looking at the mean and SD, can you tell whether the data is
normally distributed, positively skewed? or negatively skewed? 15
Describing & Summarizing Distribution by
Using Pictures
Histogram
10
Frequency
8
6
4
2
0
30.00
40.00
50.00
60.00
70.00
80.00
90.00
100.00
Mean = 65.325
Std. Dev. =
17.34327
N = 40
kidrate
Q: What are the advantages and disadvantages of displaying
data using a histogram?
16
Describing & Summarizing Distribution by
Using Pictures
Q: What are the advantages and disadvantages of
displaying data using a stem and leaf plot?
17
Describing & Summarizing Distribution by
Using Pictures
Boxplot
100.00
90.00
80.00
70.00
60.00
50.00
40.00
30.00
kidrate
18
Explaining/Predicting the data (kidrate) by
Gender- Using Numbers
Statistics
kidrate
Boy
Gi rl
Mean
Median
Mode
Std. Deviati on
Variance
Skewness
Std. Error of Skewness
Range
Mi nimum
Maximum
Mean
Median
Mode
Std. Deviati on
Variance
Skewness
Std. Error of Skewness
Range
Mi nimum
Maximum
62.5000
61.5000
60.00a
14.89790
221.947
.249
.512
59.00
36.00
95.00
68.1500
71.5000
77.00a
19.45920
378.661
.131
.512
65.00
35.00
100.00
a. Multiple modes exis t. The sm allest value is shown
19
Explaining/Predicting the data (kidrate) by
Gender - Using Pictures
20
Revisiting the Concept of Statistical Modeling Using Mean
Data
=
Model
=
Kidrate
=
=
Mean
+
Res.
+
Res.
+
Res.
+
21
Variable View of SPSS Data Editor
- To specify the format of the spread sheet
22
Data View of SPSS Data Editor
- To enter and view the raw data
23
Lab Activity- Hands on SPSS (Statistics)
Please report the following statistics for the variable
“Nurse-rated Pain”
Instruction: Analyze/Descriptive Statistics/Frequencies/
Variables (Enter Nurse-rated Pain)/Statistics…
Central Tendency
Mean
Medium
Mode
Dispersion
Minimum/Maximum
Range
Quartiles/Interquartile
SD/Variance
Alternatively, you can use
Instruction: Analyze/Descriptive Statistics/
Descriptives/ Variables (Enter Nurse-rated Pain)
/Options
24
Lab Activity- Hands on SPSS (Graphs)
Please report the histogram for the variable
“Nurse-rated Pain”
Instruction: Analyze/Descriptive Statistics/Frequencies/
Variables(Enter Nurse-rated Pain)/Charts/Histograms
Alternatively, you can use Graphs menu
Instruction: Graphs/Histogram/Variable
(Enter Nurse-rated Pain)
25
Lab Activity- Hands on SPSS (Explore)
My personal preference for describing a continuous
variable is to use the following command, which gives
output of crucial and comprehensive information in both
numbers and pictures
Instruction: Analyze/ Descriptive Statistics /Explore
/Dependent list (enter “Nurse-rated Pain”)
26
Becoming A Competent User Of SPSS
How do I remember all these commands & paths?
1. There is no need to memorize them!! explore
the drop-down menus.
2. Your navigation of SPSS should be guided by
the conceptual frameworks and the statistical
methods you learned in this or previous stats
courses.
3. SPSS is just a tool not a brain! Be a clever
user!
27
Supplemental Learning Resources for SPSS
You can find very useful Youtube tutorials on various SPSS
tools and analyses. They are less time consuming to learn
than reading texts.
As necessary, read the following chapters from the website
of Social Science Research and Instructional Council
(SSRIC):
http://www.csubak.edu/ssric-trd/spss/spsfirst.htm
Chapter One: Getting Started With SPSS for Windows
Chapter Two: Creating a Data File
Chapter Three: Transforming Data
Chapter Four: Univariate Statistics
28
This Is Where We Have Been Today
Measurement of Data
Descriptive
Type of
the
Inference
Inferential
Continuous
Categorical
Summative/Descriptive
Summative/Descriptive
Explanatory/Predictive
Explanatory/Predictive
Summative/Descriptive
Summative/Descriptive
Explanatory/Predictive
Explanatory/Predictive
29