Chapter 5 PPT
Download
Report
Transcript Chapter 5 PPT
Statistical Analysis of Data
Graziano and Raulin
Research Methods: Chapter 5
Graziano & Raulin (2000)
Individual Differences
A fact of life
–
–
People differ from one another
People differ from one occasion to another
Most psychological variables have effects
that are small compared to individual
differences
Statistics give us a way to detect such subtle
effects in a sea of individual differences
Graziano & Raulin (2000)
Descriptive Statistics
Are used to describe the data
Many types of descriptive statistics
–
–
–
Frequency distributions
Summary measures such as measures of central
tendency, variability, and relationship
Graphical representations of the data
A way to visualize the data
The first step in any statistical analysis
Graziano & Raulin (2000)
Frequency Distributions
First step in organization of data
–
Can see how the scores are distributed
Used with all types of data
Illustrate relationships between variables in
a cross-tabulation
Simplify distributions with a large range by
using a grouped frequency distribution
Graziano & Raulin (2000)
Histograms
A bar graph, as shown
at the right
Can be used to graph
either
–
–
Data representing
discrete categories
Data representing
scores from a
continuous variable
Sample Histogram
60
50
40
Freq 30
20
10
0
Graziano & Raulin (2000)
1 2 3 4 5 6
Scores
Histograms (2 distributions)
Possible to graph two
or more distributions
on the same histogram
to see how they
compare
Note that one of the
two groups in this
histogram was the
same group graphed
previously
Sample Histogram
80
70
60
50
Freq 40
30
20
10
0
Graziano & Raulin (2000)
1 2 3 4 5 6
Scores
Frequency Polygon (1 group)
Like a histogram
except that, instead of
a bar representing the
mean score or
frequency, a dot is
used, with the dots
connected as shown
Data could be either
discrete or continuous
Sample
Frequency Polygon
60
50
40
30
20
10
0
Graziano & Raulin (2000)
1
2
3 4
Scores
5
6
Frequency Polygon (2 groups)
Can compare two of
more frequency
polygons on the same
scale as shown
Easier to compare
frequency polygons
because the graph
appears less cluttered
than multiple
histograms
Sample
Frequency Polygon
80
60
40
20
0
Graziano & Raulin (2000)
1
2
3 4
Scores
5
6
Shapes of Distributions
Many variables in
psychology are
distributed normally
The distribution is
skewed if scores
bunch up at one end
Illustrate on the left
are symmetric and
skewed distributions
Graziano & Raulin (2000)
Measures of Central Tendency
Mode: the most frequently occurring score
–
Median: the middle score in a distribution
–
Easy to compute from frequency distribution
Less affected than the mean by a few deviant
scores
Mean: the arithmetic average
–
–
Most commonly used central tendency measure
Used in later inferential statistics
Graziano & Raulin (2000)
Computing the Mean
Compute the mean of
3, 4, 2, 5, 7, & 5
Sum the numbers
–
Count the numbers
–
26
6
Plug these values into
the equation at the
right
Graziano & Raulin (2000)
X
X
N
26
X
4.33
6
Measuring Variability
Range: lowest to highest score
Average Deviation: average distance from
the mean
Variance: average squared distance from
the mean
–
Used in later inferential statistics
Standard Deviation: square root of variance
–
expressed on the same scale as the mean
Graziano & Raulin (2000)
Measures of Relationship
Pearson product-moment correlation
–
Spearman rank-order correlation
–
Used with interval or ratio data
Used when one variable is ordinal and the
second is at least ordinal
Scatter plots
–
–
Visual representation of a correlation
Helps to identify nonlinear relationships
Graziano & Raulin (2000)
Regression
Using a correlation (relationship between
variables) to predict one variable from
knowing the score on the other variable
Usually a linear regression (finding the best
fitting straight line for the data)
Best illustrated in a scatter plot with the
regression line also plotted (see Figure 5.6)
Graziano & Raulin (2000)
Reliability Indices
Test-retest reliability and interrater
reliability are indexed with a Pearson
product-moment correlation
Internal consistency reliability is indexed
with coefficient alpha
Details on these computations are included
on the CD supplement
Graziano & Raulin (2000)
Standard Scores (Z-scores)
A way to put scores on a common scale
Computed by subtracting the mean from the
score and dividing by the standard deviation
Interpreting the Z-score
–
–
Positive Z-scores are above the mean; negative
Z-scores are below the mean
The larger the absolute value of the Z-score, the
further the score is from the mean
Graziano & Raulin (2000)
Inferential Statistics
Used to draw inferences about populations
on the basis of samples from the
populations
The “statistical tests” that we perform on
our data are inferential statistics
Provide an objective way of quantifying the
strength of the evidence for our hypothesis
Graziano & Raulin (2000)
Populations and Samples
Population: the larger groups of all
participants of interest to the researcher
Sample: a subset of the population
Samples almost never represent populations
perfectly (termed “sampling error”)
–
Not really an error; just the natural variability
that you can expect from one sample to another
Graziano & Raulin (2000)
The Null Hypothesis
States that there is NO difference between
the population means
Compare sample means to test the null
hypothesis
Population parameters & sample statistics
–
–
Population parameter is a descriptive statistic computed
from everyone in the population
Sample statistics is a descriptive statistic computed
from everyone in your sample
Graziano & Raulin (2000)
Statistical Decisions
We can either Reject or Fail to Reject the
null hypothesis
–
–
–
Rejecting the null hypothesis suggests that there is a
difference in the populations sampled
Failing to reject suggests that no difference exists
Decision is based on probability (reject if it is unlikely
that the null hypothesis is true)
Alpha: the statistical decision criteria used
Traditionally alpha is set to small values (.05 or .01)
Always a chance for error in our decision
Graziano & Raulin (2000)
Statistical Decision Process
Reject Null
Hypothesis
Retain Null
Hypothesis
Null Hypothesis
is True
Type I
Error
Correct
Decision
Null Hypothesis
is False
Correct
Decision
Type II
Error
Graziano & Raulin (2000)
Testing for Mean Differences
t-test for independent groups: tests mean
difference of two independent groups
Correlated t-test: tests mean difference of
two correlated groups
Analysis of Variance: tests mean
differences in two or more groups
–
–
Groups may or may not be independent
Also capable of evaluating factorial designs
Graziano & Raulin (2000)
Power of a Statistical Test
Sensitivity of the procedure to detect real
differences between the populations
Not just a function of the statistical test, but
also a function of the precision of the
research design and execution
Increasing the sample size increases the
power because larger samples estimate the
population parameters more precisely
Graziano & Raulin (2000)
Statistical versus Practical
Significance
Statistical significance means that the
observed mean differences are not likely
due to sampling error
–
Can get statistical significance, even with very
small population differences, if the sample size
is large enough
Practical significance looks at whether the
difference is large enough to be of value in a
practical sense
Graziano & Raulin (2000)
Effect Size
Gives an indication of the size of the
difference between groups
Unlike the statistical test, the effect size is
NOT affected by the size of the sample
More details on effect size
–
–
In Chapter 15
On the CD supplement
Graziano & Raulin (2000)
Meta-Analysis
A relatively new statistical technique
Allows researchers to statistically combine
the results of several studies of the same
phenomenon to get a sense of how powerful
the effect is
Discussed in more detail in Chapter 15
Graziano & Raulin (2000)
Summary
Statistics allow us to detect and evaluate
group differences that are small compared
to individual differences
Descriptive versus inferential statistics
–
–
Descriptive statistics describe the data
Inferential statistics are used to draw inferences about
population parameters on the basis of sample statistics
Statistics objectify evaluations, but do not
guarantee correct decisions every time
Graziano & Raulin (2000)