PPA 502 – Program Evaluation
Download
Report
Transcript PPA 502 – Program Evaluation
PPA 502 – Program
Evaluation
Lecture 6a – Using Statistics
Appropriately
Descriptive and Inferential
Statistics
Introduction
– Any phenomena that can be counted can be
summarized. If these summaries are used to
describe a group of items, the figures presented
are descriptive statistics.
– When statistics are computed from a probability
sample with the intention of generalizing from
the sample to the population, the statistics are
referred to as inferential statistics.
Descriptive and Inferential
Statistics
Generalizing from samples.
– The population of interest must be reasonably
known and identifiable.
– A sampling technique should be used in which
the probability for selecting any unit in the
population can be calculated.
– A sample should be drawn that is of appropriate
size relative to the size of the population to
which generalization is desired.
Descriptive and Inferential
Statistics
Generalizing from samples.
– Even though probability sampling is applied,
evaluators should examine a sample to ensure
that it is truly representative of the population
to which the evaluators hope to generalize.
– Without randomization, evaluators must take
great care in assuring representativeness. With
randomization, statistical significance is used.
Descriptive and Inferential
Statistics
Estimating the strength of relationships.
How are the variables
measured?
An appropriate
coefficient
Range
Nominal
Lambda, G&K Tau b
0 to +1
Ordinal
Gamma, Somer’s d
-1 to +1
Interval
Pearson’s r
-1 to +1
Interval
Bivariate: r-squared
0-100%
Interval
Multiple: R-squared
0-100%
Descriptive and Inferential
Statistics
Statistical hypothesis testing.
– Null hypothesis vs. Research hypothesis.
– Discrepancies between true situation and test
results.
• Type I error – false positive.
• Type II error – false negative.
Descriptive and Inferential
Statistics
Statistical hypothesis testing.
Design Feature
False positives
False negatives
1. Threats to validity
Sample volunteers
*
Identical pre and post-test
*
Experimental mortality
*
Hawthorne effect
*
Zealot effect
*
Overcompensating control group
*
Staff overcompensation
*
Unreliable measurement
*
2. Other design characteristics
Sample size too small
*
Time period too short
*
Treatment contamination
*
Incomplete implementation
*
Descriptive and Inferential
Statistics
Statistical hypothesis testing.
– Selecting a statistical confidence test.
• 95% standard, but:
• 80-90% may be more in line to avoid type II errors.
Practical significance.
– Statistical significance measures whether
findings can be generalized.
– Practical significance evaluates the size of
program effect: slight, moderate, strong.
• Unfortunately, no hard and fast standards.
Selecting Appropriate Statistics
Criteria for selecting appropriate data
analysis techniques.
– Question-related criteria.
• Generalization?
• Causal? Impact?
• Quantitative standards?
Selecting Appropriate Statistics
Criteria for selecting appropriate data analysis
techniques.
– Measurement-related criteria.
•
•
•
•
•
•
•
•
Level of measurement?
Multiple indicators?
Sample sizes?
Multiple observations over time?
Independent or related samples?
Variable distributions?
Measurement precision?
Outliers?
Selecting Appropriate Statistics
Criteria for selecting appropriate data
analysis techniques.
– Audience-related criteria.
•
•
•
•
•
Audience knowledge of sophisticated techniques?
Graphics versus tables?
Precision level for audience?
Graphs versus regressions?
Statistical versus practical significance?
Selecting Appropriate Statistics
Purpose
Generalization
Bivariate
relationship
Identify factors
Group cases
Estimate impact
Describe or
predict trend
Measurement
Appropriate
technique
Appropriate
Significance test
Appropriate
Magnitude
Measure
Nominal/ Ordinal
Frequency counts
Chi-square
NA
Interval
Means/
medians/SD/IQR
Chi-square
NA
Nominal/ Ordinal
Contingency tables
Chi-square
Percentage difference
Interval
Difference of means
Chi-square or t
Difference in means
Nominal/ Ordinal
NA
NA
NA
Interval
Factor Analysis
t
Pearson’s r
Nominal/ Ordinal
NA
NA
NA
Interval
Cluster or
discriminant analysis
t
Pseudo R-squared
Nominal/ Ordinal
Loglinear regression
t and f
R-squared
Interval
Regression
t and f
Beta weights
Nominal/ Ordinal
Regression
t and f
R-squared, beta
weights
Interval
Regression
t and f
Same as above
Selecting Appropriate Statistics
Applying regression.
– Dependent variable.
– Linear model (but curvilinear can be modeled).
– Used to estimate changes in behavior or impacts.
– Best fitting line.
– Coefficient of determination.
– Unstandardized regression coefficients (slopes).
– Standardized regression coefficients (betas).
– Significance.
• T-test, individual.
• F-test, collective.
– Confidence intervals.
Selecting Appropriate Statistics
Selecting techniques to sort measures or
units.
– Techniques.
• Aggregation.
– Summative index.
• Analytical techniques.
– Measures.
• Factor analysis.
– Groups.
• Discriminant analysis.
• Cluster analysis.
Selecting Appropriate Statistics
Other factors affecting selection of
statistical techniques.
–
–
–
–
–
–
Sample size.
Number of observations over time.
Variable distributions.
Implied level of measurement.
Outliers.
Level of sophistication of users.
Selecting Appropriate Statistics
Reporting statistics appropriately.
– Identify contents of all tables and figures clearly.
– Indicate use of decision rules in analysis.
– Consolidate analyses whenever possible.
– Do not abbreviate.
– Provide basic information about measurement of
variables.
– Present appropriate percentages.
– Present information on statistical significance clearly.
– Present information on magnitude of relationships
clearly.
– Use graphics to present analytical findings clearly.
Selecting Appropriate Statistics
Reporting statistical results to high-level
public officials.
– Dilemma: how do you present less than certain
data without excessive hedging.
– Prepare decision-makers for less than certain
answers.
• Range of uncertainty (confidence intervals) are good
because of familiarity with polling.
– Present only findings of practical importance.
– Graphics are better than tables.