Transcript Lecture 8

The Assumptions
1
Fundamental Concepts of Statistics
Measurement - any result from any procedure that assigns a value to an
observable phenomenon. Problems - our observations are based on our
ability to observe, count, etc. Accuracy is always an issue. It is very
difficult to achieve the same measurement twice.
Variation - this brings us to the idea of variation. Statistics is based on the
idea that almost everything varies in someway or has variation.
Two reasons for variation:
1. measurement inaccuracies or random error
2. true differences b/w observations, measurement and groups
Probabilistic causation - because of this property we can only deal with
probabilities of being correct or incorrect in our determination of
differences in crime rates.
2
Three Types of Statistics
•
Descriptive - Techniques employed in the presentation of
collected data. Tables, charts, graphs and the formulation
of quantities that indicate concise information about our
data.
•
Inferential -Linked with the concept of probability.
Statistical methods that permit us to infer (probabilistically)
something about the real world and about the "true"
population from knowledge derived from only part of that
population. Methods that allow us to specify how likely we
will be in error.
•
Predictive- Deals with relationships and the idea that
knowing information about on characteristic or variable can
help us predict the behavior of another variable. Methods
and tools that help predict future observations in other
populations or time periods.
3
Determining Causation
1. TIME-ORDER: the presumed cause must
always precede the presumed effect
2. COVARIATION: the presumed cause and
effect must vary with each other
3. ELIMINATION OF ALTERNATIVE
EXPLANATIONS: there must be no equally
plausible explanations for the presumed
effect
4
Descriptive: Central Tendency
• Mode - The most frequent observation. Usually used with
nominal data to describe data. Limitation - limited
information - could be multi-modal. Cannot be
arithmetically manipulated
• Median - the middle observation. Usually used with
ordinal level data. Relatively stable. Limitations - must
have ordinal data or higher. Cannot be arithmetically
manipulated
• Mean - Most widely used measure in statistics (i.e., most
statistical tests are built around the mean). Can be
arithmetically manipulated (calculated). Limitations must have either interval or ration data, sensitive to
outliers
Formula: ∑x / n
5
Measures of Variability or Dispersion
Range - high and lows.
– Limitations: Based on only two extreme observations
Interquartile range - measures variablility based on percentiles.
Q3(75th percentile) -Q1 (25th percentile)
Limitations: Leaves our many observations
Mean Deviation – the average of the absolute deviations.
∑|x-µ| / n
Limitations: Less sensitive to deviations in the distribution
Variance - Based on distances from the mean (X - mean).
Takes the square of each deviation from the average and then
averages the squares.
∑(x-µ)2 / n
Standard Deviation - the square root of the variance
6
Why analyze statistics?
• Much of what policy is interested is
increasing or reducing some phenomenon.
– Increase employment
– Reduce crime
– Reduce abortions
– Reduce auto fatalities
– Increase graduation rates
• “One common way to define a policy
problem is to measure it.” (Stone, 2002)
7
Not everyone is convinced of positivist
approach
The rational choice and cost/benefit analysis
is said to miss out on a lot of the
subjectivity of politics and policy analysis
Even those who support the scientific method
are skeptical of being able to quantify
social and political phenomenon.
What are some problems with apply
statistical, quantitative methods to the
social sciences?
8
Choosing how to measure
• Inclusion vs. Exclusion
• Numbers tell a story, of decline and decay
or bigger and worse.
• The goal is to create a sense of
helplessness and control.
• How much is too much or too little
• If counting is used for evaluation, incentive
to manipulate numbers
9