DataAnalysis_revised

Download Report

Transcript DataAnalysis_revised

QUANTITATIVE RESEARCH METHODS
A reminder:
In your data collection you probably gathered various
types of information (numbers, interviews, work
samples, etc.) from different sources (student records,
students, observations, journals, etc.).
The types of data gathered probably fall under, what
can be called ‘levels of measurement scales.’
Levels of Measurement Scales
(Stevens, 1946)
Source: Stevens, S. S. (1946). On the theory of scales of measurement.
Science, 103(2684), 677-680.
Levels of Measurement Scales
Nominal
Examples: Gender
Ordinal
Interval
Ratio
Likert Scale Temp. scales Test scores
The type of data that you have will tend to fall into one of these groups.
Before we examine each one of these in some detail, let’s look at some
simple descriptive statistical terms that you will probably use in your
written report
Simple Descriptive Statistics
Mode
The most frequently occurring observation or measured value for
a construct or phenomenon
Median
Middle observation or measured value of a construct in a
distribution; 50th percentile; the middle value of a set of data.
Mean
Average score; typically the arithmetic average, but not always
You will examine descriptive statistics in more detail later, but now
back to types of data.
Nominal data
EXAMPLE: Gender, social security #, cultural factor
Nominal data is used
purely as labels
No
origin
No
order associated
Distance
between units
is not equal
Nominal: In research and in measurement
• Used to identify or put into categories, particularly in some
database applications or in multiple choice options for
demographic data
• Examples:
Gender, ethnicity
• Statistic Used:
Mode
Ordinal data
Examples: Consumer preferance, Likert scale, and intelligence scores
Expresses a rank order
but relations among
the measures are relative
No origin
Order
associated
Distance
between units
not equal
Ordinal: In research and in measurement
• Widely used in research and in measurement…
particularly prevalent in survey research
• Examples:
Likert scales, Thurstone scales
• The U.S should sign the Kyoto Treaty.
Strongly agree
• Statistic Used:
Agree
Neutral
Disagree
Mode, Median
Disagree
Strongly
Interval
Examples: Farenheit and celsius temperature
Expresses a rank order and
relations among the measures
are equal distance between and among units
no
origins
order
associated
distance
between units
is equal
Example: Temperature
• When we think of a thermometer we can see a great
example of interval data.
• The degree readings on the thermometer are laid
out in an orderly and equal fashion. We understand
that if the value is low (15*), that it is cold.
• With interval data, we can go below 0. Living in Midwest,
we have all seen the thermometer register that!
Interval: In research and
in measurement
• Numerals express a rank order and relations among the
measures are equidistant between and among units
• Example: Physical measurements of humans
• Statistic Used: Mode, Median, Mean
Ratio
Examples: Physical measurements (rulers) and test scores
Represents
mathematical
properties of
number lines
origin
Absolute zero
Order
associated
Distance between
units is
equal
Consider the test scores that you may use as
part of your study
• A student scores 3 out of 5 on a test
• The units between the scores are equal, like interval
• With ratio we add absolute 0
• A student who does not get any correct on a spelling test is given
a 0. She cannot be given a score lower than 0.
• A percentage is a ratio n:100
Ratio: In research and in measurement
• Widely used in natural sciences…less common in social
sciences, education, and in psychometric measurement
• However, ratio data are often an element of outcome
measures/general outcome measures
• Examples: Number of words spelled
correctly, number of words read in a minute,
number of multiplication facts correctly
calculated in a minute
• Statistic Used: Mode, Median and Mean
Additional definitions
• Nominal: Used purely as a label-No middle-
Mode (can you remember what this is?)
• Ordinal: Ranks the dataA Median may be used to find the middle
• Interval: Ranks the data and the distance between scores is the
same
A Mean may be used to find the middle (average);
Median and Mode may also be used
• Ratio: Ranks the data, the distance between scores is the
same, and there is absolute 0;
Mean, Median and Mode
Beginning Your Analysis
• So what do you do with your data to begin analysis?
• Decide what is quantitative (numbers generally) and qualitative
(based on words. This will be dealt with in a later PowerPoint.)
• Quantitative analysis will use some form of descriptive
statistics
• Remember Mode, Median, Mean…….this is where these come
into play
• Qualitative can also use some descriptive statistics, but more
on that later
Descriptive statistics
• Why use descriptive statistics?
• Summarizes (describes) the data and makes it easier to
find connections between data
• Also displaying data makes them more visual and easier to
understand. This will be dealt with later.
• Let’s examine the basis of simple descriptive statistics
Simple descriptive statistics
Descriptive Statistics
Central tendency
Central tendency
Measure of typicality;
typical score in a
distribution
Dispersion
Dispersion
Spread of the
distribution
Central Tendency: three descriptors
Mode
The most frequently occurring observation or measured value for
a construct or phenomenon
Median
Middle observation or measured value of a construct in a
distribution; 50th percentile; the middle value of a set of data.
Mean
Average score; typically the arithmetic average, but not always
So, how are these calculated? The next slides show this, but EXCEL
software has an automatic function for this in the ‘Analysis ToolPak’.
Median
• The median is just as the name implies-
the middle of the data
• Data are listed or placed on a number line
• If you have an odd number of choices for response, the
middle of the two mid numbers are the median
Calculating the median
Responses from your survey item #1:
1= s/da, 2=da, 3=n, 4=a, 5=s/a
Participant
Item #1 response
#1
3-neutral
#2
2-disagree
#3
5-strongly agree
#4
3-neutral
#5
4-agree
Responses
Median
2, 3, 3, 4, 5
3
Mean
• The mean is the middle and can be found by a mathematical
process commonly referred to as “averaging.”
• Add all the values together (Total)
• Divide the sum of the values by the number of items (n)
• This answer represents the mean or average
Example: 4, 5, 4, 12, and 5 (5 is the number of responses; n)
Add: 4+5+4+12+5=30 (Total)
Divide 30 by 5=6
(Total/n)
6 is the mathematical mean of the group
Synthesis scales of measurement and descriptive
statistics
• The properties of scales of measurement affect the types
of mathematical operations that can be performed on the
values in a meaningful manner
• For example, does anything meaningful result from
dividing gender (a nominal measurement)?
Male = 1; 1/2 does not yield a meaningful solution
• For example, first place + second does not equal third
(ordinal measurement)
Reporting measures of central tendency
Scale of measurement
Nominal
(Gender)
Ordinal
(Likert Scales)
Interval
(Temp)
Ratio
(Test Scores)
Central tendency
Mode
Most frequently occurring observation
or measured value for a construct or
phenomenon
Mode; Median
Median: Middle observation or
measured value of a construct in a
distribution; 50th percentile
Mode; Median; Mean
Mean: Average score; typically the
arithmetic average, but not always
Mode, Median, Mean
Dispersion
Now, let’s consider another important aspect to descriptive statistics:
dispersion.
Dispersion
Range
Interquartile
range
Variance
Standard
deviation
We will focus on 2 measures: Range and Standard Deviation (SD)
Range
• The difference between the highest and lowest
observation or measured value of a construct
Exclusive range
The difference between the
largest and smallest
observation without correcting
for rounding error in
measurements
Inclusive range
The difference between the largest
and smallest observation but
correcting for rounding error in
measurements
• highest score – lowest score
• highest score – lowest score + 1
RANGE
• Difference between the largest and smallest data values
• Listing of the smallest value (#) to the largest value (#)
Ex: 4, 3, 2, 9, 7, 1, 6
1, 2, 3, 4, 6, 7, 9
What is the range of these data?
Range is 8 (1-9)
Standard deviation
• Most common measure of dispersion/variability/ spread
• SD is measured as the square of the deviations from the
mean
Standard Deviation: Conceptualizing
What does it mean to deviate?
• To differ from typical or average
• The standard deviation is kind of the “mean of the mean,”
and often can help you find the story behind the data.
• To understand this concept, it can help to learn about what
statisticians call normal distribution of data.
• A normal distribution of data means that most of the
examples in a set of data are close to the "average," while
relatively few examples tend to one extreme or the other.
Standard Deviation:
Conceptualizing statistically
What does it mean to be normally distributed?
Standard Deviation:
Conceptualizing statistically
What does it mean to deviate?
• Statistically typical or normal takes the form of a measure of
central tendency
• In this case, the mean (arithmetic average)
•
X - X
• Where X is an individual score and X is the mean
√ ∑ (X – X)2
n-1
Standard Deviation:
Conceptualizing statistically
The standard deviation is a statistic that tells you how tightly
all the various examples are clustered around the mean in a
set of data.
Computing the value of a standard deviation is complicated.
But let me show you graphically what a standard deviation
represents...
Standard Deviation
One standard deviation away from the mean in either direction on the
horizontal axis (the red area) accounts for somewhere around 68 percent
of the people in this group.
Two standard deviations away from the mean (the red and green areas)
account for roughly 95 percent of the people.
And three standard deviations (the red, green and blue areas) account for
about 99 percent of the people.
Standard Deviation
So what kind of standard deviation would these graphs give you?
When the examples are pretty tightly bunched together and the bellshaped curve is steep, the standard deviation is small.
When the examples are spread apart and the bell curve is relatively flat,
that tells you that there is a relatively large standard deviation.
Standard Deviation
So, what’s the big deal with SD?
If you are comparing test scores for different schools, the standard
deviation will tell you how diverse the test scores are for each school.
For example, let’s say Springfield Elementary has a higher mean
test score than Shelbyville Elementary.
Your first reaction might be to say that the students at Springfield are
smarter.
Standard Deviation
So, what’s the big deal with SD?
But a bigger SD for one school tells you that there are relatively more
students at that school scoring toward one extreme or the other.
By asking a few follow-up questions you might find that
Springfield’s mean was skewed up because the school district sends all
of the gifted education students to Springfield.
Or that Shelbyville’s scores were lowered because students who recently
have been “mainstreamed” from special education classes have all been
sent to Shelbyville.
Why report standard deviation in your
research?
• A standard deviation is used with the mean
• It helps us understand the mean better:
DATA SET A: n=2
S1=3
S2=3
The mean in this case is 3 (3+3=6, 6/2)
DATA SET B: n=2
S1=5
S2=1
The mean in this case is also 3 (5+1=6, 6/2)
• Although the mean is the same, it does not accurately reflect
the deviation found in DATA SET B. We need more
information.
Why report standard deviation in your
research?
DATA SET A: n=2
S1=3
S2=3
The mean in this case is 3 (3+3=6, 6/2)
What is the SD here?
•0
DATA SET B: n=2
S1=5
S2=1
The mean in this case is also 3 (5+1=6, 6/2)
What is the SD here?
• 2.82842712474619
Standard Deviation: Conceptualizing Statistically
• If a person scores above
the mean, the deviation
yields a positive number
• If a person scores below
the mean, the deviation
yields a negative number
Reporting measures of dispersion
Scale of measurement
Nominal
(Gender)
Ordinal
(Likert Scales)
Dispersion
Index of diversity; index of
qualitative variation
Range; average deviation from
the median
Interval
(Temp)
Standard deviation
Ratio
(Test Scores)
Standard deviation