Descriptive statistics 2012_13
Download
Report
Transcript Descriptive statistics 2012_13
Descriptive
statistics
922
What do we need to run an
experiment?
Hypothesis (Linguistic)
Participants
Task (stimuli = questions, responses = answers)
Results
Conclusions
Key terms: stimulus design, response measure
Example
Show me the cat that
bit the dog
Show me the cat that
the dog bit
Picture from:
Friedmann &Novogrodsky (2001)
Design
Number of conditions
Within subject / between subject
How many items to each participant
Order of items
Measure Response
Variables
Scales
Analysis
Descriptive
Inferential
Variables
Any experimental category that has a
value that can vary.
Anything that is not constant and can
change over time, or be different in
different people is a variable
Variables can take many forms
Variables can be manipulated and
observed
Properties of Variables
Continuous variable – along a continuum
with equal intervals (e.g., age, height,
weight, grade in a test)
Ordinal variables – rating along a
continuum with estimated intervals (e.g.,
evaluation)
Discrete variables (categorical, nominal) –
divide to categories (e.g., language,
yes/no, correct/incorrect)
Types of Variables
Independent variables –
Characteristics of the subject (Participant
variable)
Conditions chosen by the experimenter
Dependent variables – what the experiment
measures (e.g., degree of success)
Intervening variables – variables which are
not measured or manipulated, but could
influence the results (e.g., concentration,
intelligence)
Scales
Nominal
Ordinal
Interval
Ratio
Scales
Nominal
Ordinal
Interval
Ratio
Two things with the
same number are
similar (same name)
Scales
Nominal
Ordinal
Interval
Ratio
Four is more than
three (but not the
same as three from
two)
Scales
Nominal
Ordinal
Interval
Ratio
Four is more than two
(but not twice)
Scales
Nominal
Ordinal
Interval
Ratio
Four is more than
three, same as three
from two, and is twice
two
Which scale are the following
variables rated on?
Height
Celsius degrees
TV channel number
Grades in an exam (1-100)
Psychological rating (anxiety on a scale of 110)
Time (13:00, 14:00)
Time (one hour, two hours, three hours)
Phone number
Rating places in a race
Variables and Scales: summary
Choose an appropriate task
Measure responses
Be aware of the variables and their
properties
Choose the mathematical operations
appropriate for the scale
Factorial design
Tests all possible combinations, e.g., a 2x2
design – one participant variable and one
independent variable with two conditions.
Subject
relatives
TLD
SLI
Object
Relatives
Practical questions for offline
tasks
How many subjects? At least 25
How many categories? 2x2
How many items? More subjects >> fewer
items.
For 25 – 6 items per category
For 50 – 3 is enough
For case studies and within subject analysis
at least 10.
SIMPLE NUMERICAL
COMPUTATIONS
Ratio
The relation between
two nominal variables
N
Nouns
80
V/N ratio: 60/80=3/4
Verbs
60
N/V ratio: 80/60=4/3
Other
words
Total
50
190
Example
Goofy said that the
Troll had to put two
hoops on the pole to
win.
Does the Troll win?
Musolino (2004)
Ratio
N
Yes
8
No
12
Didn’t
answer
Total
10
37
Yes/no ratio:
8/12=2/3
Proportion
Relation between a group and its part
(Verb/Word, Pronouns/Subject position).
Ratio out of the total
Verb/Word proportion: 60/190=1/3=0.31
Percentage (%)
Relative proportion out of a hundred
Verb percentage (out of all words):
100*(60/190) =31%
Rate
The relative frequency (for population out
of a 1000)
7%
of children have SLI
>> 0.07 * 1000 = 70
70 children out of a 1000 have SLI
Frequency
Count the number
of times a score
occurs.
How many times a
value of a variable
occurs?
Example
Show 10 pictures,
and check for number
of “correct” response
Is every bunny eating
a carrot?
Roeper, Strauss and Zurer
Pearson (2004)
Picture
correct
1
1
2
1
3
0
4
0
5
0
6
0
7
1
8
1
9
1
10
1
Total
6
Frequency
Count the number
of times a score
occurs
Child
Score
1
8
2
8
3
6
4
6
5
6
6
6
7
2
8
2
Frequency
Raw score
Frequency
Child
Score
Score
Frequency
1
8
2
2
2
8
6
4
3
6
8
2
4
6
5
6
6
6
7
2
8
2
Frequency=how many children got
this score
Frequency graph
Score on the test is
the horizontal axis
(X-axis)
Frequency is on the
vertical axis (Y-axis)
Percentile
Grade
Frequency
100
90
80
70
60
50
Total N
2
5
10
8
4
1
30
cumulative
frequency
30
28
23
13
5
1
percentile
100%
93%
77%
43%
17%
3%
The cumulative frequency - how many scores are
below a particular point in the distribution
Percentile = 100(Cumulative Frequency/Total N)
Frequency polygon (the curve)
Frequency distribution
N of student
12
10
8
6
4
2
0
50
60
70
80
90
100
Grade
The frequency polygon (the curve) is a picture of the data
Types of distributions (Fig. 4.3
&4.4, pp. 113-116)
Peak
Tails
A bell shaped curve - a symmetric distribution,
a unimodal distribution (one midpoint, one peak),
normal distribution
Pointy distribution (Leptokutic)
Flat distribution (Platykutic)
In skewed distribution the tail is skewed in one direction:
Positively skewed distribution - most scores are low, the
tail is directed towards the high (positive) scores which
skewed the distribution
Negatively skewed distribution - most scores are high, the
tail is directed towards the low (negative) scores which
skewed the distribution
Bimodal distribution - a double peaked curve
Descriptive Statistics - Some
definitions
Min (the lowest score) and Max (the
highest score)
Range – the range of observed values.
Range = Max-Min
But the range changes with the extreme
scores (unstable but useful informal
measure).
Mode - most frequently obtained score
Mean (average) – average of a set of
numbers
Median – the middle score of a group
(when odd) or the average of the two
middle scores (when even)
In a bell curve (normal) distribution mode,
mean and median will be the same
Mode
Grade
Frequency
50
1
60
4
70
8
80
10
90
5
100
2
total
30
Which grade is most
frequent?
Highest in “frequency”
column
Mean (average)
Grade
Frequency
50
1
60
4
70
8
80
10
90
5
100
2
total
30
Compute a sum of all
grades
Divide by number of
grades
Mean (average)
Grade x times
50x1
50
60x4
240
70x8
560
80x10
800
90x5
450
100x2
200
total
2300
mean
2300/30
76.66
Median
Grade
Frequency
50
1
60
4
70
8
80
10
90
5
100
2
total
30
Order all grades in a
row according to
value
The grade in “the
middle” of the row is
the median
Median
Grade
Frequency
50
1
60
4
70
8
80
10
90
5
100
2
total
30
We have a row of 30
grades:
50,60,60,60,60,70…
Half of 30 is 15
The grade in the 15th
position is the median
Median
Grade
Frequency
50
1
60
4
70
8
80
10
90
5
100
2
total
30
Slight complication:
we have 15 grades on
both sides of the
median
Compute mean of the
grades in the 15th and
16th positions
Questions:
Are both curves the same? How?
Are they different? How?
We need to measure the accuracy of the mean.
(Figure from Hatch & Farhady 1982, p.56)
Variability
Coming attractions
How to draw valid statistical inferences?
We
have to look at the relation between our
sample and the population
Today we looked at where the ‘center’ of
the data is – what is the big picture
Look
at variance, how the data is distributed
Deviation
The distance between a score and the Mean (see Table 4.2,
p. 125), how much a score deviates from the average
Sum of squared errors (SS)
Variance
Average error in the sample, average error
in the population
Variance in the sample = SS/N
33.7143/7=4.8163
Variance in the population = SS/(N-1)
33.7143/6=5.6191
Why N-1? Degree of freedom (read box
4.5, page 129)
Standard deviation (SD)
The average distance between a score and the
Mean (square root of the Variance)
SD= √5.6191 = 2.37
What can SD tell us about the distribution (pointy
distribution vs. flat distribution)?
Standard Error (SE)
How well does the sample represent the
population?
Different samples of the population might
yield different means. The SE is the
average of the SDs of the means of
several samples. Large value - big
difference, small value- small difference.
SE = SD/√ N
Confidence Interval
The limits within which 95% or 99% of the
samples fall
Lower boundary = Mean-2SE
Upper boundary = Mean+2SE
Inferential
statistics
z-score and T-score
How can we use the standard deviation
(SD) to compare two samples? two
exams? two tests?
We translate the raw scores into distance in SD
from the mean, by subtracting the mean from the
raw score and dividing by the SD.
So for Table 4.2:
1-3.57
8-3.57
--------- = -1.08
--------- = 1.86
2.37
2.37
These scores are z-scores. Some zscores are negative and some are
positive. Why?
So for Table 4.2:
1-3.57
8-3.57
--------- = -1.08
--------- = 1.86
2.37
2.37
These scores are z-scores. Some zscores are negative and some are
positive. Why?
If you prefer a scale with only positive
numbers, you can use the T-score
T score = 10 * z-score +50
10 * -1.08 +50 = 39.2
10*1.86+50 = 68.6
A few words on Covariance and
Pearson correlation
Covariance - how much two variables co-vary?
Cov = (X - X) (Y- Y)
But we are interested in sets of scores so we
need to sum up all the individual covariance and
divide, as always by N-1.
Σ (X-X)(Y-Y)
COVxy= ---------------------N-1
What do we need covariance for? To measure
correlations (Pearson correlation coefficient is
considered the best way to estimate correlation
between X & Y).
Since the two samples do not have the same
SD, we must adjust the covariance to the
amount of variation
COVx y
r= -------------SDx * SDy
What does r mean?
Positive r - positive correlation
Negative r - negative correlation
Small r - small correlation
Big r - big correlation
inferential statistics.xls
Effect size
We can use correlations to measure
experimental effect size
r2 - the coefficient of determination - is the
fraction of the variance that is accounted for by a
linear correlation.
r=0.1 (small effect) - only 1% of the variance is
accounted for by our task (1%=.01=r2)
r=0.3 (medium effect) - 9% of variance is
accounted for by our task (9%=.09=r2)
r=0.5 (large effect) - 25% of variance is
accounted for by our task (25%=0.25=r2)
r = 1 A perfect effect
Probability
How probable it is to get a certain correlation?
How probable is it to get a certain score?
How probable is it to get a certain mean?
How probable is it that two samples are the
same/different?
Playing "Head or
Throwing a dice.
tails?"
Probability can be calculated by dividing the
number of desired events by the number of
possible outcomes.
Or by relaying on SD
What is the probability of getting a score above the mean?
What is the probability of getting a score which is up to 1SD
above the mean? up to 1SD from the mean? (For every zscore there is a probability)
Confidence Interval
The limits within which 95% of the
samples fall
Lower boundary = Mean-2SE
Upper boundary = Mean+2SE
Hypothesis testing
How likely is it (how probable is it) that our
hypothesis is right?
The probability that some results could
happen by chance is less than 5% (or 1%)
p<0.05 (or p<0.01) - the level of
significance
Null hypothesis - there is no difference between
our sample and the population
Positive hypothesis - the sample does better
than the population.
Negative hypothesis - the sample worse better
than the population
Alternative hypothesis - the sample is different
but there is no direction.
p<0.05
(Figures from Hatch & Farhady 1982, p.87)
p>0.05
If the data falls in the shaded area of 8.5 - the
null hypothesis is confirmed
If the data falls in the shaded area of 8.6 - the
null hypothesis is rejected
If the data falls in the shaded higher tail of 8.6 the scores are higher than the population and
the null hypothesis is rejected
If the data falls in the shaded negative tail of 8.6
- the scores are lower than the population and
the null hypothesis is rejected
Since there is no direction specified by the
null hypothesis, we must consider both
tails - thus we use a two tailed test (with
.025 in each tail).
If we test a directional hypothesis, the
level of significance applies to one tail
only.
(Figures from Hatch & Farhady 1982, p.88)
A score in the shaded area in 8.7 confirms the
_____________ hypothesis
A score in the shaded area in 8.8 confirms the
_____________ hypothesis