Transcript Covariance

Statistics
or
What’s normal about the normal curve,
and what’s standard about the standard deviation,
and what co-relates in a correlation
Statistics: Intro
Overview
• What’s normal about the normal curve?
– The nature of the confusion
– One formal answer
– An intuitive answer (real-time demo)
• What’s standard about a standard deviation?
– Z-scores
• [ What’s co-relates in a correlation? ]
Statistics: Intro
What’s normal about the normal curve(s)?
• The normal curve is not a single curve, but a class of curve
of property distribution /probability, that share properties
in common
• There are a number of ways of mathematically defining
and estimating the normal distribution
• The actual definition (which you don’t need to know) is:
Statistics: Intro
What’s normal about the normal curve(s)?
• The main questions I want to address today is:
– What does that math mean?
– Why are so many things normally distributed?
– What makes sure that those things stay distributed
normally?
– What stops other things from being normally
distributed?
Statistics: Intro
What is the normal curve?
• The normal curve has the following properties:
– It is bell-shaped
– It is symmetric
– The total area under the curve is 1.
– The normal curve extends indefinitely in both
directions, getting infinitely close to zero in
either direction.
Statistics: Intro
From: Wilensky, U., (1997). What is Normal Anyway? Therapy for
Epistemological Anxiety. Educational Studies in Mathematics. Special Issue on
Computational Environments in Mathematics Education. Noss R. (Ed.) Volume
33, No. 2. pp. 171-202.
U: Why do you think height is distributed normally?
L: Come again? (sarcastic)
U: Why is it that women's height can be graphed using a normal curve?
L: That's a strange question.
U: Strange?
L: No one's ever asked me that before..... (thinking to herself for a while) I guess
there are 2 possible theories: Either it's just a fact about the world, some guy
collected a lot of height data and noticed that it fell into a normal shape.....
U: Or?
L: Or maybe it's just a mathematical trick.
U: A trick? How could it be a trick?
Statistics: Intro
L: Well... Maybe some mathematician somewhere just concocted this crazy
function, you know, and decided to say that height fit it.
U: You mean...
L: You know the height data could probably be graphed with lots of different
functions and the normal curve was just applied to it by this one guy and now
everybody has to use his function.
U: So you’re saying that in the one case, it's a fact about the world that height is
distributed in a certain way, and in the other case, it's a fact about our
descriptions but not about height?
L: Yeah.
U: Well, if you had to commit to one of these theories, which would it be?
L: If I had to choose just one?
U: Yeah.
L: I don't know. That's really interesting. Which theory do I really believe? I guess
I've always been uncertain which to believe and it's been there in the
background you know, but I don't know. I guess if I had to choose, if I have to
choose one, I believe it's a mathematical trick, a mathematician's game. ....What
possible reason could there be for height, ....for nature, to follow some weird
bizarro function?
Statistics: Intro
Formal answer 1: The binomial distribution I
The chance of an event of probability p happening r times out
of n tries:
P(r) = n!/(r! (n - r)!) * pr * (1 - p) n-r
(Recall: We wondered about this generalization last class.)
Statistics: Intro
Formal answer 1: The binomial distribution II
Why is it called the binomial distribution?
Bi = 2
Nom = thing
= the two-thing distribution
It can be used wherever:
– 1. Each trial has two possible outcomes (say, success
and failure; or heads and tails)
– 2. The trials are independent = the outcome of one trial
has no influence over the outcome of another trial.
– 3. The outcomes are mutually exclusive
– 4. The events are randomly selected
Statistics: Intro
Let’s try it out (Example 6.3 from our first probability class)
• What are the odds of there being exactly one seven
out of two rolls?
• one way is to roll 7 first, but not second
- the odds of this are 1/6 * 5/6 (independent
events) = 0.138
- the odds of rolling 7 second are 5/6 * 1/6
(independent events) = 0.138
- since these two outcomes are mutually
exclusive, we can add them to get 0.138 + 0.138 = 0.277
Statistics: Intro
The generalization (Example 6.3 from last class)
• What are the odds of there being exactly one seven
out of two rolls?
An event of probability p happens r times out of n tries:
P(r) = n!/(r! (n - r)!) * pr * (1 - p) n-r
p = 1/6; N = 2; r = 1
2!/(1!1!)*1/61*5/61 = 0.277
Statistics: Intro
What does this have to do with the normal distribution?
Statistics: Intro
What does this have to do with the normal distribution?
Statistics: Intro
Why does this normal distribution happen?
[See http://ccl.northwestern.edu/cm/index.html
for the StarLogoT demo used in class.
Can you understand:
What effect changing the probabilities of each event has?
What has to change to skew a normal curve?]
Statistics: Intro
The standard deviation
From: http://www.psychstat.smsu.edu/introbook/sbk00.htm
• Given the non-linear shape of the normal distribution, one
has two choices:
– A.) Keep the amount of variation in each division
constant, but vary the size of the divisions
– B.) Keep the size of each division constant, but vary the
the amount of variation in each division
Statistics: Intro
The standard deviation (SD)
• The definition of SD takes the second approach: it
keeps the size of each division constant, but it
varies the the amount of variation in each division
• The SD is a measure of average deviation
(difference) from the mean
• It is the square root of the variance, which is
the average squared difference from the
mean. [Why do we square the difference?]
Statistics: Intro
Z-scores
• If we express differences by dividing them by SDs, we have zscores: standard units of difference from the mean
• THESE Z-SCORES WILL COME IN EXTREMELY
USEFUL!
– For example, we might want to know:
• If a 12-foot elephant is taller (compared to the height of
average elephants) than a 230 pound man is heavy
(compared to weight of average men)
• If Wayne Gretzky was better hockey player than Tiger Woods
is a golfer (a prize for the person who proves on or the
other!)
• If a person with a WAIS IQ of 140 is rarer(= less probable)
than a person with a GPA of 3.9
—Etc.
Statistics: Intro
What co-relates in a correlation?
• In a correlation, we want to find the equation for the (one
and only) line (the line of regression) which describes the
relation between variables with the least error.
– This is done mathematically, but the idea is simply that
we draw a line such that the squared distances on two
(or more) dimensions of points from the line would not
be less for any other line
Statistics: Intro
We need first to know: What is covariance?
• Covariance is closely related to variance (which is, recall,
the average of the squared deviations from a mean)
• The covariance of two features X and Y measures their
tendency to vary together, i.e., to co-vary.
• It is defined as the average of (differences from the mean
for X multiplied by the differences from the mean for Y)
– That is: the average of the products of the deviations
from the mean of X and Y
– In variance (one variables), we square the differences
from the mean
– In covariance (two variables), we multiply one
difference from the mean by the other difference from
the mean
Statistics: Intro
We need first to know: What is covariance?
• Covariance is the average of the products of the deviations
from the mean of X and Y
• Properties:
– If X and Y tend to increase together, then c(X,Y) > 0
– If one tends to decrease when the other increases, then
c(X,Y) < 0
– If X and Y are independent, then c(X,Y) = 0
– | c(X,Y) | <= the product of the standard deviations of
X and Y
Statistics: Intro
What is a correlation?
• R = The covariance of x and y / the product of the SDs of
X and Y
• It is measure of how (the product of) item-by-item
differences from the two means relates to (the product of )
their overall average differences
• When X and Y are related, covariance close to the product
of the SDs of X and Y, so R will be close to 1.
• When X and Y are unrelated, the differences from the
means by item will depart from the average differences
from the mean: c(x,y) < SD(x) * SD(y)
Statistics: Intro
Visual help
• Check out the normal curve and correlation real-time
demos (as well as infinite 2-dice problems!) at:
http://noppa5.pc.helsinki.fi/koe/
Statistics: Intro