Transcript Authority 2

Authority 2
HW 8: AGAIN
HW 8
I wanted to bring up a couple of issues from
grading HW 8.
Even people who got problem #1 exactly right
didn’t think about it enough. The problem was
supposed to be easy, and it is… if you think
about it.
Definition of Conditional Probability
Some of you knew the definition of conditional
probability, and tried to apply that:
P(C/ W) = P(C & W) ÷ P(W)
What’s important here is that the instructions
did not tell you the prior (unconditional)
probability of winning P(W).
A Common Mistake
The instructions explained the probability of
winning by chance alone, as if you were betting
on 10 coin flips: P(W/ not-C), the probability of
winning fairly.
The most common mistake on the homework
was taking P(W) to be equal to 1 ÷ 1024 and
using that number to calculate P(W/ C).
Cheating Increases Winning
But remember: that’s the probability that
someone who isn’t cheating will win. Once you
add in cheaters along with the non-cheaters the
probability of winning increases. If the chances
were still 1 ÷ 1024 even when people started
cheating, nobody would bother to cheat!
P(W)?
But how do you find out the probability of
winning, P(W)?
You know the probability of winning given that
you cheated: P(W/ C) = 100%. Here, the law of
total probability is useful.
Law of Total Probability
This is the law for the special case where we
have a binary variable like C:
P(W) = P(W/ C)P(C) + P(W/ not-C)P(not-C)
The first term is going to be 100% times the
probability of being a cheater, which is 250 ÷
1024000.
Second Term
P(W) = P(W/ C)P(C) + P(W/ not-C)P(not-C)
The second term is going to be (1 in 1024) times
the probability that a randomly selected players
is not a cheater:
P(not-C) = 1 – P(C) = 1 – (250 ÷ 1024000) =
1023750 ÷ 1024000.
P(W)
So we get:
P(W) = (250 ÷ 1024000) + [(1 ÷ 1024) x (1023750
÷ 1024000)]
= (250 + 999.75) ÷ 1024000
= 1249.75 ÷ 1024000
P(C & W)
Now we just need the probability that a
randomly selected individual will be a winner
and a cheater, P(C & W). Since only cheaters will
be winners and cheaters, and every winner and
cheater will be a cheater, C & W = C. So
P(C & W) = P(C) = 250 ÷ 1024000.
20.004%
Now we can use the definition of conditional
probability:
P(C/ W)
= P(C & W) ÷ P(W)
= (250 ÷ 1024000) ÷ (1249.75 ÷ 1024000)
= 250 ÷ 1249.75
= 20.004%
An Easier Way?
But that was hard and it involved a lot of
numbers. How can we make it easier? The first
thing we need to observe is that both the
numerator and the denominator share a
common factor:
Remove Common Factors
P(C/ W) = P(C & W) ÷ P(W)
P(C & W) is going to be the number of cheaters
divided by the number of people who play
roulette. And P(W) is going to be the number of
winners divided by the number of people who
play roulette. So you can forget about the
common factor: 1 ÷ 1024000.
#W ÷ #C
P(C/ W) = P(C & W) ÷ P(W) = #(C & W) ÷ #W
The probability that someone who won was a
cheater is just the number of people who cheat
and win out of the number of people who win.
And as we noted before, #(C & W) = #C. So:
P(C/ W) = #C ÷ #W
Think about It
But even that wasn’t very smart, and needed us
to remember the definition of P(C/ W). What if
we’re no good at probability?
Well, think about the question. We have
someone who is a winner. Did he cheat? If the
percentage of cheaters among winners is very
low, probably not. How many cheaters are there
out of the total number of winners?
#C? #W?
Now we’ve reduced the problem to two
questions: how many people cheat and how
many people win? If we take those two numbers
and divide the first by the second, we get the
probability that someone who wins has cheated.
How many cheaters are there out of the total
number of winners?
#C
I made the answer to the question “how many
cheaters are there?” very easy. The probability
of being a cheater is 250 ÷ 1024000, and the
population of players is 1024000, so
#C = (250 ÷ 1024000) x 1024000 = 250.
No calculator required!
#W?
The second question, “how many winners are
there?” is a little bit tougher. It should be the
number of cheaters, #C, the number we just
calculated– 250– plus the people who won
without cheating.
#W = #C + #(W & not-C)
#(W & not-C)
How do we figure out the # of winners who did
not cheat,
#(W & not-C) =[P(W/ not-C) x #not-C]?
We know the # of non-cheaters, it’s 1024000 –
250, or 1023750. And we know probability that
a non-cheater will win, it’s 1 in 1024. So:
#(W & not-C) = 1023750 ÷ 1024 = 999.75
P(C/ W)
So here’s our answer:
P(C/ W)
= #C ÷ [#C + #(W & not-C)]
= 250 ÷ (250 + 999.75)
= 20.04%
999.75 ≈ 1000
If you realized that 999.75 was almost the same
as 1000 (there’s only a .25 difference), you could
solve the problem in your head:
P(C/ W)
= 250 ÷ (250 + 1000)
= 250 ÷ [250 + (250 x 4)]
= 250 ÷ (250 x 5)
= 1 ÷ 5 = 20%
Problem #2
In the second problem, you were supposed to
give three potential explanations for an as-yet
unexplained correlation: the positive correlation
between the car accident rates in areas of
Chicago, and the rates of street crime in those
areas. If an area has more car accidents, it has
more street crime and vice versa.
Correlation ≠ Causation
There were two problems that I saw show up
frequently in your “car crashes cause street
crime” and “street crime causes car crashes”
explanations.
The first was simply a failure to recognize that
correlation is not the same as causation, and
pointing out a correlation is not the same as
explaining it.
Sample Answer
Here’s a sample answer that I got: “According to
the article, areas that had higher rates of street
crime had more car accidents. So street crime
causes car accidents.”
The claim “areas that had higher rates of street
crime had more car accidents” is just a
description of the correlation.
Causal Explanation
If you want to provide an explanation of a
phenomenon, you have to do two things: (a)
make a causal claim and (b) provide a
mechanism.
To explain why car accidents and street crime
are correlated you might say (a) car accidents
cause street crime and (b) the way this happens
is that car accidents distract people and make
them vulnerable to robbery.
Second Problem
The second problem with some of your “A
causes B” and “B causes A” explanations was
that some of you often confused common cause
explanations for them.
For example: “Here’s how car accidents cause
street crime: people with a deviant tendency
often drive recklessly. This tendency also causes
them to commit crimes.”
Really a Common Cause
Here, the deviant tendency is what is causing
both the car accidents and the street crime.
If you stopped the car accidents (for example, by
removing everyone’s driver’s license) you
wouldn’t reduce the street crime, because it’s
the deviant tendency and not the car accidents
that causes the crime.
Common Cause
There were also some misconceptions about
common cause explanations.
To explain a correlation between A and B by a
common cause C, you make two causal claims: C
causes A and C causes B and provide two
mechanisms: say why C causes A and why it
cause B too.
C Must Be One Thing
Importantly, C can’t be two different things, one
of which explains A and the other of which
explains B.
For example, you can’t say C = “people who
don’t pay attention and need money” and then
go on to explain: “If people don’t pay attention,
then they get in car accidents; if they need
money, they commit robberies.”
X and Y Correlated?
Unless you had an independent reason to think
that that there was a strong correlation between
not paying attention and needing money, this
couldn’t possibly explain the correlation
between car accidents and street crime. This
explanation is really of the form “X = not paying
attention causes car crashes” and “Y = needing
money causes street crime.”
Bad Explanation
And you can’t argue like this:
X causes A
Y causes B___________________
Therefore A and B are correlated.
Compare:
Broken bones cause extreme pain.
Smiling babies cause happiness.______
Therefore, extreme pain is correlated with
happiness.
Correlated In the Right Way
It’s also important that the common cause in
your explanation is correlated in the right way
with the variables A and B.
Here’s an example of the wrong way: “C =
residents having more money. People who have
more money will buy more cars, and more cars
means more accidents. People who have more
money will turn to street crime less often.”
Positive vs. Negative Correlations
This proposed explanation provides a variable C,
“how much money residents have” that causes
the values of the variables A, “number of car
accidents” and B, “amount of street crime.”
However, it predicts that we should see areas of
Chicago with more accidents and less crime, and
areas with more crime and fewer accidents. It
predicts that crime and crashes are negatively
correlated.
Goal of Causal Models
Finally, remember what the goal of finding
various causal models for discovered
correlations is. We want to know the best
explanation for the correlation. This requires
that any proposed explanation, like a common
cause explanation, be reasonable.
Unreasonable Explanation
I got a lot of examples like this: “night time/ bad
weather/ natural disasters are the common
cause that explains the correlation between car
accidents and street crime. When it’s dark
outside, more crime happens. Also when it’s
dark, it’s tougher to see and more easy to crash.
So darkness causes both increased crime and
increased car accidents.”
How Dark It Is
But “how dark it is outside” is a variable that has
different values at different times. I’m sure there
is a positive correlation between street crime at
a certain time of day and car accidents at that
time of day– for exactly this reason.
But the correlation we wanted to explain was
not the values of “street crime” and “car
crashes” at different times of day– there’s a
positive correlation at different parts of the city!
Observed Correlation
Here’s the observed correlation: neighborhoods
in Chicago that have higher rates of car crashes
have higher rates of street crime.
But it gets dark at the same time in every
neighborhood in Chicago. No neighborhood has
more night-time than any other neighborhood.
So the reason that neighborhoods differ in these
ways cannot be explained by the correlation
between street crime and car accidents at night.
Bad Weather & Natural Disasters
The same thing is true for bad weather and
earthquakes. All the areas of Chicago have the
same amount of good weather and the same
amount of bad weather. So even if there’s more
crime and more accidents during bad weather,
this can’t explain why some areas of Chicago
have more crime and more accidents than other
areas.
Variables
This confusion is partly my fault. I should have
explained variables more carefully.
Here’s how I introduced variables: a variable is
something that takes on different values. So for
example, “height” is a variable because different
people have different heights, and “daylight” is a
variable, because different times have different
amounts of daylight.
Domain of a Variable
But there is an important concept here that I did
not stress: the domain of the variable. Different
people have different heights, but different
times do not. So people (and other things that
can have heights, like buildings) are in the
domain of the variable “height.” Different times
can have different amounts of daylight, but
different people (at least people at the same
latitude) cannot, so only “time” is in the domain
of “amount of daylight.”
Subtly Different Variables
Some variables can seem the same, even if they
are different, because they have different
domains.
So there is one variable “crime” that has as its
domain time. At different times, there are
different amounts of crime. And there is another
variable “crime” whose domain is places:
different places have different amounts of
crime.
Keep the Domain Fixed
When you want to explain a correlation
between two variables with domain D, and you
are proposing a common cause variable C, C
must also have D as its domain.
Since the variables we’re looking at are varying
rates of crime and car accidents by location, we
need to look at potential common causes that
vary by location, not daylight or bad weather.
SUMMARY
Think
The answers to HW 8 were mostly good. Still, I
can’t teach you everything, there’s only so much
time in class.
You need to think about the problems. What’s
the probability that someone who won was a
cheater? The answer is a lot easier if you think
first instead of breaking out the calculator and
the statistics textbooks.
Think
What is a common cause explanation for
correlated levels of car accidents and street
crime over different locations in Chicago? Some
answers to this question (“earthquakes”) don’t
make any sense. I can explain why they don’t
make sense, but you don’t need that
explanation to see that they don’t work. Think
about how the causal model is supposed to run,
and you will see that it doesn’t.