Probability-and-Inductionx
Download
Report
Transcript Probability-and-Inductionx
Probability and Induction
Probability
Probability is a measure of the chances that
something will happen.
• A fair coin has a 50% probability of landing
heads.
• There is a 20% probability that it will rain
tomorrow.
• There is a 37.6% chance that Horacio Cartes
will win the presidency of Paraguay.
Frequency
Probability is not the frequency that something
happens.
Frequency of coin landing heads
=
[# of heads ÷ # of total flips]
Frequency
The frequency of heads over a large number of
coin flips will be close to the probability of a coin
landing heads. But:
• Frequency over a small number of flips may
be very different from probability.
• Some events only happen once, like the 2013
election in Paraguay.
What Is Probability?
People disagree on this question. One idea is
that it’s a type of uncertainty.
• A 50% probability of a coin landing heads is
where I’ll only bet $10 that it will land heads if
the bet pays $10 or more if I win.
• A 20% probability of it raining tomorrow is
where I’ll only bet $10 that it will rain if the
bet pays $40 or more.
Formalizing Probability
We can use Pr(-) and our earlier logic to
formalize probability claims:
Pr(A) = the probability of A happening
Pr(A & B) = probability of A and B
Pr(A v B) = probability of A or B
Pr(~B) = probability of B not happening
Probability Axioms
Probability theory is one of the simplest
mathematical theories there is. There are only
three basic laws (axioms): For all A, B:
1 ≥ Pr(A) ≥ 0
Pr(A v ~A) = 1
Pr (A v B) = Pr(A) + Pr(B) – Pr(A & B)
Conjunction
I won’t ask you to do any math, but you do need
to remember a couple of facts for the final:
Pr(A & B) ≤ Pr(A)
Pr(A & B) ≤ Pr(B)
Note: “≤” not “<“
Conjunction Fallacy
We learned this when we learned about the
conjunction fallacy. (A & B) is always less
probable or equally probable than A, and always
less probable or equally probable than B.
Question #2
2. Which of the following is most likely to happen?
a. There will not be a final exam in this class.
b. There will not be a final exam in this class,
because the instructor has to leave the country.
c. Lingnan University closes and there will not be a
final exam in this class.
d. There is not enough information to answer this
question.
Question #2
2. Which of the following is most likely to happen?
a. There will not be a final exam in this class.
b. There will not be a final exam in this class,
because the instructor has to leave the country.
c. Lingnan University closes and there will not be a
final exam in this class.
d. There is not enough information to answer this
question.
Special Cases
Normally, (A & B) is less probable than A, as in
the question on the exam. But sometimes they
are equal, for instance when (A → B) or when B
is always true.
The probability that a thing x is a dog AND an
animal is the same as the probability that it is a
dog, because “x is a dog → x is an animal.”
For the Final
So remember for the final:
Pr(A & B) ≤ Pr(A)
Pr(A & B) ≤ Pr(B)
There is more than one question that tests these
facts!
INDUCTION
Kinds of Inference
C.S. Pierce (1839-1914), an American pragmatist
philosopher, was the first to divide inferences
into three types:
• Deductive
• Abductive
• Inductive
Jars of Balls
Jars of Balls
Imagine that we are reasoning about a jar full of
a mix of red and black balls. For any ball it can
have one of three features (or their opposites):
• J = it is in the jar
• S = it is part of a random sample of balls taken
from the jar
• R = it is red
Deductive Arguments
A deductive argument (also known as a logically
valid argument) is an argument such that if the
premises are true, then the conclusion must be
true; the premises can’t all be true while the
conclusion is false; if the conclusion is false, at
least one of the premises must also be false.
Deduction Example
Everything in the jar is red; this sample is taken
from the jar; so everything in this sample is red.
[Rule] All J’s are R’s.
[Case] All S’s are J’s.
Therefore,
[Result] All S’s are R’s.
Notes on Terminology
We did a fair amount of deductive logic so far.
The Final contains language like “construct a
natural deduction from the premises ________
to the conclusion ________.”
This just means prove. All the proof questions
are optional.
Abductive Arguments
An abductive argument, unlike a deductive
argument, is ‘ampliative’: the conclusion goes
beyond what is contained in the premises.
Abductive Arguments
Abductive arguments are also called ‘inferences
to the best explanation.’
We observe some phenomenon, think about all
the different ways it could have been produced,
and conclude that it was produced in the way
that is most plausible, best fits with our other
theories, etc.
Abduction Example
Everything in the jar is red; everything in the
sample is red; so the sample came from the jar.
[Rule] All J’s are R’s.
[Result] All S’s are R’s.
Therefore,
[Case] All S’s are J’s.
Generalized Abduction
X% of what’s in the jar is red; Y% of what’s in the
sample is red; so the sample came from the jar.
[Rule] X% of J’s are R’s.
[Result] Y% of S’s are R’s.
Therefore,
[Case] All S’s are J’s.
Inductive Arguments
Inductive arguments are also ‘ampliative’ in that
the truth of their premises does not guarantee
the truth of their conclusions.
Inductive Arguments
Inductive arguments reason from what’s true of
a sample, to what’s true of the population as a
whole.
Polling is an example of induction, as are
inferences from past experience to future
experience (since what’s past is only a sample of
what happens).
Induction Example
This sample is taken from the jar; everything in
the sample is red; so everything in the jar is red.
[Case] All S’s are J’s.
[Result] All S’s are R’s.
Therefore,
[Rule] All J’s are R’s.
Generalized Induction
This sample is taken from the jar; Y% of what’s
in the sample is red; so X% of what’s in the jar is
red.
[Case] All S’s are J’s.
[Result] Y% of S’s are R’s.
Therefore,
[Rule] X% of J’s are R’s.
Enumerative Induction
A common form induction takes is what is
known as enumerative induction:
a1 is F and G
a2 is F and G
…
an is F and G
Therefore, everything that is F is G
Example of Enumerative Induction
Fruit #1 is a durian that smells bad.
Fruit #2 is a durian that smells bad.
…
Fruit #563 is a durian that smells bad.
Fruit #564 is a durian that smells bad.
Therefore,
All durians smell bad.
Need Not Infer to a Rule
Fruit #1 is a durian that smells bad.
Fruit #2 is a durian that smells bad.
…
Fruit #563 is a durian that smells bad.
Fruit #564 is a durian that smells bad.
Therefore,
Fruit #565, if it’s a durian, will smell bad.
Strength and Weakness
We often do not use the word “valid” to
describe inductive arguments.
An argument is deductively true when IF the
premises are true, THEN the conclusion is true.
Inductive arguments are not deductively valid.
Strength and Weakness
Instead, we can say that some inductive
arguments are strong and others are weak.
An inductive argument A1, A2, …; therefore C is
strong if the probability of its conclusion, given
its premises, is very high:
Pr(C / A1 & A2 & …) >> 0.
Strength
Strength: assuming the premises are true, the
probability of the conclusion is very high.
This does not mean that any argument with a
high-probability conclusion is inductively strong.
Sometimes Pr(C) >> Pr(C / A)
Example
For example, it’s highly probable that I will not
win Mark 6.
But this is not a strong deductive argument: I
correctly guessed the first 5 numbers
announced in Mark 6; therefore, I will not win
Mark 6.
A Note on Terminology
On the final, you will encounter both the
phrases “inductively strong” and “inductively
valid.” These phrases mean the same thing. It’s a
personal preference to use one or the other.
“Generalized Deduction”
Another form that induction can take is not
often discussed by philosophers: it’s the
‘generalized’ form of the deductive argument:
[Rule] X% of J’s are R’s.
[Case] All S’s are J’s.
Therefore,
[Result] Y% of S’s are R’s.
Example
90% of French people like to eat snails.
Pierre is a French person.
Therefore, Pierre likes to eat snails.
Maybe this isn’t induction? It’s definitely neither
deduction nor abduction.
SAMPLES
Sample
In statistics, the people who we are studying are
called the sample. (Or if I’m studying the
outcomes of coin flips, my sample is the coin
flips that I’ve looked at. Or if I’m studying
penguins, it’s the penguins I’ve studied.)
Induction on Samples
Our goal is to use induction to infer from the
statistical properties of the sample, to the
statistical properties of the population.
For example, we want to infer from the fact that
in a sample of 100, those who drank win lived
on average 2 years longer, to the conclusion that
on average, people in the population as a whole
who drink wine live 2 years longer.
Statistical Claims
Claims about the population are usually made
like this:
“We are 90% confident that people who drink
wine live between 1.8 and 2.1 years longer than
people who don’t drink wine.”
They present a 90 (or 95 or 99) percent
confidence interval.
Margin of Error
Usually, we want a 95% confidence interval.
Thus, we have a special name for 95%
confidence intervals: “margins of error.”
If someone says: “37% will vote for Cartes with a
3% margin of error” what they mean is that they
are 95% sure that between 34% and 40% of
people will vote for Cartes.
Sample Size Determination
One question we should know the answer to is
how many people in the sample are needed to
determine the statistical properties of the
population with a low margin of error?
Law of Large Numbers
Luckily, we do know that more is always better.
The “Law of Large Numbers” says that if you
make a large number of observations, the
results should be close to the expected value.
(There is no “Law of Small Numbers”)
Average of Dice Rolls
Example
Let’s think about a particular problem.
Suppose we are having an election between
Alegre and Cartes and we want to know how
many people in the population plan to vote for
Alegre.
How many people do we need to ask?
Non-Random Samples
The first thing we should realize is that it’s not
going to do us any good to ask a non-random
group of people.
Suppose everyone who goes to ILoveAlegre.com
is voting for Alegre. If I ask them, it will seem like
100% of the population will vote for Alegre,
even if only 3% will really vote for him.
Internet Polls
(Important Critical Thinking Lesson:
Internet polls are not trustworthy. They are
biased toward people who have the internet,
people who visit the site that the poll is on, and
people who care enough to vote on a useless
internet poll.)
Selection Bias
Why do internet polls exist, if they aren’t
accurate or trustworthy? Often because the
people putting up the poll do not want accurate
results.
If I put up a poll on my blog about whether it’s
wrong to deny the right of abode to helpers,
then people who agree with me (the only
people who read my blog) will vote. Then I will
have fake “evidence” that I’m right.
Representative Samples
The opposite of a biased sample is a
representative sample.
A perfectly representative sample is one where
if n% of the population is X, then n% of the
sample is X, for every X.
For example, if 10% of the population smokes,
10% of the sample smokes.
Random Sampling
One way to get a representative sample is to
randomly select people from the population, so
that each has a fair and equal chance of ending up
in the sample.
For example, when we randomize our experiments,
we randomly sample the participants to obtain our
experimental group. (Ideally our participants are
randomly sampled from the population at large.)
Problems with Random Sampling
Random sampling isn’t a cure-all, however.
For example, if I randomly select 10 people from
a (Western) country, on average I’ll get 5 men
and 5 women. On average.
But, on any particular occasion, I might select
(randomly) 7 men and 3 women, or 4 men and 6
women.
Stratified Sampling
One way to fix these problems would be to
randomly sample 5 women and randomly
sample 5 men. Then I would always have an
even split between men and women, and my
men would be randomly drawn from the group
of men, while my women were randomly drawn
from the group of women.
How Many People?
So to return to the question: how many people
do we need to include in a poll or an experiment
before we can infer to the population at large
(with a high degree of confidence that an effect
is in a narrow range)?
It depends on the population size!
How Many People
This is the number of randomly sampled
respondents one needs, in a population of N, to
get an answer with a 3%, 5%, or 10% margin of
error.
The important thing to notice is that as the
population gets bigger and bigger, the
corresponding samples don’t get that much
bigger.