Probability and Risk

Download Report

Transcript Probability and Risk

LSP 121
Introduction to Probability
and Risk
A Question
• With terrorism, homicides, and traffic
accidents, is it safer to stay home and
take a college course online rather than
head downtown to class?
• We’ll come back to this later
* Number of Possible Outcomes
• Suppose there are M possible outcomes for one
process and N possible outcomes for a second
process. The total number of possible outcomes
for the two processes combined is M x N.
• How many possible outcomes are possible when
you roll two dice?
– 6*6
• How many possible outcomes for having three
children?
– 2*2*2
Possible Outcomes Continued
• A restaurant menu offers two choices for an
appetizer, five choices for a main course, and
three choices for a dessert. How many different
outcomes (ie. how many different three-course
meals)?
– 2*5*3
• A college offers 12 natural science classes, 15
social science classes, 10 English classes, and 8
fine arts classes. How many choices?
– 12*15*10*8 = 1400
Possible Outcomes Continued
• A license plate has 7 digits, each digit being 09. How many possible outcomes?
• What if the license plate allows digits 0-9 and letters
A-Z?
• How many area codes in the US?
* Three Types of Probability
1. Theoretical, (aka “a priori”) probability – based on a model
in which all outcomes are equally likely.
•
•
Probability of a die landing on a 2 = 1/6
Probability of coin coming up tails = 1/2
2. Empirical probability – base the probability on the results of
observations or experiments.
•
If it rains an average of 100 days a year, we might say the probability of rain
on any one day is 100/365.
3. Subjective (personal) probability – use personal judgment or
intuition.
•
•
If you go to college today, you will be more successful in the future.
The Blackhawks have a 45% chance of winning the cup again next year.
Theoretical Probability
• P(A) = (number of ways A can occur) / (total number of outcomes)
– Denominator = # of outcomes discussed in previous slides
• Probability of a head landing in a coin toss?
– Numerator: head can occur 1 way
– Denominator: 2 possible outcomes
• = 1/2
• Probability of rolling a 7 using two dice?
– Num: 1/6, 2/5, 3/4, 4/3, 5/2, 6/1 (6 ways)
– Denom: 36 outcomes
• = 1/6
• Probability that a family of 3 will have two boys and one girl?
– Num: 3 possible ways of having 2 boys and 1 girl (BBG,BGB,GBB)
– Denom: 8 possible outcomes: (BBB,BBG,BGB,BGG,GBB, GBG, GGB, GGG) 
= 3/8
Empirical Probability
• Recall: Empirical probability is a probability based
on observations or experiments
• Example: Records indicate that a river has crested
above flood level just four times in the past 2000
years. What is the empirical probability that the
river will crest above flood level next year?
– 4 times in previous 2000 years = 1 time every 500
years
– So, probability = 1/500 = 0.002
* Probability of an Event Not Occurring
• P(not A) = 1 - P(A)
• Seems simple, but turns out to be very useful,
so don’t forget this rule
• If the probability of rolling a 7 with two dice is
6/36, then the probability of not rolling a 7
with two dice is 30/36
Combining Probabilities Independent Events
• Two events are independent if the outcome of
one does not affect the outcome of the next
– We will contrast this with combining probabilities
for events that are not independent
• For independent events, the probability of A
and B occurring together, P(A and B), = P(A) x
P(B)
Combining Probabilities Independent Events
• For example, suppose you toss three coins. What is the
probability of getting three tails?
– (1/2) x (1/2) x (1/2) = 1/8
• Find the probability that a 100-year flood will strike a
city in two consecutive years
– (1 in 100) x (1 in 100) = 0.01 x 0.01 = 0.0001
• What is the probability of drawing an ace of diamonds
and then an Ace of clubs from the deck?
– 1/52 * 1/52? No: These events are not independent.
Combining Probabilities
(for Independent Events)
• You are playing craps in Vegas. You have had a
string of bad luck. But you figure since your
luck has been so bad, it has to balance out
and turn good
• Bad assumption! Each event is independent
of another and has nothing to do with
previous run. Especially in the short run (as
we will see in a few slides)
• This is called Gambler’s Fallacy
• Is this the same for playing Blackjack?
“OR” Probabilities
(for Non-Overlapping Events)
• If you ask what is the probability of either this
happening or that happening, and the two
events don’t overlap:
P(A or B) = P(A) + P(B)
• Suppose you roll a single die. What is the
probability of rolling either a 2 or a 3?
P(roll 2 or 3) = P(2) + P(3) = 1/6 + 1/6 = 2/6
Probability of At Least Once
• What is the probability of something
happening at least once?
• P(at least one event ‘A’ in ‘n’ trials)
=
1 - [ P(not A in one trial) ]n
Example
• What is the probability that a region will experience at
least one 100-year flood during the next 20 years?
• Recall: P(at least one event ‘A’ in ‘n’ trials)
=
1 - [ P(not A in one trial) ]n
• Probability of a flood in one “trial” (i.e. one year) is
1/100. So, the probability of no flood is 99/100.
• P(at least one flood in 20 years)
= 1 - P(no flood in one year)20
= 1 - P(0.99 )20
= 0.87
Another Example
• You purchase 10 lottery tickets, for which the probability
of winning some prize on a single ticket is 1 in 10. What
is the probability that you will have at least one winning
ticket?
• P(at least one event ‘A’ in ‘n’ trials)
=
1 - [ P(not A in one trial) ]n
• P(at least one winner in 10 tickets)
= 1 – ( 1- 0.1 )10
= 1-0.910
= 0.65
* Law of Large Numbers
• Law of large numbers: Probability only applies over a large
number of trials.
• The probability of tossing a coin and landing tails is 0.5. But
what if you toss it 5 times and you get HHHHH?
– Could this happen?  Of course!
– But if you flipped 500 times, do you think you’ll get 500 heads?
 Almost impossible
• The “law of large numbers” tells you that if you toss the
coin many times, you should get approximately 50% tails.
– The more times you flip the coin, the more likely you are to get
50% of each
– The fewer number of times you flip the coin, the less likely you
are to get 50% of each
Expected Value
• Furthermore, what if you have multiple
related events – each of which has its own
probability? What is the expected value from
the set of all possible events?
• We call this the ‘expected value’
• Expected value = (event 1 value x prob of
event 1) + (event 2 value x prob of event 2) +
(event 3 value * prob event 3) + etc …
Expected Value - Example
• Suppose that $1 lottery tickets have the
following probabilities and values:
– 1 in 5 win a free $1 ticket
– 1 in 100 win $5
– 1 in 100,000 to win $1000
– 1 in 10 million to win $1 million
• What is the expected value of a lottery ticket?
Expected Value- Solution
–
–
–
–
–
Ticket purchase: value $1, prob 1
Win free ticket: value $1, prob 1/5
Win $5: value $5, prob 1/100
Win $1000: prob 1/100,000
Win $1million: prob 1/10,000,000
• Expected value = (-1*1) + (1*1/5) + (5*1/100) +
(1000*1/100000) + (1000000*1/10000000)
• = - $0.64
• That is, every ticket costs you an average of 64 cents
– A positive value refers to money we gain
– A negative value comes from money we spend. (e.g. The minus
1 in the very first term comes from the fact that a tickets costs
us 1 dollar)
– Don’t forget that negative number!!!
Solution Continued
• Now sum all the products:
-$1 + 0.20 + 0.05 + 0.01 + 0.10 =
-$0.64
So, averaged over many tickets, you should expect to
lose $0.64 (on average) for each lottery ticket that you
buy. If you buy, say, 1000 tickets, you will win with
some of them and you will lose with some of them.
However over a 1000 tickets, you should expect to lose
about $640.
Expected Value - Another Example
• Suppose an insurance company sells policies
for $500 each.
• The company knows that about 10% will
submit a claim that year and that claims
average to $1500 each.
• Does the company make or lose money on
average? How much?
Another Example – Expected Value
• Company makes $500 100% of the time (when a
policy is sold)
• Company loses $1500 10% of the time
• (+500 x 1.0) + (-$1500 x 0.1) = 500 – 150 = 350
• Company gains $350 from each customer
• The company needs to have a lot of customers to
ensure this works (Law of large numbers)
– Recall that the law of large numbers tellls us that
probabilities are guaranteed to reach their expected
values only when averaged over many trials.
Part II - RISK
• Uses probabilities…
Do You Take Risks?
• Are you safer in a small car or a sport utility
vehicle?
• Are cars today safer than those 30 years ago?
• If you need to travel across country, are you
safer flying or driving?
The Risk of Driving
• In 1966, there were 51,000 deaths related to
driving, and people drove 9 x 1011 miles
• In 2000, there were 42,000 deaths related to
driving, and people drove 2.75 x 1012 miles
• Was driving safer in 2000?
The Risk of Driving
• 51,000 deaths / 9 x 1011 miles = 5.7 x 10-8
deaths per mile
• 42,000 deaths / 2.75 x 1012 miles = 1.5 x 10-8
deaths per mile
• Driving has gotten safer! Why?
Driving vs. Flying
• Over the last 20 years, airline travel has averaged 100
deaths per year
• Airlines have averaged 7 billion (7 x 109) miles in the air
• 100 deaths / 7 x 109 miles = 1.4 x 10-8 deaths per mile
• How does this compare to driving (1.5 x 10-8 deaths per
mile)?
• Is it fair to compare miles driven to miles flown?
– Might be more accurate to compare deaths per trip
– Key point: Even when you come up with a nice statistical
number, it is no substitute for thinking. This is where many
people (even those who should know better) drop the ball.
The Certainty Effect
• Suppose you are buying a new car. For an
additional $200 you can add a device that will
reduce your chances of death in a highway
accident from 50% to 45%. Interested?
• What if the salesman told you it could reduce
your chances of death from 5% to 0%.
Interested now? Why?
The Certainty Effect
• Suppose you can purchase an extended
warranty plan which covers 33% of the items
completely but remaining items not at all
• Or you can purchase an extended warranty
plan which covers all items at 33% coverage
• Which would you choose?
The Availability Heuristic
• Which do you think caused more deaths in the
US in 2000, homicide or diabetes?
• Homicide: 6.0 deaths per 100,000
• Diabetes: 24.6 deaths per 100,000
Which Has More Risk?
• Which is safer – staying home for the day or going to
school/work?
• In 2003, one in 37 people was disabled for a day or more by
an injury at home – more than in the workplace and car
crashes combined
– Shave with razor – 33,532 injuries
– Hot water – 42,077 injuries
– Slice a grapefruit with a knife – 441,250 injuries
Which Has More Risk?
• What if you run down two flights of stairs to
fetch the morning paper?
• 28% of the 30,000 accidental home deaths
each year are caused by falls (poisoning and
fires are the other top killers)
Which Has More Risk?
• Ratio of people killed every year by lightning
strikes versus number of people killed in shark
attacks: 4000:1
• Average number of people killed worldwide
each year by sharks: 6
• Average number of Americans who die every
year from the flu: 36,000
What Should We Do?
• Hide in a cave?
• Know the data – be aware!
• Now, let’s start our first med school lecture
Tumors and Cancer
• Welcome to the DePaul School of Medicine!
• Most people associate tumors with cancers,
but not all tumors are cancerous
• Tumors caused by cancer are referred to as
malignant
• Non-cancerous tumors are referred to as
benign
Tumors and Cancer
• We can calculate the chances of getting a
tumor and/or cancer. Our probability data is
based on empirical research studies.
• If you don’t know how to calculate simple
probabilities, you will misinform your patient
and cause undo stress
Mammograms
• Suppose your patient has a breast tumor.
Is it cancerous?
– Probably not
– Studies have shown that only about 1 in 100
breast tumors turn out to be malignant
– Nonetheless, you order a mammogram
– Suppose the mammogram comes back
positive. Now does the patient have cancer?
Accuracy
• Key Question: What is meant by a “positive”
test?
• Earlier mammogram screening was 85%
accurate
– This might lead you to think that if you tested
positive, there is a pretty good chance that you
have cancer.
• But this is not true!
Actual Results
• Consider a study in which mammograms are
given to 10,000 women with breast tumors
• Assume that 1% (1 in 100) of the tumors are
malignant (100 women actually have cancer,
9900 have benign tumors)
Actual Results
Tumor is
Malignant
Tumor is Benign
Totals
100
9900
10,000
Positive
Mammogram
Negative
Mammogram
Total
Tumor is Malignant is 1/100th of the total 10,000.
Actual Results
• Mammogram screening correctly identifies
85% of the 100 malignant tumors as
malignant
• These are called true positives
• The other 15% had negative results even
though they actually have cancer
• These are called false negatives
• There is a corresponding number of false
positives and true negatives
Possible Results from a Medical Test
• True Positive: patient has the disease, and test is
positive
• False Positive: patient doesn’t have the disease,
and test is positive
• True Negative: patient doesn’t have the disease,
and test is negative
• False Negative: patient has the disease, and test
is negative
• Pop-Quiz: Of these four, which is the most
dangerous possibility?
Mammogram Results
Tumor is Malignant
Tumor is Benign
Positive
Mammogram
85 True
Positives
???
Negative
Mammogram
15 False
Negatives
???
Total
100
9900
Totals
10,000
Which labels (TP, FP, TN, FN) would be applied under the benign category?
Actual Results
• Mammogram screening correctly identifies 85% of the
9900 benign tumors as benign  “true negatives”
• The other 15% of the 9900 (1485) get positive results in
which the mammogram incorrectly suggest their
tumors are malignant. These are called false positives.
• Key point: One of the most common false assumptions
made by non-medical folks, is to assume that tests are
always accurate.
– They are not
– A good test has close to 100% accuracy, but very few
actually do
Actual Results
Tumor is Malignant
Tumor is Benign
Positive
Mammogram
85 True
Positives
1485 False
Positives
Negative
Mammogram
15 False
Negatives
8415 True
Negatives
Total
100
9900
Totals
10,000
This is what a mammogram would ideally show: True Positives and True
Negatives. Unfortunately, all tests have some error in them.
Actual Results
Tumor is Malignant
Tumor is Benign
Totals
Positive
Mammogram
85 True
Positives
1485 False
Positives
1570
Negative
Mammogram
15 False
Negatives
8415 True
Negatives
8430
Total
100
9900
10,000
Now compute the row totals.
Results
• Overall, the mammogram screening gives
positive results to 85 women who actually
have cancer and to 1485 women who do not
have cancer
• The total number of positive results is 1570
• Because only 85 of these are true positives,
that is 85/1570, or 0.054
• In other words, the chance that a positive
result really means cancer is only 5.4% !!
Teach your student doctor:
• When your patient’s mammogram comes back
positive, you should reassure her that there’s
still only a small chance that she has cancer
– Although further tests are probably necessary
Another Question
• Suppose you are a doctor seeing a patient
with a breast tumor. Her mammogram comes
back negative. Based on the numbers above,
what is the chance that she has cancer?
– In other words, what is the probability that this is
a false negative?
• Scary – but it happens! However, it is quite rare.
Actual Results
Totals
Tumor is
Malignant
Tumor is
Benign
Positive
Mammogram
85 True
Positives
1485 False 1570
Positives
Negative
Mammogram
15 False
Negatives
8415 True
Negatives
8430
Total
100
9900
10,000
15/8430, or 0.0018, or slightly less than 2 in 1000.
This is a dangerous position. Now what do you do?
Answer: Go to medical school to find out!