Bayesian Probability

Download Report

Transcript Bayesian Probability

Processing physical evidence
• discovering, recognizing and examining it;
• collecting, recording and identifying it;
• packaging, conveying and storing it;
• exhibiting it in court;
• disposing of it when the case is closed.
Lecture: Forensic Evidence and
Probability
Characteristics of evidence
• Class characteristics
features that place the
item into a specific
category
• Individual
characteristics
features that distinguish
one item from another
of the same type
The arithmetic mean is the "standard"
average, often simply called the "mean"
The standard deviation (SD) quantifies variability.
If the data follow a bell-shaped Gaussian
distribution, then 68% of the values lie within one
SD of the mean (on either side) and 95% of the
values lie within two SD of the mean. The SD is
expressed in the same units as your data.
1% of women at age forty who participate in routine screening have breast
cancer. 80% of women with breast cancer will get positive mammographies. 9.6%
of women without breast cancer will also get positive mammographies. A woman
in this age group had a positive mammography in a routine screening. What is the
probability that she actually has breast cancer?
1% of women at age forty who participate in routine screening have breast
cancer. 80% of women with breast cancer will get positive mammographies. 9.6%
of women without breast cancer will also get positive mammographies. A woman
in this age group had a positive mammography in a routine screening. What is the
probability that she actually has breast cancer?
STATISTICAL SOLUTION
To put it another way, before the mammography screening, the 10,000 women can
be divided into two groups:
•Group 1: 100 women with breast cancer.
•Group 2: 9,900 women without breast cancer.
After the mammography, one gets:
* 80 women with breast cancer, and a positive mammography.
i.e. 80% of 100
* 950 women without breast cancer, and a positive mammography.
i.e. 9.6% of 9900
The probability that a patient with a positive mammogram has breast cancer is:
# (breast cancer + positive mammorgraphy) / #(positive mammorgraphy )
= 80/(80+950) = 7.8%
1% of women at age forty who participate in routine screening have breast
cancer. 80% of women with breast cancer will get positive mammographies. 9.6%
of women without breast cancer will also get positive mammographies. A woman
in this age group had a positive mammography in a routine screening. What is the
probability that she actually has breast cancer?
BAYESIAN SOLUTION
The original proportion of patients with breast cancer is known as the prior
probability:
P(C) = 1% and P(~C) = 99%
The chance of a patient having a positive mammography given that she has cancer,
and the chance that of a patient having a positive mammography given that she does
not have cancer, are known as the two conditional probabilities. Collectively
information is often termed the liklehood ratio:
P(M | C) = 80% i.e probability of +ve mammogram given that she has cancer
P(M | ~C) = 9.6% i.e probability of +ve mammogram given that she does not
have cancer
The final answer - the estimated probability that a patient has breast cancer given
that we know she has a positive result on her mammography - is known as the
revised probability or the posterior probability.
1% of women at age forty who participate in routine screening have breast
cancer. 80% of women with breast cancer will get positive mammographies. 9.6%
of women without breast cancer will also get positive mammographies. A woman
in this age group had a positive mammography in a routine screening. What is the
probability that she actually has breast cancer?
prior probability x conditional probability = posterior probability
P(C) . P(M | C) = P(C | M)
P(~C) P(M | ~C) P(~C | M)
0.01 . 0.8 = 0.008 = 80
0.99 0.096 0.095
950
the estimated odds that a patient has breast cancer given that we know she has a
positive result on her mammography are 80 to 950
the estimated probability that a patient has breast cancer given that we know she
has a positive result on her mammography is 80 / (80+950) = 7.8%
prior probability P(C) .
P(~C)
The probability that the suspect is or is not guilty prior to presenting this
evidence
conditional probability P(M | C)
P(M | ~C)
Also called the Likelihood Ratio (LR) and represents the probability that this
evidence would be present if the suspect is or is not guilty
posterior probability P(C | M)
P(~C | M)
The probability that the suspect is or is not guilty given the evidence
presented
Bayesian Probability
•
Problem#1
A suspect is seen fleeing the crime. The suspect is positively identified as being at least six feet tall and was
wearing a nurse’s uniform. Exactly 5% of the male population is at least 6 feet tall, while 0.5% of the
female population is at least 6 feet tall, and 98% of all nurses are female. What are the odds that the suspect
is a male.
•
Problem#2
1 million people in America have HIV/AIDS. HIV tests correctly identify a HIV infected person with a
positive result 97.7% of the time. HIV tests correctly identify a non-HIV infected person with a negative
result 92.6% of the time. If an American gets a positive HIV test result what are the odds that they are
infected with HIV? (Assume an american population of 260 million)
•
Problem#3
Suppose that a barrel contains many small plastic eggs. Some eggs are painted red and some are painted
blue. 40% of the eggs in the bin contain pearls, and 60% contain nothing. 30% of eggs containing pearls
are painted blue, and 10% of eggs containing nothing are painted blue. What is the probability that a blue
egg contains a pearl?
•
Problem#4
There are 100 people in a room, 20 women and 80 men. 80% of women are blonde, while 30% of the men
are blonde. The suspect has blonde hair and is definitely one of the people in the room. What are the odds
that the suspect is a female.
•
Problem#5
The investigator on the case informs you that the odds that the suspect committed the crime are 2 to 1.
Your DNA fingerprint analysis of the suspect’s blood gives a 1 in a million probability that it is a random
match to the blood found at the crime scene. You also know that your lab has a 1 in a 1000 chance of a
false positive. What are the odds that the blood found at the crime scene came from your suspect?
Defender’s Fallacy :
P(S | M) = “odds of it being my client”
= 1/[P(M | ~S) x sample population]
Prosecutor’s Fallacy :
P(S | M) = “odds of it being anyone other than the suspect”
=1 / P(M | ~S)
A crime has been committed, and a blood sample has been found at the
crime scene. The blood is typed as A- , a blood type found in 5% of the
population A suspect is identified, who also happens to have the Ablood type. In addition a DNA profile of the suspect gives the odds of a
random match of his blood to the blood found at the crime scene of
109 to 1.
What are the odds that this suspect was present at the crime scene?
What is the probability that this suspect was present at the crime
scene?
If the odds of a false positive for the DNA profile are one in a
thousand, what are the odds that this suspect was present at the crime
scene? What is the probability that this suspect was present at the
crime scene?
Bayesian Probability
•
Problem#1
A suspect is seen fleeing the crime. The suspect is positively identified as being at least six feet tall and was
wearing a nurse’s uniform. Exactly 5% of the male population is at least 6 feet tall, while 0.5% of the
female population is at least 6 feet tall, and 98% of all nurses are female. What are the odds that the suspect
is a male.
•
Problem#2
1 million people in America have HIV/AIDS. HIV tests correctly identify a HIV infected person with a
positive result 97.7% of the time. HIV tests correctly identify a non-HIV infected person with a negative
result 92.6% of the time. If an American gets a positive HIV test result what are the odds that they are
infected with HIV? (Assume an american population of 260 million)
•
Problem#3
Suppose that a barrel contains many small plastic eggs. Some eggs are painted red and some are painted
blue. 40% of the eggs in the bin contain pearls, and 60% contain nothing. 30% of eggs containing pearls
are painted blue, and 10% of eggs containing nothing are painted blue. What is the probability that a blue
egg contains a pearl?
•
Problem#4
There are 100 people in a room, 20 women and 80 men. 80% of women are blonde, while 30% of the men
are blonde. The suspect has blonde hair and is definitely one of the people in the room. What are the odds
that the suspect is a female.
•
Problem#5
The investigator on the case informs you that the odds that the suspect committed the crime are 2 to 1.
Your DNA fingerprint analysis of the suspect’s blood gives a 1 in a million probability that it is a random
match to the blood found at the crime scene. You also know that your lab has a 1 in a 1000 chance of a
false positive. What are the odds that the blood found at the crime scene came from your suspect?