chap14 - Computer Science

Download Report

Transcript chap14 - Computer Science

Uncertainty


Logical approach problem: we do not
always know complete truth about the
environment
Example:


Leave(t) = leave for airport t minutes
before flight
Query: ? t Leave(t )  ArriveOnTime
Problems

Why can’t we determine t exactly?

Partial observability


Uncertainty in action outcomes


road state, other drivers’ plans
flat tire
Immense complexity of modelling and
predicting traffic
Problems

Three specific issues:

Laziness


Theoretical ignorance


Too much work to list all antecedents or
consequents
Not enough information on how the world
works
Practical ignorance

If if we know all the “physics”, may not have all
the facts
What happens with a purely
logical approach?

Either risks falsehood:


Leads to conclusions to weak to do anything
with:


“Leave(45) will get me there on time”
“Leave(45) will get me there on time if there’s no
snow and there’s no train crossing Route 19 and
my tires remain intact and...”
Leave(1440) might work fine, but then I’d
have to spend the night in the airport
Solution: Probability


Given the available evidence, Leave(35) will
get me there on time with probability 0.04
Probability address uncertainty, not degree of
truth

Degree of truth handled by fuzzy logic



IsSnowing is true to degree 0.2
Probabilities summarize effects of laziness
and ignorance
We will use combination of probabilities and
utilities to make decisions
Subjective or Bayesian
probability

We will make probability estimates based on
knowledge about the world


Not assertions about the world


Probability assessment if the world were a certain
way
Probabilities change with new information


P(Leave(45) | No Snow) = 0.55
P(Leave(45) | No Snow, 5 AM) = 0.75
Analagous to entailment, not truth
Making decision under
uncertainty

Suppose I believe the following:





P(Leave(35) gets me there on time | ...) = 0.04
P(Leave(45) gets me there on time | ...) = 0.55
P(Leave(60) gets me there on time | ...) = 0.95
P(Leave(1440) gets me there on time | ...) = 0.9999
Which action do I choose?



Depends on my preferences for missing flight vs.
eating in airport, etc.
Utility theory used to represent preferences
Decision theory takes into account utility and
probabilities
Axioms of Probability

For any propositions A and B:
0  P ( A)  1
P(True)  1, P( False )  0
P( A  B)  P ( A)  P( B )  P( A  B)

Example:


A = computer science major
B = born in Minnesota
Notation and Concepts

Unconditional probability or prior
probability:




P(Cavity) = 0.1
P(Weather = Sunny) = 0.55
corresponds to belief prior to arrival of any
new evidence
Weather is a multivalued random variable


Could be one of <Sunny, Rain, Cloudy, Snow>
P(Cavity) shorthand for P(Cavity=true)
Probability Distributions

Probability Distribution gives probability
values for all values



P(Weather) = <0.55, 0.05, 0.2, 0.2>
must be normalized: sum to 1
Joint Probability Distribution gives probability
values for combinations of random variables

P(Weather, Cavity) = 4 x 2 matrix
Posterior Probabilities

Conditional or Posterior probability:


P(Cavity | Toothache) = 0.8
For conditional distributions:

P(Weather | Earthquake) =
Earthquake=false
Earthquake=true
Posterior Probabilities

More knowledge does not change
previous knowledge, but may render
old knowledge unnecessary


P(Cavity | Toothache, Cavity) = 1
New evidence may be irrelevant

P(Cavity | Toothache, Schiller in Mexico) =
0.8
Definition of Conditional
Probability

Two ways to think about it
P( A  B)
P( A | B) 
, if P( B)  0
P( B)
Definition of Conditional
Probablity

Another way to think about it
P( A  B)  P( A) P( B | A)  P( B) P( A | B)

Sanity check: Why isn’t it just:
P( A  B)  P( A) P( B)

General version holds for probability
distributions:
P(Weather, Cavity)  P(Weather | Cavity)P(Cavity)

This is a 4 x 2 set of equations
Bayes’ Rule

Product rule given by
P( A  B)  P( A) P( B | A)  P( B) P( A | B)

Bayes’ Rule:
P( B | A) P( A)
P( A | B) 
P( B)

Bayes’ rule is extremely useful in trying
to infer probability of a diagnosis, when
the probability of cause is known.
Bayes’ Rule example

Does my car need a new drive axle?

If a car needs a new drive axle, with 30% probability
this car jerks around


Unconditional probabilites:



P(car jerks) = 1/1000
P(needs axle) = 1/10,000
Then:



P(jerks | needs axle) = 0.3
P(needs axle | jerks) = P(jerks | needs axle) P(needs axle)
-----------------------------------------P(jerks)
= (0.3 x 1/10,000) / (1/1000) = .03
Conclusion: 3 of every 100 cars that jerk need an axle
Not dumb question
P( B | A) P( A)
P( A | B) 
P( B)

Question:


Why should I be able to provide an
estimate of P(B|A) to get P(A|B)?
Why not just estimate P(A|B) and be done
with the whole thing?
Not dumb question

Answer:


Diagnostic knowledge is often more tenuous
than causal knowledge
Suppose drive axles start to go bad in an
“epidemic”




e.g. poor construction in a major drive axle brand
two years ago is now haunting us
P(needs axle) goes way up, easy to measure
P(needs axle | jerks) should (and does) go up
accordingly – but how to estimate?
P(jerks | needs axle) is based on causal
information, doesn’t change