Transcript ppt
Probabilistic Inference
Reading: Chapter 13
Next time: How should we define artificial
intelligence?
Reading for next time (see Links, Reading for
Retrospective Class):
Turing paper
Mind, Brain and Behavior, John Searle
Prepare discussion points by midnight, wed night
(see end of slides)
Transition to empirical AI
Add in
Ability to infer new facts from old
Ability to generalize
Ability to learn based on past observation
Key:
Observation of the world
Best decision given what is known
2
Overview of Probabilistic Inference
Some terminology
Inference by enumeration
Bayesian Networks
3
4
5
6
7
8
Probability Basics
Sample space
Atomic event
Probability model
An event A
9
10
Random Variables
Random variable
Probability for a random variable
11
12
13
14
15
16
Logical Propositions and Probability
Proposition = event (set of sample points)
Given Boolean random variables A and B:
Event a = set of sample points where A(ω)=true
Event ⌐a=set of sample points where A(ω)=false
Event aΛb=points where A(ω)=true and B(ω)=true
Often the sample space is the Cartesian product of the
range of variables
Proposition=disjunction of atomic events in which it is true
(aVb) = (⌐aΛb)V(aΛ⌐b)V(aΛb)
P(aVb)= P(⌐aΛb)+P(aΛ⌐b)+P(aΛb)
17
18
19
20
21
22
23
24
Axioms of Probability
All probabilities are between 0 and 1
Necessarily true propositions have
probability 1. Necessarily false
propositions have probability 0
The probability of a disjunction is
P(aVb)=P(a)+P(b)-P(aΛb)
P(⌐a)=1-p(a)
25
The definitions imply that certain logically
related events must have related probabilities
P(aVb)= P(a)+P(b)-P(aΛb)
26
Prior Probability
Prior or unconditional probabilities of propositions
Probability distribution gives values for all
possible assignments
P(female=true)=.5 corresponds to belief prior to
arrival of any new evidence
P(color) = (color = green, color=blue, color=purple)
P(color)=<.6,.3,.1> (normalized: sums to 1)
Joint probability distribution for a set of r.v.s gives
the probability of every atomic event on those
r.v.s (i.e., every sample point)
P(color,gender) = a 3X2 matrix
27
28
29
30
31
32
33
Inference by enumeration
Start with the joint distribution
34
Inference by enumeration
P(HasTeeth)=.06+.12+.02=.2
35
Inference by enumeration
P(HasTeethVColor=Green)=.06+.12+.02+.24=.4
4
36
Conditional Probability
Conditional or posterior probabilities
E.g., P(PlayerWins|HostOpenDoor=1 and
PlayerPickDoor2 and Door1=goat) = .5
If we know more (e.g., HostOpenDoor=3 and
door3-goat):
P(PlayerWins)=1
Note: the less specific belief remains valid after
more evidence arrives, but is not always useful
New evidence may be irrelevant, allowing
simplification:
P(PlayerWins|Californiaearthquake)=P(PlayerWins)=.3
37
Conditional Probability
A general version holds for joint distributions:
P(PlayerWins,HostOpensDoor1)=P(PlayerWins|HostOpensDoor1)*P(Ho
stOpensDoor1)
38
Inference by enumeration
Compute conditional probabilities:
P(⌐Hasteeth|color=green)= P(⌐HasteethΛcolor=green)
P(color=green)
0.8
= 0.24
0.06+.24
39
Normalization
Denominator can be viewed as normalization constraint α
P(⌐Hasteeth|color=green) = α P(⌐Hasteeth|color=green)
=α[P(⌐Hasteeth,color=green, female)+
P(⌐Hasteeth,color=green, ⌐ female)]
=α[<0.03,0.12>+<0.03,0.012>]=α<0.06,0.24>
=<0.2,0.8>
Compute distribution on query
variable by fixing evidence variables
and summing over hidden variables
40
Inference by enumeration
41
Independence
A and B are independent iff
P(A|B)=P(A) or P(B|A)=P(B) or
P(A,B)=P(A)P(B)
32 entries reduced to 12; for n
independent biased coins, 2n -> n
Absolute independence powerful but rare
Any domain is large with hundreds of
variables none of which are independent
42
43
Conditional Independence
If I have length <=.2, the probability that I
am female doesn’t depend on whether or not
I have teeth:
P(female|length<=.2,hasteeth)=P(female|h
asteeth)
The same independence holds if I am >.2
P(male|length>.2,hasteeth)=P(male|length>.2)
Gender is conditionally independent of hasteeth
given length
44
In most cases, the use of
conditional independence reduces
the size of the representation of the
joint distribution from exponential
in n to linear in n
Conditional independence is our
most basic and robust form of
knowledge about uncertain
environments
45
Next Class: Turing Paper
A discussion class
Graduate students and non-degree students:
Anyone beyond a bachelor’s:
Prepare a short statement on the paper. Can be
your reaction, your position, a place where you
disagree, an explication of a point.
Undergraduates: Be prepared with questions for
the graduate students
All: Submit your statement or your question by
midnight Wed night.
All statements and questions will be printed and
distributed in class on Wednesday.
46