Transcript Document

Chapter 5
STOCHASTIC METHODS
Contents
• The Elements of Counting
• Elements of Probability Theory
• Applications of the Stochastic
Methodology
• Bayes’ Theorem
CSc411
Artificial Intelligence
1
Application Areas
Diagnostic Reasoning. In medical diagnosis, for example, there
is not always an obvious cause/effect relationship between the set
of symptoms presented by the patient and the causes of these
symptoms. In fact, the same sets of symptoms often suggest
multiple possible causes.
Natural language understanding. If a computer is to
understand and use a human language, that computer must be
able to characterize how humans themselves use that language.
Words, expressions, and metaphors are learned, but also change
and evolve as they are used over time.
Planning and scheduling. When an agent forms a plan, for
example, a vacation trip by automobile, it is often the case that
no deterministic sequence of operations is guaranteed to succeed.
What happens if the car breaks down, if the car ferry is cancelled
on a specific day, if a hotel is fully booked, even though a
reservation was made?
Learning. The three previous areas mentioned for stochastic
technology can also be seen as domains for automated learning.
An important component of many stochastic systems is that they
have the ability to sample situations and learn over time.
CSc411
Artificial Intelligence
2
Set Operations
Let A and B are two sets, U universe
–
–
–
–
–
–
–
CSc411
Cardinality |A|: number of elements in A
Complement Ā: all elements in U but not in A
Subset: A  B
Empty set: 
Union: A  B
Intersection: A  B
Difference: A - B
Artificial Intelligence
3
Addition Rules
The Addition rule for combining two sets:
The Addition rule for combining three sets:
This Addition rule may be generalized to any
finite number of sets
CSc411
Artificial Intelligence
4
Multiplication Rules
• The Cartesian Product of two sets A and B
• The multiplication principle of counting, for
two sets
CSc411
Artificial Intelligence
5
5
Permutations and Combinations
• The permutations of a set of n elements taken
r at a time
• The combinations of a set of n elements
taken r at a time
CSc411
Artificial Intelligence
6
Events and Probability
CSc411
Artificial Intelligence
7
Probability Properties
• The probability of any event E from the sample
space S is:
• The sum of the probabilities of all possible
outcomes is 1
• The probability of the compliment of an event is
• The probability of the contradictory or false
outcome of an event
CSc411
Artificial Intelligence
8
Independent Events
CSc411
Artificial Intelligence
9
The Kolmogorov Axioms
Three Kolmogorov Axioms:
From these three Kolmogorov axioms, all of
probability theory can be constructed.
CSc411
Artificial Intelligence
10
Traffic Example
Problem description
A driver realizes the gradual slowdown and
searches for possible explanation by means of
car-based download system
– Road construction?
– Accident?
Three Boolean parameters
– S: whether slowdown
– A: whether accident
– C: whether road construction
Download data – next page
CSc411
Artificial Intelligence
11
• Download data:
The joint probability
distribution for the
traffic slowdown, S,
accident, A, and
construction, C,
variable of the example
A Venn diagram
representation of
the probability
distributions is
traffic slowdown, A
is accident, C is
construction.
CSc411
Artificial Intelligence
12
Variables
CSc411
Artificial Intelligence
13
Expectation
CSc411
Artificial Intelligence
14
Prior and Posterior Probability
CSc411
Artificial Intelligence
15
Conditional Probability
A Venn diagram illustrating the calculations of P(d|s) as
a function of p(s|d).
CSc411
Artificial Intelligence
16
Chain Rules
The chain rule for two sets:
The generalization of the chain rule to multiple sets
We make an inductive argument to prove the chain rule,
consider the nth case:
We apply the intersection of two sets of rules to get:
And then reduce again, considering that:
Until
is reached, the base case, which we have
already demonstrated.
CSc411
Artificial Intelligence
17
Independent Events
CSc411
Artificial Intelligence
18
Probabilistic FSM
CSc411
Artificial Intelligence
19
Probabilistic Finite State Acceptor
A probabilistic finite state acceptor for the pronunciation of
“tomato”.
CSc411
Artificial Intelligence
20
The ni words
The ni words with their frequencies and probabilities from the
Brown and Switchboard corpora of 2.5M words.
The ni phone/word probabilities from the Brown and Switchboard
corpora.
CSc411
Artificial Intelligence
21
Bayes’ Rules
• Given a set of evidence E, and a set of hypotheses H ={hi}
The conditional probability of hi given E is:
p(hi|E) = (p(E|hi)  h(hi))/p(E)
• Maximum a posteriori hypothesis (most probable
hypothesis), since p(E)is a constant for all hypotheses
arg max(hi) p(E|hi)p(hi)
• E is partitioned by all hypotheses, thus
p(E) = i p(E|hi)p(hi)
CSc411
Artificial Intelligence
22
General Form of Bayes’ Theorem
The general form of Bayes’ theorem where we assume the set
of hypotheses H partition the evidence set E:
CSc411
Artificial Intelligence
23
Applications of Bayes’ Theorem
• Used in PROSPECTOR
• A simple example: suppose to purchase an automobile:
Dealers
Go to probability
Purchase a1
probability
1
d1 = 0.2
p1 = 0.2
2
d2 = 0.4
p2 = 0.4
3
d3 = 0.4
p3 = 0.3
• The application of Bayes’ rule to the car purchase problem:
CSc411
Artificial Intelligence
24
Bayes Classifier
• Naïve Bayes, or the Bayes classifier, that uses the
partition assumption, even when it is not
justified:.
• Assume all evidences are independent, given a
particular hypothesis
CSc411
Artificial Intelligence
25
The Traffic Problem
The Bayesian representation of the traffic problem with potential
explanations.
The joint probability distribution for the traffic and construction
variables
Given bad traffic, what is the probability of road construction?
p(C|T)=p(C=t, T=t)/(p(C=t, T=t)+p(C=f, T=t))=.3/(.3+.1)=.75
CSc411
Artificial Intelligence
26