Uncertainty-AI
Download
Report
Transcript Uncertainty-AI
DEALING WITH UNCERTAINTY
(1)
WEEK 5
CHAPTER 3
Introduction
• The world is not a well-defined place.
• There is uncertainty in the facts we know:
– What’s the temperature? Imprecise measures
– Is X a good president? Imprecise definitions
– Where are the road pits? Imprecise knowledge
• There is uncertainty in our inferences
– If I have red scars a itchy rash and was gardening all
weekend Iprobably have poison ivy
• People make successful decisions all the time
anyhow.
2
Sources of Uncertainty
• Uncertain data
– missing data, unreliable, ambiguous, imprecise representation,
inconsistent, subjective, derived from defaults, noisy…
• Uncertain knowledge
– Multiple causes lead to multiple effects
– Incomplete knowledge of causality in the domain
– Probabilistic/stochastic effects
• Uncertain knowledge representation
– restricted model of the real system
– limited expressiveness of the representation mechanism
• inference process
– Derived result is formally correct, but wrong in the real world
– New conclusions are not well-founded (eg, inductive reasoning)
– Incomplete, default reasoning methods
3
Reasoning Under Uncertainty
• So how do we do reasoning under uncertainty and
with inexact knowledge?
– heuristics
• ways to mimic heuristic knowledge processing methods used by
experts ( limit the search for solution)
– empirical associations
• experiential reasoning and based on limited observations
• Verifiable or provable by means of observation or experiment.
• Guided by practical experience and not theory, as in medicine.
– probabilities
• objective (frequency counting)
• subjective (human experience )
4
Decision making with uncertainty
• Rational behavior:
– For each possible action, identify the possible
outcomes
– Compute the probability of each outcome
– Compute the utility of each outcome
– Compute the probability-weighted (expected)
utility over possible outcomes for each action
– Select the action with the highest expected utility
(principle of Maximum Expected Utility)
5
Some Relevant Factors
• expressiveness
– can concepts used by humans be represented adequately?
– can the confidence of experts in their decisions be expressed?
• comprehensibility
– representation of uncertainty
– utilization in reasoning methods
• correctness
– probabilities
– relevance ranking
– long inference chains
• computational complexity
– feasibility of calculations for practical purposes
• reproducibility
– Do the observations deliver the same results when repeated?
6
Basic Probability
• Probability theory enables us to make rational
decisions.
• Which mode of transportation is safer ( more
safety):
– Car or Plane?
– What is the probability of an accident?
Basic Probability Theory
• An experiment has a set of potential outcomes, e.g., throw a dice
• The
of an experiment is the set of all possible
outcomes, e.g., {1, 2, 3, 4, 5, 6}.
• An event is a subset of the sample space.
– {2}
– {3, 6}
– even = {2, 4, 6}
– odd = {1, 3, 5}
Probability as Relative Frequency
• An event has a probability.
• Consider a long sequence of experiments. If we look at the
number of times a particular event occurs in that sequence, and
compare it to the total number of experiments,
we can compute a ratio.
• This ratio is one way of estimating the probability of the event.
• P(E) = (# of times E occurred)/(total # of trials)
– 100 attempts are made to swim a length in 30 secs.
The
swimmer succeeds on 20 occasions (tries); therefore the
probability that a swimmer can complete the length in 30 secs is:
• 20/100 = 0.2
• Failure = (1- 0.2) or 0.8
• The experiments, the sample space and the events must
be defined clearly for probability to be meaningful
– What is the probability of an accident?
Theoretical Probability
• Principle of Indifference - Alternatives are always to be judged
probabley if we have no reason to expect or prefer one over the
other.
• Each outcome in the sample space is assigned equal probability.
• Example: throw a dice
– P({1})=P({2})= ... =P({6})=1/6
Law of Large Numbers
• As the number of experiments increases the relative frequency
of an event more closely approximates the theoretical probability
of the event.
– if the theoretical assumptions hold.
• Buffon’s Needle for Computing π
– Draw parallel lines 1 inch apart on a plane
– Throw a 1-inch needle on the plane
– P( needle crossing a line )=2/π
number of throws
2 number of crossings
Large Number Reveals Untruth in Assumptions
Why ?
Results of 1,000,000 throws of a dice
Number
1
2
3
4
5
6
Fraction .155 .159 .164 .169 .174 .179
Axioms of Probability Theory
• Suppose P(.) is a probability function, then
1. for any event E, 0≤P(E) ≤1. …..How ?
2. P(S) = 1, where S is the sample space.
3. for any two mutually exclusive events E1 and E2,
P(E1 E2) = P(E1) + P(E2)
• Any function that satisfies the above three
axioms is a probability function.
Joint Probability
• Let A, B be two events, the joint probability of both A and B
being true is denoted by P(A, B).
• Example:
P(spade) is the probability of the top card being a spade.
P(king) is the probability of the top card being a king.
P(spade, king) is the probability of the top card being both a
spade and a king, i.e., the king of spade.
P(king, spade)=P(spade, king) ???
Properties of Probability
1. P(E) = 1– P(E)
2. If E1 and E2 are logically equivalent, then
P(E1)=P(E2).
– E1: Not all philosophers are more than six feet tall.
– E2: Some philosopher is not more that six feet tall.
Then P(E1)=P(E2).
3. P(E1, E2)≤P(E1).
Conditional Probability
• The probability of an event may change after
knowing another event.
The probability of A given B is denoted by
P(A|B).
• Example
– P( W=space ) the probability of a randomly
selected word from an English text is ‘space’
– P( W=space | W’=outer) the probability of
‘space’ if the previous word is ‘outer’
Example
A:the top card of a deck of poker cards is a king of spade
P(A) = 1/52
However, if we know
B: the top card is a king
then, the probability of A given B is true is
P(A|B) = 1/4.
How to Compute P(A|B)?
Business Students
Of 100 students completing a course, 20 were business
major. Ten students received A in the course, and three
of these were business majors.,
suppose A is the event that a randomly selected student
got an A in the course, B is the event that a randomly
selected event is a business major. What is the
probability of A? What is the probability of A after
knowing B is true?
B
not B
A
3
20
7
80
Probabilistic Reasoning
• Evidence
– What we know about a situation.
• Hypothesis
– What we want to conclude.
• Compute
– P( Hypothesis | Evidence )
Credit Card Authorization
• E is the data about the applicant's age, job,
education, income, credit history, etc,
• H is the hypothesis that the credit card will
provide positive return.
• The decision of whether to issue the credit
card to the applicant is based on the
probability P(H|E).
Medical Diagnosis
• E is a set of symptoms, such as, coughing,
sneezing, headache, ...
• H is a disorder, e.g., common cold, SARS, flu.
• The diagnosis problem is to find an H
(disorder) such that P(H|E) is maximum.
Basics of Probability Theory
• mathematical approach for processing uncertain information
– sample space set
X = {x1, x2, …, xn}
• collection of all possible events
• can be discrete or continuous
– probability number P(xi): likelihood of an event xi to occur
• non-negative value in [0,1]
• total probability of the sample space is 1
• for mutually exclusive events, the probability for at least one of them
is the sum of their individual probabilities
• experimental probability
– based on the frequency of events
• subjective probability
– based on expert assessment
24
Compound Probabilities
• describes independent events
– do not affect each other in any way
• joint probability of two independent events A
and B
P(A B) = P(A) * P (B)
• union probability of two independent events A
and B
P(A B) = P(A) + P(B) - P(A B)
=P(A) + P(B) - P(A) * P (B)
25
Probability theory
• Random variables
– Domain
• Atomic event: complete
specification of state
• Prior probability: degree of
belief without any other
evidence
• Joint probability: matrix of
combined probabilities of a
set of variables
• Alarm, Burglary, Earthquake
– Boolean (like these), discrete,
continuous
• Alarm=True Burglary=True
Earthquake=False
alarm burglary earthquake
• P(Burglary) = .1
• P(Alarm, Burglary) =
alarm ¬alarm
burglary
.09
¬burglary .1
.01
.8
26
Independence
• When two sets of propositions do not affect each others’ probabilities,
we call them independent, and can easily compute their joint and
conditional probability:
– Independent (A, B) if P(A B) = P(A) P(B), P(A | B) = P(A)
• For example, {moon-phase, light-level} might be independent of
{burglary, alarm, earthquake}
– Then again, it might not: Burglars might be more likely to burglarize
houses when there’s a new moon (and hence little light)
– But if we know the light level, the moon phase doesn’t affect whether
we are burglarized
– Once we’re burglarized, light level doesn’t affect whether the alarm
goes off
• We need a more complex notion of independence, and methods for
reasoning about these kinds of relationships
27
Conditional independence
• Absolute independence:
– A and B are independent if P(A B) = P(A) P(B); equivalently,
P(A) = P(A | B) and P(B) = P(B | A)
• A and B are conditionally independent given C if
– P(A B | C) = P(A | C) P(B | C)
• This lets us decompose the joint distribution:
– P(A B C) = P(A | C) P(B | C) P(C)
• Moon-Phase and Burglary are conditionally independent
given Light-Level
• Conditional independence is weaker than absolute
independence, but still useful in decomposing the full joint
probability distribution
28
Conditional Probabilities
• describes dependent events
– affect each other in some way
• conditional probability of event a given that
event B has already occurred
P(A|B) = P(A B) / P(B)
29
Q&A
30