Bayesian Decision Theory

Download Report

Transcript Bayesian Decision Theory

Bayesian Decision Theory
Foundations for a unified theory
1
What is it?
• Bayesian decision theories are formal models of
rational agency, typically comprising a theory of:
– Consistency of belief, desire and preference
– Optimal choice
• Lots of common ground…
– Ontology: Agents; states of the world;
actions/options; consequences
– Form: Two variable quantitative models ; centrality of
representation theorem
– Content: The principle that rational action maximises
expected benefit.
2
It seems natural therefore to speak of plain Decision
Theory. But there are differences too ...
e.g. Savage versus Jeffrey.
– Structure of the set of prospects
– The representation of actions
– SEU versus CEU.
Are they offering rival theories or different expressions of
the same theory?
Thesis: Ramsey, Savage, Jeffrey (and others) are all
special cases of a single Bayesian Decision Theory
(obtained by restriction of the domain of prospects).
3
Plan
• Introductory remarks
– Prospects
– Basic Bayesian hypotheses
– Representation theorems
• A short history
– Ramsey’s solution to the measurement problem
– Ramsey versus Savage
– Jeffrey
• Conditionals
– Lewis-Stalnaker semantics
– The Ramsey-Adams Hypothesis
– A common logic
• Conditional algebras
• A Unified Theory (2nd lecture)
4
Types of prospects
• Usual factual possibilities e.g. it will rain tomorrow; UK
inflation is 3%; etc.
– Denoted by P, Q, etc.
– Assumed to be closed under Boolean compounding
• Conjunction: PQ
• Negation: ¬P
• Disjunction: P v Q
• Logical truth/falsehood: T, 
• Plus derived conditional possibilities e.g. If it rains
tomorrow our trip will be cancelled; if the war in Iraq
continues, inflation will rise.
– The prospect of X if P and Y if Q will be represented as
(P→X)(Q→Y)
5
Main Claims
• Probability Hypothesis: Rational degrees of belief in
factual possibilities are probabilities.
• SEU Hypothesis: The desirability of (P→X)(¬P→Y) is
an average of the desirabilities of PX and ¬PY,
respectively weighted by the probability that P or that ¬P.
• CEU Hypothesis: The desirability of the prospect of X is
an average of the desirabilities of XY and X¬Y,
respectively weighted by the conditional probability,
given X, of XY and of X¬Y.
• Adams Thesis: The rational degree of belief to have in
P→X is the conditional probability of X given that P.
6
Representation Theorems
• Two problems; one kind of solution!
– Problem of measurement
– Problem of justification
• Scientific application: Representation theorems shows
that specific conditions on (revealed) preferences suffice
to determine a measure of belief and desire.
• Normative application: Theorems show that commitment
to conditions on (rational) preference imply commitment
to properties of rational belief and desire.
7
Ramsey-Savage Framework
1. Worlds / consequences: ω1, ω2, ω3, …
2. Propositions / events: P, Q, R, …
3. Conditional Prospects / Actions: (P→ω1)(Q→ω2), …
Good egg
Rotten egg
Break egg
6-egg omelette
Nothing to eat
Throw egg away
5-egg omelette
5-egg omelette
4. Preferences are over worlds and conditional
prospects.
“If we had the power of the almighty … we could by offering
him options discover how he placed them in order of merit
…“
8
Ramsey’s Solution to the Measurement
Problem
1. Ethically neutral propositions
•
•
Problem of definition
Enp P has probability one-half iff for all ω1 and ω2
(P→ω1)(¬P→ω2)  (¬P→ω1)(P→ω2)
2. Differences in value
•
•
Values are sets of equi-preferred prospects
 - β  γ – δ iff (P→)(¬P→δ)  (P→ β )(¬P→γ)
9
3. Existence of utility
Axiomatic characterisation of a value difference
structure implies that existence of a mapping from
values to real numbers such that:
 - β = γ – δ iff U() – U(β) = U(γ) –U(δ)
4. Derivation of probability
Suppose δ  ( if P)(β if ¬P). Then:
U ( )  U (  )
Pr(P) 
U ( )  U (  )
10
Evaluation
• The Justification problem
– Why should measurement axioms hold?
– Sure-Thing Principle versus P4 and Impartiality
• Jeffrey’s objection
– Fanciful causal hypotheses and artifacts of attribution.
– Behaviourism in decision theory
• Ethical neutrality versus state dependence
– Desirabilistic dependence
– Constant acts
11
Utility Dependence
Good egg
Rotten egg
Break egg
6-egg omelette
None wasted
Nothing to eat
5 eggs wasted
Throw egg away
5-egg omelette
1 egg wasted
5-egg omelette
None wasted
Good egg
Rotten egg
Miracle
6-egg omelette
None wasted
6-egg omelette
None wasted
Topsy Turvy
Nothing to eat
5 eggs wasted
6-egg omelette
None wasted
12
Probability Dependence
Republican
Democrat
Dodgy land deal
Low taxes
Unrestricted development
High taxes
Restricted development
No deal
No development
No development
Miracle deal
High taxes
Restricted development
Low taxes
Unrestricted development
13
Jeffrey
• Advantages
– A simple ontology of propositions
– State dependent utility
– Partition independence (CEU)
• Measurement
– Under-determination of quantitative representations
– The inseparability of belief and desire?
– Solutions: More axioms, more relations or more
prospects?
• The logical status of conditionals
14
Conditionals
• Two types of conditional?
– Counterfactual: If Oswald hadn’t killed Kennedy
then someone else would have.
– Indicative: If Oswald didn’t kill Kennedy then
someone else did
• Two types of supposition
– Evidential: If its true that …
– Interventional: If I make it true that …
[Lewis, Joyce, Pearl versus Stalnaker, Adams,
Edgington]
15
Lewis-Stalnaker semantics
Intuitive idea: A□→B is true iff B is true in those worlds
most like the actual one in which A is true.
Formally: A□→B is true at a world w iff for every A¬Bworld there is a closer AB-world (relative to an
ordering on worlds).
1. Limit assumption: There is a closest world
2. Uniqueness Assumption: There is at most one
closest world.
16
The Ramsey-Adams Hypothesis
• General Idea: Rational belief in conditionals goes by
conditional belief for their consequents on the
assumption that their antecedent is true.
• Adams Thesis: The probability of an (indicative)
conditional is the conditional probability of its
consequent given its antecedent:
(AT) p( A  B)  p( B | A)
• Logic from belief: A sentence Y can be validly inferred
from a set of premises iff the high probability of the
premises guarantees the high probability of Y.
17
A Common Logic
1.
2.
3.
4.
5.
6.
7.
8.
AB  AB  AB
A  A
AA  
A¬A  
AB  AAB
(AB)(AC)  ABC
(AB) v (AC)  A(B v C)
¬(AB)  A¬B
18
The Bombshell
• Question: What must the truth-conditions of AB be, in
order that Ramsey-Adams hypothesis be satisfied?
• Answer: The question cannot be answered.
Lewis, Edgington, Hajek, Gärdenfors, Döring, …: There is no nontrivial assignment of truth-conditions to the conditional
consistent with the Ramsey-Adams hypothesis.
• Conclusion:
1. “few philosophical theses that have been more decisively
refuted” – Joyce (1999, p.191)
2. Ditch bivalence!
19

Boolean algebra
AB
A
AC
B

BC
C
20

Conditional Algebras (1)
ACAC
AB
AC
BC
ACA
A
B
ACC
C
ACAC

(XY)(XZ)  XYZ
(XY) v (XZ)  X(Y v Z)
21

Conditional algebras (2)
ACAC
AB
AC
BC
ACA
A
B
ACC
C
ACAC

XY  XY
22

Conditional algebras (3)
ACAC
AB
AC
BC
ACA
A
B
ACC
C
ACAC

XY  XY
23

Normally bounded algebras (1)
ACAC
AB
AC
BC
ACA
A
B
ACC
C
ACAC

XX  
XY  XXY
24

Material Conditional
ACAC
AB
AC
BC
ACA
A
B
ACC
C
ACAC

X  ¬X
25

Normally bounded algebras (2)
ACAC
AB
AC
BC
ACA
A
B
ACC
C
ACAC

X¬X  
¬(XY)  X¬Y
26

Conditional
algebras (3)
AB
AC
ACA
A
BC
ACC
B

C
27