Basic Axioms of Probability

Download Report

Transcript Basic Axioms of Probability

Lecture II
 Using
the example from Birenens Chapter 1:
Assume we are interested in the game Texas
lotto (similar to Florida lotto).


In this game, players choose a set of 6 numbers
out of the first 50. Note that the ordering does
not count so that 35,20,15,1,5,45 is the same of
35,5,15,20,1,45.
How many different sets of numbers can be
drawn?

First, we note that we could draw any one of 50
numbers in the first draw.
However for the second draw we can only draw
49 possible numbers (one of the numbers has
been eliminated). Thus, there are 50 x 49
different ways to draw two numbers
 Again, for the third draw, we only have 48
possible numbers left. Therefore, the total
number of possible ways to choose 6 numbers out
of 50 is

50
5
50
j 1
k  45
  50  j    k 
k
k 1
50  6
k
k 1
50!

 50  6 !

Finally, note that there are 6! ways to draw a set of 6
numbers (you could draw 35 first, or 20 first, …). Thus,
the total number of ways to draw an unordered set of
6 numbers out of 50 is
 50 
50!
 15,890,700
 
 6  6! 50  6 !

This is a combinatorial. It also is useful for binomial
arithmetic:
 a  b
n
 n  k nk
   a b
k 0  k 
n
 Basic

Sample Space: The set of all possible outcomes.


definitions:
In the Texas lotto scenario, the sample space is all
possible 15,890,700 sets of 6 numbers which could be
drawn.
Event: A subset of the sample space.

A subset of the sample space. In the Texas lotto
scenario, possible events include single draws such as
{35,20,15,1,5,45} or complex draws such as all
possible lotto tickets including {35,20,15}. Note that
this could be {35,20,15,1,2,3}, {35,20,15,1,2,4},….

Simple Event: An event which cannot be a union
of other events


An event which cannot be a union of other events. In
the Texas lotto scenario, this is a single draw such as
{35,20,15,1,5,45}.
Composite Event: An event which is not a simple
event.
A
set

j1
,
j
k

of different combinations of outcomes is
called an event. These events could be
simple events or compound events. In the
Texas lotto case, the important aspect is
that the event is something you could bet
on (for example, you could bet on three
numbers in the draw {35,20,15}).
A
collection of events F is called a family of
subsets of sample space Ω. This family
consists of all possible subsets of Ω
including Ω itself and the null-set Φ.


Following the betting line, you could bet on all
possible numbers (covering the board) so that
Ω is a valid bet.
Alternatively, you could bet on nothing, or Φ is a
valid bet.
 Next,
we will examine a variety of
closure conditions. These are conditions
that guarantee that if one set is an
contained in a family, another related
set must also be contained in that
family.

First, we note that the family is closed
under complementarity
If A  F then A   \ A  F

Second, we note that the family is closed
under union
If A, B  F then A  B  F
Definition 1.1 (Bierens): A collection F of
subsets of a nonempty set Ω satisfying
closure under complementarity and closure
under union is called an algebra.
 Adding closure under infinite union defined
as

If Aj  F for j  1, 2,3,

then
j 1
Aj  F

Definition 1.2 (Bierens): A collection F of subsets
of a nonempty set Ω satisfying closure under
complementarity and infinite union is called a σalgebra (sigma-algebra) or a Borel Field.
P  A

A
x 0,1
 We
typically think of this as an odds
function (i.e., what are the odds of a
winning lotto ticket? 1/15,890,700).

To be mathematically precise, suppose we
define a set of events
A  1 ,
 j 
say that we choose n different numbers. The
probability of winning the lotto is
P  A  P 1 ,
n   n N

Our intuition would indicate that ,
P   1
or the probability of winning given that you have
covered the board is equal to one (a certainty).
 Further, if you don’t bet the probability of
winning is zeros or
P    0

Definition 1.2.2 (Cassella and Berger) : Given a
sample space Ω and an associated Borel field B, a
probability function is a function P with domain
B that satisfies



P (A)0 for all AB.
P (Ω)=1.
If A1,A2,…B are pairwise disjoint, then
P(i=1Ai)=i=1P (Ai)
 Axioms



of Probability:
P(A)  0 for any event A.
P(S) = 1 where S is the sample space.
If {Ai}, i=1,2,…, are mutually exclusive (that is,
AiAj= for all ij), then
P(A1A2…)=P(A1)+P(A2)+…
 In
a little more detail from Casella and
Berger:


Definition 1.1.1: The set, S, of all possible
outcomes of a particular experiment is called the
sample space for the experiment.
Definition 1.1.2: An event is any collection of
possible outcomes of an experiment, that is, any
subset of S (including S itself).

Defining the subset relationship



A  B  x  A  xB
A = B  A  B and B  A
Union: The union of A and B, written A  B, is the set
of elements that belong to either A or B.
A  B  x : x  A or x  B

Intersection: The intersection of A and B, written A 
B, is the set of elements that belong to both A and B.
A  B  x : x  A and x  B

Complementation: The complement of A, written Ac,
is the set of all elements that are not in A.
A  x : x  A
c

Theorem 1.1.1: For any three events A, B, and C
defined on a sample space S,




Commutativity: A  B=B  A, A  B=B  A.
Associativity: A  (B  C)=(A  B)  C,
A  (B  C)=(A  B)  C
Distributative Laws:
A (B C )=(A B )(A  C ),
A (B C )=(A  B )(A C )
DeMorgan’s Laws: (A B )c=Ac  Bc,
(A B )c=Ac Bc
 Simple
Evens with Equal Probabilities
n( A)
P( A) 
n( S )
the probability of event A is simply the
number possible occurrences of A divided by
the number of possible occurrences in the
sample.
 Definition
2.3.1 The number of permutations
of taking r elements from n elements is a
number of distinct ordered sets consisting of
r distinct elements which can be formed out
of a set of n distinctive elements and is
denoted Pnr.

The first point to consider is that of factorials.
For example, if you have two objects A and B,
how many different ways are there to order the
object? Two:
{A, B} or {B, A}

If you have three orderings how many ways are
there to order the objects? Six:
{A, B, C}, {A, C, B}, {B, A, C}, {B, C, A}, {C, A, B}, or {C,
B, A}


The sequence then becomes two objects can be
drawn in two sequences, three objects can be
drawn in six sequences (2 x 3). By inductive
proof, four objects can be drawn in 24 sequences
(6 x 4).
The total possible number of sequences is then
for n objects is n! defined as:
n!=n (n -1)(n -2)…1


Theorem 2.3.1: Pnr=n!/(n-r)!.
Definition 2.3.2: The number of combinations of
taking r elements from n elements is the number
of distinct sets consisting of r distinct elements
which can be formed out of a set of n distinct
elements and is denoted Cnr.
n!
C 
(n  r )!r!
n
r
 In
order to define the concept of a
conditional probability it is necessary to
discuss joint probabilities and marginal
probabilities.

A joint probability is the probability of two
random events. For example, the draw of two
cards from a deck of cards. There are
52x51=2652 different combinations of the first
two cards from the deck.


The marginal probability is overall probability of
a single event or the probability of drawing a
given card.
The conditional probability of an event is the
probability of that event given that some other
event has occurred.


In the textbook, what is the probability of the die
being a one if you know that the face number is odd?
(1/3).
Note if you know that the role of the die is a one, then
the probability of the role being odd is 1.
 Axioms




of Conditional Probability:
P (A|B )0 for any event A.
P (A|B )=1 for any event A  B.
If {AiB}, i=1, 2,… are mutually exclusive, then
P(A1A2…|B )=P(A1|B )+P(A2|B)+….
If B H and B G and P (G )0, then
P( H | B) P( H )

P(G | B) P(G)


Theorem 2.4.1: P (A|B )=P (AB )/P (B) for any
pair of events A and B such that P (B)>0.
Theorem 2.4.2 (Bayes Theorem): Let Events A1,
A2, …, An be mutually exclusive such that P
(A1A2…An)=1 and P (Ai) >0 for each i. Let E
be an arbitrary event such that P (E)>0. Then
P( Ai | E ) 
P( E | Ai ) P( Ai )
n
 P( E | A ) P( A )
j 1
j
j

Another manifestation of this theorem is from
the joint distribution function:
P(E, Ai )  P  E  Ai   P  E Ai  P  Ai 

The bottom equality reduces the marginal
probability of event E
n
P( E )   P( E | Ai ) P( Ai )
i 1

This yields a friendlier version of Bayes theorem
based on the ratio between the joint and
marginal distribution function:
P( E , Ai )
P( Ai | E ) 
P( E )
 Statistical
independence is when the
probability of one random variable is
independent of the probability of another
random variable.
 Definition 2.4.1: Events A and B are said to
be independent if P (A)=P (A|B ).
 Definition
2.4.2: Events A, B, and C are said
to be mutually independent if the following
equalities hold:




P (AB )=P (A )P (B )
P (AC )=P (A )P (C )
P (BC )=P (B )P (C )
P (AB C )=P (A )P (B)P (C )