Probability, Discrete Random Variables
Download
Report
Transcript Probability, Discrete Random Variables
Chapter 3 Probability(概率)
The Concept of Probability
Sample Spaces and Events
Some Elementary Probability
Rules
Conditional Probability and
Independence
Section 3.1 The Concept of
Probability
An experiment is any process of observation with an uncertain
outcome.
--- On any single trial of the experiment, one and only one of the
possible outcomes will occur.
The possible outcomes for an experiment are called the
experimental outcomes
Probability is a measure of the chance that an experimental
outcome will occur when an experiment is carried out
Probability and statistics
3
Example 3.1
Roll a die. The experimental outcomes are 1, 2, 3, 4, 5,
and 6.
An Outcome is the
particular result of
an experiment.
An Event is the
collection of one or
more outcomes of
an experiment.
Possible outcomes: The
numbers 1, 2, 3, 4, 5, 6
One possible event: The
occurrence of an even
number. That is, we collect
the outcomes 2, 4, and 6.
Regardless of the method used, probabilities must be
assigned to the experimental outcomes so that two
conditions are met:
Conditions
1. 0 P(E) 1
such that:
If E can never occur, then P(E) = 0
If E is certain to occur, then P(E) = 1
2. The probabilities of all the experimental outcomes must
sum to 1
Section 3.2 Sample Spaces
and Events(事件)
Sample space (S)(样本空间):
The sample space is defined as the set of all possible
outcomes of an experiment.
e.g. All 6 faces of a die:
e.g. All 52 cards of a bridge deck:
Example 3.2
Genders of Two Children
Let: B be the outcome that child is boy.
G be the outcome that child is girl.
Sample space S:
S = {BB, BG, GB, GG}
If B and G are equally
likely , then
P(B) = P(G) = ½
and
P(BB) = P(BG) = P(GB) =
P(GG) = ¼
Recall example 3.2: Genders of Two Children
An event is a set of sample space outcomes.
Events
P(one boy and one girl) =
P(BG) + P(GB) = ¼ + ¼ = ½.
P(at least one girl) =P(BG) +
P(GB) + P(GG) = ¼ + ¼ + ¼
= ¾.
Note: Experimental Outcomes: BB, BG, GB, GG
All outcomes equally likely: P(BB) = … = P(GG) = ¼
Example 3.3
Answering Three True-False Questions
A student takes a quiz that consists of three true-false
questions. Let C and I denote answering a question correctly
and incorrectly, respectively.
The graph on the next slide shows the sample space outcomes
for the experiment. The sample space consists of 8 outcomes:
CCC CCI CIC CII ICC ICI IIC III
Suppose the student is totally unprepared for the quiz and has
to blindly guess the answers. That is, the student has a 50-50
chance of correctly answering each question.
So, each of the 8 outcomes is equally likely to occur.
P(CCC)=P(CCI)= ... = P(III)=1/8.
Probabilities: Equally Likely Outcomes
If the sample space outcomes (or experimental
outcomes) are all equally likely, then the
probability that an event will occur is equal to
the ratio:
the number of outcomes that correspond to the event
The total number of outcomes
Basic Computation of Probabilities
The probability of an event is also equal the sum of the
probabilities of the sample space outcomes that correspond to
the event.
Example 3.4
The probability that the student will get exactly two questions
correct is
P(CCI) + P(CIC) + P(ICC) = 1/8 + 1/8 + 1/8 = 3/8.
The probability that the student will get at least two questions
correct is
P(CCC) + P(CCI) + P(CIC) + P(ICC) = 1/8 + 1/8 + 1/8 + 1/8
= 1/2.
Relative Frequency Method(概率的频率解释)
Let E be an outcome of an experiment.
If the experiment is performed many times, P(E) is the
relative frequency of E.
P(E) is the percentage of times E occurs in many repetitions
of the experiment.
Use sampled or historical data to calculate probabilities.
Example 3.5
Suppose that of 1000 randomly selected consumers, 140
preferred brand X.
The probability of randomly picking a person who prefers
brand X is
140/1000 = 0.14 or 14%.
Example2: Long-Run Relative Frequency
Long-Run Relative Frequency
Method Method: Example
1. An accounts receivable manager knows from
past data that about 70 of 1000 accounts became
uncollectible.
The manager would estimate the probability of bad
debts as 70/1000 = .07 or 7%.
2. Tossing a fair coin 3000 times, we can see that
although the proportion of heads was far from 0.5 in
the first 100 tosses, it seemed to stabilize and
approach 0.5 as the number of tosses increased.
Long-Run Relative Frequency Method:
application
Often we determine the probability from a random
sample (Long-Run Relative Frequency Method) and
apply it to the population.
Of 5528 Zhuhai residents randomly sampled,
445 prefer to watch CCTV-1
Estimated Share P(CCTV-1) = 445 / 5528 = 0.0805
So the probability that any Zhuhai resident chosen at
random
prefers CCTV-1 is 0.0805
Assuming total population in Zhuhai is 1,000,000 :
Size of audience in the city = Population x Share
so 1,000,000 x 0.0805 = 80,500
Subjective Probability
Using experience, intuitive judgment, or personal
expertise to assess/derive a probability
May or may not have relative frequency
interpretation (Some events cannot be repeated many
times)
Contains a high degree of personal bias.
What is the probability of your favorite basketball
or football team win the next game? (e.g. sports
betting)
Subjective
probability
&
betting
The odds in betting reflect the subjective
probability guessed by the mass.
Who much are you willing to pay for a ticket
which worth $10 if there was life on Mars and
nothing if there was not?
Subjective probability usually reflects the
mind/opinion more than the reality.
Sometimes, it is used to gauge the public
opinions.
Section 3.3 Some Elementary
Probability Rules
The complement A of an event A is
the set of all sample space outcomes
not in A. Further, P(A) = 1 - P(A).
These figures are “Venn diagrams”.
Union of A and B, A B (A和B的并集)
Is an event consisting of the outcomes that
belong to either A or B (or both).
Intersection of A and B, A B (A和B的交集)
Is an event consisting of the outcomes that
belong to both A and B.
The Addition Rule(加法准则)
The probability that A or B (the union of A and B) will
occur is P(A B) = P(A) + P(B) - P(A B)
where P(A B) is the “joint” probability of A and B,
i.e., both occurring.
A and B are mutually exclusive(相互排斥)
if they have no sample space outcomes in
common, or equivalently, if P(A B) = 0.
If A and B are mutually exclusive, then
P(A B)=P(A)+P( B).
Example 3.7
Newspaper Subscribers #1
Define events:
A = event that a randomly selected household subscribes
to the Atlantic Journal.
B = event that a randomly selected household subscribes
to the Beacon News.
Given:
total number in city, N = 1,000,000
number subscribing to A, N(A) = 650,000
number subscribing to B, N(B) = 500,000
number subscribing to both, N(A∩B) = 250,000
Newspaper Subscribers #2
Use the relative frequency method to assign
probabilities
650,000
P A
0.65
1,000,000
500,000
P B
0.50
1,000,000
250,000
P A B
0.25
1,000,000
Table3.1 A Contingency Table(列联表) Subscription Data
for the Atlantic Journal and the Beacon News
Events
Subscribes to Does Not
Beacon News, Subscribe to
B
Beacon News,
Total
Subscribes to
Atlantic Journal, A
250,000
400,000
650,000
Does not
Subscribes to
Atlantic Journal,
250,000
100,000
350,000
Total
500,000
500,000
1,000,000
Newspaper Subscribers #3
Refer to the contingency table in Table 3.1 for
all probabilities
For example, the chance that a household does
not subscribe to either newspaper
Calculate PA B , so from middle row
and middle column of Table 3.1,
100,000
P A B
0.10.
1,000,000
Newspaper Subscribers #4
The
chance that a household subscribes to either newspaper:
P(A B)=P(A)+P ( B ) P ( A B )
0.65 0.50 0.25
0.90.
Note that if the joint probability was not subtracted, then
we would have gotten 1.15, greater than 1, which is
absurd.
Note: The subtraction avoids double counting the joint
probability.
A Mutually Exclusive Case
Recall
the radio station example. The percentages of LA
residents who favor each of the top 10 stations is listed in the
Figure. Let the name of each station, for example KPWR,
represent the event that the station, say KPWR, is the most
favorable radio station for a randomly selected resident.
Since the survey asked each resident to name the single station
that he/she listens to most, the 10 events are mutually exclusive.
Therefore, the probability that a randomly selected LA resident
would favor one of the top 10 stations is
P(KPWR U KLAX U …… U KSBC-FM)
= P(KPWR)+P(KLAX)+……+P(KCBS-FM)
= 0.08+0.064+ …….+0.036=0.508.
Section 3.4 Conditional
Probability and Independence
The probability of an event A, given that the event B
has occurred, is called the “conditional probability
of A given B”(条件概率) and is denoted as
Further,
P(A B)
P(A| B) =
P(B)
Assume that P(B) is greater than 0.
Interpretation: Restrict the sample space to just event
B. The conditional probability P(A|B) is the chance
of event A occurring in this new sample space.
Similarly, if A occurred, then what is the chance of
B occurring?
To answer this question, we need to introduce the
probability of event B, given that the event A has
occurred, i.e., the conditional probability of B
given A, denoted by P(B|A).
P(A B)
P(B | A) =
P(A)
Assume that P(A) is greater than 0.
Newspaper Subscribers
Given that the households that subscribe to the
Atlantic Journal, what is the chance that they also
subscribe to the Beacon News?
Calculate P(B|A), where
P A B
P B | A
P A
0.25
0.3846.
0.65
Independence(独立) of Events
Two events A and B are said to be independent
if and only if P(A|B) = P(A) or, equivalently,
P(B|A) = P(B).
That is, if the chance of event A occurring is not
influenced by whether the event B occurs and
vice versa; or if the occurrences of the events A
and B have nothing to do with each other, then A
and B are independent.
In fact if one of the above two equations holds,
so does the other, why?
Newspaper Subscribers
Given that the households that subscribe to the Atlantic
Journal subscribers, what is the chance that they also
subscribe to the Beacon News?
If independent, the P(B|A) = P(B).
Is P(B|A) = P(B)?
Know that P(B) = 0.50.
Just calculated that P(B|A) = 0.3846.
0.50 ≠ 0.3846, so P(B|A) ≠ P(B).
B is not independent of A.
A and B are said to be dependent.
The Multiplication Rule
The joint probability that A and B (the intersection
of A and B) will occur is
P(A B) = P(A) P(B|A)
= P(B) P(A|B).
If A and B are independent, then the probability
that A and B (the intersection of A and B) will
occur is
P(A B) = P(A) P(B) P(B) P(A).
A Question
Suppose in the following contingency table, where the
numbers represent probabilities, some data are lost.
1.Can you recover the missing data?
2.Are events R and C independent?
R
R
Total
C
.4
C
.3
.5
Total
.6
1.00
Contingency Tables
P(R )
P(R C )
R
R
Total
C
.4
.1
.5
C
.2
.3
.5
Total
.6
.4
1.00
P(R C )
P( C )
As P( R C ) 0.4
P ( R) P(C ) 0.6 0.5 0.3
P ( R C ) P ( R) P(C )
the events R and C are dependent.
Chapter 4 Discrete Random Variables(离
散随机变量)
Two Types of Random Variables
Discrete Probability Distributions
The Binomial Distribution
The Poisson Distribution
Random Variables (随机变量)
A random variable is a variable that assumes numerical
values that are determined by the outcome of an
experiment.
Example 4.1 Consider a random experiment in which a coin
is tossed three times. Let X be the number of heads. Let H
represent the outcome of a head and T the outcome of a tail.
The possible outcomes for such an experiment:
TTT, TTH, THT, THH, HTT, HTH, HHT, HHH
Thus the possible values of X (number of heads) are 0,1,2,3.
From the definition of a random variable, X as defined in this
experiment, is a random variable.
Section 4.1 Two Types of Random Variables
Discrete random variable(离散型随机变量): Possible
values can be counted or listed
- For example, the number of TV sets sold at the store in
one day. Here x could be 0, 1, 2, 3, 4 and so forth.
Continuous random variable (连续型随机变量): May
assume any numerical value in one or more intervals
- For example, the waiting time for a credit card
authorization, the interest rate charged on a business
loan
Example: Two Types of Random Variables
Question
Random Variable x
Type
Family
size
x = Number of people in
family reported on tax return
Discrete(离散)
Distance from
home to store
x = Distance in miles from
home to a store
Continuous(连续)
Own dog
or cat
x = 1 if own no pet;
= 2 if own dog(s) only;
= 3 if own cat(s) only;
= 4 if own dog(s) and
cat(s)
Discrete
Section 4.2Discrete Probability Distributions(离散概率分布)
The probability distribution of a discrete random
variable is a table, graph, or formula that gives the
probability associated with each possible value that the
variable can assume
Notation: Denote the values of the random variable by x
and the value’s associated probability by p(x)
Properties
1. For any value x of the random variable, p(x) 0
2. The probabilities of all the events in the sample space must
sum to 1, that is,
px 1
all x
Example 4.2 Number of Radios(Sold at South City in a
Week)
Let x be the random variable of the number of radios sold per
week, x has values x = 0, 1, 2, 3, 4, 5
Given sales history over past 100 weeks
Let f be the number of weeks (of the past 100) during which
x number of radios were sold
Records tell us that
f(0)=3
No radios have been sold in 3 of the weeks
f(1)=20
One radios has been sold in 20 of the weeks
f(2)=50
Two radios have been sold in 50 of the weeks
f(3)=20
Three radios have been sold in 20 of the weeks
f(4)=5
Four radios have been sold in 4 of the weeks
f(5)=2
Five radios have been sold in 2 of the weeks
No more than five radios were sold in any of the past 100 weeks
Frequency distribution of sales history over past
100 weeks
# Radios, x
0
1
2
3
4
5
Frequency
f(0) =3
f(1) =20
f(2) =50
f(3) =20
f(4) = 5
f(5) = 2
100
Relative Frequency Probability, p(x)
3/100 = 0.03
p(0) = 0.03
20/100 = 0.20
p(1) = 0.20
0.50
p(2) = 0.50
0.20
p(3) = 0.20
0.05
p(4) = 0.05
0.02
P(5) = 0.02
1.00
1.00
Interpret the relative frequencies
as probabilities
So for any value x, f(x)/n = p(x)
Assuming that sales
remain stable over time
What is the chance that two radios will be sold in a week?
P(x = 2) = 0.50
What is the chance that fewer than 2 radios will be sold in a
week?
p(x < 2) = p(x = 0 or x = 1)
Using the addition
rule for the mutually
= p(x = 0) + p(x = 1)
exclusive values of
= 0.03 + 0.20 = 0.23
the random variable.
What is the chance that three or more radios will be sold in a
week?
p(x ≥ 3) = p(x = 3, 4, or 5)
= p(x = 3) + p(x = 4) + p(x = 5)
= 0.20 + 0.05 + 0.02 = 0.27
Expected Value of a Discrete Random Variable
The mean(均值) or expected value of a discrete
random variable X is:
X x p x
All x
is the value expected to occur in the long run
and on average
Example 4.3
Number of Radios
How many radios should be expected to be sold in a week?
Calculate the expected value of the number of radios
sold, X
Radios, x
0
1
2
3
4
5
Probability, p(x)
p(0) = 0.03
p(1) = 0.20
p(2) = 0.50
p(3) = 0.20
p(4) = 0.05
p(5) = 0.02
1.00
x p(x)
0 0.03 = 0.00
1 0.20 = 0.20
2 0.50 = 1.00
3 0.20 = 0.60
4 0.05 = 0.20
5 0.02 = 0.10
2.10
• On average, expect to sell 2.1 radios per week
Variance and Standard Deviation
The variance of a discrete random variable is:
2X x X 2 px
All x
• The variance is the average of the squared deviations of the
different values of the random variable from the expected
value
The standard deviation is the square root of the variance
X
2X
• The variance and standard deviation measure the spread of the
values of the random variable from their expected value
Example 4.4
Number of Radios
Calculate the variance and standard deviation of the number
of radios sold at Sound City in a week
Radios, x
0
1
2
3
4
5
Probability, p(x)
p(0) = 0.03
p(1) = 0.20
p(2) = 0.50
p(3) = 0.20
p(4) = 0.05
p(5) = 0.02
1.00
Standard deviation
X
0.89 0.9434
(x - X)2 p(x)
(0 – 2.1)2 (0.03) = 0.1323
(1 – 2.1)2 (0.20) = 0.2420
(2 – 2.1)2 (0.50) = 0.0050
(3 – 2.1)2 (0.20) = 0.1620
(4 – 2.1)2 (0.05) = 0.1805
(5 – 2.1)2 (0.02) = 0.1682
0.8900
Variance
X2 0.89
The Binomial Distribution
(二项分布)
The Binomial Experiment:
1. Experiment consists of n identical trials
2. Each trial results in either “success” or “failure”
3. Probability of success, p, is constant from trial to trial
4. Trials are independent
Note: The probability of failure, q, is 1 – p and is constant
from trial to trial
If x is the total number of successes in n trials of a binomial
experiment, then x is a binomial random variable
The Binomial Distribution #2
For a binomial random variable x, the probability of x
successes in n trials is given by the binomial distribution:
n!
px =
p x q n- x
x!n - x !
• Note: n! is read as “n factorial” and n! = n × (n-1) × (n-2)
× ... × 1
– For example, 5! = 5 4 3 2 1 = 120
• Also, 0! =1
• Factorials are not defined for negative numbers or fractions
The Binomial Distribution #3
• What does the equation mean?
– The equation for the binomial distribution consists of
the product of two factors
n!
px =
x!n - x !
Number of ways to
get x successes and
(n–x) failures in n
trials
p x q n- x
The chance of getting x
successes and (n–x)
failures in a particular
arrangement
Example 4.5
Incidence of Nausea
The company claims that, at most, 10 percentage of all
patients treated with Phe-Mycin would experience nausea
as a side effect of taking the drug.
x = number of patients who will experience nausea
following treatment with Phe-Mycin out of the 4 patients
tested
Find the probability that 2 of the 4 patients treated will
experience nausea
Given: n = 4, p = 0.1,
4!
0.1 2 0.9 4 2 with x = 2
px 2
2!4 2!
Then: q = 1 – p = 1 –
60.1 2 0.9 2 0.0486 0.1 = 0.9
Binomial Distribution (n = 4, p = 0.1)
Binomial Probability Table
(Appendix Table A.1, P817)
Table 4.7(a) for n = 4, with x = 2 and p = 0.1
p = 0.1
values of p (.05 to .50)
x
0
1
2
3
4
0.05
0.8145
0.1715
0.0135
0.0005
0.0000
0.95
0.1
0.6561
0.2916
0.0486
0.0036
0.0001
0.9
0.15
0.5220
0.3685
0.0975
0.0115
0.0005
0.85
…
…
…
…
…
…
…
0.50
0.0625
0.2500
0.3750
0.2500
0.0625
0.50
values of p (.05 to .50)
P(x = 2) = 0.0486
4
3
2
1
0
x
Example 4.5
Incidence of Nausea(after Treatment)
x = number of patients who will experience nausea
following treatment with Phe-Mycin out of the 4
patients tested
Find the probability that at least 3 of the 4 patients treated
will experience nausea
Set x = 3, n = 4, p = 0.1, so q = 1 – p = 1 – 0.1 = 0.9
Then:
p x 3 p x 3 or 4
p x 3 p x 4
0.0036 .0001 0.0037
Using the addition
rule for the mutually
exclusive values of
the binomial random
variable
Rare Events
Suppose at least three of four sampled patients
actually did experience nausea following treatment
If p = 0.1 is believed, then there is a chance of only 37 in
10,000 of observing this result
So this is very unlikely!
But it actually occurred
So, this is very strong evidence that p does not equal 0.1
There is very strong evidence that p is actually greater
than 0.1
Several Binomial Distributions
Mean and Variance of a Binomial Random
Variable
If x is a binomial random variable with
parameters n and p (so q = 1 – p), then
mean X np
variance X2 npq
standard deviation X
npq
Back to Example 4.5
Of 4 randomly selected patients, how many should be
expected to experience nausea after treatment?
Given: n = 4, p = 0.1
Then mX = np = 4 0.1 = 0.4
So expect 0.4 of the 4 patients to experience nausea
If at least three of four patients experienced nausea,
this would be many more than the 0.4 that are
expected
Binomial Distribution EXAMPLE:
Pat Statsdud is registered in a statistics course and intends to
rely on luck to pass the next quiz.
The quiz consists on 10 multiple choice questions with 5
possible choices for each question, only one of which is the
correct answer.
Pat will guess the answer to each question
Find the following probabilities
Pat gets no answer correct
Pat gets two answer correct?
Pat fails the quiz
If all the students in Pat’s class intend to guess the
answers to the quiz, what is the mean and the standard
deviation of the quiz mark?
Solution
Checking the conditions
An answer can be either correct or incorrect.
There is a fixed finite number of trials (n=10)
Each answer is independent of the others.
The probability p of a correct answer (.20) does not
change from question to question.
Determining the binomial probabilities:
Let X = the number of correct answers
10!
P( X 0)
(.20 ) 0 (.80 )100 .1074
0! (10 0)!
10!
P( X 2)
(.20 ) 2 (.80 )10 2 .3020
2! (10 2)!
Determining the binomial probabilities:
Pat fails the test if the number of correct answers is less
than 5, which means less than or equal to 4.
P(X4 = p(0) + p(1) + p(2) + p(3) + p(4)
= .1074 + .2684 + .3020 + .2013 + .0881
=.9672
The mean and the standard deviation of the quiz mark?
μ= np = 10(.2) = 2.
σ= [np(1-p)]1/2 = [10(.2)(.8)]1/2 = 1.26
The Poisson Distribution
(泊松分布)
Consider the number of times an event occurs over
an interval of time or space, and assume that
1. The probability of occurrence is the same for any
intervals of equal length
2. The occurrence in any interval is independent of an
occurrence in any non-overlapping interval
If x = the number of occurrences in a specified
interval, then x is a Poisson random variable
The Poisson Distribution Continued
Suppose is the mean or expected number of
occurrences during a specified interval
The probability of x occurrences in the interval when
are expected is described by the Poisson distribution:
e x
px
x!
where x can take any of the values x = 0, 1, 2, 3, …
and e = 2.71828… (e is the base of the natural logs)
Example 4.6
ATC Center Errors
Suppose that an air traffic control (ATC) center has been
averaging 20.8 errors per year and lately the center
experiences 3 errors in a week.
Let x be the number of errors made by the ATC center
during one week
Given: = 20.8 errors per year
Then: = 0.4 errors per week
• Because there are 52 weeks per year, m for a week is:
= (20.8 errors/year) / (52 weeks/year) = 0.4
errors/week
ATC Center Errors Continued
Find the probability that 3 errors (x =3) will occur in a
week
– Want p(x = 3) when = 0.4
e 0.4 0.43
px 3
0.0072
3!
Find the probability that no errors (x = 0) will occur in
a week
– Want p(x = 0) when
= 0.4
e 0.4 0.40
px 0
0.6703
0!
Poisson Probability Table
(Appendix Table A.2, P821)
, Mean number of Occurrences
x
0
1
2
3
4
5
0.1
0.9048
0.0905
0.0045
0.0002
0.0000
0.0000
0.2
0.8187
0.1637
0.0164
0.0011
0.0001
0.0000
…
…
…
…
…
…
…
0.4
0.6703
0.2681
0.0536
0.0072
0.0007
0.0001
…
…
…
…
…
…
…
e 0.4 0.4 3
p x 3
0.0072
3!
=0.4
1.00
0.3679
0.3679
0.1839
0.0613
0.0153
0.0031
Poisson Distribution ( = 0.4)
Mean and Variance of a Poisson Random Variable
If x is a Poisson random variable with parameter
, then
mean X
variance
2
X
standard deviation X
Several Poisson Distributions
Back to Example 4.6
In the ATC center situation, 28.0 errors occurred on
average per year
Assume that the number x of errors during any span of
time follows a Poisson distribution for that time span
Per week, the parameters of the Poisson distribution
are:
• mean = 0.4 errors/week
• standard deviation = 0.6325 errors/week.
• Because = √0.4 = 0.6325
Poisson Distribution Example
Customers arrive at a rate of 72 per
hour. What is the probability of 4
customers arriving in 3 minutes?
Solution:
72 per hr. = 1.2 per min.
= 3.6 per 3 mins.
px
e
x!
x
e 3.6 3.6
px 4
0.1912
4!
4