X - markymaths

Download Report

Transcript X - markymaths

International Baccalaureate Higher Level International Baccalaureate Higher
Level International Baccalaureate Higher Level International Baccalaureate
Higher Level International Baccalaureate Higher Level International
DiscreteHigher
random
variables Baccalaureate Higher Level
Baccalaureate
Level International
International
Higher Level International Baccalaureate Higher
LearningBaccalaureate
outcomes
Level International Baccalaureate Higher Level International Baccalaureate
This
workInternational
will help youBaccalaureate
to learn
Higher
Level
Higher Level International
Baccalaureate
Higher Level
International
Baccalaureate
Higher Level
• about probability
distributions
for discrete
random variables
International Baccalaureate Higher Level International Baccalaureate Higher
to calculate
and use E(X),
the expectation
(mean) Baccalaureate
Level• how
International
Baccalaureate
Higher
Level International
Higher
Level
International
Higher
Level International
• how
to calculate
andBaccalaureate
use E[g(X)], the
expectation
of a simple
Baccalaureate
Level International Baccalaureate Higher Level
function of Higher
X
International Baccalaureate Higher Level International Baccalaureate Higher
to calculate
and use Var(X),
theLevel
variance
of X
Level• how
International
Baccalaureate
Higher
International
Baccalaureate
Higher
Levelthe
International
HigherF(x)
Level International
• about
cumulativeBaccalaureate
distribution function
Baccalaureate Higher Level International Baccalaureate Higher Level
• about the
binomial andHigher
Poisson
distributions
International
Baccalaureate
Level
International Baccalaureate Higher
Level International Baccalaureate Higher Level International Baccalaureate
Higher Level International Baccalaureate Higher Level International
Baccalaureate Higher Level International Baccalaureate Higher Level
International Baccalaureate Higher Level International Baccalaureate Higher
Level International Baccalaureate Higher Level International Baccalaureate
When a variable is discrete, it is possible to specify or describe all its
possible numerical variables, for example
• the number of females in a group of four students: the possible values
are 0, 1, 2, 3, 4,
• the amount gained, in pence , in a game where the entry fee is 10 p
and the prizes are 50 p and £1: the possible values are 10, 40, 90,
• the number of times you throw a die until a six appears: the possible
values are 1, 2, 3, 4, 5, … to infinity.
Probability Distributions
Consider this situation:
By mistake, three faulty fuses are put into a box containing two good
fuses. The faulty and good fuses become mixed up and are
indistinguishable by sight. You take two fuses from the box. What is
the probability that you take
a. no faulty fuses,
b. one faulty fuse,
c. two faulty fuses.
Probability
2
4
3
Outcome
2
F P  F  F    3   2   0 .3
faulty
 5  4 
F
5
2
F’
4
3
2
5
4
 3  2 
P  F  F '        0 .3
 5  4 
1
faulty
 2  3 
      0 .3
 5  4 
1
faulty
F P  F ' F 
F’
1
4
a. P(no faulty fuses) = 0.1
b. P(one faulty fuse) = 0.6
c. P(two faulty fuses) = 0.3
F’
 2  1 
P  F '  F '        0 .1
 5  4 
0
faulty
The variable being considered here is ‘the number of faulty fuses’ and
it can be denoted by X. then the answers to the previous questions can
be written as
a. P  X  0   0.1
b. P  X  1  0.6
c. P  X  2   0.3
or placed in a table
x
P(X = x)
0
1
2
0.1
0.6
0.3
S ince P  X  0   P  X  1  P  X  2   1
then X is a discrete random variable.
For a discrete random variable, the sum of the probabilities is 1,
P X
 x  1
all x
The function that is responsible for allocating probabilities, P(X = x) is
known as the probability density function of X. (p.d.f of X).
Two tetrahedral dice, each with faces labelled 1, 2, 3 and 4 are
thrown and score noted, where the score is the sum of the two
numbers on which the dice land. Find the probability density function
(p.d.f.) of X, where X is the random variable ‘the score when the two
dice are thrown’.
The p.d.f. of a discrete random variable Y is given by P(Y = y) = cy2,
for y = 0, 1, 2, 3, 4. Given that c is a constant, find the value of c.
The discrete random variable W has probability distribution as
shown
w
-3
-2
-1
0
1
P(W = w)
0.1
0.25
0.3
0.15
d
Find
a. the value of d
b. P   3  W  0 
d. P   1  W  1
e. the mode
c. P  W   1
Expectation of X, E(X)
E(X) is read as ‘E of X’ and it gives an average or typical value of X, known
as the expected value or expectation of X. This is comparable with the
mean in descriptive statistics.
Experimental approach
The frequency distribution shows the results when an unbiased die
is thrown 120 times.
Score, x
1
2
3
4
5
6
Frequency, f
15
22
23
19
23
18
Total 120
The mean score
x 
 fx
f

1 15  2  22  3  23  4  19  5  23  6  18
120
 3.6  2 s.f. 
You could write this out in a different way
x  1
15
120
 2
22
120
3
23
120
 4
19
120
5
23
120
 6
These are the relative frequencies of the scores
of 1, 2, 3, 4, 5, 6 respectively Notice that they
are close to 12200  61
18
120
Theoretical approach
When an unbiased die is thrown the probability of obtaining a particular
value is
1
.
6
T h e p ro b a b ility d istrib u tio n is P  X  x  
Score, x
P(X = x)
1
6
fo r x  1, 2, 3, 4, 5, 6 .
1
2
3
4
5
6
1
1
1
1
1
1
6
6
6
6
6
6
The expected mean or expectation of X, is obtained by multiplying each
score by its probability, then summing. It is written E(X) so
E X
E X
  1
1
6
 2
   xP  X
all x
1
6
3
 x
1
6
 4
1
6
5
1
6
 6
and   E  X

1
6
 3 .5
A random variable X has probability distribution as shown. Find the
expectation, E(X)
x
-2
-1
-1
1
2
P(X = x)
0.3
0.1
0.15
0.4
0.05
Find the expected number of sixes when three fair dice are thrown.
Find the expectation, E(X)
x
P(X = x)
1
2
3
4
5
0.1
0.2
0.4
0.2
0.1
A fruit machine consists of three windows which operate independently.
Each window shows pictures of fruits: lemons, apples, cherries or
bananas. The probability that a window shows a particular fruit is as
follows.
The rules for playing the game
on the fruit machine are:
P(lemon) = 0.4
P(cherries) = 0.2
You win £1
You win 50p
P(apple) = 0.1
You win 40p
P(banana) = 0.3
You win 80p
Find the expected gain or loss
if you play the game.
In any order
Expectation of any function of X, E[g(X)]
The definition of expectation can be extended to any function of X,
2
su ch a s 1 0 X , X ,
1
X
, X  4, e tc.
In general, if g(X) is any function of the discrete random variable
X, then
E  g  X   
 g x P  X
 x
a ll x
For example
E  10 X
   10 xP  X
 x
all x

E X
2
x
a ll x
2
P X  x
1
E 
x
 x P X  x
1
a ll x
E  X  4 
 x  4P  X
all x
 x
T he random variable X has p.d.f. P  X  x  for x  1, 2, 3 show n.
x
P(X = x)
1
2
3
0.1
0.6
0.3
Calculate
a. E(X),
b. E(3),
c. E(5X)
d. E(5X+3)
In general for constants a and b,
E a   a
E  aX
  aE  X 
E  aX  b   aE  X   b
A six-sided die has faces marked with numbers 1, 3, 5, 7, 9 and 11. It
is biased so that the probability of obtaining the number R in a single
roll of the die is proportional to R.
a. Show that the probability distribution of R is given by
P R  r

r
,
36
r  1, 3, 5, 7, 9, 1 1 .
b. The die is to be rolled and a rectangle drawn with sides of length 6
cm and R cm. Calculate the expected value of the area of the
rectangle.
c. The die is to be rolled again and a square with sides of length 24R-1
cm. Calculate the expected value of the perimeter of the square.
r
1
3
5
7
9
11
P(R = r)
k
3k
5k
7k
9k
11k
r
1
3
5
7
9
11
P(R = r)
1
36
3
36
5
36
7
36
9
36
11
36
X is the number of heads obtained when two coins are tossed find
a. The expected number of heads,
b. E(X2),
c. E(X2 – X).
x
P(X = x)
0
1
2
1
4
1
2
1
4
In general,for tw o functions of X , g  x  and h  x 
E  g  x   h  x    E  g  x    E  h  x  
Variance of X, Var(X)
Remember that variance = (standard deviation)2
Experimental approach
For a frequency distribution with mean x the variance s2 is given by
s 
2

f x  x 
f
2
or s 
2

f
fx
2
x
2
Theoretical approach
For a discrete random variable X , w ith E ( X )   the variance is
d e fin e d a s fo llo w s :
Var  X
  E X
 
2
Var  X

 E X  
2
  2 X   
 E  X   2E  X   E   
 E  X   2  
 E X  
E X
2
2
2
2
2
2
2
2
2
The random variable X has probability distribution as shown in the
table:
x
P(X = x)
Find
1
2
3
4
5
0.1
0.3
0.2
0.3
0.1
a.
  E  X ,
b.
E X
c.
V ar  X  ,
d.
 , th e sta n d a rd d e v ia tio n o f X .

2
,
Two boxes each contain three cards,. The first box contains cards
labelled 1, 3, and 5; the second box contains cards labelled 2, 6 and 8.
In a game, a player draws one card at random from each box and his
score, X, is the sum of the numbers on the two cards.
a. Obtain the six possible values of X and find the corresponding
probabilities.
b. Calculate E(X), E(X2) and the variance of X.
Second box
First box
x
P(X = x)
2
6
8
1
3
7
9
3
5
9
11
5
7
11
13
3
5
7
9
1
9
1
9
2
9
2
9
11
2
9
13
1
9
The following results relating to variance are useful.
If a and b are any constants,
V ar  a   0
Var aX
  a 2Var  X 
2
Var aX  b   a Var  X 
For example
Var 2 X
  22 Var  X 
 4 V ar  X 
Var  2 X  3   2 Var  X
2
 4 V ar  X


V a r  5  x     1 V a r  X
2
 Var  X


The cumulative distribution function, F(x)
In a frequency distribution, the cumulative frequencies are obtained by
summing all the frequencies up to a particular value.
In the same way, in a particular distribution, the probabilities u to
certain values are summed to give the cumulative probability. The
cumulative probability function is written F(x).
Consider the following probability distribution.
x
P(X = x)
1
2
3
4
5
0.05
0.4
0.3
0.15
0.1
F  1  P  X  1  0.05
F  2   P  X  2   P  X  1  P  X  2   0.05  0.4  0.45
F  3   P  X  3   0.75
F  4   P  X  4   0.9
F 5   P  X  5   1
The cumulative distribution function is
x
F(x)
1
2
3
4
5
0.05
0.45
0.75
0.9
1
In general, for the discrete random variable X,
The cumulative distribution function F(x) where
F x  P X  x
The discrete random variable X has cumulative distribution function
F x 
x
fo r x  1 , 2 ,
6
,6
Write out the probability distribution and suggest what X represents.
x
F(x)
x
P(X = x)
1
2
2
6
1
6
1
1
6
3
4
5
6
4
6
5
6
1
3
4
5
6
1
6
1
6
1
6
1
6
3
6
2
1
6
For a discrete random variable X the cumulative distribution function
F(x) is shown
x
F(x)
Find a. P  X  3  ,
1
2
3
4
5
0.2
0.32
0.67
0.9
1
b. P  X  2 
The binomial distribution
In a particular population, 10% of people have blood type B. If three
people are selected at random from the population, what is the
probability that exactly two of them have blood type B?
0 .1
0 .1
B
0 .9
B
0 .1
0 .1
0 .9
0 .1
2
B  0 .9   0 .1 2
B’
B  0 .9   0 .1
B
0 .9
0 .9
B’
0 .1
0 .9
B’  0 .9   0 .1
B’
0 .9
0 .1
B
B’
B
B’
0 .9
B’
2
P  exactly tw o type B   P  B  B  B '   P  B  B '  B   P  B '  B  B 
 3  0 .9   0 .1
2
 0 .0 2 7
Now consider the situation when eight people are selected. What is the
probability that exactly two of the eight people will have blood type B?
Can you find the probability that exactly two have blood type B in a
randomly selected group of 12 people.?
Conditions for binomial model
For a situation to be described using a binomial model
•
•
•
•
a finite number, n, trials are carried out,
the trials are independent
the outcome of each trial is deemed either a success or a failure,
the probability, p of successful outcome is the same for each trial.
The discrete random variable, X is the number of successful
outcomes in n trials. Then X is said to follow a binomial distribution
X  B  n , p  or X  B in  n , p 
NOTE: The number of trials n and the probability of success p, are
both needed in order to describe the distribution completely.
W e w rite the p (failure) as q w here q  1  p
If X  B  n , p  , the probability of obtaining r suc cesses in n trials is P  X  r
w h e re P  X  r

 n  n r r
   q p fo r r  0, 1, 2, 3,
r 
, n.

At Sellitall Supermarket, 60% of the customers pay by credit card. Find
the probability that in a randomly selected sample of ten customers.
a. Exactly two pay by credit card
b. More than seven pay by credit card.
Five independent trials of an experiment are carried out. The
probability of a successful outcome p and the probability of failure is
1–p=q
Write out the probability distribution of X, where X is the number of
successful outcomes in five trials. Comment on your answer.
The random variable X is distributed B(7, 0.2). Find correct to three
decimal places
a. P(X = 3),
b. P(1 < X ≤ 4)
c. P(X > 1)
A box contains a large number of pens. The probability that a pen is
faulty is 0.1. How many pens would you need to select to be more
than 95% certain of picking at least one faulty one?
Expectation and variance of the binomial distribution
It can be shown that
If X
EX
B  n , p  then
  np
and V ar  X
  npq
w here q  1  p
The random variable X is B(4, 0.8). Construct the probability
distribution for X and find the expectation and variance.
V erify that E  X
x
P(X = x)
  np
and V ar  X
  npq .
0
1
2
3
4
0.0016
0.0256
0.1536
0.4096
0.4096
The probability that it will be a fine day is 0.4. Find the expected
number of fine days in a week and also the standard deviation.
X is B(n, p) with mean 5 and standard deviation 2. Find the values
of n and p.
The mode of the binomial distribution
The mode is the value of X that is most likely to occur. Consider
the following probability sketches.
B  7, 0 .3 5 
X
p 0.4
0.3
X
p 0.4
0.3
B  4, 0 .5 
X
p 0.4
0.3
0.2
0.2
0.2
0.1
0.1
0.1
0
0
0
0
1
2
3
4
5
6
7
0 1 2 3 4
x
X
p 0.4
0.3
0.2
0.2
0.1
0.1
0
0
x
x
X
p 0.4
0.3
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6
x
B  9, 0 .5 
B  6, 0 .8 
B  2 0, 0 .2 5 
For 13-20 the
probabilities
0 1 2 3 4 5 6 7 8 9 10 11 12 too small to
x illustrate
• when p = 0.5 and n is odd, there are two modes,
• otherwise the distribution has one mode
The mode can be found by calculating all the probabilities and find the
value of X with the highest probability. This without a GDC can be very
tedious; it is only usually necessary to consider the probabilities of
values of X close to the mean np.
The probability that a student is awarded a distinction in the
Mathematics examination is 0.05. In a randomly selected group of 50
students, what is the most likely number of students awarded a
distinction?
The Poisson distribution
Consider these random variables
• the number of emergency calls received by an ambulance
control in an hour,
• the number of vehicles approaching a expressway toll bridge
in a five minute interval,
• the number of flaws in a metre length of material
• the number of white corpuscles on a slide.
Assuming these occur randomly, they are all examples of variables
that can be modelled using a Poisson distribution.
Conditions for Poisson model
• Events occur singly at random in a given interval of time or space.
• λ, the mean number of occurrences in the given interval, is known and
is finite.
The variable X is the number of occurrences in the given interval.
If the above conditions are satisfied , X is said to follow a Poisson
distribution written
X
P o    w h e re P  X  x   e


x
x!
fo r x  0, 1, 2, 3,
to in fin ity
A student finds that the average number of amoebas in 10 ml of pond
water from a particular pond is four. Assuming that the number of
amoebas follows a Poisson distribution, find the probability that in 10
ml sample
a. there are exactly five amoebas,
b. there are no amoebas,
c. there are fewer than three amoebas.
These two results are useful in general
If X
P o  X  , th e n P  X  0   e

a n d P  X  1   e

Unit interval
Care must be taken to specify the interval being considered.
In the previous example the mean number of amoebas in 10 ml of pond
water from a particular pond is 4 so the number in 10 ml is distributed Po(4).
Now suppose you want to find a probability relating to the number of
amoebas in 5 ml of water from the same pond. The mean number of
amoebas in 5 ml is two, so the number 5 ml is distributed Po(2)
Similarly, the number of amoebas in 1 ml of pond water is distributed Po(0.4)
On average the school photocopier breaks down eight times during
the school week. (Monday to Friday). Assuming that the number of
breakdowns can be modelled by a Poisson distribution, find the
probability that is breaks down
a. five times in a given week,
b. once on Monday,
c. eight times in a fortnight.
Mean and variance of the Poisson distribution
If X
P o    then E  X

and V ar  X

X follows a Poisson distribution with standard deviation 1.5.
Find P(X ≥ 3)
P o  1
X
X
p 0.4
p 0.4
0.3
0.3
0.2
0.2
0.1
0.1
0
0
1
2
3
4
X
5
6
Po 2
x
0
0
p 0.3
p 0.3
0.2
0.2
0.1
0.1
0
0
1
2
3
4
5
6
7 x
P o  1.6 
1
2
3
4
5
6 x
X
P o  2.2 
5
6
0
0
1
2
3
4
7
8 x
X
Po 3 
6
7
p 0.3
0.2
0.1
0
0
1
2
3
4
5
X
8 x
P o  3.8 
p 0.3
0.2
0.1
0
0
1
2
3
4
5
6
7
8
9 x
X
Po 5 
p0.2
Notice for small values
of λ, the distribution is
very skew, but it
becomes more
symmetrical as λ
increases
0.1
0
0
1
2
3
4
5
6
7
8
9
10 11 x
X
P o  10 
0.2
p
0.1
0
0
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20x
The mode of the Poisson distribution
The mode is the value of X that is the most likely to occur, i.e. with
the greatest probability.
From the diagrams, we can see that
when λ = 1, there are two modes, 0 and 1,
when λ = 2, there are two modes, 1 and 2,
when λ = 1, there are two modes, 2 and 3,
In general, if λ is an integer, there are two modes, λ – 1 and λ.
For example, if X ~ Po(8), the modes are 7 and 8.
Notice also that
when λ = 1.6, the mode is 1,
when λ = 2.2, the mode is 2,
when λ = 3.8, the mode is 3,
In general, if λ is not an integer, mode is the integer below λ.
For example, if X ~ Po(4.9), the mode is 4 .