Transcript Document

The Binomial Distribution
This distribution is useful for
modelling situations in which the
random variable (representing the
outcome of an experiment) may take
one of only two possible values.
For example, sitting for a test, you
can either have a success or a failure;
if a coin is tossed, the outcome is
either a Head or a Tail, etc.
 The
following additional features
characterise the Binomial:
1. The number of trials n is a known
constant and it is not too large.
2. For each trial, the probability of
success p is a known constant and is the
same for each trial.
35
30
25
20
Frequency
or
e
M
5.
6
4.
2
2.
8
1.
4
15
10
5
0
0
Frequency
Histogram
Bin
Binomial
(25, 0.1)
Histogram
30
20
15
Frequency
10
5
Bin
or
e
M
23
.4
21
.8
20
.2
18
.6
0
17
Frequency
25
Binomial
(25, 0.9)
Mean
Standard Error
Median
Mode
Standard Deviation
Sample Variance
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
Count
24.19
0.281301
24
24
2.813011
7.91303
-0.45478
0.107326
13
19
32
2419
100
Binomial
(40, 0.6)
Mean =np
= 24
S.D. = (np
(1-p))1/2
= 3.09
Histogram
25
15
Frequency
10
5
Bin
or
e
M
29
.4
26
.8
24
.2
21
.6
0
19
Frequency
20
Binomial
(40, 0.6)
Binomial
(2500, 0.001)
Histogram
30
20
15
Frequency
10
5
or
e
M
5.
6
4.
2
2.
8
1.
4
0
0
Frequency
25
Bin
Mean
Standard Error
Median
Mode
Standard Deviation
Sample Variance
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
Count
2.43
0.159706
2
3
1.597062
2.550606
-0.16406
0.494198
7
0
7
243
100
Binomial
(2500,
0.001)
Mean=np
= 2.5
S.D. =(np
(1-p))1/2
= 1.58
The Poisson Distribution:
Consider the following situation.
Customers come to a shop at a
regular interval
The average rate of arrival of
customers is given (l) but the total
number of customers to arrive is
unknown and could be very large
• The probability of getting r customers
in any given interval is then given by:
• P(X = r) = (e-l lr)/ r!
• where e = 2.718….….
• Theory:
• If x ~ Binomial (n,p), n is large, yet
np is not large, then x has an
approximate Poisson distribution with
l = np
Half a percent of 500 students in a
course are likely to resort to unfair
means.
Find the probability that
Exactly 4 students will resort
to unfair means0.133836
At most 6 students 0.133602
will resort
More
thanmeans
4 students will
to
unfair
Less
than
4
students
will
resort to unfair means
resort to unfair means
Continuous Probability
Distributions
The Uniform distribution
This distribution is useful for modelling
situations where a random outcome may
be imagined to have realizations along
a straight line with equal probability
Suppose that commuters waiting on a
platform 50 metres long are likely to
spread out evenly
Suppose we call X the location chosen by a
typical commuter.
Then 0  X  50 and X has a uniform
distribution
f(X)
For every value of X, f(X) = 1/50
1/50
0
25
50
X
Any
Expected
realization
of XofisXa symmetric
=
mode
25
TheThe
distribution
isValue
perfectly
The Median = 25
f(X)
For every value of X, f(X) = 1/(b-a)
1/(b-a)
0
a
(a+b)/2
b
X
The
Expected
Value
of
X
=
(a+b)/2
The
distribution
is
perfectly
2
symmetric
So
the
St.
Dev.
of
X
=
(b-a)
/12
Any
realization
of
X
is
a
mode
The variance of X = (b-a) /12
The Median = (a+b)/2
f(X)
1/50
0
25
50
X
The
St. Dev.ofofXX== (50-0)
(50-0)2/12
The
variance
/12
=14.43
=208.3
0.
24
87
10 258
.1
5
11 2
19 392
56
.9
74
29 059
.8
27
36
39 725
97
.6
99
39
26
8
M
or
e
Frequency
Histogram
14
12
10
8
6
Frequency
4
2
0
Bin
Column1
Mean
24.40611
Standard Error
1.417486
Median
23.62972
Mode
#N/A
Standard Deviation
14.17486
Sample Variance
200.9266
Kurtosis
-0.99496
Skew nessSimulation from
0.051839
Range
Uniform(0, 50)49.22636
Minimum
0.334178
Maximum
49.56053
Sum
2440.611
Count
100
The Triangular distribution
Can you locate the mean
median and the mode?
X
The Triangular distribution
Every normal
The
two
shaded
parts
distribution is
must
be
equal
in
area.
symmetric about the
mean.
Mean-a
Mean
Mean +a
z
The two inner green
areas are equal as
well.
Mean-a
Mean
Mean +a
The area shaded
brown is
approximately
68% of the whole
-1
0
1
z
The area shaded
orange
is approximately
90% of the whole
-1.645
0
+1.645
The area shaded
orange is
approximately
95% of the whole
-2
0
+2
f(x)
m
Two Normal Distribution curves with same mean
(m) but different standard deviation
x
f(z)
m
m+2
Two Normal Distribution curves with same
standard deviation but different values of
mean
z
Find the area shaded
black
Answer =0.5
+0.4452
=0.9452
The area shaded black is 0.9452
as well (by symmetry)
Find the area shaded
black
The black shaded area below has the
same area (by symmetry)
Answer
= 0.5+0.3944
= 0.8944
First
Thisfind
area
Then
add
0.5
this
area
is 0.3944
The answer is 0.675
Q3=
Q1= 0.675
-0.675
Look for z so that
this area is 0.25
25%
25%
Q1
Q3
D9
D1=
= 1.28
-1.28
Look for z so that
this area is 0.4
10%
10%
D1
D9
Find the first decile of the SND
So Q1 for x
So Q3 for x
Exercise:
Find
the first decile
of + 20
Q1
for
z
is
–0.675
=x-0.675(5)
+
20
=
0.675(5)
= zs+m Q3 for2 z is 0.675
X =~ 16.625
Normal (20, 5 )
= 23.375
Find the first and the third quartile of
X ~ Normal (20, 52)
35% of British men are at least 185 cm tall
If I meet 200 such men on any given day,
what is the probability that 100 or more
of them are 185 cm or taller?
Normal Approximation of the
Binomial (Chapter P4 of the text)
Probability(M)
Prob( M = 100) = ?
= P(99.5 < M < 100.5)
is the normal
approximation
.99 100. 101. 102. 103. 104.
M
Prob(M

100)
=
Prob(M
=
100)
+
Prob(M
=
This
area
is shaded
black
But
it
is
also
the
area
under
the red
polygon
Prob(M

100)
>=200)
99.5)
101) + Prob(M = 102) ...=+Prob(M
Prob(M
to the right of 99.5, or Prob(M > 99.5)
Probability(M)
So
Prob( M = 100)
= P(99.5 < M < 100.5)
.99 100. 101. 102. 103. 104.
M
Probability(M) Find Prob( M < 103)
It is also the area
This
area
shaded
under
theisred
black
polygon or
Prob(M  102.5)
.99 100. 101. 102. 103. 104.
M
Binomial
Normal Approximation
Prob( M = 100) = Prob(99.5 < M <100.5)
Prob(M  100) = Prob(M > 99.5)
Prob( M < 103)= Prob(M  102.5)
Prob( M  103) = Prob( M  103.5)
We can similarly show that
The approximation works if both of
np and n(1-p) 5
Read page 500 (inset)