Transcript File

Lecture 3
Summer Semester
2009
BEA 140
By Leon Jiang
BEA140
Leon Jiang, University of Tasmania
1
Some points more for
univariate data
BEA140
Leon Jiang, University of Tasmania
2
Central tendency



Mean
Median
Mode
BEA140
Leon Jiang, University of Tasmania
3
Variance
BEA140

Population: 2 = ( Xi2 - (Xi)2/N ) / N

Sample: s2 = ( Xi2 - (Xi)2/n ) / (n-1)
Leon Jiang, University of Tasmania
4
Standard deviation

s2 = ( Xi2 - (Xi)2/n ) / (n-1)
[ X i  ( X i ) / n]
2
s
BEA140
2
n 1
Leon Jiang, University of Tasmania
5
The meaning of Stdv.



“ For most data batches around two thirds
( or 68%) of the data will fall within one
standard deviation of the mean, and around
95% within two standard deviations of the
mean.”
- empirical rule
- rule of thumb
BEA140
Leon Jiang, University of Tasmania
6
MEASURING FROM
GROUPED DATA
BEA140
Leon Jiang, University of Tasmania
7
Measuring For Grouped Data

When no raw data but only secondary source of
data available, we have to analyze this
secondary set of data, which has been grouped
for reporting purposes.

A set of grouped data is not like a set of raw data
in that the information in it has already been
grouped arbitrarily.

A set of grouped data is subjective or at least it is
not so objective as raw data, therefore small
errors exist.
BEA140
Leon Jiang, University of Tasmania
8
Generally we use a frequency distribution table to show
the grouping of data
Time
number of calls class mark
cum. Freq.
fj
xj
fjxj
fjxj2
1<=X<3
3<=X<5
5<=X<7
7<=X<9
9<=X<11
11<=X<13
13<=X<15
15<=X<17
11
19
10
9
2
1
1
0
2
4
6
8
10
12
14
16
22
76
60
72
20
12
14
0
44
304
360
576
200
144
196
0
11
30
40
49
51
52
53
53
17<=X<19
19<=X<21
1
0
54
18
20
18
0
294
324
0
2148
54
54
BEA140
Leon Jiang, University of Tasmania
9
Class mark for frequency distribution of
grouped data

Class mark , Xj is a representative value of all observations
located in the class.

A class mark is determined by the largest value and the
smallest value in the class.

Xj

Xj = (RUCL + RLCL) / 2
Where, RUCL => the largest value ; RLCL => the smallest
value

BEA140
= ( largest value + smallest value ) / 2
Leon Jiang, University of Tasmania
10
Central tendency for grouped data

Mean of g.d (grouped data) is defined as the
weighted sum of class marks, with class
frequencies as weights. i.e.

X(mean) = (Σfj xj ) / n

X ( mean ) = 294/54=5.44
BEA140
Leon Jiang, University of Tasmania
11
Median for g.d
1.
-
Locating the median class :
the class containing the median.
But how and where?
-
Total number of calls in the frequency distribution is 54 (=>
even number).
-
and therefore, according to the formula of median ( median
= n + 1 / 2 ), the median ought to be the 27.5th value.
The class containing the 27.5th value is the median class.
-
BEA140
Leon Jiang, University of Tasmania
12
FORMULA FOR MD:


MD = LCL + class width * ( how far into class ) / (how
many in class )
3.0 + 2 * (27.5 - 11) / 19
BEA140
Leon Jiang, University of Tasmania
13
MD = LCL + class width * ( how far into class ) /
(how many in class )
3.0 + 2 * (27.5 - 11) / 19
Time
number of calls class mark
cum. Freq.
fj
xj
fjxj
fjxj2
1<=X<3
3<=X<5
5<=X<7
7<=X<9
9<=X<11
11<=X<13
13<=X<15
15<=X<17
11
19
10
9
2
1
1
0
2
4
6
8
10
12
14
16
22
76
60
72
20
12
14
0
44
304
360
576
200
144
196
0
11
30
40
49
51
52
53
53
17<=X<19
19<=X<21
1
0
54
18
20
18
0
294
324
0
2148
54
54
BEA140
Leon Jiang, University of Tasmania
14
Small errors likely exist most of the time

Median from raw data = 4.4

Median from grouped data = 4.47
BEA140
Leon Jiang, University of Tasmania
15
An example: MD = LCL + class width * ( how
far into class ) / (how many in class )
Class
80 &U 90
90 &U 100
100 &U 110
110 &U 120
120 &U 130
130 &U 140
BEA140
Freq.
1
2
6
3
2
2
16
Leon Jiang, University of Tasmania
cumu. Freq.
1
3
9
12
14
16
16
LCL + class width * (how far into the class) / how many in the
class

100 + 10 * (8.5 – 3) / (9 – 3)
Median = 109.17
BEA140
Leon Jiang, University of Tasmania
17
Mode for g.d.

With grouped data, we tend to talk more of a modal class – the
class (classes) with the highest frequency rather than the mode.

But, if asked for a mode with grouped data, the best we can do is
to tell the class mark of modal class as follows:
Modal class: 3 &U 5 ( 19 observations )
Mode :
4
( class mark of modal class )
BEA140
Leon Jiang, University of Tasmania
18
Dispersion ( variance ) for grouped data

The sample variance formula is :

S2 ={Σfj Xj2 – (Σfj Xj)2 / n }/ (n-1)
The population variance formula is :



Standard deviation =
BEA140
2
= {Σfj Xj2 – (Σfj Xj)2 / N }/ N
s
2
or
Leon Jiang, University of Tasmania

2
19
Preparing a table to help work out S.d.
Class
80 &U 90
90 &U 100
100 &U 110
110 &U 120
120 &U 130
130 &U 140
BEA140
Freq. class mark
cumu. Freq.
fj
xj
f jXj
f jXjsquare
1
2
6
3
2
2
16
85
95
105
115
125
135
660
85
190
630
345
250
270
1770
7225
18050
66150
39675
31250
36450
198800
Leon Jiang, University of Tasmania
1
3
9
12
14
16
20
Working out the standard deviation for the example~!

S2 ={Σfj Xj2 – (Σfj Xj)2 / n }/ (n-1)

Standard deviation =

S = 14.14

Mean = 1770 / 16 = 110.625
BEA140
s
Leon Jiang, University of Tasmania
2
21
Shape

Skewness – relates to symmetry of
distribution.

Positively skewed or right skewed: tail
extends to right , mean > Median > Mode

Negatively skewed or left skewed: tail
extends to left, mean < median < mode
BEA140
Leon Jiang, University of Tasmania
22
Standard scores


The standard score expresses
any observation in terms of the
number of standard deviation it
is from the mean.
t score ( for sample)
* z score (for population)
X X
t
s
X 
z

BEA140
Leon Jiang, University of Tasmania
23
Interpretation of standard
score

Mean 5, standard deviation 2, for a sample

t score for 8 = (8-5)/2=1.5

Interpretation: the observation is 1.5 standard
deviations above the sample mean.
BEA140
Leon Jiang, University of Tasmania
24
Bivariate Variables
Summary measures
BEA140
Leon Jiang, University of Tasmania
25
Bivariate variables

In the previous parts, we were all the time talking about a
single numerical variable such as the rate of return of mutual
funds.

From this lecture, we shall start to study two variables with
correlation.
BEA140
Leon Jiang, University of Tasmania
26
Two numerical variables


A case:
In a call center, operators were trained to receive phone calls.
However, the duration of calls shows a significant difference
from one another. The shorter the duration of a call, the more
efficient an operator proves to be.

Suppose, the call center manager wants to know if the
training hours the operators received have any correlation to
the duration of those phone calls the operators handled.

The data pooled down are as the follows:
X Training hours
Y Duration minutes


BEA140
Leon Jiang, University of Tasmania
27
Data pooled like this
X (training hours): 6.5 7.5 6 8.5 5.5 3.5 8.5 8 8 7 8.5
9.5
Y (duration mins): 6.2 2.9 9.2 3.2 8.9 13.6 2.5 4.2 4.3 3.1 3.4
2.7
X (training hours): …………………………………………………….
Y (duration mins): …………………………………………………….
Anyway, in total there have been 54 phone calls in this set of data
being studied.
* Now, what we are about to find out is to know whether these
two variables ( X training hours of operators ; Y duration
minutes of calls) show any real correlation. Or , by putting it
simply, the call center manager wants to know if the more
training hours the operators receive, the shorter the duration
of calls the operators handle will be.
BEA140
Leon Jiang, University of Tasmania
28
Setting up a scatter diagram for the data here ~!

A scatter diagram ( scattergram ) between two variables will
indicate the form, type and strength of the relation.

Form – whether linear or non-linear

Type – direct (positive) or inverse (negative)

Strength – how closely data are co-ordinated, e.g. if linear,
how close ordered pairs are to a line describing their
relationship. This is indicated by a correlation measure.
BEA140
Leon Jiang, University of Tasmania
29
(Pearson’s) Coefficient of Correlation





This is a summary measure that describes the form, type and
strength of a scattergram.
The range of r is between –1 , 0 , 1.
-1: perfect negative relationship – all points exactly on a negative
sloping line
0: no linear correlation
1: perfect positive relationship
r
BEA140
 XY  ( X )( Y ) / n
 X  ( X ) / n Y  ( Y )
2
2
Leon Jiang, University of Tasmania
2
2
/n
30

Back to the case study



r( Pearson’s coefficient of correlation) = - 0.9209
This means X and Y have a very strong negative
linear relationship.
Or , let’s say the training hours the operators
received really show a strong negative relationship
with the duration of calls they handled.
BEA140
Leon Jiang, University of Tasmania
31
In-depth analysis of this linear relationship
– linear regression

Determining the Coefficient of Correlation is concerned with
summarizing the form, type and strength of the relationship
between two variables.

The motivation for regression is the desire to quantify the
relationship, often for the purposes of using the knowledge of one
variable to predict the other.

Say , using one variable ( X ) to predict the other
variable ( Y ).
BEA140
Leon Jiang, University of Tasmania
32
The regression line is mathematically expressed by this
equation

Yc = a + bX
Yc is the computed value of Y.

a is the sample regression constant, or Y-intercept.

b is the sample regression coefficient, or slope of the
line.

BEA140
Leon Jiang, University of Tasmania
33
Least squares method

This is a mathematical technique that determines what values of
a and b minimize the sum of squared differences. Any values for
a and b other than those determined by the least-squares
method result in a greater sum of squared differences between
the actual value of Y and the predicted value of Y.

Simply put, least-squares method is used to find a
line of best fit for two correlated variables.
BEA140
Leon Jiang, University of Tasmania
34
Working out the linear regression ~!




Residual is defined as the vertical distance between the actual
value and the predicted value ( the point on the line of best fit).
In least-squares regression, we find the values of a and b, such
that sum of squares of residuals, is a minimum.
Actual pairs :
(X1, Y1), (X2, Y2),… ...
Predicted (calculated )pairs: (X1, Yc1), (X1,Yc2), … …
BEA140
Leon Jiang, University of Tasmania
35
Back to the case study~!

Since we have known that the training hours correlate to the
duration of calls. It is somehow to say : if we know the
training hours an operator received , in some sense we can
predict how many minutes , on average, he or she should
take to handle a phone call.

Or, in linear regression, we know X and by using the least
squares method, we can calculate out Y.
BEA140
Leon Jiang, University of Tasmania
36
Solutions for a & b

Two formulae respectively for a and b.
b
n XY   X  Y
a 
BEA140
n X
2
Y
 ( X )
2
 b X
n
Leon Jiang, University of Tasmania
37
Establishing a table to work out linear regression
BEA140
Xi
6.5
7.5
6
…
…
…
8.5
7.5
6
391.5
Yi
6.2
2.9
9.2
…
…
…
2.8
5.9
6.5
290.7
2
Xi
42.25
…
XiYi
40.3
…
2
Yi
38.44
…
36
39
42.25
2974.25 1863.55 2081.69
Leon Jiang, University of Tasmania
38
Outcomes ~!


b=-1.79595
a=18.40399
.



BEA140
Then Yc=18.404 –1.796X
This is the linear regression.
Interpretation : for each extra hour of training,
there is an associated decrease of 1.796
minutes in call duration.
Leon Jiang, University of Tasmania
39
One consideration~!





Note: regression says nothing about causation, only about
association~!
This means X does not necessarily cause a change in Y.
Or, the training hours do not necessarily change the duration of
calls, instead they have correlation.
Think about : does smoking cigarettes cause life expectancy
shorter?
Not really~! ?
BEA140
Leon Jiang, University of Tasmania
40
The standard error of the estimate

Standard error measures how well actual Y and computed Y
are matched – the smaller Se, the better the match and
predictive accuracy.
Se
2

 (Y
n2
Se 
BEA140
 Yc )
2
Se
Leon Jiang, University of Tasmania
2
41
Note!

Standard error is very similar to standard
deviation.

Standard error is for bivariate, whilst
standard deviation is for univariate.
BEA140
Leon Jiang, University of Tasmania
42
Computational form for Se.

You can use this computational form to find
out Se.
Se
BEA140
2
Y


2
 a  Y  b XY
n2
Leon Jiang, University of Tasmania
43
Coefficient of determination
2
2
Y

(
Y
)

 /n

Total variation = SST=

Explained variation = SSR

Unexplained variation = SSE=

2
Y
  aY  b XY
Coefficient of determination =SSR / SST=

BEA140
Leon Jiang, University of Tasmania
SSE
1
SST
44
Coefficient of determination
-
r
2

The Coefficient of determination by calculation
turned out to be 0.848

This means 85% of total variation in call duration
(around the average duration level) has been
explained by a linear relation between duration and
training hours.
BEA140
Leon Jiang, University of Tasmania
45
We just saw summery
measures for dealing with
two numerical variables.
What about ordinal data?
BEA140
Leon Jiang, University of Tasmania
46
Two ordinal variables



BEA140
A scattergram can also be used to illustrate a
possible relationship between two ordinal
variables.
We often have ordinal variables in fields such as
Marketing and Management where people have
been asked to rank some attribute.
An example could be a series of taste trials
carried out during product development, such as
the example below, where a panel was asked to
rank soft drinks by “Refreshing ness” and
“Sweetness”.
Leon Jiang, University of Tasmania
47
Understanding this example

This example illustrates which one of the
drinks is the most refreshing and which is the
second most refreshing …

Likewise, which is the sweetest and which is
the second sweetest …
BEA140
Leon Jiang, University of Tasmania
48
Drink
Slurp
Fizz
Fizz Plus
Binge
Slam
Dunk
Whizz
Pling
Tweak
Blitz
BEA140
Refresh Rank
1
2
5
6
3
4
10
9
7
8
Leon Jiang, University of Tasmania
Sweetness
Rank
8
7
10
9
5
6
2
3
1
4
49
Sweetness vs Refreshingness
Sweetness Rank
12
10
8
6
4
2
0
0
2
4
6
8
10
12
Refreshing Rank
BEA140
Leon Jiang, University of Tasmania
50
Spearman’s Rank Correlation Coefficient

Spearman’s Rank CC, can be used as a summary measure
to gauge the degree of relationship between two ordinal
variables.


Spearman’s Rank C.C. is given the symbol rs for sample data,
(and rs for population data)



BEA140
It is usually calculated using the following short cut formula:
r is the Greek letter ‘rho’ - (the Greek equivalent to ‘r’).
Leon Jiang, University of Tasmania
51
Where di is the difference between the ranks of the ith pair
of observations, and n is the number of pairs of
observations.
n
rs  1 
BEA140
6 d

i 1
2
2
i

n n 1
Leon Jiang, University of Tasmania
52
Notes to this short formula

Strictly speaking this formula only works
when the number of ties is relatively small. If
more than about 1/4 to 1/3 of the
observations of a variable are in ties then the
shortcut formula starts to get unreliable. We
will deal with ties later. When there are too
many ties we need to use the “long” formula
BEA140
Leon Jiang, University of Tasmania
53
What are ties?
BEA140
Leon Jiang, University of Tasmania
54
Dealing with ties: we allocate the average rank of all
observations involved in the tie, to each observation
involved in the tie.

Standard & Poor’s bond ratings for a random
sample of 12 bonds:
C BB A AA A BBB CC D B A AA AAA
BEA140
Leon Jiang, University of Tasmania
55
C
BB
A AA A BBB CC D B A AA AAA
AAA AA AA A A A BBB BB B CC C D
1
2
3 4 5 6
7
8 9 10 11 12
AAA AA AA A A A BBB BB B CC C
1
BEA140
2.5 2.5 5 5 5
7
Leon Jiang, University of Tasmania
8
D
9 10 11 12
56
Two people came equal third (that is, the next person
came fifth). These share the 3rd & 4th positions and
thus each is given a rank of 3.5.
placing 1.0 2.0 3* 3* 5.0 6.0 7.0
ranking 1.0 2.0 3.5 3.5 5.0 6.0 7.0
BEA140
Leon Jiang, University of Tasmania
57
Rankings with ties



When rankings involve ties they provide us
with two extra problems:

how to deal with the ties

the short cut formula may be unreliable
if there are too many ties, and we need to
use a longer formula –
BEA140
Leon Jiang, University of Tasmania
58
The Full Spearman formula
- Use when there are ties!
rs 
BEA140
n
n
n
i 1
i 1
i 1
n  X i Yi  (  X i )(  Yi )
n
n
 n 2 n


2
2
2
n  X i  (  X i )  n  Yi  (  Yi ) 
i 1
i 1
 i 1
  i 1

Leon Jiang, University of Tasmania
59
Example - using the “short cut” formula
Drink
Slurp
Fizz
Fizz Plus
Binge
Slam
Dunk
Whizz
Pling
Tweak
Blitz
BEA140
Refresh Rank Sweetness
Rank
1
8
2
7
5
10
6
9
3
5
4
6
10
2
9
3
7
1
8
4
Leon Jiang, University of Tasmania
di
di 2
-7
-5
-5
-3
-2
-2
8
6
6
4
49
25
25
9
4
4
64
36
36
16
268
60
Result !

rs = 1 - 6*268 / (10*99) = - 0.624

Indicating quite a strong negative relationship
between refreshingness and sweetness, (as
we saw in the scattergram).
BEA140
Leon Jiang, University of Tasmania
61
Example – using the “long” formula

BEA140
A students association’s satisfaction
ratings for 8 courses, and the seniority of
the person taking the course are listed
below. Use Spearman’s Rank C.C. to
investigate the relationship between the
two.
Leon Jiang, University of Tasmania
62
End of Module 2
We are getting in Module 3 !
BEA140
Leon Jiang, University of Tasmania
63
Module 3
Probability & Probability
Distributions
BEA140
Leon Jiang, University of Tasmania
64
Probability

What is meant by the word – probability?

Probability is the likelihood or chance that a particular
event will occur.

Three approaches to probability
1.
2.
BEA140
A priori classical probability
Empirical classical probability
3.
Subjective probability
Leon Jiang, University of Tasmania
65
A priori classical probability

The probability of success is based on prior
knowledge of the process involved.

Probability of occurrence
X

T

Probability of occurrence

X=number of ways in which the event occurs
T=total number of elementary outcomes

BEA140
Leon Jiang, University of Tasmania
66
Example for priori classical probability

A box of 20 chocolate beans, among which 10
are red-colored and the other 10 are greencolored.

The probability of selecting a piece of red-colored
bean each time is 0.5 , or say : 10 / 20.

Because we know the total number of beans and
also the proportion of the two different colored
beans in advance, that’s why we call it – “ priori
probability ”
BEA140
Leon Jiang, University of Tasmania
67
Empirical classical probability

Empirical classical probability adopts the same
formula to calculate the probability of occurrence.

Probability of occurrence

However, in empirical classical probability,
probability of success is based on observed
data instead of pre-known data (priori).
BEA140
X

T
Leon Jiang, University of Tasmania
68
Example for empirical classical probability

Your mid-term exam is coming and this exam is said to be
optional, which means you can choose to take the exam or
not.

If we take a poll asking how many students are to attend the
exam and 99% of students are to attend the exam, we say
here, there is a 0.99 probability that an individual student will
attend the exam.
Remember, in this example, we did not know how many
students wished to take the exam. And this is different from
the priori classical example, in which we already knew 50%
were red and 50% were green.


BEA140
So, empirical probability actually is based on more
randomness.
Leon Jiang, University of Tasmania
69
Subjective probability

From the name we can infer that this approach to
probability is based on people’s personal
decision.

For instance:
You think you have a probability of 90% to pass
CPA exam and your supervisor thinks your
probability to pass it can be 60%.
Both of the probabilities are based on personal
judgment and experience, but not on
objectiveness.


BEA140
Leon Jiang, University of Tasmania
70
Sample spaces and events






BEA140
Event :
Each possible type of occurrence is referred to
as an event.
Simple event
A simple event can be described by a single
characteristic.
Sample space:
The collection of all the possible events is called
the sample space.
Leon Jiang, University of Tasmania
71
Axioms about probability

Given a sample space: S={E1+E2+… + En}, the
probabilities assigned to Ei must satisfy:


If an event has no chance to occur, the probability is 0 and if
an event is definite to occur, the probability is 1.


BEA140
0 ≦Ei ≦1, for each I
P(E1) + P(E2) +…P(En) = ∑P(Ei) = 1
Probability of Event A = sum of probabilities of simple events
comprising A.
Leon Jiang, University of Tasmania
72
Contingency tables

By example:

Intent to purchase investigation
This kind of investigation often takes place in
sales and marketing research scenario.


BEA140
In this example : the sample space is 1,000
households in terms of purchase behavior for
laptop computer.
Leon Jiang, University of Tasmania
73

1.
2.

1.
2.
BEA140
In the investigation, there are basically
two intents to the purchase.

Sub-samples
Planned to purchase – 300 households
Not planned to purchase – 700
households
So, after the purchase behaviors
happened, we can further subdivide the
sample of 1,000 households into :
Actually purchased
Not purchased
Leon Jiang, University of Tasmania
74

Now, in this example, of the big sample of
1,000, we can have four different sub-samples:
1.
Planned to purchase
Not planned to purchase
2.
3.
4.
BEA140
Purchased
Not purchased
Leon Jiang, University of Tasmania
75

but, latter, the outcomes of actual purchase and
no purchase turned out to be not that
consistent with the original investigated intents.

In the first category ( planned to purchase –
300 households), 200 out of 300 actually
purchased and the remaining 100 did not.

In the second category ( not planned to
purchase – 700 households ), 50 out of 700
actually purchased, the remaining 650 was
consistent with their initial intent.
BEA140
Leon Jiang, University of Tasmania
76
Complement and joint event

The complement of event A includes all events
that are not part of event A. The complement of A
is given by the symbol A’ or
A.

In the above example, 300 planned to purchase is the
complement of 700 not planned to purchase.

Joint event:
A joint event is an event that has two or more
characteristics.


BEA140
In the above example, the event “ planned to purchase and
actually purchased” is a joint event.
Leon Jiang, University of Tasmania
77
Usually two ways to depict events in sample

Contingency table - also called “ table of crossclassification ”

Now, based on the above example, we learn to
construct this contingency table and Venn diagram.
BEA140
Leon Jiang, University of Tasmania
78
Contingency table
Actually Purchased
Yes
No
Total
BEA140
Planned to purchase
Yes
No
200
50
100
650
300
700
Leon Jiang, University of Tasmania
Total
250
750
1000
79
Terms





BEA140
Intersection
A∩B: both A and B occur together, the joint event.
( sometimes simply written as AB)
Union
A∪B: either A or B or both.
Other common forms of notation include A∨B ,
A+B, A OR B
Leon Jiang, University of Tasmania
80
Example of using the above two notations

Number (n) of cards that is a Heart or an ace in a
set of poker cards (52 cards).

n(H∪A) = n(H) + n(A) - n(H∩A)
= 13 + 4
- 1
= 16


BEA140
Leon Jiang, University of Tasmania
81
Complement

Complement - A’: event A does not occur, or another form :
NOT A.
Example: Non-hearts = H’, n(H’) = 39

Complement rule: P(A) = 1- P(A’)

BEA140
Leon Jiang, University of Tasmania
82
Mutually exclusive and collectively exhaustive
 Mutually exclusive: occurrence of one event
precludes occurrence of another. If A and B are
mutually exclusive, then n(A∩B) = 0.



Collectively exhaustive:
Events together comprise the sample space; at least
one event is certain to occur.
Example: number of female students ∪ number of
male students = 26 ( QM course ).
BEA140
Leon Jiang, University of Tasmania
83
More to understand mutually exclusive and
collectively exhaustive




BEA140
For being female or male, everyone only can be one or the
other ( collectively exhaustive) , but no one is both ( mutually
exclusive).
Being female or male are mutually exclusive and collectively
exhaustive events.
In the example of TV set purchase:
Planned to purchase or not planned to purchase. Everyone
only can plan to purchase or not (collectively exhaustive), but
no one is both “planned to purchase” and “not planned to
purchase ” (mutually exclusive).
Leon Jiang, University of Tasmania
84
Probability contingency table
BEA140
numbers
O
O'
Total
M
7
24
31
M'
14
35
49
Total
21
59
80
numbers
O
O'
Total
M
0.0875
0.3
0.3875
M'
0.175
0.4375
0.6125
Total
0.2625
0.7375
1
Leon Jiang, University of Tasmania
85
General form of a 2×2 contingency table
Probabilities
A
B
P(A∩B)
B'
P(A∩B')
Total
P(A)
BEA140
A'
Total
P(A'∩B) P(B)
P(A'∩B') P(B')
P(A')
1
Leon Jiang, University of Tasmania
86
Simple (marginal) probability : P(A)

The most fundamental rule for probabilities is that
they range from 0 to 1.

Simple (marginal) probability refers to the
probability of occurrence of a simple event. P(A).

Example: what is the probability that a red-heart
card is selected in a set of poker cards?
P(red-card) = 13 / 52 = 0.25

BEA140
Leon Jiang, University of Tasmania
87
Joint probability : P(A∩B)

Joint probability refers to situations involving two or more
events, such as the probability of planned to purchase and
actually purchased in the big-screen TV set purchase
example.

Joint probability means that both event A and B must occur
simultaneously.

So, P(planned ∩purchased ) = 200/1000 = 0.2
BEA140
Leon Jiang, University of Tasmania
88
Computing marginal probability

In fact, the marginal probability of an event
consists of a set of joint probabilities.
 The formula:

P(A) = P(A and B1) + P(A and B2) + … + P(A and Bk)

In the previous example:
P(planned to purchase) = P(planned to purchase and
purchased) + P(planned to purchase + did not purchase)
= 200/1000 + 100/1000
=0.30



BEA140
Leon Jiang, University of Tasmania
89
Addition rule
P(A∪B) = P(A) + P(B) – P(A∩B)
N.B. If A, B are mutually exclusive, then P(A∩B) = 0, and
P(A∪B) = P(A) + P(B)


BEA140
Leon Jiang, University of Tasmania
90
Multiplication rule

P(A∩B) = P(B|A) = P(A|B)P(B) and it follows that
 P(B|A) = P(A∩B) / P(A) or
 P(A|B) = P(A∩B) / P(B)

The bar symbol “ | ” means “given”.

P(B|A) is the probability of B happening given that A happens.
This is known as a conditional probability.
BEA140
Leon Jiang, University of Tasmania
91
Conditional probability


-
-
BEA140
To spot conditional probabilities, we notice those words like
“of ”, “ if ” and “given”. Suppose :
D = “part is defective”, and B = “part was produced by B”, the
following would tell you P(D|B):
If a part was produced by B, there is a 5% chance it is
defective.
5% of the parts produced by B are defective.
There is a 5% chance that a part is defective, given that it
was produced by B.
Leon Jiang, University of Tasmania
92
Back to the TV purchase example
P(actually purchased | planned to purchase) =
planned to purchase and actually purchased
planned to purchase
= 200 / 250
= 0.80
P(B|A) = P(A and B) / P(A)

Here: A = planned to purchase
B = Actually purchased
BEA140
Leon Jiang, University of Tasmania
93
Independence

Two events, A and B, are independent, if the probability of A
occurring is not affected by B and vice versa.


A, B independent if :
 P(A) = P(A|B) , P(B) = P(B|A)
P(AB) = P(A)P(B)
only if A and B are independent.


BEA140
Leon Jiang, University of Tasmania
94
Bayes’ Theorem






BEA140
Bayes’ rule is useful in decision analysis.
Let’s learn it through an example as follows:
A machine is known to be in good condition 90% of the time.
If in good condition, only 1% of output is defective.
If in poor condition, 10% of output is defective.
An item of output is observed to be defective. Given this
information what is the probability that the machine is in
good condition?
Leon Jiang, University of Tasmania
95
Solution





G: condition of machine is good.
D: an item of output is defective.
Probabilities:
Prior (pre-condition) : P(G) = 0.9, P(G’) = 0.1
Conditional : P(D|G) = 0.01, P(D|G’) = 0.10


BEA140
P(G|D) = P(D|G)*P(G) / P(D)
 - conditional probability
But, we need to find out P(D).
Leon Jiang, University of Tasmania
96

P(defect) = P(defect and good condition) + P(defect and poor
condition)
P(G∩D) + P(G’ ∩D)
P(D) = P(D|G)P(G) + P(D|G’)P(G’)
= 0.01*0.9
+ 0.10*0.1
= 0.019

Then : P(G|D) = 0.009 / 0.019 = 0.47




BEA140
Leon Jiang, University of Tasmania
97
Expression of Bayes’ Rule

P(A|B) = P(B|A)P(A) / P(B)

This actually is the formula for joint probability.
BEA140
Leon Jiang, University of Tasmania
98
Counting Rule 1

If any one of k different mutually exclusive and collectively
exhaustive events can occur on each of n trials, the number
of possible outcomes is equal to
k
BEA140
n
Leon Jiang, University of Tasmania
99
Example for counting rule 1

A coin ( two sides) tossed 10 times, the number
of outcomes is
2  1,024
10
BEA140
Leon Jiang, University of Tasmania
100
Counting Rule 2

If there are K1 events on the first trial, K2 events
on the second trial, … and Kn events on the nth
trial, then the number of possible outcomes is

BEA140
(k1) (k2) … (Kn)
Leon Jiang, University of Tasmania
101
Example for counting rule 2

A license plate consists of 3 letters (26 letters in
total, a,b,c…z) followed by 3 digits ( 1 – 10), the
possible outcomes are:

26× 26×26 ×10 ×10 ×10 = 17,576,000
BEA140
Leon Jiang, University of Tasmania
102
Counting Rule 3





The number of ways that n objects can be
arranged in order is:
n!=(n)(n-1)…(1)
0!=1
“!” reads “factorial”.
“n!” is read “n factorial”.
BEA140
Leon Jiang, University of Tasmania
103
Example for counting rule 3


The number of ways that 6 books can be arranged
is:
n!=6!= 6*5*4*3*2*1=720
BEA140
Leon Jiang, University of Tasmania
104
Counting Rule 4

Permutations: the number of ways of
arranging X objects selected from n objects in
order is:
n!
(
n

X
)!
 Permutation: each possible arrangement is
called permutation.
BEA140
Leon Jiang, University of Tasmania
105
Example for counting rule 4

The number of ordered arrangements of 4 books
selected from 6 books is :
n!
6!

 360
(n  X )! (6  4)!
BEA140
Leon Jiang, University of Tasmania
106
Counting Rule 5

Combinations: the number of ways of selecting X
objects out of n objects, irrespective of order, is :
n!
 
 
X !( n  X )!
X
n
BEA140
Leon Jiang, University of Tasmania
107
Example for counting rule 5
– also called rule of combinations

4 books out of 6 books, the number of arrangements
is ( note: irrelevant to order):

BEA140
n!
 
 
 X  X !( n  X )!
n
= 15
Leon Jiang, University of Tasmania
108