Lecture Notes for Week 5
Download
Report
Transcript Lecture Notes for Week 5
Introduction to Probability and
Statistics
Chapter 12
Topics
• Types of Probability
• Fundamentals of Probability
• Statistical Independence and Dependence
• Expected Value
• The Normal Distribution
Sample Space and Event
• Probability is associated with performing an
experiment whose outcomes occur randomly
• Sample space contains all the outcomes of an
experiment
• An event is a subset of sample space
• Probability of an event is always greater than or
equal to zero
• Probabilities of all the events must sum to one
• Events in an experiment are mutually exclusive if
only one can occur at a time
Objective Probability
• Objective Probability
• Stated prior to the occurrence of the event
• Based on the logic of the process producing the outcomes
• Relative frequency is the more widely used definition of
objective probability.
• Subjective Probability
• Based on personal belief, experience, or knowledge of a
situation.
• Frequently used in making business decisions.
• Different people often arrive at different subjective
probabilities.
Fundamentals of Probability Distributions
• Frequency Distribution
• organization of numerical data about the events
• Probability Distribution
• A list of corresponding probabilities for each event
• Mutually Exclusive Events
• If two or more events cannot occur at the same
time
• Probability that one or more events will occur is
found by summing the individual probabilities of
the events:
P(A or B) = P(A) + P(B)
Fundamentals of Probability
A Frequency Distribution Example
• Grades for past four years.
Event
Grade
A
B
C
D
F
Number of
Students
300
600
1,500
450
150
3,000
Relative
Frequency
300/3,000
600/3,000
1,500/3,000
450/3,000
150/3,000
Probability
.10
.20
.50
.15
.05
1.00
Fundamentals of Probability
Non-Mutually Exclusive Events & Joint Probability
• Probability that non-mutually exclusive events M and F or
both will occur expressed as:
P(M or F) = P(M) + P(F) - P(MF)
• A joint (intersection) probability, P(MF), is the probability
that two or more events that are not mutually exclusive can
occur simultaneously.
Fundamentals of Probability
Cumulative Probability Distribution
• Determined by adding the probability of an event to the sum
of all previously listed probabilities
Event
Grade
A
B
C
D
F
Probability
.10
.20
.50
.15
.05
1.00
Cumulative
Probability
.10
.30
.80
.95
1.00
• Probability that a student will get a grade of C or higher:
• P(A or B or C) = P(A) + P(B) + P(C) = .10 + .20 + .50 = .80
Statistical Independence and Dependence
Independent Events
• Events that do not affect each other are independent.
• Computed by multiplying the probabilities of each event.
P(AB) = P(A) P(B)
• For coin tossed three consecutive times: Probability of getting head
on first toss, tail on second, tail on third is:
• P(HTT) = P(H) P(T) P(T) = (.5)(.5)(.5) = .125
Statistical Independence and Dependence
Independent Events – Bernoulli Process Definition
• Properties of a Bernoulli Process:
• Two possible outcomes for each trial.
• Probability of the outcome remains constant over
time.
• Outcomes of the trials are independent.
• Number of trials is discrete and integer.
Binomial Distribution
• Used to determine the probability of a number of successes
in n trials.
P(r) n! prqn -r
r!(n-r)!
where:
p = probability of a success
q = 1- p = probability of a failure
n = number of trials
r = number of successes in n trials
• Determine probability of getting exactly two tails in three tosses of a
coin.
3!
(.5)2(.5)3 2
2! (3 - 2)!
(321) (.25)(.5)
(21)(1)
6 (.125)
2
P(2 tails) P(r 2)
P(r 2) .375
Example
• Microchips are inspected at the quality
control station
• From every batch, four are selected and
tested for defects
• Given defective rate of 20%, what is the
probability that each batch contains
exactly two defectives
Binomial Distribution Example – Quality Control
• What is probability that each batch will contain exactly two
defectives?
4!
(.2)2(.8)2
2!(4 - 2)!
(43 21)(.25)(.5)
(21)(1)
24 (.0256)
2
.1536
P(r 2 defectives )
• What is probability of getting two or more defectives?
4!
4!
4!
(.2)2(.8)2
(.2)3(.8)1
(.2)4(.8)0
2!(4 - 2)!
3!(4 3)!
4!(4 - 4)!
.1536 .0256 .0016
.1808
P(r 2)
• Probability of less than two defectives:
P(r<2) = P(r=0) + P(r=1) = 1.0 - [P(r=2) + P(r=3) + P(r=4)]
= 1.0 - .1808 = .8192
Dependent Events
• If the occurrence of one event affects the probability of the
occurrence of another event, the events are dependent.
• Coin toss to select bucket, draw for blue ball.
• If tail occurs, 1/6 chance of drawing blue ball from bucket 2; if head
results, no possibility of drawing blue ball from bucket 1.
• Probability of event “drawing a blue ball” dependent on event
“flipping a coin.”
Dependent Events – Conditional Probabilities
• Unconditional: P(H) = .5; P(T) = .5, must sum to one.
• Conditional: P(RH) =.33, P(WH) = .67, P(RT) = .83,
P(WT) = .17
Math Formulation of Conditional Probabilities
• Given two dependent events A and B:
P(AB) = P(AB)/P(B) or P(AB) = P(A|B).P(B)
• With data from previous example:
P(RH) = P(RH) P(H) = (.33)(.5) = .165
P(WH) = P(WH) P(H) = (.67)(.5) = .335
P(RT) = P(RT) P(T) = (.83)(.5) = .415
P(WT) = P(WT) P(T) = (.17)(.5) = .085
Summary of Example Problem Probabilities
Bayesian Analysis
• In Bayesian analysis, additional information is used to alter
(improve) the marginal probability of the occurrence of an
event.
• Improved probability is called posterior probability
• A posterior probability is the altered marginal probability of
an event based on additional information.
• Bayes’ Rule for two events, A and B, and third event, C,
conditionally dependent on A and B:
P(A C)
P(C A)P(A)
P(C A)P(A) P(CB)P(B)
Bayesian Analysis – Example (1 of 2)
• Machine setup; if correct 10% chance of defective part; if
incorrect, 40%.
• 50% chance setup will be correct or incorrect.
• What is probability that machine setup is incorrect if sample
part is defective?
• Solution: P(C) = .50, P(IC) = .50, P(DC) = .10, P(DIC) = .40
where C = correct, IC = incorrect, D = defective
P(ICD)
P(DIC)P(IC)
P(DIC)P(IC) P(DC)P(C)
(.40)(.50)
(.40)(.50) (.10)(.50)
.80
Statistical Independence and Dependence
Bayesian Analysis – Example (2 of 2)
• Previously, the manager knew that there was a 50% chance
that the machine was set up incorrectly
• Now, after testing the part, he knows that if it is defective,
there is 0.8 probability that the machine was set up
incorrectly
Expected Value
Random Variables
• When the values of variables occur in no particular order or
sequence, the variables are referred to as random
variables.
• Random variables are represented by a letter x, y, z, etc.
• Possible to assign a probability to the occurrence of
possible values.
Possible values of
no. of heads are:
Possible values of
demand/week:
Expected Value
Example (1 of 4)
• Machines break down 0, 1, 2, 3, or 4 times per month.
• Relative frequency of breakdowns , or a probability
distribution:
Random Variable x
(Number of Breakdowns)
0
1
2
3
4
P(x)
.10
.20
.30
.25
.15
1.00
Expected Value
Example (2 of 4)
• Computed by multiplying each possible value of the
variable by its probability and summing these products.
• The weighted average, or mean, of the probability
distribution of the random variable.
• Expected value of number of breakdowns per month:
E(x) = (0)(.10) + (1)(.20) + (2)(.30) + (3)(.25) + (4)(.15)
= 0 + .20 + .60 + .75 + .60
= 2.15 breakdowns
Expected Value
Example (3 of 4)
• Variance is a measure of the dispersion of random variable
values about the mean.
• Variance computed as follows:
•
Square the difference between each value and the
expected value.
•
Multiply resulting amounts by the probability of each
value.
•
Sum the values compiled in step 2.
• General formula:
2 = [xi - E(xi)] 2 P(xi)
Expected Value
Example (4 of 4)
• Standard deviation computed by taking the square root of
the variance.
• For example data:
xi P(xi) xi – E(x)
0 .10
-2.15
1 .20
-1.15
2 .30
-0.15
3 .25
0.85
4 .15
1.85
1.00
[xi – E(xi)]2
4.62
1.32
0.02
0.72
3.42
[xi – E(x)]2 P(xi)
.462
.264
.006
.180
.513
1.425
2 = 1.425 breakdowns per month
standard deviation = = sqrt(1.425)
= 1.19 breakdowns per month
Poisson Distribution
• Based on the number of outcomes occurring during
a given time interval or in a specified regions
• Examples
– # of accidents that occur on a given highway during a 1week period
– # of customers coming to a bank during a 1-hour interval
– # of TVs sold at a department store during a given week
– # of breakdowns of a washing machine per month
Conditions
• Consider the # of breakdowns of a washing
machine per month example
– Each breakdown is called an occurrence
– Occurrences are random that they do not
follow any pattern (unpredictable)
– Occurrence is always considered with respect
to an interval (one month)
The Probability Mass Distribution
• X = number of counts in the interval
• Poisson random variable with > 0
x
• PMF
e
f(x)=
x=0,1,2,
x!
• Mean and Variance
E[X] = , V (X) =
Example
• If a bank gets on average = 6 bad checks per
day, what are the probabilities that it will receive
four bad checks on any given day?10 bad checks
on any two consecutive days?
• Solution
x = 4 and = 6, then f(4) =
6 4 e 6
= 0.135
4!
e 12 1210
= 12 and x = 10, then f(10) =
= 0.105
10!
Example
• The number of failures of a testing instrument from
contamination particle on the product is a Poisson
random variable with a mean of 0.02 failure per
hour.
– What is the probability that the instrument does not fail in
an 8-hour shift?
– What is the probability of at least one failure in one 24hour day?
Solution
a) Let X denote the failure in 8 hours. Then, X has a
Poisson distribution with =0.16
P(X=0)=0.8521
b) Let Y denote the number of failure in 24 hours.
Then, Y has a Poisson distribution with =0.48
P(Y1) = 1-P(Y = 0) =0.3812
The Normal Distribution
Continuous Random Variables
• Continuous random variable can take on an infinite number
of values within some interval.
• Continuous random variables have values that are not
countable
• Cannot assign a unique probability to each value
The Normal Distribution
Definition
• The normal distribution is a continuous probability
distribution that is symmetrical on both sides of the mean.
• The center of a normal distribution is its mean .
• The area under the normal curve represents probability,
and total area under the curve sums to one.
The Normal Distribution
Example (1 of 5)
• Mean weekly carpet sales of 4,200 yards, with standard
deviation of 1,400 yards.
• What is probability of sales exceeding 6,000 yards?
• = 4,200 yd; = 1,400 yd; probability that number of yards
of carpet will be equal to or greater than 6,000 expressed
as: P(x6,000).
The Normal Distribution
Example (2 of 5)
-
The Normal Distribution
Standard Normal Curve (1 of 2)
• The area or probability under a normal curve is measured
by determining the number of standard deviations from the
mean.
• Number of standard deviations a value is from the mean
designated as Z.
Z = (x - )/
The Normal Distribution
Standard Normal Curve (2 of 2)
The Normal Distribution
Example (3 of 5)
Z = (x - )/ = (6,000 - 4,200)/1,400
= 1.29 standard deviations
P(x 6,000) = .5000 - .4015 = .0985
The Normal Distribution
Example (4 of 5)
• Determine probability that demand will be 5,000 yards or
less.
Z = (x - )/ = (5,000 - 4,200)/1,400 = .57 standard deviations
P(x 5,000) = .5000 + .2157 = .7157
The Normal Distribution
Example (5 of 5)
• Determine probability that demand will be between 3,000
yards and 5,000 yards.
Z = (3,000 - 4,200)/1,400 = -1,200/1,400 = -.86
P(3,000 x 5,000) = .2157 + .3051= .5208
Different Table
• P(3,000 x 5,000)=
• P((3,000 - 4,200)/1,400) z ((5,000 4,200)/1,400)
• P(-0.86 z 0.57)=
• P( z 0.57)- P( z -0.86)=
• P( z 0.57)- P( z ≥0.86)=
• P( z 0.57)- [1-P( z 0.86)]=
• (0.7157)-[1-0.8051]=0.5208
The Normal Distribution
Sample Mean and Variance
• The population mean and variance are for the entire set of
data being analyzed.
• The sample mean and variance are derived from a subset
of the population data and are used to make inferences
about the population.
The Normal Distribution
Computing the Sample Mean and Variance
n
xi
Sample mean x i n1
n
(x - x)2
2 i 1 i
Sample variance s
n -1
2
Sample standard deviation s s
The Normal Distribution
Example Problem Re-Done
Sample mean = 42,000/10 = 4,200 yd
Sample variance = [(190,060,000) - (1,764,000,000/10)]/9
= 1,517,777
Sample std. dev. = sqrt(1,517,777)
= 1,232 yd
Week
i
1
2
3
4
5
6
7
8
9
10
Demand
xi
2,900
5,400
3,100
4,700
3,800
4,300
6,800
2,900
3,600
4,500
42,000
The Normal Distribution
Chi-Square Test for Normality (1 of 2)
• It can never be simply assumed that data are normally
distributed.
• A statistical test must be performed to determine the exact
distribution.
• The Chi-square test is used to determine if a set of data fit a
particular distribution.
• It compares an observed frequency distribution with a
theoretical frequency distribution that would be expected to
occur if the data followed a particular distribution (testing
the goodness-of-fit).
The Normal Distribution
Chi-Square Test for Normality (2 of 2)
• In the test, the actual number of frequencies in each range
of frequency distribution is compared to the theoretical
frequencies that should occur in each range if the data
follow a particular distribution.
• A Chi-square statistic is then calculated and compared to a
number, called a critical value, from a chi-square table.
• If the test statistic is greater than the critical value, the
distribution does not follow the distribution being tested; if it
is less, the distribution does exist.
• Chi-square test is a form of hypothesis testing.
Statistical Analysis with Excel (1 of 3)
Statistical Analysis with Excel (2 of 3)
Statistical Analysis with Excel (3 of 3)