Transcript pptx file

Random Variables and
Probability Distributions
Random Variables
• Definition:
– A rule that assigns one (and only one)
numerical value to each simple event of an
experiment; or
– A function that assigns numerical values to
the possible outcomes of an experiment.
• Two types:
– Discrete
– Continuous
Discrete Random Variables
• Definition:
– A random variable whose numerical values are
limited to specific values within its range; or
– A random variable that can assume a countable
number of values.
• Examples:
– number of traffic accidents
– mortgage rates
– shoe size
Continuous Random Variables
• Definition:
– A random variable that can take any value over a
continuous range of values; or
– A random variable that can assume values
corresponding to any of the points contained in one or
more intervals.
• Example:
– length of right foot of a person
– length of time between arrivals
– weight of a food item bought at a store
Examples
• Are the following discrete or continuous
random variables?
– The pump price of a gallon of gasoline in
dollars.
– The time taken by a flight from New York to
London.
– The age of a grocery store shopper in years.
Probability Distribution for a
Discrete Random Variable
• Definition:
– A list of all the possible values of the random variable
and their respective probabilities; or
– A graph, table or formula that specifies the probability
associated with each possible value the random
variable can assume.
• Requirements:
– 1.
– 2.

p ( xi )  0
 p( x )  1
i 1
i
for all values of x
Example
• Let the random variable of interest be the face
value shown when tossing a die:
–
–
–
–
–
–
For x=1, P(x)=1/6,
For x=2, P(x)=1/6,
For x=3, P(x)=1/6,
For x=4, P(x)=1/6,
For x=5, P(x)=1/6,
For x=6, P(x)=1/6.
p( x )  0

x
p( x )  1
Example
• Let the random variable of interest be the
number of heads observed when two fair
coins are tossed:
– {No heads observed} For x=0,
P(x=0)=P(T1,T2)=1/4
– {One head observed} For x=1,
P(x=1)=P(T1,H2)+P(H1,T2)=1/4+1/4=1/2
– {Two heads observed} For x=2,
P(x=2)=P(H1,H2)=1/4
Example
• Let the random variable of interest (x) be the
number of candy bars sold by a vending
machine (which holds 500 bars) in one day.
• X has a range of 0 to 500 and each value of X is
equally likely.
– What is the probability that exactly 250 candy bars
will be sold?
– What is the probability that more than 250 candy bars
will be sold?
– What is the probability that an odd number of candy
bars will be sold?
Cumulative Distribution Function
• Definition:
--The cumulative distribution function, F(x), of the random
variable X is defined for each real number x as follows:
F(x) = P(X ≤ x) for -∞ < x < ∞
where P(X ≤ x) means the probability associated with the
event {X ≤ x}.
– Thus, F(x) is the probability that, when the experiment is
done, the random variable X will have a taken on a value
no larger than the number x.
– When X is discrete, F(x) =  p( xi )
all xi  x
Cumulative Distribution Function
• Requirements for F(x):
– 1. F is a non-decreasing function:
» if a < b then F(a) ≤ F(b)
– 2. lim F ( x )  1
x 
F ( x)  0
– 3. xlim
 
--Thus, 0 ≤ F(x) ≤ 1
--We can easily show from these requirements that:
P (a ≤ X ≤ b) = F(b) – F(a), for all a < b.
Mean (Expected Value) of a
Discrete Random Variable
• The mean, or expected value, of a discrete
random variable is given by:
  E ( x) 
 xp( x )
x
– It is possible that a discrete
random variable may
never equal its mean.
• Example:
– Expected value of rolling a die.
Example
• From earlier die toss experiment:
– x=1, P(x)=1/6,
– x=2, P(x)=1/6,
– x=3, P(x)=1/6,
- x=4, P(x)=1/6,
- x=5, P(x)=1/6,
- x=6, P(x)=1/6.
• Mean or expected value:
  E ( x )   xp( x )
x
1
1
1
1
1
1
 (1 * )  ( 2 * )  (3 * )  ( 4 * )  (5 * )  (6 * )  3.5
6
6
6
6
6
6
Variance of a Discrete Random
Variable
• The variance of a discrete random variable
is given by:
  E[( x   ) ]   ( x   ) p( x )
2
2
2
x
• Examples:
– Variance of rolling a die.
• Standard deviation is the positive square
root of the variance.
Example
• From earlier die toss experiment:
– x=1, P(x)=1/6; x=2, P(x)=1/6; x=3, P(x)=1/6; x=4,
P(x)=1/6; x=5, P(x)=1/6; x=6, P(x)=1/6.
– E(x)=3.5
• Variance:
 2  E[( x   ) 2 ]   ( x   ) 2 p( x )
x
1
1
1
 ((1  3.5) 2 * )  (( 2  3.5) 2 * )  (( 3  3.5) 2 * )
6
6
6
1
1
1
 (( 4  3.5) 2 * )  (( 5  3.5) 2 * )  (( 6  3.5) 2 * )  2.9167
6
6
6
Other Related Topics
• Excel’s RAND() function generates a number
between 0 and 1.
• When two random variables are related in the sense
that they both depend on which of several possible
scenarios occurs, the covariance and correlation
are summary measures of the relationship between
them.
• p(xi, yi), the joint probability,is the probability that
the random variables X and Y equal the values xi
and yi, respectively.
• When X and Y are independent random variables,
the joint probability is equal to the product of the
marginals.
Binomial Random Variable
• Definition:
– The random variable (x) which represents the
number of successes that occur in n
independent trials is said to be a binomial
random variable with parameters (n,p) where
p is the probability of success on a given trial.
– Counts the number of successes (or failures)
in n trials.
Characteristics of a Binomial
Random Variable
• The experiment consists of n identical trials.
• There are only two possible outcomes on each
trial (S for Success or F for Failure).
• The probability of a success (S) is p for each
trial. P(S)= p; P(F)= q; p+q=1.
• The trials are independent.
• The binomial random variable x is the number of
Successes in n independent trials.
Example
• Flip a coin 50 times. Count the number of
heads.
• A type of machine breaks down 10% of the time
on a production run. Count the breakdowns in
60 production runs.
• Some customers purchase gum when checking
out at a store. Count the number of customers
who purchase gum.
Binomial Probability Distribution
 n
x
n x
p( x )    p q
 x
( x  0,1,2 ,3, .... , n )
• where
– p=probability of a success on a single trial
– q=1-p
– n=Number of trials
– x=Number of successes in n trials
Binomial Probability Distribution
 n
p( x )    p x q n  x
 x
( x  0,1,2 ,3,...., n )
n

 x

 
x
p q
Number of simple events n with x Successes….
n x
Probability of x Successes and (n-x) Failures in
any simple event….
Example
• Toss four coins:
– What is the probability of obtaining two heads
and two tails?
4
2
42
p ( 2)  
2
( 0.5) ( 0.5)
 


4!
2
42
 
 0.375
 2! ( 4  2)! 
( 0.5) ( 0.5)


– What is the probability of obtaining one head
and three tails?
4
1
4 1
p (1)  
1
( 0.5) ( 0.5)
 


4!
1
4 1
 
 0.250
 1! ( 4  1)! 
( 0.5) ( 0.5)


Example
• A machine produces defective items with a
probability of 0.1:
– What is the probability that in a sample of five items,
at most one item will be defective?
– What is the probability that in a sample of five items,
exactly two items will be defective?
– What is the probability that in a sample of five items,
more than three items will be defective?
Example
• Let x be the number of defective items out
of five:
 5
0
5 0
p ( 0)  
 0.59049
0
( 0.1) ( 0.9)
 
 5
1
51
p (1)  
 0.32805
1
( 0.1) ( 0.9)
 
 5
2
5 2
p ( 2)  
 0.07290
2
( 0.1) ( 0.9)
 
 5
3
5 3
p ( 3)  
 0.00810
 3
( 0.1) ( 0.9)
 
 5
4
5 4
p ( 4)  
 0.00045
4
( 0.1) ( 0.9)
 
 5
5
5 5
p (5)  
 0.00001
 5
( 0.1) ( 0.9)
 
Example
– What is the probability that in a sample of five items,
at most one item will be defective?
p( x  1)  p( x  0)  p( x  1)  0.59049  0.32805  0.91854
– What is the probability that in a sample of five items,
exactly two items will be defective?
p ( x  2)  0.07290
– What is the probability that in a sample of five items,
more than three items will be defective?
p( x  3)  p ( x  4)  p( x  5)  0.00045  0.00001  0.00046
Mean, Variance, Standard Deviation of
a Binomial Random Variable
• Mean:   E ( x )  np
• Variance: 
2
 npq
• Standard Deviation:


npq
Example
• Let x be a binomial random variable with
p=0.7 and n=10:
– The mean is:
  E ( x )  np  10 * 0.7  7.0
– The variance is:
 2  10 * 0.7 * 0.3  2.1
– The standard deviation is:
  npq  10 * 0.7 * 0.3  2.1  1.449
Binomial Example
• Experiment:
– Flip a coin three times and record the value of
the up face.
– What is the probability of getting exactly two
heads?
– Eight possible sequences of heads and tails
(why?). Xn=23=8
– HHH, HHT, HTH, HTT, THH, THT, TTH, TTT.
Example
• Each sequence is equally likely, that is
p(x)=1/8=0.125:
– How many ways to get 2 heads?
 N  3
3!
3! 3 * 2
 
3
     
2
 x   2  2!(3  2)! 2!
– Probability of each sequence is:
p x q( nx )  (0.5)2 (0.5)1  0.125
– Probability of exactly two heads is 3 out of 8
(3/8=0.375) by counting or (3*0.125=0.375) by
binomial formula.
Probability Table
• A table that lists the probability of any two characteristics
where each characteristic can take on multiple values.
• Example:
– Grocery shoppers by gender and senior citizen
status.
Senior Citizen
Not Senior Citizen
Male
0.02
0.28
0.30
Female
0.08
0.62
0.70
0.10
0.90
1.00
Continuous Random Variables
• f(x) is the probability density function of the
continuous random variable x if these conditions
are met for any values a and b:
– 1. f ( x )  0,    x  
– 2.
– 3.



f ( x ) dx  1
P ( a  X  b) 
b

a
f ( x ) dx
Mean (Expected Value) for a
Continuous Random Variable
• The expected value of a continuous
random variable x is the average or mean
value of x and is given by:
E( X ) 


xf ( x ) dx

Variance of a Continuous
Random Variable
• The variance of a continuous random
variable x is the expectation of the
squared difference between x and its
mean  and is given by:

2
Var ( X )  E ( X   )   ( x   ) 2 f ( x ) dx

– Alternatively:
Var ( X )  E ( X )   
2
2


x 2 f ( x ) dx   2

Cumulative Distribution
• Nondecreasing function of the random
variable x with the properties:
• 1. P( X  x)  F ( x )   f ( x )dx
• 2. F ( )  0
• 3. F ( )  1
• 4. P(a  X  b)  F (b)  F (a )
• 5. dF ( x )  f ( x )
x

dx
Normal Distribution
(Continuous Random Variable)
• Properties:
– Many real-life observations follow the normal
distribution (or are very close to being normally
distributed);
– The probability distribution is bell-shaped and
continuous;
– The probability distribution is symmetric about the
mean and is uni-modal;
– Two parameters define the normal distribution, the
mean and the standard deviation.
Normal Distribution Examples
– Height of adult males;
– Number of ounces of soft drink dispensed by
a filling machine;
– Distribution of scores on a test.
• Probability density function:
f ( x) 
1
e
2
1  ( x )  2

  

 2 


– x can take any value in the range ( , )
Notes: Normal Distribution
• If x is a continuous random variable which
follows a normal distribution:
– x can assume any value over a specified
range.
– The probability that x is a specific value is
equal to 0.
– Typically, we are interested in the probability
that x falls between two points.
– Integration is approximate, not exact.
Z-score and the Normal
Distribution
• Difficult to integrate the normal probability
density function. Instead, use z-score:
– Standard normal table shows areas under
curve for a normal curve with mean=0 and
standard deviation=1.
– Need to standardize x values of interest by
x
using:
z

Steps for Calculating Probability
Using the Z-score
• Sketch a bell-shaped curve, indicate the mean
and the value(s) of x of interest.
• Shade the area (which represents the
probability) you are interested in obtaining.
• Use the z-score formula to calculate z-value(s)
for the values of x of interest.
• Look up z-values in table (or use Excel) to find
corresponding area(s). You may need to use
symmetry.
Examples
• Life of rechargeable battery for laptop computer
has a normal distribution with a mean of 4 hours
and a standard deviation of 2 hours:
– What is probability that the battery will last be
between 5 and 6 hours?
• Gas mileage for a car is normally distributed with
a mean of 25 mpg and a standard deviation of 6
mpg:
– What is the probability that a car will have a gas
mileage between 20 and 25 mpg?
Discrete Random VariablesBinomial Example
• The probability that a patient fails to
recover from a particular operation in
Fairfax Hospital is 0.1:
– What is the probability that exactly two of the
next eight patients having this operation will
not recover?
– What is the probability that at most one
patient of the next eight patients having this
operation will not recover?
Discrete Random VariablesBinomial Example 2
• Thirty percent of the defective brake calipers
manufactured by Dana can be fixed by rework:
– What is the probability that in a batch of six defective
calipers at least three can be fixed by rework?
– What is the probability that none of them can be fixed
by rework?
– What is the probability that all of them can be fixed by
rework?
Discrete Random VariablesBinomial Example 3
• Western Digital expects only 2% of its hard disks
to malfunction during the warranty period. In a
sample of ten disk drives:
– What is the probability that none will malfunction
during the warranty period?
– What is the probability that exactly one will
malfunction during the warranty period?
– What is the probability that at least two will
malfunction during the warranty period?
Continuous Random VariablesNormal Example
• Let x be a random variable depicting human
intelligence as measured by IQ tests. If x has a
normal distribution with a mean of 100 and a
standard deviation of 10, determine:
–
–
–
–
–
The probability of an IQ greater than 100;
The probability of an IQ less than 85;
The probability of an IQ of at least 110;
The probability of an IQ between 85 and 125;
The probability of an IQ between 110 and 200.
Continuous Random VariablesNormal Example 2
• Suppose the outer diameter of a ball bearing
produced by a stable manufacturing process
follows a normal distribution with a mean of 3.5
cm and a standard deviation of 0.02 cm. If the
diameter of this type of ball bearing must be no
smaller than 3.47 cm and no larger than 3.53 cm
to be usable, what percentage of bearings must
be scrapped?
Continuous Random VariablesNormal Example 3
• A time and motion study was conducted at the
Volvo-GM manufacturing plant in Dublin (VA) to
determine the time it takes a worker to assemble
the rear drive unit for a large truck. The data
was found to be normally distributed with a
mean of 75 seconds and a standard deviation of
6 seconds. In order for the assembly process to
flow smoothly, this unit has to be assembled in
84 seconds or less. Approximately what
proportion of the time will the assembly process
flow smoothly?