random variable

Download Report

Transcript random variable

Chapter 16 Random Variables
math2200
Life insurance
• A life insurance policy:
– Pay $10,000 when the client dies
– Pay $5,000 if the client is permanently
disabled
– Charge $50 per year
Random variable
• We call the variable X a random variable if
the numeric value of X is based on the
outcome of a random event. e.g. The
amount the company pays out on one
policy
– Random variable is often denoted by a capital
letter, e.g. X, Y and Z. A particular value of
the variable is often denoted by the
corresponding lower case letter, e.g. x, y and
z
Random variable
• Discrete
– If we can list all the outcomes (finite or
countable) e.g. the amount the insurance
pays out is either $10,000, $5,000 or $0
• Continuous
– any numeric value within a range of values.
• Example: the time you spend from home to school
Probability model
• The collection of all possible values and
the probabilities that they occur is called
the probability model for the random
variable.
Example
• Death rate :1 out of every 1000 people per year
• Disability rate: 2 out of 1000 per year
• Probability model
Policyholder
outcome
Payment
(x)
Probability
(Pr (X=x))
Death
10,000
1/1000
Disability
5,000
2/1000
Neither
0
997/1000
What does the insurance company
expect?
• 1000 people insured and in a year,
– 1 dies
– 2 disabled
– pays $10,000 + $5,000*2 = $20,000
– payment per customer: $20,000/1000 = $20
– charge per customer: $50
– profit : $30 per customer!
Expected value
• $20 is the expected payment per customer
• E(X) = 20
=10000 * 1/1000 + 5000 * 2/1000 + 0*997/1000
• If X is a discrete random variable
  E  X    x  P  x
How about spread?
• Most of the time, the company makes $50
per customer
• But, with small probabilities, the company
needs to pay a lot ($10000 or $5000)
• The variation is big
• How to measure the variation?
Spread
• The variance of a random variable is:
  Var  X     x     P  x 
2
2
• The standard deviation for a random
variable is:
  SD  X   Var  X 
Variance and standard deviation
Policyholder
outcome
Payment (x) Probability
Pr(X=x)
Deviation
Death
10,000
1/1000
(10000-20) = 9980
Disability
5,000
2/1000
5000-20 =4980
Neither
0
997/1000
0 -20 = -20
Var (X) = Σ[x-E(X)]2 * P(X=x)
Variance = 99802 (1/1000)+49802 (2/1000)+(-20)2 (997/1000) = 149,600
Standard deviation = square root of variance
SD(X) = $386.78
Properties of Expected value and
Standard deviation
• Shifting
– E(X+c) = E(X) + c
– Var(X+c) = Var(X)
Example: Consider everyone in a company
receiving a $5000 increase in salary.
• Rescaling
– E(aX) = aE(X)
– Var(aX) = a2 Var(X)
Example: Consider everyone in a company
receiving a 10% increase in salary.
Properties of expected value and
standard deviation
• Additivity
– E(X ± Y) = E(X) ± E(Y)
– If X and Y are independent
• Var(X ± Y) = Var(X) + Var(Y)
• Suppose the payments for two customers are
independent, the variance for the total payment
to these two customers
Var (X+Y) = Var (X)+ Var (Y) = 149600 + 149600 =
299200
• If one customer is insured twice as much, the
variance is
– Var(2X) = 4Var(X) = 4*149600 = 598400
– SD(2X) = 2SD(X)
Example :Combine Random Variables
• Sell used Isuzu Trooper and purchase a
new Honda motor scooter
– Selling Isuzu for a mean of $6940 with a
standard deviation $250
– Purchase a new scooter for a mean of $1413
with a standard deviation $11
• How much money do I expect to have
after the transaction? What is the standard
deviation?
Combining Random Variables
• Bad News: the probability model for the
sum of two variables is often different from
what we start with.
• Good news: the magical normal model the
probability model for the sum of
independent Normal random variables is
still normal.
Example: Combining normal
random variables
• packaging stereos
– Packing the system
• Normal with mean 9 min and sd 1.5min
– Boxing the system
• Normal with mean 6 min and sd 1min
• What is the probability that packing two
consecutive systems take over 20 minutes?
• What percentage of the stereo systems
take longer to pack than to box ?
• X1: mean=9, sd = 1.5
• X2: mean=9, sd = 1.5
• T=X1+X2: total time to pack two systems
– E(T) = E(X1)+E(X2) = 9+9=18
– Var(T) = Var(X1)+Var(X2) = 1.52 + 1.52 = 4.5
(assuming independence)
– T is Normal with mean 18 and sd 2.12
– P(T>20) = normalcdf(20,1E99, 18, 2.12)
=0.1736
• What percentage of the stereo systems
take longer to pack than to box ?
– P: time for packing
– B: time for boxing
– D=P-B: difference in times to pack and box a
system
– The questions is P(D>0)=?
– Assuming P and B are independent
• E(D) = E(P-B) = E(P)-E(B) = 9-6=3
• Var(D) = Var(P-B) = Var(P)+Var(B) = 1.52 + 12 =
3.25
• SD(D) = 1.80
• D is Normal with mean 3 and sd 1.80
• P(D>0) =normalcdf(0,1E99,3,1.80)= 0.9525
• About 95% of all the stereo systems will require
more time for packing than for boxing
Correlation and Covariance
(OPTIONAL)
• If E(X)=µ and E(Y)=ν, then the covariance
of the random variables X and Y is defined
as
Cov(X,Y)=E((X-µ)(Y-ν))
• The covariance measures how X and Y
vary together.
properties of covariance
•
•
•
•
•
Cov(X,Y)=Cov(Y,X)
Cov(X,X)=Var(X)
Cov(cX,dY)=c*dCov(X,Y)
Cov(X,Y) = E(XY)- µν
If X and Y are independent, Cov(X,Y)=0
– The converse is NOT true
• Var(X ± Y) = Var(X) + Var(Y) ± 2Cov(X,Y)
Correlation and Covariance
(cont.)
• Covariance, unlike correlation, doesn’t
have to be between -1 and 1.
• To fix the “problem” we can divide the
covariance by each of the standard
deviations to get the correlation coefficient:
Corr ( X , Y ) 
Cov( X , Y )
 XY
What Can Go Wrong?
• Don’t assume everything’s Normal.
– You must Think about whether the Normality
Assumption is justified.
• Watch out for variables that aren’t
independent:
– You can add expected values for any two
random variables, but
– you can only add variances of independent
random variables.
What Can Go Wrong? (cont.)
• Don’t forget: Variances of independent
random variables add. Standard
deviations don’t.
• Don’t forget: Variances of independent
random variables add, even when you’re
looking at the difference between them.
• Don’t write independent instances of a
random variable with notation that looks
like they are the same variables.