Distribution of a function of a random variable

Transcript Distribution of a function of a random variable

Theorems about mean, variance
• Properties of mean, variance for one random variable X, where
a and b are constant:
• E[aX+b] = aE[X] + b
• Var(aX+b) = a2Var(X)
• Var(X) = E[X2] – (E[X])2
• Theorem. Let X and Y be independent random variables and
let g and h be real valued functions of a single real variable.
E[g(X)h(Y)]  E[g(X)]E[h(Y)].
• Theorem. For random variables X1, X2, ... , Xn, defined on the
same sample space, and for constants a1, a2, ... , an, we have
 n
 n
E  ai Xi    ai EXi .
 i 1
 i 1
Mean and median may differ
• Consider an exponential r. v. with λ = 1. The density is:
m=
µ=1
0.693
• Note that the mean µ and the median m are different. The
density has a lot of weight “in the tail” which causes the mean
to be larger. We say that this density is “skewed to the right”.
Statistical Estimation
• Suppose we are given a random variable X with some
unknown probability distribution. We want to estimate the
basic parameters of this distribution, like the expectation of X
and the variance of X.
• The usual way to do this is to observe n independent variables
all with the same distribution as X. To estimate the unknown
mean  of X, we use the sample mean described on the next
slide. The value of the observations yield a value for the
sample mean which is used as an estimate for . In a similar
way, the sample variance (discussed later) is used to estimate
the variance of X.
The sample mean
• Let X1,X2,…,Xn be independent and identically distributed
random variables having c. d. f. F and expected value μ. Such
a sequence of random variables is said to constitute a sample
from the distribution F. The sample mean is denoted by X
and is defined by
n Xi
X  i 1 .
n
• By using the theorem on the previous slide, we have E[X]  μ.
• Thus, the expected value of the sample mean is μ, the mean of
the distribution. For this reason, X is said to be an unbiased
estimator of μ.
• The random variable X is an example of a statistic. That is, it
is a function of the observations which does not depend on the
unknown parameter μ.
Expectation of Bernoulli and binomial random variables
• Recall that a Bernoulli random variable Xi is defined by
1, if trialis a success (with probability p)
Xi 
0, if trialis a failure (with probability 1  p)
• Since Xi is a discrete random variable, we have
E[Xi ]  1(p) 0(1 p)  p.
• Let X be a binomial random variable with parameters (n, p).
Then X = X1+ X2+…+ Xn where each Xi is Bernoulli. By the
theorem from the previous slide,
E[X] E[X1 ]  E[X2 ]  ...  E[Xn ]  np,
which agrees with the direct computation we did earlier.
Covariance, variance of sums, and correlation
Definition. The covariance between r.v.’s X and Y, denoted by
Cov(X,Y), is defined by
Cov(X,Y)  E[(X E[X])(Y E[Y])].
• Theorem. Cov(X,Y)  E(XY) E(X)E(Y).
• Corollary. If X and Y are independent, then Cov(X, Y) = 0.
• Example. Two dependent r. v.'s X and Y might have
Cov(X, Y) = 0. Let X be uniform over (–1, 1) and let Y = X2.
Properties of covariance
• Let X and Y be random variables. Then
(i)
Cov(X,Y)
(ii)
Cov(X,X)
 Cov(Y,X)

Var(X)
(iii) Cov(aX,Y)  aCov(X,Y)
n
n
(iv) Cov(i 1 X i ,  j1 Yj ) 
 
n
n
i 1
j1
Cov(Xi , Yj )
• If we take Yj = Xj, then (iv) implies that
Var( i 1 X i )  i 1 Var(X i )  2 Cov(X i , X j ).
n
n
i j
• If Xi and Xj are independent when i and j differ, then the latter
equation becomes
Var( i 1 X i )  i 1 Var(X i ).
n
n
Sample variance
• Let X1,X2,…,Xn be independent and identically distributed
random variables having c. d. f. F, expected value μ, and
variance 2. Let X be the sample mean. The random
2
variable
(X

X
)
n
S 2  i 1 i
n 1
is called the sample variance.
• Using the results from previous slides, we have
Var(X) 
2
n
E[S2 ]   2 .
, and
Variance of a binomial random variable
• Recall that a Bernoulli random variable Xi is defined by
1, if trialis a success (with probability p)
Xi 
0, if trialis a failure (with probability 1  p)
Also, Var(Xi) = p – p2 as an easy computation shows (taking
advantage of the fact that X i2  X i ).
• Let X be a binomial random variable with parameters (n, p).
Then X = X1+ X2+…+ Xn where each Xinis Bernoulli. By the
result from a previous slide, Var(X )  i 1 Var(X i ).
• Upon combining the above results, we have
Var(X)  np(1 p),
which agrees with our earlier result.
Possible relations between two random variables, X and Y
• For random variables X and Y, Cov(X,Y) might be positive,
negative, or zero.
• If Cov(X, Y) > 0, then X and Y decrease together or increase
together. In this case, we say X and Y are positively correlated.
• If Cov(X, Y) < 0, then X increase while Y decreases or vice
versa. In this case, we say X and Y are negatively correlated.
• If Cov(X, Y) = 0, we say that X and Y are uncorrelated. Recall
that uncorrelated random variables may be dependent, however.

Distribution of a function of a random variable

Transcript Distribution of a function of a random variable

Directory