#### Transcript Stats ch06.s03

Chapter 6 Continuous Random Variables and Probability Distributions © Continuous Random Variables A random variable is continuous if it can take any value in an interval. Cumulative Distribution Function The cumulative distribution function, F(x), for a continuous random variable X expresses the probability that X does not exceed the value of x, as a function of x F ( x) P( X x) Cumulative Distribution Function F(x) 1 0 1 Cumulative Distribution Function for a Random variable Over 0 to 1 Cumulative Distribution Function Let X be a continuous random variable with a cumulative distribution function F(x), and let a and b be two possible values of X, with a < b. The probability that X lies between a and b is P(a X b) F (b) F (a) Probability Density Function 1. 2. 3. 4. Let X be a continuous random variable, and let x be any number lying in the range of values this random variable can take. The probability density function, f(x), of the random variable is a function with the following properties: f(x) > 0 for all values of x The area under the probability density function f(x) over all values of the random variable X is equal to 1.0 Suppose this density function is graphed. Let a and b be two possible values of the random variable X, with a<b. Then the probability that X lies between a and b is the area under the density function between these points. The cumulative density function F(x0) is the area under the probability density function f(x) up to x0 f ( x0 ) x0 f ( x)dx xm where xm is the minimum value of the random variable x. Shaded Area is the Probability That X is Between a and b 0 a b x Probability Density Function for a Uniform 0 to 1 Random Variable f(x) 1 0 1 x Areas Under Continuous Probability Density Functions 1. 2. Let X be a continuous random variable with the probability density function f(x) and cumulative distribution F(x). Then the following properties hold: The total area under the curve f(x) = 1. The area under the curve f(x) to the left of x0 is F(x0), where x0 is any value that the random variable can take. Properties of the Probability Density Function f(x) Comments 1 0 Total area under the uniform probability density function is 1. 0 x0 1 x Properties of the Probability Density Function Comments f(x) Area under the uniform probability density function to the left of x0 is F(x0), which is equal to x0 for this uniform distribution because f(x)=1. 1 0 0 x0 1 x Rationale for Expectations of Continuous Random Variables Suppose that a random experiment leads to an outcome that can be represented by a continuous random variable. If N independent replications of this experiment are carried out, then the expected value of the random variable is the average of the values taken, as the number of replications becomes infinitely large. The expected value of a random variable is denoted by E(X). Rationale for Expectations of Continuous Random Variables Similarly, if g(x) is any function of the random variable, X, then the expected value of this function is the average value taken by the function over repeated independent trials, as the number of trials becomes infinitely large. This expectation is denoted E[g(X)]. By using calculus we can define expected values for continuous random variables similarly to that used for discrete random variables. E[ g ( x)] g ( x) f ( x)dx x Mean, Variance, and Standard Deviation i. Let X be a continuous random variable. There are two important expected values that are used routinely to define continuous probability distributions. The mean of X, denoted by X, is defined as the expected value of X. X E(X ) ii. The variance of X, denoted by X2, is defined as the expectation of the squared deviation, (X - X)2, of a random variable from its mean 2 2 X X E[( X ) ] Or an alternative expression can be derived 2 2 2 X X The standard deviation of X, X, is the square root of the variance. E( X ) iii. Linear Functions of Variables Let X be a continuous random variable with mean X and variance X2, and let a and b any constant fixed numbers. Define the random variable W as W a bX Then the mean and variance of W are W E (a bX ) a b X and Var (a bX ) b 2 W 2 and the standard deviation of W is W b X 2 X Linear Functions of Variable An important special case of the previous results is the standardized random variable Z X X X which has a mean 0 and variance 1. Reasons for Using the Normal Distribution 1. 2. 3. 4. The normal distribution closely approximates the probability distributions of a wide range of random variables. Distributions of sample means approach a normal distribution given a “large” sample size. Computations of probabilities are direct and elegant. The normal probability distribution has led to good business decisions for a number of applications. Probability Density Function for a Normal Distribution 0.4 0.3 0.2 0.1 0.0 x Probability Density Function of the Normal Distribution The probability density function for a normally distributed random variable X is f ( x) 1 2 2 e ( x ) 2 / 2 2 for - x Where and 2 are any number such that - < < and - < 2 < and where e and are physical constants, e = 2.71828. . . and = 3.14159. . . Properties of the Normal Distribution Suppose that the random variable X follows a normal distribution with parameters and 2. Then the following properties hold: i. The mean of the random variable is , E( X ) ii. iii. iii. The variance of the random variable is 2, E[( X X ) 2 ] 2 The shape of the probability density function is a symmetric bell-shaped curve centered on the mean as shown in Figure 6.8. By knowing the mean and variance we can define the normal distribution by using the notation X ~ N ( , ) 2 Effects of on the Probability Density Function of a Normal Random Variable 0.4 0.3 Mean = 6 Mean = 5 0.2 0.1 0.0 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 x Effects of 2 on the Probability Density Function of a Normal Random Variable 0.4 Variance = 0.0625 0.3 0.2 Variance = 1 0.1 0.0 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 x Cumulative Distribution Function of the Normal Distribution Suppose that X is a normal random variable with mean and variance 2 ; that is X~N(, 2). Then the cumulative distribution function is F ( x0 ) P( X x0 ) This is the area under the normal probability density function to the left of x0, as illustrated in Figure 6.10. As for any proper density function, the total area under the curve is 1; that is F() = 1. Shaded Area is the Probability that X does not Exceed x0 for a Normal Random Variable f(x) x0 x Range Probabilities for Normal Random Variables Let X be a normal random variable with cumulative distribution function F(x), and let a and b be two possible values of X, with a < b. Then P(a X b) F (b) F (a) The probability is the area under the corresponding probability density function between a and b. Range Probabilities for Normal Random Variables f(x) a b x The Standard Normal Distribution Let Z be a normal random variable with mean 0 and variance 1; that is Z ~ N (0,1) We say that Z follows the standard normal distribution. Denote the cumulative distribution function as F(z), and a and b as two numbers with a < b, then P(a Z b) F (b) F (a) Standard Normal Distribution with Probability for z = 1.25 0.8944 z 1.25 Finding Range Probabilities for Normally Distributed Random Variables Let X be a normally distributed random variable with mean and variance 2. Then the random variable Z = (X - )/ has a standard normal distribution: Z ~ N(0, 1) It follows that if a and b are any numbers with a < b, then b a P ( a X b) P Z b a F F where Z is the standard normal random variable and F(z) denotes its cumulative distribution function. Computing Normal Probabilities A very large group of students obtains test scores that are normally distributed with mean 60 and standard deviation 15. What proportion of the students obtained scores between 85 and 95? 95 60 85 60 P (85 X 95) P Z 15 15 P (1.67 Z 2.33) F (2.33) F (1.67) 0.9901 0.9525 0.0376 That is, 3.76% of the students obtained scores in the range 85 to 95. Approximating Binomial Probabilities Using the Normal Distribution Let X be the number of successes from n independent Bernoulli trials, each with probability of success . The number of successes, X, is a Binomial random variable and if n(1 - ) > 9 a good approximation is a n b n P ( a X b) P Z n (1 ) n ( 1 ) Or if 5 < n(1 - ) < 9 we can use the continuity correction factor to obtain a 0.5 n b 0.5 n P ( a X b) P Z n (1 ) n (1 ) where Z is a standard normal variable. The Exponential Distribution The exponential random variable T (t>0) has a probability density function f (t ) e t for t 0 Where is the mean number of occurrences per unit time, t is the number of time units until the next occurrence, and e = 2.71828. . . Then T is said to follow an exponential probability distribution. The cumulative distribution function is F (t ) 1 e t for t 0 The distribution has mean 1/ and variance 1/2 Probability Density Function for an Exponential Distribution with = 0.2 f(x) Lambda = 0.2 0.2 0.1 0.0 0 10 20 x Joint Cumulative Distribution Functions Let X1, X2, . . .Xk be continuous random variables i. Their joint cumulative distribution function, F(x1, x2, . . .xk) defines the probability that simultaneously X1 is less than x1, X2 is less than x2, and so on; that is F ( x1 , x2 ,, xk ) P( X 1 x1 X 2 x2 X k xk ) ii. The cumulative distribution functions F(x1), F(x2), . . .,F(xk) of the individual random variables are called their marginal distribution functions. For any i, F(xi) is the probability that the random variable Xi does not exceed the specific value xi. F ( x1 , x2 ,, xk ) F ( x1 ) F ( x2 ) F ( xk ) iii. The random variables are independent if and only if Covariance Let X and Y be a pair of continuous random variables, with respective means x and y. The expected value of (x - x)(Y - y) is called the covariance between X and Y. That is Cov( X , Y ) E[( X x )(Y y )] An alternative but equivalent expression can be derived as Cov( X , Y ) E ( XY ) x y If the random variables X and Y are independent, then the covariance between them is 0. However, the converse is not true. Correlation Let X and Y be jointly distributed random variables. The correlation between X and Y is Corr ( X , Y ) Cov( X , Y ) XY Sums of Random Variables Let X1, X2, . . .Xk be k random variables with means 1, 2,. . . k and variances 12, 22,. . ., k2. The following properties hold: i. The mean of their sum is the sum of their means; that is E ( X 1 X 2 X k ) 1 2 k ii. If the covariance between every pair of these random variables is 0, then the variance of their sum is the sum of their variances; that is Var ( X 1 X 2 X k ) 12 22 k2 However, if the covariances between pairs of random variables are not 0, the variance of their sum is K 1 K Var ( X 1 X 2 X k ) 12 22 k2 2 Cov( X i , X j ) i 1 j i 1 Differences Between a Pair of Random Variables Let X and Y be a pair of random variables with means X and Y and variances X2 and Y2. The following properties hold: i. The mean of their difference is the difference of their means; that is E( X Y ) X Y ii. If the covariance between X and Y is 0, then the variance of their difference is Var ( X Y ) X2 Y2 iii. If the covariance between X and Y is not 0, then the variance of their difference is Var ( X Y ) X2 Y2 2Cov( X , Y ) Linear Combinations of Random Variables The linear combination of two random variables, X and Y, is W aX bY Where a and b are constant numbers. The mean for W is, W E[W ] E[aX bY ] a X bY The variance for W is, W2 a 2 X2 b 2 Y2 2abCov( X , Y ) Or using the correlation, W2 a 2 X2 b 2 Y2 2abCorr ( X , Y ) X Y If both X and Y are joint normally distributed random variables then the resulting random variable, W, is also normally distributed with mean and variance derived above.