Transcript May 25

Random Variable Overview
• What are random variables?
• Intro to probability distributions
•Discrete
•Continuous
• Linear transformations of RVs
• Combinations of RVs
What are random variables?
Let X represent a quantitative variable that is measured or
observed in an experiment.
The value that X takes on in a given experiment is a
random outcome.
•Counting the number of defective lightbulbs in a case of
bulbs
•Measuring daily rainfall in inches
•Measuring the average depression score of computer
science majors
Sample means, standard deviations, proportions,
and frequencies are all random variables.
Two types of random variables
The observations can take
only a finite, countable
number of values.
The observations can take
on any of the countless
number of values in an
interval
• The number of heads in four
coin tosses
• The average response time of a
random sample of 200 depressed
patients
• The number of anorexics in a
random sample of 500 people
• The average IQ of a random
sample of 22 statistics students
In general, averages are continuous and counts are
discrete.
The average anger response
The number of juvenile delinquents
What is a probability distribution?
The probabilities
associated with each
specific value of the RV
The probabilities
associated with a range of
values of the RV.
The Sample Space
Suppose that we toss three coins.
Let X = the number of heads
appearing.
X is a random variable taking on
one of the values 0,1,2,3
Two balls are randomly chosen
from an urn of blue and red balls.
We win $1 for every blue and lose
$1 for every red. Let X = our total
winnings.
X is a random variable taking on
one of the values -2, 0, 2
The Sample Space
Suppose that we toss two dice.
Let X = the sum of the two tosses.
Suppose that we toss two dice.
Let X = the difference of the two
tosses.
X is a random variable taking on
one of the values _________
X is a random variable taking on
one of the values ___________
Probability Distributions
The probability distribution of X lists the values in the sample
space and their associated probabilities.
Suppose that we toss a fair die.
Let X = the outcome of the toss.
X is a random variable taking on
one of the values 1, 2, 3, 4, 5, 6
xi
1
2
3
4
5
6
pi
1/6
1/6
1/6
1/6
1/6
1/6
Probability Distributions
The probability distribution of X lists the values in the sample
space and their associated probabilities.
1
0.8
probability 0.6
0.4
0.2
0
1
2
3
4
outcome
5
6
xi
1
2
3
4
5
6
pi
1/6
1/6
1/6
1/6
1/6
1/6
Probability Distributions
Suppose that we toss two coins.
Let X = the number of heads.
Make the probability distribution
Probability Distributions
xi
0
1
2
Suppose that we toss two coins.
Let X = the number of heads.
Make the probability distribution
1
0.8
probability
0.6
0.4
0.2
0
0
1
outcome
2
pi
1/4
2/4
1/4
Probability Distributions
Sometimes you can estimate discrete probability distributions
using a really large sample
Cryptography: Frequencies of
letters in a 1000 letter sample
xi
A
B
C
D
E
F
1
0.8
probability
0.6
0.4
0.2
0
A
B
C
D
outcome
E
F
pi
73/1000
9/1000
30/1000
44/1000
130/1000
28/1000
Expected Values
The mean of a discrete probability distribution (called the
“expected value”) can be found using this formula
x   xi pi
It is a weighted average of the possible values of X, each value
being weighted by its probability of occurrence.
Expected Values
Suppose that we toss a fair die.
Let X = the outcome of the toss.
X is a random variable taking on
one of the values 1, 2, 3, 4, 5, 6
xi
1
2
3
4
5
6
pi
1/6
1/6
1/6
1/6
1/6
1/6
What is the expected value?
1
1
1
1
1
1
x  (1 )  (2  )  (3  )  (4  )  (5  )  (6  )  3.5
6
6
6
6
6
6
Expected Values
Suppose we draw one marble out
of a bowl containing 3 green and 7
black marbles. We win $10 if we
draw a green marble but we lose
$2 if we draw a black marble. Let
X = our winnings. What is the
expected value of X? Should you
play this game?
x 
3
7
(10  )  (2  )  1.6
10
10
xi
10
-2
pi
3/10
7/10
Variance
The variance of a discrete probability distribution can be found
using this formula
 X2   (x i  X ) 2 pi
It is a weighted average of the squared deviations in X
Suppose that we toss a fair die.
Let X = the outcome of the toss.
X is a random variable taking on
one of the values 1, 2, 3, 4, 5, 6
• μx = 3.5
• σx 2 = ?
xi
1
2
3
4
5
6
pi
1/6
1/6
1/6
1/6
1/6
1/6
xi
1
2
3
4
5
6
Suppose that we toss a fair die.
Let X = the outcome of the toss.
X is a random variable taking on
one of the values 1, 2, 3, 4, 5, 6
pi
1/6
1/6
1/6
1/6
1/6
1/6
( xi-X)2
6.25
2.25
.25
.25
2.25
6.25
• μx = 3.5
• σx 2 = ?
1
6
1
6
1
6
1
6
1
6
1
6
 X2  (6.25  )  (2.25  )  (.25  )  (.25  )  (2.25  )  (6.25  )  2.92
Suppose that we toss a coin. Let X =
1 if it’s heads and 0 if it’s tails.
What is the expected value of X?
What is the variance?
xi
μ = .50
σ2 = .25
pi
( xi-X)2
Suppose that we toss 3 coins. For
every head we get $1 and for every tail
we lose $1. Let X = our winnings.
What is the expected value of X?
What is the variance of X?
μ=0
σ2 = 3
xi
pi
( xi-X)2
Known Discrete Distributions
• The bernoulli (heads versus tails)
• The binomial (# heads in n tosses)
• The poisson (# customers entering a post office in a day)
Continuous Probability Distributions
•We talk about probabilities for a range of values, not a
particular value.
•Probability for a range of values is determined by the area
under the probability distribution curve (use calculus or a
table).
•Expected value

 xf ( x)dx




Variance

2
x f ( x)dx  ( xf ( x)dx) 2

Known Continuous Distributions
• The uniform distribution
• The normal distribution
• The t distribution
• The F distribution
Normal Distribution
•The probability distribution curve for the normal distribution
N(µ,σ) is defined by this function
1
( x u ) 2 / 2 2
f ( x) 
e
2 
• Luckily, you can you can find the probabilities for this curve
using Table E
•Expected value

Variance
2
Standard Normal Distribution
• A normal distribution has mean μx and variance σx2
• The standard normal distribution is a normal distribution that has been
transformed to have mean 0 and variance 1
• If raw scores are normally distributed, the distribution of z-scores will be
standard normal
• Thus if raw scores are normally distributed, we can associate z-scores
with standard normal probabilities
• (whether or not raw scores are normally distributed, a z-score
accurately indexes/positions a score in terms of the number of standard
deviations away from the mean)
Interpreting Z Scores
Unusual
Values
-3
Ordinary
Values
-2
-1
0
Z
Unusual
Values
1
2
3
Z-scores: Handy for thinking about the normal
probability distribution
• If the distribution of raw scores is
normal, the z distribution will be
“standard normal”
Z scores
• This is a probability density curve
• In particular, it is the “standard normal” probability
distribution
• Probability corresponds to area under the curve
• Total area under the curve is 1
Standard Normal Distribution
Area = 0.3413
Z scores
•  1 standard deviation includes about 68% of cases (34% on each side)
•  2 standard deviations includes about 95% of cases
•  3 standard deviations includes about 99.7% of cases
ASSUMING RAW SCORES DISTRIBUTED NORMALLY
Using Appendix E.10
• For positive z scores, gives you area under curve that
corresponds to probability
• For negative z scores, use the complement rule
μ
z
μ
z
μ
z
z
mean to
z
larger
portion
smaller
portion
.00
.50
1.0
1.5
2.0
2.5
0.0000
0.1915
0.3413
0.4332
0.4772
0.4938
0.5000
0.6915
0.8413
0.9332
0.9772
0.9938
0.5000
0.3085
0.1587
0.0668
0.0228
0.0062
Standard Normal Distribution
Area found in
Appx E.10
Area = 0.3413
Area = 0.1587
?
-3
-2
-1
0
1
2
3
?
Score (z )
0.4429
0.0571
1.58
0
Score (z )
Exercise
1. What is the probability of getting a Z greater than 1.96?
2. What z-score will give you a probability of 5% in the upper
tail?

Applications
Let’s say the population of bartenders has an IQ of 100
and a standard deviation of 10
If we measure the IQ of any one bartender, how likely is
it that her score would be greater than 80?
P(x > 80) = ?
Step 1: Translate the score into z score
z
X 

80  100

 2
10
Step 2: Use E.10 to get probability
P(Z > -2) = 98%
Exercise
Let’s say the population of bartenders has an
IQ of 100 and a standard deviation of 10
If we measure the IQ of any one bartender,
how likely is it that her score would be
Greater than 80?
Between 90 and 110?
Greater than 115?
Rules for Expected Values
1. Linear Transformations:
If you add/subtract a constant to the RV, then add/subtract that number to the X
If you mult/divide the RV by a constant, then mult/divide the X by that number
a bX  a  bx
Combining Two Random Variables
If you add random variable X to random variable Y, then add X to Y
If you subtract random variable X from random variable Y, then subtract X from Y
X Y  X  Y
 X Y   X  Y
Rules for Variances of Random Variables
1. Linear Transformations:
If you add/subtract a constant to the RV, then nothing happens to X2
If you mult/divide the RV by a constant, then mult/divide the X2 by that number
squared
 a2 bX  b2 2X
2. Combining Two Independent Random Variables
If you add random variable X to random variable Y, then add X2 to Y2
If you subtract random variable X from random variable Y, then add X2 to Y2
 X2 Y   X2   Y2
 X2 Y   X2   Y2
Example 1
Suppose that we toss a coin. Let X =
1 if it’s heads and 0 if it’s tails. What
is the expected value of X?
.50
X = ______
Now we go into a special “double or
nothing” round. All dollar values are
doubled in this round. What is the new
expected value of X?
1.00
x = ______
What is the variance of X?
.25
σx2 = _____
What is the new variance of X?
1.00
σx2 = _____
Example 2
Suppose that we toss a coin. Let X =
1 if it’s heads and 0 if it’s tails. What
is the expected value of X?
Now suppose we toss three coins.
What is the expected value of all three
tosses combined?
.50
X = ______
1.50
X+Y+Z = ______
What is the variance of X?
What is the new variance?
.25
σx2 = _____
.75
σX+Y+Z = _____
Example 3
50 vegetarians and 100 non-vegetarians participate in a study of cardiovascular
health.
On average, the vegetarians received a score of 80 with a standard deviation of 5.
The non-vegetarians scored 70 points on average with a standard deviation of 10.
A sneaky researcher tries to fudge the data by multiplying the scores of the nonvegetarians by 1.2 and adding 5 points. What happens to the mean and sd?
Example 4
100 pairs of male-female siblings participate in a study of repressive coping.
For the women, the average repressive coping score was 6 with a standard
deviation of .5. For men, the average repressive coping score was 5, with a
standard deviation of .5.
What is the average and standard deviation of the set of male-female difference
scores?