Transcript File

AP Statistics
Chapter 6
Random Variables
What is a random variable?
• A random variable is a variable whose value is a numerical outcome of a
random phenomenon.
EXAMPLE
If we toss four coins, how would we record the results?
We could record it as a string of tails and heads like “HTTH” or “HTHH”.
This is not a random variable because it has no numerical value to work with.
Instead, we may elect to record the number of heads in the four tosses.
This would make our sample space 0, 1 , 2, 3, 4 … all numerical outcomes.
2
Discrete vs. Continuous
Variables
• A discrete random variable has a countable number of possible values.
• A continuous random variable can take any possible value over an
interval.
EXAMPLES
The number of heads in four coin tosses.
A number generated by a spinner that covers the numbers between 0 and 1.
3
Discrete Random Variables
• The probability distribution of a discrete variable lists the values and their
probabilities.
Value X
X1
X2
X3
P(X)
p1 satisfyptwo
p3
2
• The probabilities
must
requirements:
…
Xk
…
pk
• Every probability is between 0 and 1.
• p 1 + p2 + … + pk = 1
• Find the probability of any event by adding the individual probabilities that
make up that event.
4
EXAMPLE
Determine the probability distribution of the discrete random variable X that
counts the number of heads in four coin tosses.
We can do this if we make two reasonable assumptions:
1. The coin is balanced, so each toss is equally likely to give an H or T.
2. The coin has no memory, so each toss is independent.
Since each outcome is equally likely, what is the probability of each combination?
5
Continued…
The number X represents the number of heads in four tosses. These values
are NOT equally likely.
Use this information to complete your probability distribution.
What is the probability of getting 2 or more heads?
6
Means and Variances
• The mean of a set of observations is:
x
• The mean of a random variable X is also an average of the
possible values of X.
• This average must take in to account that some values of X
may occur more frequently than others.
• We can handle this adjustment by multiplying each outcome
by its probability.
Value X
X1
X2
X3
…
Xk
P(X)
p1
p2
p3
…
pk
 X  x1 p1  x 2 p 2  ...  x k p k
 X   xi p i
7
EXAMPLE
According to Benford’s Law, the distribution of the first digit V in a
set of legitimate business records is:
First Digit V:
1
2
3
P(V)
0.301
0.176
4
0.125 0.097
5
6
7
8
9
0.079
0.067
0.058
0.051
0.046
Use this information to compute the expected value of any
randomly selected first digit. (expected value = mean)
The mean of V is:
 V  1(0.301)  2(0.176)  3(0.125)  4(0.097 )  5(0.079)
 6(0.067 )  7 (0.058)  8(0.051)  9(0.046)
 V  3.441
8
EXAMPLE Continued…
• While the mean of 3.441 is not a possible outcome of V, it still
gives us an idea of where we can expect most values to occur.
• If each digit was truly random, we would have a uniform
distribution.
• What would the mean be in this case?
• Notice how this compares to the distribution of Benford’s Law.
9
Variance
• In a set of discrete values, the variance is based off
of how much each value “varies” from the expected
amount.
• In the case of a random variable’s distribution, we
must account for the differences in frequency
among outcomes.
Value X
X1
X2
X3
…
Xk
P(X)
p1
p2
p3
…
pk
 X   x1   X
2

p1   x 2   X
2

2
p 2  ...   x k   X
 X    xi   X  p i
• …and the standard deviation is the square root of
the variance.
2

2
pk
2
10
EXAMPLE
Gain Communications sells aircraft communication units to both
military and civilian markets.
Gain uses the modern practice of using probability estimates to
estimate sales for the upcoming year.
The military division of the company estimates its sales as
follows:
Units Sold (X)
P(X)
1000
0.1
3000
0.3
5000
0.4
10,000
0.2
Calculate the expected number of sales and the standard
deviation.
11
HOMEWORK
Complete the problems: pg. 353 (#1 – 16). This assignment will be due for
completion at the start of the next session of class.
Continuous Random Variables
• As mentioned before, continuous random variables deal with an infinite
number of possible outcomes over a pre-determined interval.
• Since there are an infinite number of possibilities, the probability of any
individual occurrence is practically zero.
• Suppose we wanted to make a probability distribution for an event like,
• What would be the theoretical probability assigned to 0.47?
0 .3  x  0 .7
P (0.47 ) 
1

 0 .0
13
Density Curves
• In order to assign probabilities to events we can use density curves to describe a distribution.
• The horizontal axis of the density curve will represent all of the occurrences and its height
over each occurrence will represent its frequency.
• The area under the curve over an interval will represent the probability of an event within that
interval occurring.
• The total area under the curve will equal 1.
14
EXAMPLE
Let’s revisit the spinner that generates a random number between 0 and 1.
What would be the probability of generating a number X between 0.3 and 0.7 ?
P (0.3  X  0.7 )
15
EXAMPLE Continued
• Since each number on the spinner has an equal chance of being
generated, we will call this a uniform distribution.
• The area under the curve is 1. Since this is uniform, the curve will
be rectangular in shape.
• The probability of getting a value between 0.3 and 0.7 will be the
area between those two values.
P (0.3  X  0.7 )  0 .4
16
Taking it further…
• With the same example in mind, what would be the
following:
P ( X  0.5)  0 .5
P ( X  0.8)  0 .2
P ( X  0.5 or X  0.8)  0 .7
Is there a difference
between P(X>8) and
P(X>8)?
17
The Normal Distribution
• We have discussed a density curve in prior chapters.
It was the NORMAL CURVE.
• The normal distribution is considered a probability
distribution.
• Recall that N(μ, σ) is our shorthand way of referring
to the normal distribution having a mean of μ and a
standard deviation of σ.
• To standardize our values and use our normal
distribution table, we must use a z-score.
Z 
X 

18
EXAMPLE
An opinion poll ask an SRS of 1500 American adults what the
biggest issue facing schools was.
Based on the sample data, 30% of the adults said drugs. We will
learn how to analyze this later, but for now, we will say that
this is an estimate of the population with a distribution mean
of 0.3 and a standard deviation of 0.0118.
In other words… N (0.3, 0.0118)
What is the probability that the result differs from the truth by
more than two percentage points?
In other words…
P ( p  0.28 or p  0.32)
Hint: Start off by “standardizing” the data.
19
EXAMPLE Continued…
0.28  0.3 

P ( p  0.28)  P  Z 

0.0118


0.32  0.3 

P ( p  0.32)  P  Z 

0.0118 

 P  Z   1.69 
 P  Z  1 .6 9 
 0 .0 4 5 5
 0 .0 4 5 5
P ( p  0 .2 8 o r p  0 .3 2 )  0 .0 9 1 0
20
HOMEWORK
Complete the problems: pg. 355 (#17 – 30). This assignment will be due for
completion at the start of the next session of class.
Rules for Means
• If the values of a random variable, X, are increased
or decreased by addition or subtraction, then the
mean value of X is also increased in the same
manner.
• If the values of a random variable, X, are increased
or decreased by multiplication, then the mean
value of X is also increased in the same manner.
• In other words,
 a  bX  a  b  x
22
Rules for Means
• If we have two random variables, X and Y, then the
sum of those two variables will have a mean that is
equal to the sum of their individual means.
• In other words,
 X Y   X   Y
23
EXAMPLE
Gain Communications sells aircraft communication units to both
military and civilian markets.
Gain uses the modern practice of using probability estimates to
estimate sales for the upcoming year.
The military division of the company estimates its sales as
follows:
Units Sold (X)
1000
P(X)
3000
0.1
0.3
5000
0.4
10,000
0.2
The civilian division of the company estimates its sales as
Units Sold (Y)
300
500
750
follows:
P(Y)
0.4
0.5
0.1
Compute the mean sales of each.
24
EXAMPLE
• Gain makes a profit of $2000 on each military unit
and $3500 on each civilian unit that is sold.
• The mean military sales profit is:
 2000 X  $2000(5000)  $10, 000, 000
• The mean civilian sales profit is:
 3500 Y  $3500(445)  $1, 557, 500
• The total profit, Z, is the sum of all sales profits.
• The mean value of Z would be:
Z  $ 2 0 0 0 X  $ 3 5 0 0Y
 Z   2000 X   3500 Y
25
Rules for Variance
• We can apply similar rules to the variances of
random variables.
• In order to do this, we must know if there the two
random variables are independent of one another.
• This would mean that there was a correlation of
ZERO between them.
• If there is a correlation between them, we must
account for that correlation when we try to
combine variances.
• It should also be noted that we are working with
variances here and not standard deviations.
26
Rules for Variance
• If X is a random variable and a and b are fixed
2
2
2
numbers, then:
 a  bX  b  X
• Notice that addition to X does not affect the
variation. Only multiplication does.
• If X and Y are random variables with complete
independence (no correlation):
2
2
2
 X Y   X   Y

2
  X  Y
2
X Y
2
27
EXAMPLE
A college uses SAT scores as one criterion for
admission. Experience has shown that the
distribution of SAT scores among its entire
population of applicants is:
SAT Math Score (X)
SAT Verbal Score (Y)
μx = 625
μY = 590
σx = 90
σY = 100
What are the mean and standard deviation of the total
score X + Y among students applying to this college?
 X  Y  1 8 1 0 0  1 3 4 .5 4
 X  Y  1215
NOTE: This is based on the assumption that the scores
are independent, which many may argue that they
are not.
28
EXAMPLE
• A large auto dealership keeps track of sales and lease agreements
made during each hour of the day. Let X = the number of cars
sold, and let Y = the number of cars leased during the first hour
of a randomly selected Friday.
• Based on previous records, the distributions of X and Y are:
Sold X
0
1
2
3
Leased Y
0
1
2
p
0.3
0.4
0.2
0.1
p
0.4
0.5
0.1
29
CONTINUED…
• Find the mean and standard deviation of both X and Y.
 X  1.1
 Y  0.7
  0.64
 0.943
• Nowlet’s
X define the total numberY of deals as T. (T = X + Y)
• Find and interpret the mean of T.
• Now compute the standard deviation of T.
T   X  Y
 1 .8
30
CONTINUED
• Remember that you must deal with variances instead of standard
deviations.
 T  0.943  0.64
2
2
2
 1 .2 9 8 8
 T  1.14
• The dealership’s manager receives a $500 bonus for each car sold
and a $300 bonus for each car leased. Find the mean and
standard deviation of the manager’s total bonus.
 B  500(1.1)  300(0.7)
B 
 $760
500 (0.943)  300 (0.64 )  $ 5 0 9 .0 9
2
2
2
2
31
Check Your Understanding
Complete the Check Your Understanding problem on the top of pg.
372. We will discuss the answers in a moment.
32
HOMEWORK
Complete the problems: pg. 378 (#37 – 51). This assignment will be due for
completion at the start of the next session of class.
The Binomial Setting
• We have a binomial situation when the following things are in place:
1. Each observation will fall in to one of two categories, usually considered
“success” or “failure”.
2. There is a fixed number of observations, “n”.
3. All of the n observations are independent.
4. The probability of success is the same for each observation.
34
Binomial Distributions
• In a binomial setting, the random variable X is equal to the number of
successes.
• The probability distribution of X in this case is considered a binomial
distribution.
• The parameters of the distribution are n and p.
• n represents the number of observations
• p is the probability of success on any observation.
• As an abbreviation, we say that X is B(n, p).
35
EXAMPLE
Blood type is a trait that is passed through heredity. If both parents carry the
genes for both O and A blood types, there is a probability of 0.25 of
having a child with Type O blood.
If these parents have 5 children, how many children would have Type O
blood?
This is a binomial distribution B(5, 0.25).
Deal 10 cards and let X be the count of the number of red cards.
This would not be a binomial distribution because each occurrence is not
independent.
36
Computing Binomial
Probabilities
• If X has the binomial distribution with n observations, having a
probability of p for success on each, then the possible values of X are 0, 1,
2, …, n. If k is any of these values,
n!
k
nk
X  tok )find
 the probability of pk number
(1  ofpsuccesses
)
This formula can P
be(applied
k !( n  k ) !
in the situation described.
37
EXAMPLE
A quality engineer selects an SRS of 10 switches from a large shipment for detailed inspection.
Unknown to the engineer, 10% of the switches in the shipment fail to meet the
specifications. What is the probability that exactly 1 of the ten switches in the sample will
fail inspection?
This is a distribution defined as B(10, .1).
In this situation, k = 1.
10 !
P ( beXtheprobability
1)  of the engineer finding
(.1)1 or(.9)
 0 .3switches?
874
What would
fewer defective
1
9
1!(10  1) !
 0 .7 3 6 1
38
EXAMPLE 2
Each child of a particular pair of parents has a probability 0.25 of having type
O blood. If they have 5 children, what is the probability that exactly 3 of
the children have type o blood?
P ( X  3) 
5!
(.25) (.75)  10 (.25) 3 (.75) 2
3!(5  3)!
There is basically, an 8.8% chance that this could happen!
0 .0type
8 7 8O9
What is the probability that MORE THAN 3 of the children have
blood?
3
2
39
HOMEWORK
Complete the problems pg. 403 (#69 – 80). This assignment will be due for
completion at the start of the next session of class.
Geometric Probability
• We have a geometric setting when the following characteristics are in
place:
1. Each observation will fall in to one of two categories, usually considered
“success” or “failure”.
2. The probability of success is the same for each observation.
3. All of the n observations are independent.
4. The variable of interest, X, is the number of trials required to obtain the
first success.
41
EXAMPLE
If we are rolling a single die, and we want to roll a “5”, then how many rolls
would it take to get a five for the first time?
GEOMETRIC DISTRIBUTION
If we are rolling a die four times, and we want to count the number of fives
that we roll …
BINOMIAL DISTRIBUTION
42
Calculating Geometric
Probabilities
• If X has a probability p of occurring, and a probability q of not occurring,
the possible values of X are 1, 2, 3, …
• If n is any of these values, the probability that the first success occurs on
the nth trial is:
n 1
P (probability
X  nthat
) it would
q takep3 rolls before we got our
• What would be the
first five? 6 rolls?
43
Using the TI-84: pdf
• Just as with binomial probabilities, we can use the calculator to quickly
compute geometric probabilities.
• The geometpdf function will quickly compute the probability for a set
number of trials being required to achieve first success.
• To compute the probability that it would take five rolls to roll a “5” for
the first time, we would use:
geom etpdf (1 / 6, 5)  0 .0 8 0 4
44
The Geometric Distribution
• The geometric probability distribution also has a mean and standard
deviation.
• The mean, or expected value, of a geometric random variable is:
 
1
• The standard deviation of a geometricprandom variable is:
 
q
p
2
45
HOMEWORK
Complete the problems 8.37 – 8.46. This assignment will be due for
completion at the start of the next session of class.