6.1 Discrete and Continuous Random Variables

Download Report

Transcript 6.1 Discrete and Continuous Random Variables

6.1 Discrete and Continuous
Random Variables
Objectives
SWBAT:
• COMPUTE probabilities using the probability distribution of a discrete random variable.
• CALCULATE and INTERPRET the mean (expected value) of a discrete random variable.
• CALCULATE and INTERPRET the standard deviation of a discrete random variable.
• COMPUTE probabilities using the probability distribution of certain continuous random
variables.
What is a random variable? Give some examples.
• A random variable takes numerical values that describe the outcomes
of some chance process.
Let X = the score on hole #13 at
Augusta National golf course for a
randomly selected golfer on day 1 of
the 2011 Masters.
Consider tossing a fair coin 3 times.
Define X = the number of heads obtained
X = 0: TTT
X = 1: HTT THT TTH
X = 2: HHT HTH THH
X = 3: HHH
Value
0
1
2
3
Probability
1/8
3/8
3/8
1/8
What is a probability distribution?
• The probability distribution of a random variable gives its possible values and
their probabilities.
• Probability distributions could be in table or histogram form.
Example: Let’s define the random variable X as the number of games played in
a randomly selected World Series.
As a histogram:
Its probability distribution:
What is a discrete random variable? Give some examples.
• There are two main types of random variables: discrete and continuous. If we can find a
way to list all possible outcomes for a random variable and assign probabilities to each
one, we have a discrete random variable.
A discrete random variable X takes a fixed set of possible values with gaps between.
The probability distribution of a discrete random variable X lists the values xi and their
probabilities pi:
Value:
Probability:
x1
p1
x2 x3
p2 p3
…
…
The probabilities pi must satisfy two requirements:
1.Every probability pi is a number between 0 and 1.
2.The sum of the probabilities is 1.
To find the probability of any event, add the probabilities pi of the particular values xi that
make up the event.
• Even though it takes a fixed set of values, it could be an infinite set
(for example a geometric distribution….more to come on this).
• To illustrate “gaps between,” think about shoe size. Shoe size usually
goes by halves, i.e. …8, 8.5, 9, 9.5, 10… There are gaps between these
values because you cannot get a shoe in size 8.1 or size 8.1356712…
• Compare this to measuring someone’s foot length. There would be
no gaps in measuring foot length because someone’s foot could
measure 8.1356712. Foot length is known as a continuous random
variable.
• Think about the gaps between as if you were placing the distribution
on a number line, and there would be gaps in between the values on
a number line. For example, the variable X in the coin-tossing
example is a discrete random variable because there are gaps
between the possible values of 0, 1, 2, and 3 on a number line, and
their probabilities added to 1.
• Often, discrete random variables are things you can count.
How many languages?
Imagine selecting a U.S. high school student at random. Define the random
variable X = number of languages spoken by the randomly selected student.
The table below gives the probability distribution of X, based on a sample of
students from the U.S. Census at School database.
a) Show that the probability distribution for X is legitimate.
All the probabilities are between 0 and 1 and they add to 1, so this is a legitimate
probability distribution.
b) Make a histogram of the probability distribution. Describe what you see.
Shape: skewed right
Center: The median is 1 (more than half
the distribution is 1), but the mean will be
slightly higher due to the skewness.
Spread: The number of languages varies
from 1 to 5, but nearly all of the students
speak just one or two languages.
c) What is the probability that a randomly selected student speaks at least 3 languages?
More than 3?
Roulette: One wager players can make in Roulette is called a “corner bet.” To make this bet, a
player places his chips on the intersection of four numbered squares on the Roulette table. If
one of these numbers comes up on the wheel and the player bet $1, the player gets his $1
back plus $8 more. Otherwise, the casino keeps the original $1 bet. If X = net gain from a
single $1 corner bet, the possible outcomes are x = –1 or x = 8. Here is the probability
distribution of X:
If a player were to make this $1 bet over and over again, what would be the player’s average gain?
In the long run, the player loses $1 in 34 of every 38 games and gains $8 in 4 of every 38 games.
Imagine a hypothetical 38 bets. The player’s average gain is:
If a player were to make $1 corner bets many, many times, the average gain would be about –$0.05 per bet.
In other words, in the long run, the casino keeps about 5 cents of every dollar bet in roulette.
How do you calculate the mean (expected value) of a discrete random
variable? Is the formula on the formula sheet?
• The mean of any discrete random variable is an average of the
possible outcomes, with each outcome weighted by its probability.
Suppose that X is a discrete random variable whose probability distribution is
Value:
x1
x2
x3
…
Probability:
p1
p2
p3
…
To find the mean (expected value) of X, multiply each possible value by its
probability, then add all the products:
 x  E ( X )  x1 p1  x2 p2  x3 p3  ...
This is on the
formula sheet
under
probability.
  xi pi
• This is saying that the mean value of X is equal to the expected value of
X, which is equal to the sum of the X values times their probabilities.
Let’s go back to the world series example, in which X = the number of games
played in a randomly selected World Series.
Let’s find the expected value of X.
How do you interpret the mean (expected value) of a discrete random variable?
Let’s look at the World Series example. The expected value of X is 5.86 games. How do
we interpret this value?
If we were to randomly select World Series over and over, the average number of
games in the selected Series would be about 5.86.
Calculate and interpret the mean of the random variable X in the languages example.
If we were to randomly select many, many U.S. high school students at
random, the average number of languages spoken would be about 1.457.
Does the expected value of a random variable have to equal one of the possible
values of the random variable? Should expected values be rounded?
• No, the expected value of a random variable does not have to equal one of the
possible values of the random variable.
• Expected values should NOT be rounded.
• Expected value is really the mean. Think if we were finding the mean of some
test scores. It would be perfectly normal to find a mean test score of 70.4.
70.4 (usually) would not be one possible value for a test score. We also
wouldn’t round that mean of 70.4. Expected value is the same thing as mean,
so we wouldn’t round.
How do you calculate the variance and standard deviation of a discrete random variable? Are
these formulas on the formula sheet?
• Since we use the mean as the measure of center for a discrete random variable, we use the
standard deviation as our measure of spread. The definition of the variance of a random
variable is similar to the definition of the variance for a set of quantitative data.
This is on the
formula sheet
Suppose that X is a discrete random variable whose probability distribution is
Value:
x1
x2
x3
…
Probability:
p1
p2
p3
…
and that µX is the mean of X. The variance of X is
Var(X) = s X2 = (x1 - m X ) 2 p1 + (x 2 - m X ) 2 p2 + (x 3 - m X ) 2 p3 + ...
= å (x i - m X ) 2 pi
The formula says that you are taking each value of the random variable, subtracting the expected value of the
random variable (finding the deviation), squaring that result, and multiplying it by the probability of the
random variable. Then, you add up those values.
• To get the standard deviation of a random variable, take the square root of the variance.
How do you interpret the standard deviation of a discrete random variable?
• The standard deviation of a random variable X is a measure of how much the
values of the variable typically vary from the expected value.
• In other words, it measures the average distance the outcomes are from the
mean.
Skip follow-up question on roulette….go to the languages example.
Use your calculator to calculate and interpret the standard deviation of X in the
languages example.
Reminder: the expected value was 1.457.
Step 1: You must substitute into the formula to show where you calculation is coming from!
Step 2: Enter the values of the
random variable in L1 and the
corresponding probabilities in
L2.
Step 3: Use one-variable
statistics with the values in
L1 and the FreqList as L2.
The standard deviation is 0.671.
Interpretation: The number of languages spoken by a randomly selected U.S. high school student typically varies
by about 0.671 languages from the mean (1.457).
Are there any dangers to be aware of when using the calculator to find
the mean and standard deviation of a discrete random variable?
You must show some work to get credit. The first couple of terms is
fine.
What is a continuous random variable? Give some examples.
• Discrete random variables commonly arise from situations that involve counting
something. Situations that involve measuring something often result in a continuous
random variable.
A continuous random variable X takes on all values in an interval of numbers. The
probability distribution of X is described by a density curve. The probability of any event is the
area under the density curve and above the values of X that make up the event.
• To think of an example, think about foot length as described before. There are no gaps in
between values.
• Other examples:
• the amount of time it takes to run the 110 meter hurdles (continuous) vs the number of
hurdles cleanly jumped over (discrete)
• a student’s age (continuous) and the number of birthdays they have had (discrete)
• The probability model of a discrete random variable X assigns a probability between 0 and 1
to each possible value of X.
• A continuous random variable Y has infinitely many possible values. All continuous
probability models assign probability 0 to every individual outcome. Only intervals of
values have positive probability.
Is it possible to have a shoe size = 8? Is it possible to have a foot length = 8
inches?
• Yes, it is possible to have a shoe size of 8.
• No, it is not possible to have a foot length = 8 inches. The reason being is that
you can keep measuring down to get more precise, for example
8.00000000000000000000001 inches.
How many possible foot lengths are there? How can we graph the distribution
of foot length?
• There are an infinite number of foot lengths.
• To graph, we can create a histogram with an infinite number of really skinny
rectangles that add to 1. This ends up looking like a density curve (think
Chapter 2).
• This is an area where statistics and calculus overlap. You will see when you start
examining integrals.
How do we find probabilities for continuous random variables?
• We find probabilities by examining the area under a density curve.
• They are the same thing! The boundary line adds no area.
• When thinking about why a continuous probability model assigns
probability 0 to every individual outcome, remember that each
outcome is just one of an infinite number of possible outcomes, so
the probability is 1/∞ .
• Another way to think about it is as if you are finding the area of a
rectangle with a width of 0.
Weights of Three-Year-Old Females
The weights of three-year-old females closely follows
a Normal distribution with a mean of 𝜇 = 30.7 pounds
and a standard deviation of 𝜎 = 3.6 pounds.
Randomly choose one three-year-old female and call
her weight X.
a) Find the probability that a randomly selected
three-year-old female weighs at least 30 pounds.
Step 1: N(30.7, 3.6)
Step 2:
Step 3: There is about a 57.71% chance that the randomly selected three-yearold female will weigh at least 30 pounds.
b) Find the probability that a randomly selected three-year-old female weighs
between 25 and 35 pounds.
Step 1: N(30.7, 3.6)
Step 2:
Step 3: There is about a 82.72% chance that the randomly selected three-yearold female will weigh between 25 and 35 pounds.
c) If P(X<k) = 0.8, find the value of k.
Step 1: N(30.7, 3.6)
Step 2:
Step 3: The value of k is 33.7298 pounds.