Random Variables - Arizona State University

Download Report

Transcript Random Variables - Arizona State University

Probability Distributions
Random Variables: Finite and Continuous
A review
MAT174, Spring 2004
Finite Random Variables
We want to associate probabilities with the
values that the random variable takes on.
There are two types of functions that allow us to
do this:
 Probability Mass Functions (p.m.f)
 Cumulative Distribution Functions (c.d.f)
Probability Distributions


The pattern of probabilities for a random
variable is called its probability distribution.
In the case of a finite random variable we call
this the probability mass function (p.m.f.),
fx(x) where fx(x) = P( X = x )
n
 P( X  x )  1.
i
i 1
f
all x
X
( x)  1
Thus, 0  f X ( x)  1 for any value of x and
Probability Mass Function



This is a p.m.f which is a
histogram representing the
probabilities
The bars are centered above
the values of the random
variable
The heights of the bars are
equal to the corresponding
probabilities (when the width
of your rectangles is 1)
0.5
0.4
0.3
P(X=x)
0.2
0.1
0
0
1
2
Cumulative Distribution Function




The same probability information is often given
in a different form, called the cumulative
distribution function (c.d.f) or FX
FX(x) = P(X ≤ x)
0 ≤ FX(x) ≤ 1, for all x
In the finite case, the graph of a c.d.f. should
look like a step function, where the maximum is
1 and the minimum is 0.
Cumulative Distribution Function
Cumulative Distrib ution Function
1.0
0.8
0.6
F X (x )
0.4
0.2
0.0
0
1
2
3
4
5
6
7
x
8
9 10 11 12 13 14
Binomial Random Variable

Let X stand for the number of successes in n Bernoulli Trials
where X is called a Binomial Random Variable

Binomial Setting:
1.
2.
3.
4.

You have n repeated trials of an experiment
On a single trial, there are only two possible outcomes
The probability of success is the same from trial to trial
The outcome of each trial is independent
Expected Value of a Binomial R.V is represented by
E(X)=n*p
BINOMDIST


BINOMDIST is a built-in Excel function that
gives values for the p.m.f and c.d.f of any
binomial random variable
It is located under Statistical in the Function
menu
–
–
BINOMDIST(x, n, p, false) = P(X=x)
BINOMDIST(x, n, p, true) = P(X ≤ x)
Expected Value
E( X ) 
 x  fX ( x)
all x

This is average value of X (what happens on
average in infinitely many repeated trials of the
underlying experiment
–

It is denoted by X
For a Binomial Random Variable, E(X)=n*p,
where n is the the number of independent trials
and p is the probability of success
Continuous Random Variable



Continuous random variables take on values in an
interval; you cannot list all the possible values
Examples:
1. Let X be a randomly selected number between 0
and 1
2. Let R be a future value of a weekly ratio of closing
prices for IBM stock
3. Let W be the exact weight of a randomly selected
student
You can only calculate probabilities associated with
interval values of X. You cannot calculate P(X=x);
however we can still look at its c.d.f, FX(x).
Probability Density Function (p.d.f)

Represented by fx(x)
–
–


fx(x) is the height of the function fx(x) at an input of x
This function does not give probabilities
For any continuous random variable, X,
P(X=a)=0 for every number a.
Look at probabilities associated with X taking
on an interval of values
–
P(a ≤ X ≤ b)
Probability Density Function (p.d.f)


To find P(a ≤ X ≤ b), we need to look at the
portion of the graph that corresponds to this
interval.
How can we relate this to integration?
fX
A
a
b
Probability Density Function
A  FX (b)  FX (a)  P(a  X  b)
 P ( a  X  b)
 P ( a  X  b)
 P(a  X  b).
Cumulative Distribution Function

CDF -–
–

FX(x)=P(X ≤ x)
0 ≤ FX(x) ≤ 1, for all x
NOTE: Regardless of whether the random
variable is finite or continuous, the cdf, FX, has
the same interpretation
–
I.e., FX(x)=P(X ≤ x)
Cumulative Distribution Function


For the finite case, our c.d.f graph was a step
function
For the continuous case, our c.d.f. graph will
be a continuous graph
Cumulative Distribution Function
1.2
1.0
0.8
0.6
0.4
0.2
0.0
F T (t )
-1
0
1
t
2
3
Fundamental Theorem of Calculus
(FTC)

Given that G ( x ) 
–

 g ( x)dx
Differentiate both sides and what happens?
Well, from the previous slide we can see that
FX ( x)   f X ( x)dx
–


If we differentiate both sides, we get that
What does this say?
How can we verify this claim?
FX' ( x)  f X ( x)
Example 7 from Course Files

Define the following function:
 7.5 x 4  30x 3  37.5 x 2  15x if 0  x  1
f X ( x)  
elsewhere
0
–
–
What are the possible values of X?
Set up an integral that would give you the following
probabilities:




–
–
P(X < 0.5)
P(X > 0.6)
P(0.1 ≤ X ≤ 0.9)
P(0.1 ≤ X ≤ 5)
Verify that the function is a density function
What is E(X)?
Expected Value



For a finite random variable, we summed over all
possible values of x
For a continuous random variable, we want to integrate
over all possible values of x

This implies that E ( X )   X  x  f X ( x)dx


Example 8 from the Course Files

0
Let T be the amount of

time between
fT (t )   1  t
consecutive computer
e 16.8

16.8
crashes and has the
following p.d.f. and c.d.f.
0
FT (t )  
t

– What type of r.v. is T?
1  e 16.8
–
–
Calculate P(1 < T < 5) in
two different ways.
What is E(X)?
if t  0
if t  0
if t  0
if t  0
Exponential Distribution


Exponential random variables usually describe the waiting time between
consecutive events.
In general, the p.d.f and c.d.f for an exponential random variable X is given as
follows:
0 if x  0
0 if x  0

f X ( x)  1  x / 
FX ( x)    x /

e
if
0

x

1  e if 0  x


Any EXPONENTIAL random variable X, with parameter , has
E( X )  

How can we verify this?
Continuous R.V. with exponential
distribution
Probability Density Function
Cumulative Distribution Function
0.6
0.5
0.4
0.3
0.2
0.1
0.0
f X (x )
-3
1.2
1.0
0.8
0.6
0.4
0.2
0.0
F X (x )
0
3
6
x
9
12
15
-3
0
3
6
x
9
12
• How can we verify that the graph on the left is the
graph of a p.d.f.?
15
Uniform Distribution



If the probability that X assumes a value is the same
for all equal subintervals of an interval [0,u], then we
have a continuous uniform random variable
X is equally likely to assume any value in [0,u]
If X is uniform on the interval [0,u], then we have the
following formulas:
0 if x  u
1
f X ( x)   if 0  x  u
u
0 if u  x

0 if x  u
x
FX ( x)   if 0  x  u
u
1 if u  x

Continuous R.V. with uniform
distribution
• In general, if X is a continuous random variable
with a UNIFORM distribution on [0,u], then
u
E( X ) 
2
Probbility Density Function
0.0016
0.0012
F X (x )
f X (x ) 0.0008
0.0004
0.0000
-100 0
100 200 300 400 500 600 700 800
x
Cumulative Distribution Function
1.0
0.8
0.6
0.4
0.2
0.0
-100 0 100 200 300 400 500 600 700 800
x
Focus on the Project

Look at the file Auction Focus.xls in the course files
–
–
–

This file contains 22 prior leases
Looking at each prior lease, we see that if each company bid
their signal, every company that won the auction would have
lost money
We want to devise a new bidding strategy using this data
Use data to simulate thousands of similar auctions
Identify Random Variables

We need random variables
–
Let V be the continuous random variable that gives the fair
profit value, in millions of dollars, for an oil lease similar to the
22 tracts

–
Each signal is an observation of the continuous random
variable, SV where v is the actual fair value of the tract

–
Look through Auction Focus.xls to see the statistics for the
sample
It is assumed that E(SV) = v for every lease
RV gives the error in a company’s signal


Given by the signal minus the actual fair profit value of the lease
E(RV) = 0 for every value of v
What should you do?

From slide 65 in MBD 2 Proj 2.ppt –
1.
2.
Start an Excel file which incorporates the historical data on
the lease values and your team’s particular set of signals
Use these to compute the complete sample of signal errors,
and then analyze this sample. Specifically, you should
compute the maximum, minimum, and sample mean of the
errors. You should also plot a histogram that approximates
the actual p.d.f, fR of R
–
Go to slide 50 to see information about relative frequencies