Transcript Document
CPSC 531:Probability & Statistics:
Review II
Instructor: Anirban Mahanti
Office: ICT 745
Email: [email protected]
Class Location: TRB 101
Lectures: TR 15:30 – 16:45 hours
Class web page:
http://pages.cpsc.ucalgary.ca/~mahanti/teaching/F05/CPSC531
Notes derived from “Probability and Statistics” by
M. DeGroot and M. Schervish, Third edition,
Addison Wesley, 2002, and
“Discrete-event System Simulation” by Banks,
Carson, Nelson, and Nicol, Prentice Hall, 2005.
CPSC 531: Probability Review
1
Objective and Outline
The world the model-builder sees is probabilistic
rather than deterministic.
Some statistical model might well describe the variations.
An appropriate model can be developed by sampling
the phenomenon of interest:
Select a known distribution through educated guesses
Make estimate of the parameters
Test for goodness of fit
Goal is to review:
Random variables
Discrete and continuous random variables
Cumulative distribution functions
Expectation, variance, etc.
CPSC 531: Probability Review
2
Random Variables
A random variable is a real-valued mapping defined on a
sample space.
Suppose that X is a random variable defined on space
S, then X assigns a real-number X(s) to each possible
outcome s є S.
Typically, X, Y, Z etc denote random variables; x, y, z,
etc denote values attained by random variables.
Example: Rolling a pair of dice. Let X be the random
variable corresponding to the sum of the dice on a roll.
If we think of the sample points as a pair (i, j), where i
= value rolled by the first dice and j = value rolled by
the second dice, we have:
X(s) = i+j
CPSC 531: Probability Review
3
Discrete Random Variables
A random variable X is said to be discrete if the
number of possible values of X is finite, or at most, an
infinite sequence of different values.
Example: Consider jobs arriving at a job shop.
• Let X be the number of jobs arriving each week at a job shop.
•
S = possible values of X (range space of X) = {0,1,2,…}
•
p(xi) = probability the random variable is xi = P(X = xi)
p(xi), i = 1,2, … must satisfy:
1. p( xi ) 0, for all i
2. i1 p( xi ) 1
The collection of pairs [xi, p(xi)], i = 1,2,…, is called the
probability distribution of X, and p(xi) is called the probability
mass function (pmf) of X.
The pmf is referred to as “probability function” in some texts
CPSC 531: Probability Review
4
Discrete Random Variables
Consider a random variable X that takes on values 1, 2,
3, and 4 with probabilities 1/6, 1/3, 1/3, and 1/6, resp.
p(x)
0.35
0.30
0.25
0.20
0.15
0.10
0.05
x
0.00
1
2
3
4
CPSC 531: Probability Review
5
Continuous Random Variables
X is a continuous random variable if there exists a non-negative
function f(x) such that for any set of real numbers A є S
P( X A) f ( x)dx
A
The probability that X lies in the interval [a,b] is given by:
b
P(a X b) f ( x)dx
a
f(x), denoted as the pdf of X, satisfies:
1. f ( x) 0 , for all x in S
2. f ( x)dx 1
S
3. f ( x) 0, if x is not in S
Properties
x0
1. P( X x0 ) 0, because f ( x)dx 0
x0
2. P(a X b) P(a X b) P(a X bCPSC
) P531:
(a
X b)
Probability Review
6
Continuous Random Variables
Example: Life of an inspection device is given by X, a
continuous random variable with pdf:
1 x / 2
e
, x 0
f ( x) 2
0,
otherwise
X has an exponential distribution with mean 2 years
Probability that the device’s life is between 2 and 3 years is:
1 3 x / 2
P(2 x 3) e dx 0.14
2 2
CPSC 531: Probability Review
7
Cumulative Distribution Function
The cumulative distribution function (cdf) of a random variable X
is a function F(x), defined for each real number x:
F(x) = P(X <= x) for -∞ < x < ∞
If X is discrete, then
If X is continuous, then
Properties
F ( x) p( xi )
all
xi x
x
F ( x) f (t )dt
1. F is nondecreas ing function. If a b, then F (a) F (b)
2. lim x F ( x) 1
3. lim x F ( x) 0
All probability question about X can be answered in terms of the
cdf, e.g.:
P(a X b) F (b) F (a), for all a b
CPSC 531: Probability Review
8
Cumulative Distribution Function
Example: An inspection device has cdf:
1 x t / 2
F ( x) e dt 1 e x / 2
2 0
The probability that the device lasts for less than 2 years:
P(0 X 2) F (2) F (0) F (2) 1 e1 0.632
The probability that it lasts between 2 and 3 years:
P(2 X 3) F (3) F (2) (1 e (3 / 2) ) (1 e 1 ) 0.145
CPSC 531: Probability Review
9
Expectation
The expected value of X is denoted by E(X)
If X is discrete
E ( X ) xp( x)
All x
If X is continuous
E ( X ) xf ( x )dx
The mean, μ, is the 1st moment of X
A measure of the central tendency
Properties:
E(cX) = cE(X), where c is a constant
E(Y) = aE(X) + b, where Y=aX+b, a & b are constants
E(X + Y) = E(X) + E(Y) regardless of whether X and Y are
independent
E(X.Y) = E(X).E(Y) if X & Y are independent
CPSC 531: Probability Review
10
Variance
The variance of X is denoted by V(X) or
var(X) or s2
Definition:
V(X) = E[(X – E[X]2]
Also,
V(X) = E(X2) – [E(x)]2
The variance is a measure of the dispersion or
spread of a random variable about its mean
The standard deviation of X is denoted by
Definition: square root of V(X)
Expressed in the same units as the mean
s
Properties:
V(cX) = c2V(X)
V(X + Y) = V(X) + V(Y) if X, Y are independent
CPSC 531: Probability Review
11
Small vs. Large Variance
σ2
large
σ2
small
X
X
µ
X
X
µ
Density functions for continuous random variables
with large and small variances (Source LK00, Fig 4.6)
CPSC 531: Probability Review
12
Expectations and Variance (example)
Example: The mean of life of the previous inspection device
is:
1
x / 2
x / 2
E ( X ) xe dx xe
e x / 2 dx 2
0
2 0
0
To compute variance of X, we first compute E(X2):
1 2 x / 2
x / 2
2
E ( X ) x e dx x e
e x / 2 dx 8
0
2 0
0
2
Hence, the variance and standard deviation of the device’s
life are:
V ( X ) 8 22 4
s V (X ) 2
CPSC 531: Probability Review
13
Joint Distributions
Let X and Y each have a discrete distribution.
Then X and Y have a discrete joint distribution
if there exists a function p(x,y) such that:
p(x,y) = P[X=x and Y=y]
Random variables X and Y are jointly
continuous if there exists a non-negative
function f(x,y) called the joint probability
density function of X and Y, such that for all
sets of real numbers A and B
P(X є A, Y є B) = ∫ ∫f(x,y)dxdy
B A
CPSC 531: Probability Review
14
Covariance
The covariance between the random variables
X and Y, denoted by Cov(X, Y), is defined by
Cov(X, Y) = E{[X - E(X)][Y - E(Y)]}
= E(XY) - E(X)E(Y)
The covariance is a measure of the dependence
between X and Y. Note that Cov(X, X) = V(X).
CPSC 531: Probability Review
15
Covariance
Cov(X, Y)
=0
>0
<0
X and Y are
uncorrelated
positively correlated
negatively correlated
Independent random variables are also
uncorrelated.
CPSC 531: Probability Review
16
Statistical Models
Application areas where statistical models find
widespread use:
Queueing systems
Inventory and supply-chain systems
Reliability and maintainability
Limited data
CPSC 531: Probability Review
17
Queueing Systems
In a queueing system, interarrival and service-time
patterns can be probabilistic (e.g., our M/M/1 example).
Sample statistical models for interarrival or service
time distribution:
Exponential distribution: if service times are completely
random
Normal distribution: fairly constant but with some random
variability (either positive or negative)
Truncated normal distribution: similar to normal distribution
but with restricted value.
Gamma and Weibull distribution: more general than
exponential (involving location of the modes of pdf’s and the
shapes of tails.)
CPSC 531: Probability Review
18
Inventory and supply chain
In realistic inventory and supply-chain systems, there
are at least three random variables:
The number of units demanded per order or per time period
The time between demands
The lead time
Sample statistical models for lead time distribution:
Gamma
Sample statistical models for demand distribution:
Poisson: simple and extensively tabulated.
Negative binomial distribution: longer tail than Poisson (more
large demands).
Geometric: special case of negative binomial given at least one
demand has occurred.
CPSC 531: Probability Review
19
Reliability and maintainability
Time to failure (TTF)
Exponential: failures are random
Gamma: for standby redundancy where each
component has an exponential TTF
Weibull: failure is due to the most serious of a large
number of defects in a system of components
Normal: failures are due to wear
CPSC 531: Probability Review
20
Our next stop
Discrete distributions, such as:
Bernoulli trials and Bernoulli distribution
Binomial distribution
Geometric and negative binomial distribution
Poisson distribution
Continuous distributions, such as:
Uniform
Exponential
Normal
Weibull
Lognormal
CPSC 531: Probability Review
21