Quantitative Techniques * Class I

Download Report

Transcript Quantitative Techniques * Class I

Quantitative Techniques –
Class II
Quantifying Randomness – Probability and Probability Distributions
Probability - Definitions
•
•
•
•
•
•
•
Sample Space – Like population, the entire range of values possible
Event – The actual realization of the values
Union – The likelihood of either of multiple events occurring
Intersection – The likelihood of both events occurring
Complement – Everything in the sample that is not occuring
Mutual Exclusivity – If one event occurs, then the other cannot
Independence – When the events are not related to each other – that is,
the probability of one, does not affect the other
• Permutations – The number of ways to arrange some objects
• Combinations – Permutation, when order is not important
Probability
• Quantifying randomness
• The context: An “experiment” that admits several
possible outcomes
– Some outcome will occur
– The observer is uncertain which (or what) before
the experiment takes place
• Event space = the set of possible outcomes. (Also
called the “sample space.”)
• Probability = a measure of “likelihood” attached to
the events in the event space. (Try to define
probability without using a word that means
probability.)
Rules (Axioms) of Probability
• An “event” E will occur or not occur
• P(E) is a number that equals the probability that E will
occur.
• By convention, 0 < P(E) < 1.
• E' = the event that E does not occur
• P(E') = the probability that E does not occur.
Essential Results for Probability
•
•
•
•
•
If P(E) = 0, then E cannot (will not) occur
If P(E) = 1, then E must (will) occur
E and E' are exhaustive – either E or E' will occur.
Something will occur, P(E) + P(E') = 1
Only one thing can occur. If E occurs, then E' will not occur – E
and E' are exclusive.
Joint Events
• Pairs (or groups) of events: A and B
One or the other occurs: A or B ≡ A  B
Both events occur A and B ≡ A  B
• Independent events: Occurrence of A does not affect
the probability of B
• An addition rule: P(A  B) = P(A)+P(B)-P(A  B)
• The product rule for independent events:
P(A  B) = P(A)P(B)
Using Conditional Probabilities: Bayes Theorem
Bayes’ Theorem finds the actual probability of an event from the result of your
tests.
Thus, very important for survey results or testing results of any kind
The Theorem: P(A|B) = P(B|A) x P(A)
P(B)
Random Variable
• Definition: A variable that will take a value assigned to it
by the outcome of a random experiment.
• Realization of a random variable: The outcome of the
experiment after it occurs. The value that is assigned to
the random variable is the realization.
X = the variable, x = the realization
• Use random variables to organize the information about a
random occurrence.
• Can be continuous or discrete
Probability Distribution
• Range of the random variable = the set of values it
can take
– Discrete: A set of integers. May be finite
or infinite
– Continuous: A range of values
• Probability distribution: Probabilities associated
with values in the range.
Binary Random Variable
• Like a coin toss – or any event that has
only 2 alternatives
• Event occurs
X=1
• Event does not occur  X = 0
• Probabilities:
P(X = 1) = θ
•
P(X = 0) = 1 - θ
Bernoulli Random Variable
• X = 0 or 1
• Probabilities: P(X = 1) = θ
•
P(X = 0) = 1 – θ
• (X = 0 or 1 corresponds to an
event)
Probability Function
This is called a
Probability
Density Function
(PDF)
• Define the probabilities as a function of X
• Bernoulli random variable
– Probabilities: P(X = 1) = θ
–
P(X = 0) = 1 – θ
• Function: P(X=x) = θx (1- θ)1-x, x=0,1
Mean and Variance
• E[X]
= 0(1- θ) + 1(θ) = θ
• Variance = [02(1- θ) + 12 θ] – θ2
= θ(1 – θ)
• Application: If X is the number of male
children in a family with 1 child, what
is E[X]? θ = .5, so this is the expected
number of male children in families
with one child.
Models
• Settings in which the probabilities can
only be approximated
• Models “describe” reality but don’t match
it exactly
– Assumptions are descriptive
– Outcomes are not limited to a finite range
Poisson Model
The Poisson distribution is a model that fits
situations such as
Number of accidents in a location
Number of people with a disease in a
population
e-λ λk
P[X = k] =
,k = 0,1,2,... (not limited)
k!
e is the base of the natural logarithms, approximately equal to 2.7183.
esomething is often written as the exponential function, exp(something)
Poisson Variable
Poisson Probabilities with Lambda = 4
X is the random variable
0.20
λ is the mean of x
0.15
is the standard deviation
C2
λ
The figure shows P[X=x] for a
Poisson variable with λ = 4.
0.10
0.05
0.00
0
2
4
6
8
C1
10
12
14
16
Application
• The arrival rate of customers at a pizza
store is 3.2 people per hour.
• What is the probability of exactly 5
customers walking-in during a particular
hour? And at least 5 people walking in
(less than or equal to 5)?
• We can create a table like this on excel
very easily (will learn in Analytics)
• Probability of exactly 0 customers
walking in is 0.0407 (or a 4% chance)
– Of exactly 1 customer = 0.1304 (13%)
– Of exactly 2 customers = 20.8%
– Of at least 2 customers = 4% + 13% + 21% =
38% and so on…
----------------------------------------------Probability =
Exp(-3.2) 3.2customers / customers!
----------------------------------------------Customers
Probability
0
0.0407622
1
0.130439
2
0.208702
3
0.222616
4
0.178093
5
0.113979
6
0.060789
7
0.0277893
8
0.0111157
9
0.00395225
10
0.00126472
The Normal Distribution
• The most useful distribution in all branches of
statistics and econometrics.
• Strikingly accurate model for elements of human
behavior and interaction
• Strikingly accurate model for any random outcome
that comes about as a sum of small influences.
Applications
• Biological measurements of all sorts (not just human mental
and physical)
• Accumulated errors in experiments
• Numbers of events accumulated in time
– Amount of rainfall per interval
– Number of stock orders per (longer) interval. (We used the
Poisson for short intervals)
– Economic aggregates of small terms.
• And on and on…..
The Empirical Rule
and the Normal Distribution
Dark blue is less than one standard deviation from the mean. For the normal
distribution, this accounts for about 68% of the set (dark blue) while two standard
deviations from the mean (medium and dark blue) account for about 95% and three
standard deviations (light, medium, and dark blue) account for about 99.7%.
The Logistic Distribution
Used when the distribution has a fat tail
High Kurtosis
Many instances in real world data, including
marketing surveys
financial markets data