Transcript Document

Review of Probability & Statistics
•
•
•
•
•
•
•
•
•
Living with (Im)probability
Probability Theory versus Calculus
Permutations & Combinations
Independence and Conditional probabilities
Random variables
Probability mass/density functions
Expectation/Mean
Variance/Standard deviation
Probability distributions
Lecture 2, CS567
1
(Im)Probability a blessing in disguise
• Imagine a deterministic world
– Everything pre-ordained; “Free-will” a myth
- Everything is known about the past, present and future
• Uncertainty = Spice of Life
–
–
–
–
–
–
“What would you like to be when you grow up?”
Chances of winning an Olympic medal, the lottery
Odds of Marrying Elizabeth Taylor !?
Foundation of Quantum Mechanics
Multiple histories of the Universe
“The only thing you can be certain of in life is uncertainty”
• “The most important questions of life are, for the most part,
really only problems of probability” – Pierre Simon
• One of the reasons for this course!
Lecture 2, CS567
2
Probability Theory versus Calculus
• Theory
– Mapping real world to a mathematical model
– Differing interpretations of probability possible
• Frequentist
– Probability = Countable frequencies
– Idealized (“All men are created equal”)
– Tacit and unspecified underlying assumptions
• Subjective
– Measure of belief
– Explicit representation of all aspects of uncertainty
– More general case, includes the frequentist view as a special case
– More recent trend
– Focus of this course (Bayesian framework)
• Calculus
– Manipulation of the mathematical representation
– Consensus on approach exists, same for all “Probability Theories”
Lecture 2, CS567
3
Permutations and Combinations
• Order important in permutation
– Biological sequences are permutations of respective
alphabet
– What is the total possible number of proteins?
• Order irrelevant in combination (set or multiset)
- Set of genes of a species
- How many different instances of Homo sapiens can be
created?
Lecture 2, CS567
4
Independence and Conditional
Probability
• Independence (Between two events/distributions)
– Probability of getting an A in this course and having purple hair
(Independent or Dependent?)
– Probability of a student attending this class, and being registered
for this class (Independent or Dependent?)
• Conditional Probability (Calculation based on prior event)
– Given that a student is attending this class, what is the probability
of having registered?
– Given that a student is attending this class, what is the probability
that he is smart?
Not independent
Lecture 2, CS567
Independent
5
Random variables
• “Everyone/Everything is a stochastic statistic”
• Variable whose value in a specific trial/sample cannot be
exactly predicted
• Discrete
– Non-zero probabilities exist for the variable to take on each of a set
of values
– How many A’s will a student get? (Variable = Letter grade)
• Continuous
– Non-zero probabilities exist for the variable to lie within a set of
ranges of values
– What is the probability that the Instructor’s weight is 135 pounds?
Lecture 2, CS567
6
Joint random variables
• Particular values of multiple variables being
observed together
– Probability of getting a 7’0” center who has a free throw
percentage greater than 90%
• Marginal probability of one of the joint variables
– Consider all possible values of the remaining values
(integration for the continuous case)
– Probability of getting a 7’0” center. Period. (No matter
what the free throw shooting percentage is)
– Probability of spotting a car with mileage better than 30
mpg = [|ToyotaModels| + |HondaModels|…..]/|All Cars|
Lecture 2, CS567
7
Probability mass/density functions
• Probability mass function
– Function for computing probability for discrete variables
• Probability density function
– Function for computing probability for continuous variables
• Cumulative distribution function
– For either discrete or continuous variables, compute probability for
all values greater/less than a specified limit
• The Probability Mass and Cumulative Distribution
Functions have values ranging from 0 to 1, while the
Probability Density Function has only a lower bound of 1
Lecture 2, CS567
8
• Expectation
Expectation/Mean
– The general case of “average”
• Not restricted to counts
• Any experimental result that is numeric
– What can one expect on average?
– Captures the central tendency
– Linear operator
• Additive
– Example:
•
•
•
•
Expectation of Heads in a single toss of a coin = 0.5
Expectation of 2 heads in 2 tosses of a coin = 0.25
Expectation of the Instructor winning the Nobel prize = 10-10000000000……
Average price of an airline ticket to Hawaii
– Frequently used to establish the null case in experiments (Mind
Reading/ Gender prediction scams!)
Lecture 2, CS567
9
Variance/Standard Deviation
• Captures inconsistency/irreproducibility/error
– “Always scores 20 points per game” versus “Anywhere between 0
and 40 points per game”
• Measure of dispersion
• Squaring in expression for variance ensures capturing
symmetrical dispersion
• Standard deviation
– Same unit as variable
– Useful for normalized comparisons between experiments
– Useful to estimating significance of observed values
Lecture 2, CS567
10
Probability distributions
• Not just the Expectation, but the whole shebang
– Probabilities of all possible values a variable can take
• Finding the probability distribution for a variable is
important
• If probability distribution is known
– Expectation can be calculated
– Variance/Standard deviation can be calculated
– Unusualness/Significance of an experimental finding can be
verified
– Data can be assigned to different probability distributions
Lecture 2, CS567
11
Types of Probability distributions
• Uniform
– Everything equally possible
– Probability of an event inversely proportional to range of
possibilities
– Probability that a student entering Twin Oaks lives on a
specific floor
– “Truly random” Roulette wheel
– Coin Toss
Lecture 2, CS567
12
Types of Probability distributions
• Bernoulli
– Variable can take value A xor B
– “She has her mother’s complexion!” “No, she doesn’t!”
– Odds of getting a head in a single coin toss
• Binomial
– Result of multiple Bernoulli trials
– Odds of m of n children inheriting their mother’s complexion
– Odds of getting m heads in n coin tosses
• Multinomial
– “Generalized binomial”: More than two outcomes
– Number distribution in multiple rolls of a die
– Residue composition of DNA or protein sequence
Lecture 2, CS567
13
Types of Probability distributions
• Poisson
– Number of events occurring in an interval, given an average rate
– Approximation of a Binomial distribution with large number of
trials and low probability of success
– Number of traffic accidents
• Gaussian
– Classical bell shape/also called Normal distribution
– Distribution of estimates of Mean of a variable, irrespective of the
underlying distribution of the variable per se
– Unit normalized distribution allows useful interpretation
• Extreme value distribution
– Skewed
– Fashion/Toy industries thrive on this! “Everyone has a Star Wars
light saber. I’ve got to have one!!”
Lecture 2, CS567
14
Types of Probability distributions
• Student t distribution
– Bell shaped, but with wider flanges
– Distribution of estimate of mean, normalized by estimate
of standard deviation (“error compounded by error”)
• Chi-squared distribution
– Distribution of sums of squares of multiple random
variables, each having a normal distribution
– Error in hitting a baseball = Sum(Error in seeing ball,
error in timing hit, error in centering hit..)
Lecture 2, CS567
15
Types of Probability distributions
• F distribution
– Distribution of ratio of a pair of normalized Chi-square
variables
– “Distribution of distributions”
– Compares variances
• Multi-variate normal distribution
– Generalization of normal distribution to multiple
variables
– Distribution of a vector of random variables
Lecture 2, CS567
16
Types of Probability distributions
Distribution
Uniform
Bernoulli
Binomial
Multinomial
Poisson
Student t
Chi-squared
Comment
AxB
Large n, small p
Expectation
(B-A)/2
p
np
npi
 = np
n-1
n
Lecture 2, CS567
Variance
(B-A)2/12
p(1-p)
np(1-p)
npi(1-p)

(n-1) / (n-3)
2n
17