Transcript values

Stat 155, Section 2, Last Time
• Pepsi Challenge:
When are results “significant” vs. “random”?
• Independence
– Conditional Prob’s = Unconditional Prob’s
– Special case of and rule (of probability)
• Random Variables
– Discrete vs. Continuous
– Discrete:
• Summarize probability with table
• Sum entries to calculate prob’s
Reading In Textbook
Approximate Reading for Today’s Material:
Pages 277-286, 291-305
Approximate Reading for Next Class:
Pages 291-305, 334-351
Midterm I
Coming up: Tuesday, Feb. 27
Material:
HW Assignments 1 – 6
Extra Office Hours:
Mon. Feb. 26, 8:30 – 12:00, 2:00 – 3:30
(Instead of Review Session)
Bring Along:
1 8.5” x 11” sheet of paper with formulas
Midterm I
Suggestions for studying:
•
Exam based on HW, not text or class
•
Constructed by modifying HW problems
•
So rework HW problems
Note: different from “looking over HW”
Midterm I
Use Posted Old Exams:
On Class Web Page
•
Least effective: read over solutions
•
Moderate: read test, think, then look
•
Best: Rework, then check
Midterm I
Warning about Old Exams:
•
Have slightly less material
•
In particular probability not covered then
•
Because of different calendar
(drop date used to be earlier)
•
Maybe something we haven’t covered
•
Clarify by email…
Midterm I
Warning about Old Exams:
•
Famous last words….
“But I knew everything on the practice exam”
•
Practice exam only about “method” of
questions
•
Not representative of material
•
Only a sample
•
As present exam will be
Random Variables
Now consider continuous random variables
Recall: for measurements (not counting)
Model for continuous random variables:
Calculate probabilities as areas,
under “probability density curve”, f(x)
Continuous Random Variables
Model probabilities for continuous random
variables, as areas under “probability
density curve”, f(x):
Pa  X  b = Area(
b
  f ( x )dx
a
)
a
b
(calculus notation)
Continuous Random Variables
Note:
Same idea as “idealized distributions” above
Recall discussion from:
Page 8, of Class Notes, Jan. 23
Continuous Random Variables
e.g. Uniform Distribution
Idea:
choose random number from [0,1]
Use constant density:
f(x) = C
Models “equally likely”
To choose C, want:
Area
1 = P{X in [0,1]} = C
So want C = 1.
0
1
Uniform Random Variable
HW:
4.54
(0.73, 0, 0.73, 0.2, 0.5)
4.56
(1, ½, 1/8)
Continuous Random Variables
e.g. Normal Distribution
Idea: Draw at random from a normal
population
f(x) is the normal curve (studied above)
Review some earlier concepts:
Normal Curve Mathematics
The “normal
 , density curve” is:
1
f ( x) 
e
2 
1 x 
 

2  
2
usual “function” of x
circle constant = 3.14…
natural number = 2.7…
Normal Curve Mathematics
Main Ideas:
•
•
Basic shape is:
e
“Shifted to mu”:
1
 x2
2
e

1
 x   2
2
1 x 
 

2  
2
•
“Scaled by sigma”:
•
Make Total Area = 1: divide by
•
f ( x )  0 as x   , but never  0
e
2 
Computation of Normal Areas
EXCEL
Computation:
works in terms of
“lower areas”
E.g. for N (1,0.5)
Area < 1.3
Computation of Normal Probs
EXCEL Computation:
probs given by “lower
areas”
E.g. for X ~ N(1,0.5)
P{X < 1.3} = 0.73
Normal Random Variables
As above, compute probabilities as areas,
In EXCEL, use NORMDIST & NORMINV
E.g. above:
X ~ N(1,0.5)
P{X < 1.3} =NORMDIST(1.3,1,0.5,TRUE)
= 0.73
(as in pic above)
Normal Random Variables
HW:
4.57, 4.58 (0.965, ~0)
And now for something
completely different
Recall
Stat 155, Section 2, Majors
Distribution
0.4
0.35
of majors of
0.25
0.2
0.15
0.1
0.05
de
d
nd
ec
i
er
U
th
O
m
/N
Jo
ur
ur
na
si
ng
lis
m
/C
om
m
.
En
v.
Sc
i.
/H
ea
lth
Ph
ar
gy
Po
lic
y
Bi
ol
o
ic
Pu
bl
ne
s
s
/M
an
.
0
Bu
si
this course:
Frequency
students in
0.3
And now for something
completely different
A photographer for a national magazine was assigned
to get photos of a great forest fire.
Smoke at the scene was too thick to get any good
shots, so he frantically called his home office to
hire a plane.
"It will be waiting for you at the airport!" he was
assured by his editor.
And now for something
completely different
As soon as he got to the small, rural airport, sure
enough, a plane was warming up near the
runway.
He jumped in with his equipment and yelled, "Let's
go! Let's go!"
The pilot swung the plane into the wind and soon they
were in the air.
And now for something
completely different
"Fly over the north side of the fire," said the
photographer, "and make three or four low level
passes.“
"Why?" asked the pilot.
"Because I'm going to take pictures! I'm a
photographer, and photographers take pictures!"
said the photographer with great exasperation.
And now for something
completely different
After a long pause the pilot said, "You mean you're
not the instructor?“
…
Means and Variances
(of random variables)
Text, Sec. 4.4
Idea: Above population summaries, extended
from populations to probability distributions
Connection:
frequentist view
Make repeated draws,
X 1 , X 2 ,..., X n
from the distribution
Discrete Prob. Distributions
Recall table summary of distribution:
Values
x1
x2
…
xk
Prob.
p1
p2
…
pk
Taken on by random variable X,
Probabilities:
P{X = xi} = pi
(note: big difference between X and x!)
Discrete Prob. Distributions
Table summary of distribution:
Values
x1
x2
…
xk
Prob.
p1
p2
…
pk
Recall power of this:
Can compute any prob., by summing pi
Mean of Discrete Distributions
Frequentist approach to mean:
X1   X n
X

n
#  X i  x1   x1   #  X i  xk   xk


n
#  X i  x1 
#  X i  xk 

 x1   
 xk 
n
n
k
 p1 x1   pk xk   pi xi
i 1
Mean of Discrete Distributions
Frequentist approach to mean:
k
X   pi xi
i 1
a weighted average of values
where weights are probabilities
Mean of Discrete Distributions
E.g. Above Die Rolling Game:
Winning
Prob.
9
-4
0
1/3 1/2 1/6
Mean of distribution =
= (1/3)(9) + (1/6)(0) +(1/2)(-4) = 3 - 2 = 1
Interpretation: on average (over large number
of plays) winnings per play = $1
Conclusion: should be very happy to play
Mean of Discrete Distributions
Terminology:
mean is also called:
“Expected Value”
E.g. in above game “expect” $1 (per play)
(caution: on average over many plays)
Expected Value
HW:
4.59,
4.61
Expected Value
An application of Expected Value:
Assess “fairness” of games (e.g. gambling)
Major Caution: Expected Value is not what is
expected on one play, but instead is
average over many plays.
Cannot say what happens in one or a few
plays, only in long run average
Expected Value
E.g. Suppose have $5000, and need $10,000
(e.g. you owe mafia $5000, clean out safe
at work. If you give to mafia, you go to jail,
so decide to try to raise additional $5000
by gambling)
And can make even bets, where P{win} = 0.48
(can really do this, e.g. bets on Red in
roulette at a casino)
Expected Value
E.g. Suppose have $5000, and need $10,000
and can make even bets, w/ P{win} = 0.48
Pressing Practical Problem:
•
Should you make one large bet?
•
Or many small bets?
•
Or something in between?
Expected Value
E.g. Suppose have $5000, and need $10,000
and can make even bets, w/ P{win} = 0.48
Expected Value analysis:
E(Winnings) = P{lose} x $0 + P{win} x $2
= 0.52 x $0 + 0.48 x $2 =
= $0.96
Thus expect to lose $0.04 for every dollar bet
Expected Value
E.g. Suppose have $5000, and need $10,000
and can make even bets, w/ P{win} = 0.48
Expect to lose $0.04 for every dollar bet
• This is why gambling is very profitable
(for the casinos, been to Las Vegas?)
• They play many times
• So expected value works for them
• And after many bets, you will surely lose
• So should make fewer, not more bets?
Expected Value
E.g. Suppose have $5000, and need $10,000
and can make even bets, w/ P{win} = 0.48
Another view:
Strategy
P{get $10,000}
one $5000 bet
0.48 ~ 1/2
two $2500 bets
~ (0.48)2 ~ 1/4
four $1250 bets
~ (0.48)2 ~ 1/16
“many”
“no chance”
Expected Value
E.g. Suppose have $5000, and need $10,000
and can make even bets, w/ P{win} = 0.48
Surprising (?) answer:
• Best to make one big bet
• Not much fun…
• But best chance at winning
Casino Folklore:
• This really happens
• Folks walk in, place one huge bet….
Expected Value
Warning about Expected Value:
Excellent for some things, but not all decisions
e.g. if will play many times
e.g. if only play once
(so don’t have long run)
Expected Value
Real life decisions against Expected Value:
1. State Lotteries
–
–
–
–
–
–
–
State sells tickets
Keeps about half of $$$
Gives rest to ~ one (randomly chosen) player
So Expected Value is clearly negative
Why do people play?
Totally irrational?
Players buy faint hope of humongous gain
Could be worth joy of thinking about it
Expected Value
Real life decisions against Expected Value:
1. State Lotteries
–
Support ours in North Carolina?
Interesting (and deep) philosophical balances:
–
–
–
–
Only totally voluntary tax
Yet tax burden borne mostly by poor
Is that fair?
Otherwise lose revenue to other states…
Expected Value
Real life decisions against Expected Value:
2. Casino Gambling
–
–
–
–
–
–
Always lose in long run (expected value…)
Yet people do it. Are they nuts?
Depends on how many times they play
If really enjoy being ahead sometimes
Then could be worth price paid for the thrill
Serious societal challenge:
(some are totally consumed by thrill)
Expected Value
Real life decisions against Expected Value:
3. Insurance
–
–
–
–
–
–
–
–
–
Everyone pays about 2 x Expected Loss
Insurance Company keeps the rest!
So very profitable.
But e.g. car insurance is required by law!
Sensible, since if lose, can lose very big
Yet purchase is totally against Expected Value
OK, since you only play once (not many times)
Insurance Co’s play many times (Expected
Value works for them)
So they are an evening out mechanism
And now for something
completely different
A Fun Movie Clip:
405 Landing – Video File
Functions of Expected Value
Important Properties of the Mean:
i. Linearity:
aX b  a X  b
Why?
aX b   pi axi  b   api xi   pi b
i
i
i




 a   pi xi   b  pi   a X  b
 i
  i 
i. e. mean “preserves linear transformations”
Functions of Expected Value
Important Properties of the Mean:
ii.
summability:
 X Y   X  Y
Why is harder, so won’t do here
i. e. can add means to get mean of sums
i. e. mean “preserves sums”
Functions of Expected Value
E. g. above game:
Winning
Prob.
9
-4
0
1/3 1/2 1/6
If we “double the stakes”, then want:
“mean of 2X”  2 X  2   X  $2
Recall $1 before
i.e. have twice the expected value
Functions of Expected Value
E. g. above game:
Winning
Prob.
9
-4
0
1/3 1/2 1/6
If we “play twice”, then have
 X  X   X   X  $1  $1  $2
1
2
1
2
Same as above?
But isn’t playing twice different from doubling
stake?
Yes, but not in means
Functions of Expected Value
HW:
4.73
Indep. Of Random Variables
Independence: Random Variables X & Y are
independent when knowledge of value of
X does not change chances of values of
Y
Indep. Of Random Variables
HW:
4.71, 4.72
(Indep., Dep., Dep.)
Independence
Application: Law of Large Numbers
IF X 1 ,..., X n are independent draws from the
same distribution, with mean  ,
THEN:
" lim" X  
n
(needs more mathematics to make precise,
but this is the main idea)
Independence
Application: Law of Large Numbers
Note: this is the foundation of the
“frequentist view of probability”
Underlying thought experiment is based on
many replications, so limit works….
Variance of Random Variables
Again consider discrete random variables:
Where distribution is summarized by a table,
Values
x1
x2
…
xk
Prob.
p1
p2
…
pk
Variance of Random Variables
Again connect via frequentist approach:
1 n
2


var  X 1 ,..., X n  
X

X


i
n  1 i 1

X1  X   X 2  X     X n  X 
2
2
n 1
2

#  X i  x1  xi  X     #  X 1  x1  xk  X 


n 1
2
Variance of Random Variables
Again connect via frequentist approach:
1 n
2
X i  X  
var  X 1 ,..., X n  

n  1 i 1
#  X i  x1 
#  X i  xk 
2
 x1  X    
 xk  X 2 

n 1
n 1
 p1  x1  X     pk  xk  X  
2
k
2
  pi  xi  X 
i 1
2
Variance of Random Variables
So define:
Variance of a distribution
As:
   p j x j   X 
2
X
k
2
j 1
random variable
Variance of Random Variables
E. g. above game:
Winning
Prob.
9
-4
0
1/3 1/2 1/6
1
1
1
2
2
2
   4  1  0  1  9  1
2
6
3
2
X
=(1/2)*5^2+(1/6)*1^2+(1/3)*8^2
Note: one acceptable Excel form,
e.g. for exam (but there are many)
X
Standard Deviation
Recall standard deviation is square root of
variance (same units as data)
E. g. above game:
Winning
Prob.
9
-4
0
1/3 1/2 1/6
Standard Deviation
=sqrt((1/2)*5^2+(1/6)*1^2+(1/3)*8^2)
Variance of Random Variables
HW:
C16:
Find the variance and standard
deviation of the distribution in 4.59.
(1.21, 1.10)
Properties of Variance
i.
Linear transformation
I.e. “ignore shifts” var(

2
aX  b
a 
) = var (
2
2
X
)
(makes sense)
And scales come through squared
(recall s.d. on scale of data, var is square)
Properties of Variance
ii.
For X and Y independent (important!)

2
X Y
  
2
X
2
Y
I. e. Variance of sum is sum of variances
Here is where variance is “more natural”
than standard deviation:
 X   
2
X
2
Y
Properties of Variance
Winning
E. g. above game:
9
Prob.
-4
0
1/3 1/2 1/6
Recall “double the stakes”, gave same mean,
as “play twice”, but seems different
Doubling: 
2
2X
 4 
2
X
2
2
2
2






2

Play twice, independently: X  X
X
X
X
1
2
1
2
Note: playing more reduces uncertainty
(var quantifies this idea, will do more later)
Variance of Random Variables
HW:
C17: Suppose that the random variable X
models winter daily maximum
temperatures, and that X has mean 5o C
and standard deviation 10o C.
(a) Let Y be the temp. in degrees Fahrenheit
What is the mean of X? (41o)
Hint: Recall the conversion: C=(5/9)(F-32)