Transcript Slide 1
Putting Into Practice What I
Learned from FSU Statistics
Professors
Michael Proschan
NIAID
The Indian Connection
• I recently tried to prove a theorem related to the
monitoring of clinical trials
– Last step: If X N ( ,1) and A is an event such that
P( X A) does not depend on μ, then P( X A) 0 or 1
• Pretty obvious, but how do you prove it?
• Fred Flintstone called on
The Great Gazoo!
The Indian Connection
• I recently tried to prove a theorem related to the
monitoring of clinical trials
– Last step: If X N ( ,1) and A is an event such that
P( X A) does not depend on μ, then P( X A) 0 or 1
• Pretty obvious, but how do you prove it?
• I call on
The Great Basu!
The Indian Connection
•
I ( X A)
is ancillary: its distribution does not
depend on μ
• X is a complete, sufficient statistic
• Basu’s theorem: X is independent of I ( X A)
P( X A) P( X A
I ( X A) 1)
P( X A) P[ I ( X A) 1]
[ P( X A)]2
P( X A) 0 or 1
The Indian Connection
• What I will remember most about Dr.
Basu:
– His ability to make the most complicated
topics simple
• “Let me ask you a question like this”
• “Let me show you what he was trying to do”
– His beautiful examples/counterexamples
• 10 coin flips with P(heads)=p, test p=.5 against
p>.5 at α=2-9; most powerful test throws out the
last observation
The Indian Connection
• The other half of the Indian connection was Dr.
Sethuraman, who taught limit theory
• I took that class at just the right time to solidify
what I learned in Dr. McKeague’s probability
• I learned so much from watching how Sethu
thought
• I also learned how to be careful about probability
and asymptotic arguments
The Indian Connection
• To work out asymptotics of monitoring clinical
trials, we discuss a multivariate Slutsky theorem
• To this day I worry it may be wrong because
Sethu had me prove the following “theorem”
If X n X in distribution and Yn Y in distribution, and
an a and bn b in probability, where a and b are constants,
then an X n bnYn aX bY in distribution
• After I “proved” it on the board, Sethu pointed
out the following counterexample
The Indian Connection
Xn
N (0,1) ( X n "twittles like" N (0,1))
Yn X n (so Yn also "twittles like" N (0,1))
Let ( X , Y ) iid N (0,1)
Then X n X in distribution and Yn Y in distribution,
but X n Yn 0 and X Y
N (0, 2)
Not Very Probable!
• I learned a lot from my probability
professor, Dr. McKeague
– Even though he hated it when I used
Skorohod’s representation theorem!
• Several years ago, my sister-in-law’s
boyfriend, Pablo, said he was helping a
doctor accused of overcharging Medicaid
• He asked for my help to defend her
Not Very Probable!
• State’s approach
– Take random sample of the doctor’s Medicaid
claims and compute sample mean overcharge
– Construct 90% confidence interval for
population mean overcharge, μ
– Charge doctor nL, where
• n is # of Medicare claims that year for the doctor
• L is the lower limit of the confidence interval for μ
Not Very Probable!
• I told Pablo I thought state’s approach was pretty
reasonable
– The only point of contention was whether the state
really took a random sample
• It appeared to be a convenience sample
• Then I found out who the state’s expert witness
was:
– Dr. McKeague!
– I’m not going against McKeague
• They settled the case
Not Very Probable?
• Recall the disputed election between Bush and
Gore
• Amazingly, almost an exact tie in popular vote
• What is the probability of that?
• From Dr. Leysieffer’s beautifully clear lecture
notes on stochastic processes:
2n
n
22 n
n
Not Very Probable?
2n
P(exact tie) (1/ 2)2n
n
1
n
With 100 million voters, P(exact tie)≈1/18,000
Much more probable than you would think!
Linear Models
• One area I have worked on is adaptive
sample size calculation in clinical trials
• Consider trial with paired differences
X1,…,Xn, and want to test whether μ=0
• Sample size depends on σ2
• If we change sample size midstream
based on updated within-trial variance,
how different might the final variance be?
Linear Models
1
1
A 1
1
Y HX ,
where H
0
1 -2 0 0
1 1 -3 0
1 1 1 1
-1 0 0
1
2
1
6
1
12
-1
2
1
6
1
12
-2
6
1
12
-3
12
1
n
1
n
1
n
1
n
0
0
0
0
0
0
1
n
Linear Models
|| Y || || H X || X ' H ' H X X ' X || X ||
2
2
2
Y
2
i
Y
2
i
2
-Yn =
X
X
n 1
n
i 1
i 1
2
i
2
i
X
i
n
2
2
Y
=
(
X
X
)
i i
2
Linear Models
• H called the Helmert transformation
• By Helmert, if interim and final variance
estimates are sk2 and sn2,
k 1
{(k 1)sk2 ,(n 1)sn2 } Yi 2 ,
i 1
Yi
i 1
n 1
2
• Makes it easy to derive the distribution of
(n-1)sn2 given (k-1)sk2
Linear Models
P{(n 1)sn2 v | (k 1)sk2 u} P(Y12 ... Yn21 v | Y12 ... Yk21 u)
P{(Y12 ... Yk21 ) (Yk2 ... Yn21 ) v | Y12 ... Yk21 u)
P(u Yk2 ... Yn21 v) P(Yk2 ... Yn21 v u)
Influences on Teaching
• I learned different lessons about teaching
from different professors
– Clarity and organization
• Dr. Leysieffer, Dr. Doss, Dr. Huffer
– How to derive things yourself
• My dad and the Indian connection (Drs. Basu and
Sethuraman)
– How to teach outside the box
• Dr. Zahn
Influences on Teaching:
• Quincunx is board with balls rolling down a
triangular pattern of nails
– Left or right bounce at row i is -1 or +1
independent of outcomes of previous rows
– Each ball’s position at bottom represents sum
of n iid displacements
– Collection of balls in bins at bottom illustrates
distribution of sum
• Illustrates CLT if # rows large
Influences on Teaching
• Can modify quincunx for non-iid rvs
• Permutation test in paired setting
T
5
C
2
Paired difference (T-C)
3
Influences on Teaching
• Can modify quincunx for non-iid rvs
• Permutation test in paired setting
C
5
T
2
Paired difference (T-C)
-3
Influences on Teaching
• Test statistic:
Sn X i ,
X i di w.p 1/ 2
+di w.p 1/ 2
• Sn is sum of independent, symmetric binary rvs
• Is Sn asymptotically normal?
Influences on Teaching
• Think about modified quincunx where horizontal
distance between nails differs by row
• When might normality not hold?
• Suppose largest distance exceeds sum of all
other distances
• E.g., suppose
d1 di
i 2
Very abnormal!
Influences on Teaching
• The quincunx shows that some conditions are
needed on the di to conclude asymptotic
normality, but can noncomplying di arise as
realizations of iid random variables?
• Theorem: If the di are realizations from iid
random variables with finite variance, then with
probability 1,
Sn
n
2
d
i
i 1
N (0,1) in distribution
Influences Beyond Statistics
• Several professors helped me in ways that went
beyond statistics
– My dad
– Dr. Hollander
– Dr. Toler
– Dr Zahn
• He drove me to the edge, but brought me back!
Unforgettable Quotes
• “This theorem is true only in general”
– My dad
• “Where is my duster”
– Dr. Basu
• “What belief, attitude, or position must have
been present…”
– Dr. Zahn
• “Bob’s your uncle”
– Dr. Meeter
• “One upon n” (for 1/n)
– Dr. Sethuraman