Transcript Slide 1

Math Stat Course:
Making Incremental Changes
Mary Parker
University of Texas at Austin
Intro to Mathematical Statistics
(M378K at University of Texas)
Prerequisite: Probability course (which is
required for all math majors)
 Students: Math majors, actuarial students,
other science majors
 Previous statistics courses? Some took
an applied stat course, either freshman
course or after probability course. Some
didn’t.

Math Stat Topics





Sampling distributions of statistics
Estimation of parameters: confidence intervals, method
of moments estimation, maximum likelihood estimation,
comparison of estimators using mean square error and
efficiency, sufficient statistics
Hypothesis tests: p-values, power, likelihood ratio tests
Distributions used include normal, binomial, Poisson,
uniform, gamma, beta, t, F, chi-squared, and other
standard distributions.
Other topics as time permits.
Some students took this course:
M358K Applied Statistics description
New course in the last five years
Prerequisite: Probability course. Taken by math majors
with concentration in secondary school teaching,
statistics, and some others. If they take both M358K and
M378K, they are encouraged to take M358K first.
Introduction of this course has not decreased enrollment in
M378K and some students who take this new course
and didn’t plan to take more statistics do go on to
M378K.
Questions
MAIN: How can a teacher who doesn't have the time/inclination to
completely revamp her course make incremental changes that will
better prepare students to understand and use contemporary
statistics techniques?
Preliminary:




What aspects of the reform of the first course are also appropriate for the math
stat course?
What should we preserve in the current math stat course so that it continues to
give mathematically sophisticated students a strong foundation in statistics?
What additional tools and techniques of theoretical statistics should be introduced
at this level?
Within twenty years, when all students will be using the equivalent of a
Mathematica-level program, what can/should we be teaching in theoretical
statistics courses?
Incrementally changing Math Stat

Focus on assumptions throughout.
 Check
assumptions.
 Mention alternative techniques if assumptions not
met.
 Discuss robustness of methods.
 Briefly introduce nonparametric statistics and
Bayesian inference to illustrate different assumptions /
framework.

Have students do explorations.
What explorations?
Main idea: Simulate and explore sampling distributions of
various statistics. Use to illustrate theoretical ideas and
to check on robustness of procedures.
Preliminary idea 1: Create a complete sampling
distribution themselves and check its properties to see
that they agree with the theoretical results.
Preliminary idea 2: Think of some interesting estimators to
investigate. (See that there are more possible
estimators for a parameter than the sample mean.)
Why explorations?

Explorations help make the theory
concrete

Robustness of statistical techniques: The
concept seems strange to math students
and they appreciate tools to explore it on
their own.
Simulate and explore a sampling
distribution
1.
2.
3.
4.
5.
The population is the numbers of potatoes in a 5-lb sack of potatoes
from a certain company. Assume the counts are distributed as
discrete uniform, from 12 potatoes to 18 potatoes. Choose a
reasonable sampling method and construct the sampling distribution
of the sample mean for samples of size 2.
Find the mean and variance of the population and then find the
mean and variance of the sampling distribution.
Comment on the results, based on your theoretical understanding
from the formulas we proved about the mean and variance of a
sample mean.
Discuss what would be different for samples of size 9.
Investigate the sampling distribution of the sample range.
Strategy




Given very early in the semester.
Student groups of 2-3.
Grading and instructions encourage students to
think about it over a couple of weeks without
spending much time on it at first, BECAUSE
This assignment is not as well-defined as it looks
for many students.
Difficulties often encountered
Should (13,14) be a different element of
the sample space from (14,13)?
 Should I sample with replacement or
without replacement? Why?
 When computing the standard deviations
here, is the denominator n or n-1?

Extensions
Sampling without replacement: what
changes? What does that tell us about
the language/formulas of our text?
(independence of samples)
 Where could we find the equivalent
formulas to those in our text for sampling
without replacement? What’s different?

Constructing various estimators
“German Tank Problem”
Assume German tanks had consecutive ID numbers from 001 to ???.
Need to estimate the number of the population of German tanks
(max ID in the population,) based on the IDs from the sample of
tanks we have captured.
In groups, think of at least three different reasonable estimators. Then
draw a sample of size 5 from my “population of German tank IDs” in
the envelope. Give your three estimates.
Use a computer to simulate the three sampling distributions
Strategy



Done in class before beginning to talk about
estimation.
Usually students will use (1) two times the
mean,(2) the maximum, and then, after a bit of
time, will come up with something else.
Students will need help simulating the sampling
distributions. Again, arrange the timing/grading
to encourage them to think about it and discuss
it before spending a lot of time doing it.
Difficulties in simulating sampling
distributions




How do you describe the original population to
the computer? (Discrete uniform on 1 to 600,
maybe)
Is it fairly easy to obtain a random sample from
that distribution in your software? (If not, find
other software!)
Distinguish between the sample size and the
number of points from the sampling distribution.
What should you do with the sampling
distribution?
Looking at sampling distributions




What should you look at to summarize a sampling dist’n?
(histogram, summary statistics)
Is it close to normally distributed? (Discuss normal
scores plots.)
(More advanced) Is it close to a __ dist’n? (Make
available information about probability plots in more
generality.)
If the statistic is unbiased, what characteristic will the
sampling dist’n have? (If yours doesn’t have the mean
exactly what it’s supposed to, is that because you made
an error? Why or why not?)
Focus on Assumptions

Checking assumptions for typical normal-theory
techniques
 Already discussed normal probability plots
 Discuss what types of deviations from assumptions
cause problems for a particular technique and why
 In two-sample t procedures, help them see exactly
why equal variance assumption is more popular
among theorists than those working in applications.

Robustness
 Central
Limit Theorem. Explorations of various types
of distributions – how large must n be?
Focus on Assumptions II

Nonparametric techniques



Sign test, signed rank test, and rank-sum test
Compare results with those from t-test for some examples to
further illustrate conditions for robustness of t-tests
Bayesian statistics



Very brief introduction, contrasting assumptions of frequentist
and Bayesian approaches
Do examples from binomial or normal with conjugate priors and
indicate that choosing the prior mean and variance gives quite a
lot of flexibility
Mention that using more general, non-conjugate priors leads to
the need for more computationally-intensive methods
Actual assignments



Construct a sampling distribution
German tank problem
Simulating sampling distributions in MINITAB
Find the actual assignments and supporting
material at the website listed on the handout for
this session
http://www.ma.utexas.edu/users/parker/jsm04/
Right now, click here