Probability distributions - CensusAtSchool New Zealand

Download Report

Transcript Probability distributions - CensusAtSchool New Zealand

Probability
distributions
AS91586 Apply probability distributions in solving
problems
NZC level 8
• Investigate situations that involve elements of
chance
• calculating and interpreting expected values
and standard deviations of discrete random
variables
• applying distributions such as the Poisson,
binomial, and normal
AS91586 Apply probability distributions in solving problems
• Methods include a selection from those related to:
• discrete and continuous probability distributions
• mean and standard deviation of random variables
• distribution of true probabilities versus distribution
of model estimates of probabilities versus
distribution of experimental estimates of
probabilities.
AO8-4 TKI
Calculating and interpreting expected values and standard deviations of
discrete random variables:
A statistical data set may contain discrete numerical variables. These have
frequency distributions that can be converted to empirical probability
distributions. Distributions from both sources have the same set of possible
features (centre, spread, clusters, shape, tails, and so on) and we can calculate
the same measures (mean, SD, and so on) for them.
• Makes a reasonable estimate of mean and standard deviation from a plot of
the distribution of a discrete random variable.
• Solves and interprets solutions of problems involving calculation of mean,
variance and standard deviation from a discrete probability distribution.
• Solves and interprets solutions of problems involving linear transformations
and sums (and differences) of discrete random variables.
Applying distributions such as the Poisson, binomial, and normal:
• They learn that some situations that satisfy certain conditions can be
modelled mathematically. The model may be Poisson, binomial,
normal, uniform, triangular, or others, or be derived from the
situation being investigated.
• Recognises situations in which probability distributions such as
Poisson, binomial, and normal are appropriate models,
demonstrating understanding of the assumptions that underlie the
distributions.
• Selects and uses an appropriate distribution to model a situation in
order to solve a problem involving probability.
• Selects and uses an appropriate distribution to solve a problem,
demonstrating understanding of the link between probabilities and
areas under density functions for continuous outcomes (for example,
normal, triangular, or uniform, but nothing requiring integration).
• Selects and uses an appropriate distribution to solve a
problem, demonstrating understanding of the way a
probability distribution changes as the parameter values
change.
• Selects and uses an appropriate distribution to solve a
problem involving finding and using estimates of parameters.
• Selects and uses an appropriate distribution to solve a
problem, demonstrating understanding of the relationship
between true probability (unknown and unique to the
situation), model estimates (theoretical probability) and
experimental estimates.
• Uses a distribution to estimate and calculate probabilities,
including by simulation.
AS 3.14 summary
• Includes expected value and standard deviation (and
variance).
• Includes sums and differences (and linear combinations) of
random variables.
• Includes binomial, Poisson and normal, but also includes
uniform, triangular distributions and experimental
distributions.
• Requires consideration of context as well as appearance of
the distribution when selecting a model.
Looking at distributions
(simulated normal distribution)
• Small samples do not
always have distributions
like the population they
come from.
• When looking at
distributions, a sample of
30 is much too small to
give a good picture of the
whole population
distribution.
Looking at distributions
(simulated normal distribution)
• Large samples do have
distributions like the
population they come
from.
• When looking at
distributions, a sample of
about 200 is sufficient to
give a picture of the whole
population distribution.
Estimating mean and
standard deviation
To estimate mean and standard deviation,
students need to know that:
• The mean is pulled towards extreme values
• The SD is stretched by extreme values
If the distribution is approximately normal, the
mean is the middle, and the SD is roughly 1/6th
the range (97.8% within μ ± 3σ).
Estimating mean and standard
deviation for any distribution
Estimating the mean:
• Estimate the median and adjust towards
extreme values.
Estimating the standard deviation:
• Estimate the median distance from the mean
and adjust it (stretch it if there are extreme
values).
Estimate the mean and standard deviation of the age
of students completing the census@school survey.
Mean = 12.3 years
SD = 1.8 years
Words remembered in Kim’s Game
Mean = 13.1
SD = 2.4
Mean = 9.0
SD = 2.8
Text messages sent in a day by stage one
university students
Mean = 38 messages
SD = 57 messages
Number of pairs of shoes owned by stage one university students
Mean = 10.4 pairs
SD = 8.9 pairs
words memorised with music
16
14
frequency
12
10
Mean = 5.9 words
8
6
4
SD = 2.5 words
2
0
1
2
3
4
5
6
7
8
9
10
number of words
word memorised without music
Mean = 7.0 words
16
14
frequency
12
SD = 23 words
10
8
6
4
2
0
1
2
3
4
5
6
7
number of words
8
9
10
Introducing distributions
How do you introduce:
• Binomial
• Poisson
• Normal
• Uniform
• Triangular distributions?
Introducing the binomial distribution
• Combinations and permutations are still in the
curriculum, so you can still teach them if you
want to.
• You can teach the binomial distribution
without using combinations by using trees.
Introduce the binomial distribution as a
shortcut for complicated trees.
Chuck-a-luck
• A gambling game played at carnivals, played
against a banker.
• A player pays a dollar to play and rolls 3 dice.
• If no 4s are rolled, the player loses.
• Otherwise the player gets back one dollar for
every 4 rolled and gets their original dollar
back.
Introducing the binomial distribution
Once students see the pattern emerge, they can start to generalise it, using
Pascal’s triangle or an understanding of combinations to get the coefficients.
For some students, it may be enough to know that the calculator is a shortcut
method for working out probabilities from trees like these.
Poisson distributions
• Hokey Pokey ice-cream
– is Tip Top really the best?
• Choc chip cookies: number of
choc chips visible on an area of
cookie (Farmbake Triple Choc
works well - do white chips and
dark chips separately).
Discrete uniform and
triangular distributions
Uniform: roll of one die
Triangular: Sum of two dice
0.4
sum of two dice
25
0.3
20
0.25
15
frequenct
0.35
0.2
10
0.15
5
0.1
0
0.05
1
2
3
4
5
6
7
dice sum
0
1
2
3
4
5
6
8
9
10
11
12
Continuous probability
graphs
What are the units on the vertical axis for a
continuous probability function?
Continuous probability graphs are
probability density functions
The vertical axis measures the rate probability/x, which is
called probability density.
Probability density is only meaningful in terms of area.
bus waiting time (1)
The downtown inner link bus in Auckland arrives
at a stop every ten minutes, but has no set times.
If I turn up at the bus stop, how long will I expect
to wait for a bus?
What will the distribution of wait times look like?
a
b
c
0.1
0
10
Which is more likely: a wait of between 2 and
5 minutes, or a wait of more than 6 minutes,
measured to the nearest minute?
0.1
0
10
Bus waiting time (2)
• My own bus route (277) runs only every half
hour, and isn’t as reliable as the inner link.
• I know that the bus is most likely to appear on
time, but could in fact turn up at any time
between the time it is due and half an hour
later.
What is the best model for wait time, given the
available information?
In the real world:
Uniform models are used for modelling
distributions when the only information you have
are maximum and minimum.
Triangular models are used for modelling
distributions when the only information you have
are maximum, minimum and average (could be
the mode).
a
b
c
What is the probability that I will have to wait
longer than 20 minutes for a bus?
1
15
0
30
My interpretation of AS 3.14
• Expect more questions giving experimental
data to be fitted to a theoretical model.
• Expect more evaluation of how well a
theoretical model fits experimental data.
• Expect more interpretation of the application
of a model in context.
Teaching and learning
Students should:
• record their hunch every time you start an
investigation, and compare the results to their
hunch.
• always consider the context and the distribution
you would expect in that context, as well as their
observations of the data available.
• Estimate the mean and the standard deviation
every time they look at a distribution (write down
the estimate, then check to see how close they
were).
Learning could start with:
• Questions to investigate, and gathering data:
What is the probability that at least 4 people in a
class have the same birth month?
• Data in tables: which distribution (if any) would
you use to model it? Estimate probabilities
• Data in graphs: estimate mean and standard
deviation, which distribution (if any) would model
it? Estimate probabilities.
A learning activity
• From Teaching Statistics: a bag of tricks
(Gelman and Nolan)
What do you notice?
• Students tend to group their guessed
histogram into large groups.
• Different bin widths will give different
estimates of probability.
• What else do you notice?
Misunderstanding of
probability may be the greatest
of all impediments to scientific
literacy.
Stephen J Gould