AS91586 Probability distributions (3.14) Anna Martin
Download
Report
Transcript AS91586 Probability distributions (3.14) Anna Martin
AS91586 Probability
distributions (3.14)
Anna Martin
Avondale College
What are the big ideas and how do we effectively teach them?
Uncertainty...….it’s all around us - deal with it!
★ Resources that are already available
http://new.censusatschool.org.nz/resources/3-14/
★ 2013 and 2014 NZQA exams
http://www.nzqa.govt.nz/ncea/assessment/viewdetailed.do?standardNumber=91586
Starter sketches...
Number 1 - 5 in your books.
Ready??
For each distribution, you will get some
descriptions of its features.
You have to use these descriptions to try to
sketch the shape of the distribution. You will
need to draw an x/horizontal axis WITH the
correct variable and an attempt at the scale.
Distribution 1
My distribution is of ages in years e.g. 2.7 years
I am symmetrical.
Sketch me!
I have a mean and median of 10 years.
I am uniformly distributed.
My outcomes lie between 0 and 20 years inclusive.
Distribution 2
My distribution is of heights of flowers in cm.
I am symmetrical.
Sketch me!
I have a mean and median of 50 cm.
I am normally distributed.
My standard deviation is around 10 cm.
Distribution 3
My distribution is of lengths of hair in cm.
I am bimodal.
Sketch me!
I have a small peak at 5 cm and a larger peak at 25
cm.
My outcomes pretty much range between 0 cm and
30 cm, although a few people have longer hair than
30 cm.
Distribution 4
My distribution is of hours worked per week.
I am negatively skewed.
Sketch me!
My median (and peak) is at 40 hours, but my mean
is lower at 35 hours.
My outcomes pretty much range between 0 and 60
hours, it’s unlikely to find longer than 60 hours per
week.
Distributions 5
Sketch us!
We are both distributions for the amount of water
consumed in mL. We are both normally distributed.
For athletes, my distribution has a mean of around
1500 mL and a standard deviation of around 200 mL.
For non-athletes, my distribution has a mean of
around 1200 mL and a standard deviation of around
400 mL.
Let’s see how you went!
Distribution 1
My distribution is of ages in years.
I am symmetrical.
I have a mean and median of 10 years.
I am uniformly distributed.
My outcomes lie between 0 and 20 years inclusive.
0
10
Age (in years)
20
Distribution 2
My distribution is of heights of flowers in cm.
I am symmetrical.
I have a mean and median of 50 cm.
I am normally distributed.
My standard deviation is around 10 cm.
20
30
40
50
60
Heights (in cm)
70
80
Distribution 3
My distribution is of lengths of hair in cm.
I am bimodal.
I have a small peak at 5 cm and a larger peak at 25 cm.
My outcomes pretty much range between 0 cm and 30 cm, although few people
have longer hair than 30 cm.
0
5
10
15
20
Hair lengths (in cm)
25
30
Distribution 4
My distribution is of hours worked per week.
I am negatively skewed.
My median (and peak) is at 40 hours, but my mean is lower at 35 hours.
My outcomes pretty much range between 0 and 60 hours, it’s unlikely to find
longer than 60 hours per week.
0
10
20
30
40
Hours worked (per week)
50
60
Distributions 5
We are both distributions for the amount of water consumed in mL. We are both
normally distributed.
For athletes, my distribution has a mean of around 1500 mL and a standard
deviation of around 200 mL.
For non-athletes, my distribution has a mean of around 1200 mL and a standard
deviation of around 400 mL.
The non-athletes peak is
lower because they are
more spread out (imagine
an ice-cream melting….)
0
400
800
1200
1600
Water consumed (mL)
2000
2400
What are the foundation ideas from Level 2?
AS91267 Probability
Models - what are the uniform and
normal distributions?
Features of distributions - what
are the key features of experimental or
sampling distributions?
Locating values in a distribution how do you use theoretical models or
experimental distributions to estimate
probabilities?
Comparing models with sample
data - do I have enough data to
identify the features of the random
variable?
Expectation - what would be typical
values for this distribution (middle
80%)? what would be unlikely? how
many times would I expect to see this
happen?
AS91268 Simulations
Randomness - what specifically is the
random process for the situation?
Independence - what specific things
are you assuming don’t influence each
other? Why does independence matter?
Probabilities given - why are you
assuming these will stay the same? Will
these always stay the same? Will things
run out or change? How would this
affect your simulation?
Number values given - why are you
assuming these will stay the same? Will
these always be the same? Could they
be higher or lower? How would this
affect your simulation?
Estimates - why can’t you answer the
problem with an exact number?
What are the key concepts for Level 3 probability
distributions?
Model
(theoretical)
Random
variable
Solve problems
involving
uncertainty
Working with theoretical distributions or models….
★ Using contextual clues/information to select an
appropriate model
★ Justifying the selection of the model
Consider how you can get students thinking using distributions to solve problems
quicker, rather than initially getting stuck on calculating probabilities and
navigating the graphics calculator.
NZQA 2014
What are the key concepts for Level 3 probability
distributions?
Model
(theoretical)
Random
variable
Solve problems
involving
uncertainty
Experimental
(simulated or
sample data)*
Beliefs, misconceptions or claims
★ A certain teacher is slightly
obsessed with the TV show
“The Block”
★ She notices that every year a
big deal is made about the
auction order
★ Watch this video from the
most recent series of “The
Block NZ” and consider the
question below
Youtube video link
Start at 14:40
What do the presenter and contestants seem to believe
about the auction order and winning the competition?
Initial exploration of data
This same teacher has
looked at the Australian and
NZ versions of “The Block”
and recorded who won
each series and which
order they went in the
auction.
There are have been 12 series of “The Block” in Australia
and New Zealand. What proportion of these series did the
team that went 1st or 4th in the auction win?
Considering the random variable and expectations
If the auction order doesn’t
make a difference (and no
other factors influenced
who was the winner - a big
assumption!), then the
theoretical probability
distribution could be
uniform. This means each
different auction order has a
25% chance of resulting in
a win.
How many times out of 12 would you expect the team that
went first to win using the uniform distribution?
Assessing reasoning skills
Ben and Quinn (from the
NZ 2014 series) say that
the past results show that
going 1st is the way to win
the whole competition,
because 5 out of 12 times
the couple that went first
won the competition!
Explain to Ben and Quinn in simple terms that they can
understand why this may not be very good reasoning :-)
Visualising chance variation
This same teacher ran a
simulation to investigate
how many times certain
auction orders win out of
12 seasons, if the chance
of winning was 25% for
each auction order. This is
an animated gif of 10 of
these simulations.
How many times does a result of 5 or more wins come up
(regardless of auction order)?
Visualising chance variation
This same teacher ran a
simulation to investigate
how many times certain
auction orders win out of
12 seasons, if the chance
of winning was 25% for
each auction order. This is
an animated gif of 10 of
these simulations.
How could you use these graphs to explain to the producers of “The
Block” why there may be insufficient evidence to support a claim
that the order of the auction makes a difference to who wins?
Identifying likely and unlikely outcomes in the
distribution
This same teacher ran a simulation through the computer (1000 trials), assuming
going 1st has a fixed chance of winning (p = 0.25), to investigate the variation in how
many times the 1st place would come up the winner in sets of 12 (to mimic the 12
different series of the block) by chance alone.
How many
seasons you would
expect to find that
the team that went
first would win by
chance alone?
What would be an
unlikely
outcome(s)?
Considering chance variation as part of solving a
problem
Ben and Quinn (from the NZ 2014 series) say that the past
results show that going 1st is the way to win the whole
competition, because 5 out of 12 times the couple that went
first won the competition!
Explain to
Ben and
Quinn using
the
experimental
distribution
why this is
not very good
reasoning :-)
Can you tell young Paul Rudd from old Paul Rudd?
★ Paul Rudd (an actor) never
seems to actually age!
★ In the following quiz, you'll
see eight pairings of
pictures of Rudd at different
ages — some are four
years apart, some are
eleven, some are
somewhere in between —
and you have to guess in
which of the two he is older.
★ Ready?
Link to source material
NZQA 2013 Question 3(c)
Which is the older Paul Rudd?
Left
Right
Which is the older Paul Rudd?
Left
Right
Which is the older Paul Rudd?
Left
Right
Which is the older Paul Rudd?
Left
Right
Which is the older Paul Rudd?
Left
Right
Which is the older Paul Rudd?
Left
Right
Which is the older Paul Rudd?
Left
Right
Which is the older Paul Rudd?
Left
Right
Let’s look at the results!
For each pair, record if you correctly identified the older Paul
Rudd :-)
Which is the older Paul Rudd?
Left
Right
Which is the older Paul Rudd?
Left
Right
Which is the older Paul Rudd?
Left
Right
Which is the older Paul Rudd?
Left
Right
Which is the older Paul Rudd?
Left
Right
Which is the older Paul Rudd?
Left
Right
Which is the older Paul Rudd?
Left
Right
Which is the older Paul Rudd?
Left
Right
Can you tell older Paul Rudd from younger Paul Rudd?
How many did you get right?
What is the theoretical probability of someone getting a successful outcome for
each trial through guessing?
To apply the binomial distribution to this situation, what conditions do you
need (or need to assume)?
Using the binomial distribution, is there sufficient evidence that you can tell
"old Paul Rudd" from "young Paul Rudd"?
This could be extended by
providing students with data
from the website on which this
survey was initially conducted.
Does this data suggest people
were just guessing?
Selecting models using data and conditions…...
Summary sheet
Note: Sometimes we use probability distribution models
because they are “fit for purpose” even if they do not
technically meet all of the mathematical conditions.
This is often the case with the Poisson distribution, because
one of its properties is that as the mean of the random
variable increases, the variation also tends to increase - this
is a common feature of data and random processes.
We also need to keep in mind how much data we have, and
remind ourselves (and our students) about all the thinking
we have around sample to population inferences….
World war 2 London bombing
During World War II, London was assaulted
with German flying-bombs on V-2 rockets.
The British were interested in whether or not
the Germans could actually target their bomb
hits or were limited to random hits with their
flying-bombs.
Based on the work by the British statistician
R.D. Clarke An Application of the Poisson
Distribution (1946)
World war 2 London bombing
It should be noted that this analysis is very
important. For if the Germans could only
randomly hit targets, then deployment throughout
the countryside of various security installations
would serve quite well to protect them, as random
bombing over a wide range was unlikely to hit a
given target. However, if the Germans could actually
target their flying-bombs, then the British were
faced with a more potent opponent and deployment
of security installations would do little to protect
them.
World war 2 London bombing
The British mapped off the central 24 km by 24 km region of
London into 1/2 km by 1/2 km square areas. Then they recorded
the number of bomb hits, noting their location, and this data is in
the following table:
No.
bombs
0
1
2
3
4
5 or over
No. areas
229
221
93
35
7
1
Imagine that you are a young Lieutenant in His Majesty's
Service. You are charged with ascertaining if the British are up
against an adversary who can target their flying-bombs or one
who can only randomly toss these bombs at London.
Considering data and theoretical distributions…...
The data collected (n = 586)
Estimate the mean
Check your estimate
number of bombs
with a calculation
per area
Why could we use Poisson
to model this random
variable and what would be
its parameter?
Plot the values from the model distribution on the graph
Considering data and theoretical distributions…...
The data collected (n = 586)
What can you conclude?
Imagine that you are a young Lieutenant in His Majesty's Service. You are charged
with ascertaining if the British are up against an adversary who can target their flyingbombs or one who can only randomly toss these bombs at London.