Statistics Introduction 2

Download Report

Transcript Statistics Introduction 2

Statistics Introduction 2
• The word Probability derives from the Latin
probabilitas, which can also mean probity, a
measure of the authority of a witness in a legal
case in Europe, and often correlated with the
witness's nobility.
• In a sense, this differs much from the modern
meaning of probability, which, in contrast, is a
measure of the weight of empirical evidence, and
is arrived at from inductive reasoning and
statistical inference.
True Probability vs. experimental
probability vs. theoretical probability
• We tend to give answers to probability
questions based on ‘theoretical’ probability
but all three types are usually involved in a
problem.
TRUE probability
• True probability involves an exact
understanding of all the factors involved that
lead to a certain outcome.
• It is a deterministic model.
Deterministic model
• Mathematical model in which outcomes are
precisely determined through known
relationships among states and events,
without any room for random variation.
Deterministic Model
• In such models, a given input will always
produce the same output, such as in a known
chemical reaction.
• A deterministic system is a system in which no
randomness is involved in the development of
future states of the system.
The important idea
• A deterministic model will thus always
produce the same output from a given
starting condition or initial state.
Deterministic Model
• With a deterministic model, the assumptions
and equations you select "determine" the
results. The only way the outputs change is if
you change an assumption (or an equation).
Experimental Probability
• Experimental probability, is the ratio of the
number of outcomes in which a specified
event occurs to the total number of trials, not
in a theoretical sample space but in an actual
experiment.
• In a more general sense, experimental
probability estimates probabilities from
experience and observation.
Advantage of experimental probability
• An advantage of estimating probabilities using
experimental probabilities is that this
procedure is relatively free of assumptions.
• For example, consider estimating the probability
among a population of men that they satisfy two
conditions:
• 1. that they are over 2 metres in height
• 2. that they prefer strawberry jam to raspberry
jam.
• A direct estimate could be found by counting the
number of men who satisfy both conditions to
give the experimental probability of the
combined condition.
• An alternative estimate could be found by
multiplying the proportion of men who are
over 2 metres in height with the proportion of
men who prefer strawberry jam to raspberry
jam, but this estimate relies on the
assumption that the two conditions are
statistically independent.
Disadvantage
• A disadvantage in using experimental
probabilities arises in estimating probabilities
which are either very close to zero, or very
close to one.
• In these cases very large sample sizes would
be needed in order to estimate such
probabilities to a good standard of relative
accuracy.
Theoretical Probability
• Theoretical probability is the probability that a
certain outcome will occur, as determined
through reasoning or calculation.
Example:
• Given a die which is a regular octahedron of
uniform density, and given that one and only
one of its faces is painted black, then if the die
is cast, the theoretical probability that the
outcome will be the black face is 1/8.
Experimental/Theoretical probability
• In many cases, the quality of a scientific field
depends on how well the mathematical
models developed on the theoretical side
agree with results of repeatable experiments.
• Lack of agreement between theoretical
mathematical models and experimental
measurements often leads to important
advances as better theories are developed.
Summary
Type
Model
True
Probability
Experimental
probability
Theoretical
probability
Deterministic
Probabilistic
Probabilistic
Likely
occurrence of
future event
Usually
unknown
Applying past
to the future
Based on a
model
Babies
• Worldwide, we rely on the assumption that a
newborn baby is equally likely to be a boy or a
girl.
• Is this in fact true?
True Probability
• This would involve an exact understanding of
all the factors involved that lead to a certain
outcome.
• The process by which gender is determined is
largely unknown because many other factors
other than X and Y chromosomes may be
involved.
• Is gender influenced more by the father or by
the mother?
• Does gender depend on the age of the mother
and/or of the father?
• Is gender related to the parents’ occupations?
• Is there a hereditary factor involved?
• Are there only two possible outcomes?
• Are there global or economic influences?
• ???????
True Probability
• If we knew all the determining factors, we
would be able to predict the gender of the
baby.
http://www.webmd.com/baby/feature
s/predicting-your-babys-sex
Experimental probability
• Experimental probability relies on looking at
long-run relative frequencies from the past
and projecting these patterns into the future.
51%
52%
http://en.wikipedia.org/wiki/List_of_c
ountries_by_sex_ratio
Experimental factors that may
influence the experimental probability
• One gender may be more likely to survive from conception
to birth than the other.
• Live-birth ratios are not necessarily the same as gender
ratios at conception.
• Some cultures practise abortion or infanticide because one
gender is seen to be more desirable than the other.
• Some providers of artificial insemination or in vitro
fertilisation (test-tube babies) offer gender selection
Theoretical probability
• This involves a mathematical model to explain
the distributions of various outcomes.
• We generally assume that the 50:50 model
applies and use the probabilities to make
predictions.
Comments
• The 50:50 model is probably close enough for
most people to believe this is true.
• There is an unproven belief that, because
there appears to be two outcomes and the
process is apparently random, then the
probabilities are equal. However, random
events do not have to be equally likely.
http://www.swissinfo.ch/eng/swiss_news/Third_gender_fights_for_recogniti
on.html?cid=34791620
Deterministic and probabilistic models
• A deterministic model does not include
elements of randomness. Every time you run
the model with the same initial conditions you
will get the same results.
• A probabilistic model does include elements
of randomness. Every time you run the
model, you are likely to get different results,
even with the same initial conditions.
Randomness
• http://www.youtube.com/watch?v=Lf4ZmWc_
jmA
• Randomly put one cross on your paper.
• Where are most of the crosses?
What does randomness look like?
file://localhost/Users/marionsteel/Desktop/workshops/proba
Is thisscatter.xls
random scatter?
bility workshop/random
Or is this random scatter?
Benford’s law
• http://www.nctm.org/resources/content.aspx
?id=7636
Benford’s Law
• This result has been found to apply to a wide
variety of data sets, including electricity bills,
street addresses, stock prices, population
numbers, death rates, lengths of rivers,
physical and mathematical constants, and
processes described by power laws (which are
very common in nature). It tends to be most
accurate when values are distributed across
multiple orders of magnitude.
TED Talk
• http://www.ted.com/talks/peter_donnelly_sh
ows_how_stats_fool_juries.html
Is there such a thing as a biased coin?