StatsI - Flow in Sports
Download
Report
Transcript StatsI - Flow in Sports
Statistics
Math 416
Game Plan
Introduction
Census / Poll / Survey
Population – Sample – Bias
Sample Proportion
Mean Median Mode
Box and Whisker Plot
Box and Whisker Interpretation
Stats Intro
There are lies, there are damn lies and then
there are statistics
- Mark Twain
The goal is by the use of number describe
a characteristic of a population.
The idea is to win your argument by
providing facts and too many people
consider statistics to be absolute facts.
Stats Intro
In general, most people do not understand
statistics.
Hypothesis: Student A has a school average of
10%
Conclusion: Student A is a bad person.
The statistic does not measure the person’s
goodness or badness.
What does that statistic mean?
If all there marks were the same for all courses, it
would be 10%
Statistics
Life is a continual battle to get your ideas
across and have other people trying to get their
ideas across to you.
You are constantly being bombarded by
arguments and statistics.
1.
2.
Commercials
Teachers
To understand the world around you, need to
be aware of statistics meaning and reliability.
Where do statistics come from?
Population
First we establish the population.
Population: the complete group that we are
investigating
Characteristic: A particular identifying object
exhibited by the population
i.e. hair colour
favorite colour
math knowledge, political opinion etc.
Population
The next problem is interpreting how to
measure a characteristic and obtain the
data.
Obtaining the Data: Three methods
Method #1: Ask the whole population
- Called a census
- Problems – hard to do –
depending on population
Census
Method
#1: Ask the whole population
- Called a census
- Problems – hard to do –
depending on population
Poll
Method #2: - Ask a representative
“sample” of the population
- Called a poll
Problems: representative may be tricky
Survey
Method #3: Ask only experts of the
population
- called a survey
Problems: who is an expert
Representative sample?
Bias
Bias
If data is obtained or presented in an unfair
manner than all conclusions are not correct. The
results are said to be biased (or unfair).
In collection
How and who you ask is the main source of bias
There are 4 types of bias (bad sampling, non
pertinence, wording of question & attitude of
pollster).
Bias
Eg asking 5 yr olds their favorite beer
Bad sampling
Eg Do you like to play an instrument? (to find
favorite color)
Non-pertinence
Eg man I hate Bush, are you in favor of war?
Wording of questions
Eg a policeman asking were you speeding?
Attitude of pollster
Presentation Bias
In presentation, imagine you disregard a
grade level and claim that they do not
matter in a school’s decision.
I need to prove my product is the best,
how can I get these numbers to show that?
Buy This Stock!
Stencil #1-3
$500
$400
$300
$200
Not!
$100
$0
Jan Feb MarchApril Jan
A statistical
presentation is
always biased
Representative Sample
Creating a representative sample can be an
art form in itself. The sample should be in
all the same proportions, an impossibility.
You must focus on the characteristics (the
poll or survey is focusing on!)
Representative Sample
Consider a school has 50 boys and 25 girls
and a representative sample of 10 needs to
be created.
We note the population is described in terms
of boys and girls hence we will need to create
our sample on that basis
Three steps
Representative Sample
1) Relative (by percent)
n = 75
50/75 25/75
= 67% = 33%
2) Theory - sample = 10
Difficult
to
get
.7
10x.67=6.7 10x.33 = 3.3
3) Reality
or .3 of a person!
7
3
Total of 10 & has added bias
Some Rules
If it starts at zero it stays at zero
If it appears to be zero be careful!
Make decisions on a category not overall
Creating a Sample
1)
Given the following, create a sample of 10
Hudson
Non-Hudson
Young
0
Middle Aged 20
Old
31
Relative
1
24
33
Y 0
0
MA 18% 22%
0
28% 30%
n = 109
Is it really 0
people?
Creating a Sample
Theory
Y
MA
O
Reality
Y
MA
O
0
1.8
2.8
0
2
3
0
2.2
3
0
2
3
Stencil 4,5, 6
Do relative, theory
and reality for #4;
in #5 & 6 put
theory & relative
together
open here
Statistics Central Tendency - Mean
Mean means the average
Symbol x
Found by dividing the sum ∑xi by the
number of elements n. i.e. x = ∑ xi
n
Means which value would all values be equal
to if they were the same i.e. (5,9,3,6)
x = ∑ xi = (5+9+3+6)/4 = 5.75
n
Mode
Symbol M
It is the number that appears the most
It is possible, not to have any or to have more
than one mode
Eg (1,2,5) M =
(nothing repeats)
Eg (1,6,6,8) M = 6
Eg (1,3,3,4,4,8) M = 3 & 4
Median
Do #7
Symbol M
Median is found as the middle value
Note the sample must be in order!
There are two possibilities (odd & even)
Consider (1,5,7) n = 3
Odd only 1 middle; M = 5
(1,5,7,8) n = 4
You must find the mean of both middles (5 +
7)/2 = 6
Box & Whisker Plot
(2,5,1,6,9,8,)
The Construction
1) Make sure your sample is in order
(1, 2,5,6,8,9)
2) Find the min, max & median
Min = 1 ; max = 9 median = 5.5 = Q2
These three points will serve you as part of the
box and whisker diagram. Draw it on board…
Box & Whisker Plot
(1,2,5,6,8,9)
-1 -0 1 2 3 4 5 6 7 8 9 10 11
3)
Create a number line with vertical line at the
three points hinges
4) Find median between min and Q2 called Q1
It is 2 and make another hinge
5) Find median between Q2 and max called Q3
It is 8 & make another hinge. Complete it!
Words & Facts
We have broken the data into four parts
called quartiles.
Box
Max
Min
Q
Whiskers 1
Q2 Q3
Interquartile Whiskers
range =
Q3-Q1
Words & Facts
Each quartile should hold about ¼ of the
data BUT you cannot be sure
You cannot tell the mean or the mode
Do not jump to conclusions!
A box and whisker gives you an idea about
the spread or concentration or dispersion
of data
Example #1
1 2 3 4 5 6 7 8 9 10 11 12
A general View
This data is very close together below 4.
There is more of a spread between 4 and 11
and once again between 11 and 12.
Some Questions…
What is the Mean? No idea
Questions
What is the mode? No idea
What is the median? Q = 4
2
What is the interquartile range? Q3-Q1 = 113=8
How many are below 11?
75% but no idea of the number
The lowest concentration of numbers lie
where?
Between 4 - 11
Lowest concentration vs. highest concentration
Example #2
n = 20 Class A
n = 40 Class B
30 40 50
60 70 80
90 100
a) Which class did better? Hard to tell but class A
b) What are the means No idea
c) All together approximately how many were over 60%
¾ x 20 + 2/4 x 40 = 35
Example #2 – More Questions
Which class and which mark was the highest?
Class B at approximately 97%
Which class has lowest range?
Class A
87-55 = 32
Class B
97-40 = 57
Answer: Class A
Finish Stencil