1. Statistics - hills

Download Report

Transcript 1. Statistics - hills

Unit 1
Statistics
Analyzing Data
Analyzing Data
• Measures of Central Tendency
– How can we best describe the middle of the
data?
• Measures of Variability
– How can we describe how the data varies, how
spread out it is?
Measures of Central Tendency
• Arithmetic Mean ( X )
– Average
• Find the total of all the data
• Divide by the number of pieces of data.
• If x1 is the first piece of data, x2 is second, and xn is the
last (nth) piece. Then
X1  X 2 
X
n
 Xn
Measures of Central Tendency
• Median
– Middle number
• List the data in numerical order, find the middle
number.
– If you have and even number of values, find the mean of
the two centre data pieces.
• Mode
– Most popular response
• Appears the most frequently.
Measures of Variability
• Range
– Determine the difference from the lowest
to highest piece of data.
Example
• The marks for a chemistry test were
• 65, 66, 100, 63, 64, 63, 60
– Find the
• Arithmetic Mean
• Median
• Mode
• Range
Assignment
WS: Measures of Central
Tendency
Measures of Central Tendency
A new kind of average....
Trimmed Mean
• An average calculated AFTER
removing a small percentage of the
largest and smallest values before
calculating the mean.
– After removing the specified values, the
trimmed mean is found the same way as
an arithmetic mean.
Example 1
• At a figure skating
competition judges
drop the highest and
lowest score before
determining the
competitors average
score.
• Determine the trimmed
mean for a competitor
with the following
scores:
– 6.0, 8.1, 8.3, 9.1, 9.9.
Weighted Mean
• Used in situations where not all data
points will be treated as having equal
weight.
– Your term mark is a weighted mean, and
then so is your final mark!
• Each category receives an average
score, and each score is assigned a
percentage of worth.
Example 2
• In her job as a
server at a
restaurant, Glenna
earned:
–
–
–
–
2 tips of $6.00,
3 tips of $8.00,
3 tips of $10.00,
6 tips of $12.00.
• Determine her
average tip.
Example 3
• A student receives
– A term mark of 84%
– A score on the midyear of 75%
– A final exam score
of 80%
• Determine her final
mark!
– Remember 55%,
15%, 30%
Assignment
WS: Measures of Central
Tendency
Measures of Variability
Percentile Rank
Percentile Rank
• Comparison of averages is not
enough.
• Consider a class with the following
marks
• 80%, 80%, 80%, 90%, 20%, 70%, and 65%
– The mean of the class is a 69%
» A person with 70% could claim to be “above
average”.
Percentile Rank
• Percentage of scores less than or
equal to a particular score
• Our example
– 7 scores all together
• Only 2 of them are lower than 70%
– The student was in the bottom half of the class!
Percentile Rank
B  0.5E
Percentile Rank =
 100
n
• Where
– B = number of scores below a given score
– E = number of scores equal to the given score
• E=1 if no equal scores
– n = number of scores
• Percentile Rank is always rounded up.
Our Example
• B=2
• PR = 36
– 2 scores < 70%
• E=1
– 1 score = 70%
• n=7
– 7 scores total
P.R.=
2  0.5 1
7
 100  35.7
– P36
– 36th percentile
– Scored as well as or
better than 36% of
the class.
Special Percentiles
• Median
– 50th Percentile
– P50
• Upper quartile
– Median of upper half of data
– 75th Percentile
– P75
• Lower quartile
– Median of lower half of data
– 25th Percentile
– P25
Example
• Wendy is 1.7m tall. She is taller than 65 of the
students in her grade and no one is the same
height as she is. There are 139 students in her
grade.
– What percentage of students are taller than Wendy?
Assignment
Page 390 #1 – 7
Measures of Variability
Standard Deviation
Calculating Standard Deviation
1.
Find the Mean
2.
Find the difference between each
number and the mean
3.
4.
5.
6.
X
Square of the differences
Add up (sum) the squares
X  X
X  X
 X  X 
Divide the sum of the squares by
n–1
2
 X  X 
Find the square root of this number

XX
n 1

2
2
n 1
2
• You may find a chart useful to help keep
track of the calculations…
X
X  X
X  X
X1
X2
Etc..
 X  X 
Then find:
2

XX
n 1

2
2
Example
x
• A group of Senior 4
students had the following
scores on a math test
–
–
–
–
–
–
–
–
–
–
42
53
59
66
68
68
71
76
83
94
• Find the mean, range, and
Standard Deviation for the
data
42

XX


53
59
66
68
68
71
76
83
94
 X  X 
 X
 X
n 1
2

2
XX

2
Example 2
• Another school wrote
the same test and had
the following scores
–
–
–
–
–
–
–
–
–
–
42
62
66
66
68
68
70
70
74
94
• Find the mean, range,
and Standard
Deviation for the data
x
X  X
X  X
42
62
66
66
68
68
70
70
74
94
 X  X 
 X
 X
n 1
2

2
2
Assignment
Page 399 #1 – 7
Measures of Variability
Normal Distributions
Normal Distributions
• A family of graphs that have the same
general shape, and characteristics.
– Bell-shaped curves
– Mean is in the center of the curve
• Curve is symmetric about mean
– Mean equals median
• Equal # of data pieces above and below mean
– Data is more concentrated in the middle than the
ends.
Normal Distributions
Normal Distributions
• Many data sets have a normal distribution if
you collect a very large sample
–
–
–
–
–
Height
Weight
IQ score
Marks
Life Expectancy
Normal Distribution
• Closely related to standard deviation
• Curve is divided into sections,
– Each section is one more standard deviation
away from the mean
Normal Distribution
• All normal distributions have the same percentage
of data in each section
– 68% of the data are within 1 standard deviation of the
mean. (+/- 34%)
– Only 0.30% of the data are more than 3 SD’s away from
the mean.
Example
• A company manufactures batteries whose life
follows a normal distribution. The batteries have an
average (mean) lifespan of 90 hours, with a
standard deviation of three hours.
Example
•
What percentage of batteries will have a lifespan of less than
87hrs?
•
What percentage of batteries will have a lifespan greater than
87hrs but less than 96hrs?
•
If 5000 batteries are made per day, how many will have a
lifespan greater than 93hrs?
•
If your school bought 200 of these batteries, how many of
these would last less than 84hrs?
Assignment
Page 408 #1 – 8
Statistics
Strength of Relationships
Correlation Coefficient
• Statisticians try and find if relationships
exists between two variables.
• Correlation coefficient (r)
– Numerical value assigned to a
relationship
– Describes the strength of the relationship.
– Between -1 and +1
Positive Correlation
• Closer the r-value is to
+1, the stronger the
positive correlation.
• As the values of one
set of data (x)
increase, the values of
the second set of data
(y) also increases.
Negative Correlation
• Closer the r-value is
to -1, the stronger the
negative correlation.
• As the values of one
set of data (x)
increase, the values of
the second set of data
(y) also decreases.
Golf Scores
Zero Correlation
• A r-value of 0 indicates a zero correlation
• Weak correlations will have values close to
zero.
Assignment
Page 422 #1 – 7