Measures of Position

Download Report

Transcript Measures of Position

Statistics
Numerical Representation of
Data
Part 3 – Measure of Position
Warm-up

The mean commute time in the U.S. is
24.4 minutes with a standard deviation
of 6.5 minutes. What is the minimum
percentage of commuters that have
commute times between 11.4 minutes
and 37.4 minutes?
Warm-up

The average age of U.S astronaut candidates
has been 34 years, but the ages have ranged
from 26 to 46. What would be the
approximate standard deviation?
Warm-up

If the SATs have a mean of 1500 with a
standard deviation of 300, would a score of
2200 be considered an unusual score? Can
you determine the approximate percentile
score of 2200?
Warm-up

Compare the variability of the heights and
weights of men. Which is more variable? Men
have an mean height of 69 in. with a standard
deviation of 2.5 in and a mean weight of
172.6 lbs with a standard deviation of 26.3
lbs.
Warm-up Suppose that the height of college
males has a bell shaped distribution with a
mean of 70 inches and a standard
deviation of 2 inches. Approximately what
percentage of college males are between
66 and 74 inches?
a) 68%
b) 90%
c) 95%
d) 99.7%
e) 100%
Agenda



Warm-up
Homework Review
Lesson Objectives







Determine the quartiles of a data set
Determine the interquartile range of a data set
and determine outliers
Create a box-and-whisker plot
Interpret other fractiles such as percentiles
Determine and interpret the standard score (zscore)
Summary
Homework
Quartiles


Fractiles are numbers that partition (divide)
an ordered data set into equal parts.
Quartiles approximately divide an ordered
data set into four equal parts.



First quartile, Q1: About one quarter of the data
fall on or below Q1.
Second quartile, Q2: About one half of the data
fall on or below Q2 (median).
Third quartile, Q3: About three quarters of the
data fall on or below Q3.
Example: Finding Quartiles
The test scores of 15 employees enrolled in a CPR
training course are listed. Find the first, second, and
third quartiles of the test scores.
13 9 18 15 14 21 7 10 11 20 5 18 37 16 17
Q2 divides the data set into two halves.
Lower half
Upper half
5 7 9 10 11 13 14 15 16 17 18 18 20 21 37
Q2
Solution: Finding Quartiles
 The
first and third quartiles are the medians of
the lower and upper halves of the data set.
Lower half
Upper half
5 7 9 10 11 13 14 15 16 17 18 18 20 21 37
Q1
Q2
Q3
About one fourth of the employees scored 10 or less, about one half scored
15 or less; and about three fourths scored 18 or less.
Interquartile Range
Interquartile Range (IQR)
 The difference between the third and first
quartiles.
 IQR = Q3 – Q1
Example: Finding the
Interquartile Range
Find the interquartile range of the test scores.
Recall Q1 = 10, Q2 = 15, and Q3 = 18
Solution:
• IQR = Q3 – Q1 = 18 – 10 = 8
The test scores in the middle portion of the data
set vary by at most 8 points.
Box-and-Whisker Plot
Box-and-whisker plot
 Exploratory data analysis tool.
 Highlights important features of a data set.
 Requires (five-number summary):





Minimum entry
First quartile Q1
Median Q2
Third quartile Q3
Maximum entry
Drawing a Box-and-Whisker
Plot
1.
2.
3.
4.
5.
Find the five-number summary of the data set.
Construct a horizontal scale that spans the range of the
data.
Plot the five numbers above the horizontal scale.
Draw a box above the horizontal scale from Q1 to Q3
and draw a vertical line in the box at Q2.
Draw whiskers from the box to the minimum and
maximum entries.
Box
Whisker
Minimum
entry
Whisker
Q1
Median, Q2
Q3
Maximum
entry
Example: Drawing a Box-andWhisker Plot
Draw a box-and-whisker plot that represents
the 15 test scores.
Recall Min = 5 Q1 = 10 Q2 = 15 Q3 = 18
Max = 37
Solution:
5
10
15
18
37
About half the scores are between 10 and 18. By looking at the length of the right
whisker, you can conclude 37 is a possible outlier.
Outliers


A outlier can be determined by the following:
Any value that is more than

Q3 + 1.5IQR
Or
 Any value that is less than

Q1 – 1.5IQR
Outliers

Is 37 an outlier?




IQR = 18-10 = 8
1.5IQR = 1.5 x 8 = 12
Q3 + 1.5IQR = 18 + 12 = 30
Since 37 > 30, 37 may be considered an outlier.
How to Interpret a Box and
Whisker Plot




Here is how to read a boxplot. The median
is indicated by the vertical line that runs
down the center of the box.
Additionally, boxplots display two common
measures of the variability or spread in a
data set.
Range. If you are interested in the spread
of all the data, it is represented on a
boxplot by the horizontal distance between
the smallest value and the largest value,
including any outliers. If you ignore
outliers, the range is illustrated by the
distance between the opposite ends of the
whiskers.
Interquartile range (IQR). The middle half
of a data set falls within the interquartile
range. In a boxplot, the interquartile range
is represented by the width of the box (Q3
minus Q1).
How to Interpret a Box and
Whisker Plot
How to Interpret a Box and
Whisker Plot
Percentiles and Other Fractiles
Fractiles
Summary
Quartiles
Divides data into 4 equal Q1, Q2, Q3
parts
Divides data into 10 equal D1, D2, D3,…, D9
parts
Deciles
Percentiles Divides data into 100
equal parts
Symbols
P1, P2, P3,…, P99
Example: Interpreting
Percentiles
The ogive represents the
cumulative frequency
distribution for SAT test
scores of college-bound
students in a recent year.
What test score
represents the 72nd
percentile? How should
you interpret this? (Source:
College Board Online)
Solution: Interpreting
Percentiles
The 72nd percentile
corresponds to a test
score of 1700.
This means that 72% of
the students had an
SAT score of 1700 or
less.
The Standard Score
Standard Score (z-score)
 Represents the number of standard
deviations a given value x falls from the mean
μ.
value - mean
x
z 

standard deviation

A z-score of greater than +2 or less than -2
would be considered an unusual value.
Standard Scores – Example 1
In 2007, Forest Whitaker won the Best Actor Oscar at age
45 for his role in the movie The Last King of Scotland.
Helen Mirren won the Best Actress Oscar at age 61 for her
role in The Queen. The mean age of all best actor winners
is 43.7, with a standard deviation of 8.8. The mean age of
all best actress winners is 36, with a standard deviation of
11.5. Find the z-score that corresponds to the age for each
actor or actress. Then compare your results.
Standard Scores – Example 1

Forest Whitaker
x   45  43.7
z

 0.15

8.8
• Helen Mirren
z
x

61  36

 2.17
11.5
0.15 standard
deviations above
the mean
2.17 standard
deviations above
the mean
Solution:
Comparing
z-Scores
Standard Scores
– Example
1 Data
from
Sets Different Data Sets
z = 0.15 z = 2.17
The z-score corresponding to the age of Helen Mirren is more than two
standard deviations from the mean, so it is considered unusual. Compared to
other Best Actress winners, she is relatively older, whereas the age of Forest
Whitaker is only slightly higher than the average age of other Best Actor
winners.
Standard Score – Example 2

It call also be used to compare:

A student received a 75 on a statistics test that
has a mean of 85 and a standard deviation of 7.
The same student had a 27 on a psychology test
that had a mean of 30 and a standard deviation of
2.5. On which test did the student perform better?
Standard Score – Example 2
(cont.)

Calculated the z-score for each test:



Z= (75 – 85)/7 = -1.43
Z = (27 – 30)/2.5 = -1.20
The student performed better on the
psychology test as the z-score was greater.
Summary





Determined the quartiles of a data set
Determined the interquartile range of a data
set
Created a box-and-whisker plot
Interpreted other fractiles such as percentiles
Determined and interpreted the standard
score
(z-score)
Homework

Pg 100 – 104 # 1-39 Odd