Guided Practice: Example 4, continued

Download Report

Transcript Guided Practice: Example 4, continued

Introduction
Previous lessons have demonstrated that the normal
distribution provides a useful model for many situations
in business and industry, as well as in the physical and
social sciences. Determining whether or not it is
appropriate to use normal distributions in calculating
probabilities is an important skill to learn, and one that
will be discussed in this lesson.
1
1.1.3: Assessing Normality
Introduction, continued
There are many methods to assess a data set for
normality. Some can be calculated without a great deal
of effort, while others require advanced techniques and
sophisticated software. Here, we will focus on three
useful methods:
• Rules of thumb using the properties of the
standard normal distribution (including symmetry
and the 68–95–99.7 rule).
• Visual inspection of histograms for symmetry,
clustering of values, and outliers.
• Use of normal probability plots.
1.1.3: Assessing Normality
2
Introduction, continued
With advances in technology, it is now more efficient to
calculate probabilities based on normal distributions.
With our new understanding of a few important
concepts, we will be ready to conduct research that was
formerly reserved for a small percentage of people in
society.
3
1.1.3: Assessing Normality
Key Concepts
• Although the normal distribution has a wide range of
useful applications, it is crucial to assess a distribution
for normality before using the probabilities associated
with normal distributions.
• Assessing a distribution for normality requires
evaluating the distribution’s four key components: a
sample or population size, a sketch of the overall
shape of the distribution, a measure of average (or
central tendency), and a measure of variation.
• It is difficult to assess normality in a distribution
without a proper sample size. When possible, a
sample with more than 30 items should be used.
1.1.3: Assessing Normality
4
Key Concepts, continued
• Outliers are values far above or below other values of
a distribution.
• The use of mean and standard deviation is
inappropriate for distributions with outliers.
Probabilities based on normal distributions are
unreliable for data sets that contain outliers.
• Some outliers, like those caused by mistakes in data
entry, can be eliminated from a data set before a
statistical analysis is performed.
5
1.1.3: Assessing Normality
Key Concepts, continued
• Other outliers must be considered on a case-by-case
basis.
• Histograms and other graphs provide more efficient
methods to assess the normality of a distribution.
• If a histogram is approximately symmetric with a
concentration of values near the mean, then using a
normal distribution is reasonable (assuming there are
no outliers).
6
1.1.3: Assessing Normality
Key Concepts, continued
• If a histogram has most of its weight on the right side
of the graph with a long “tail” of isolated, spread-out
data points to the left of the median, the distribution is
said to be skewed to the left, or negatively skewed:
• In a negatively skewed distribution, the mean is often,
but not always, less than the median.
7
1.1.3: Assessing Normality
Key Concepts, continued
• If a histogram has most of its weight on the left side of
the graph with a long tail on the right side of the
graph, the distribution is said to be skewed to the
right, or positively skewed:
8
1.1.3: Assessing Normality
Key Concepts, continued
• In a positively skewed distribution, the mean is often,
but not always, greater than the median.
• Histograms should contain between 5 and 20
categories of data, including categories with
frequencies of 0 (empty catagories).
• Recall that the 68–95–99.7 rule, also known as the
Empirical Rule, states percentages of data under the
normal curve are as follows: m ± 1s » 68%,
m ± 2s » 95%, and m ± 3s » 99.7%.
9
1.1.3: Assessing Normality
Key Concepts, continued
• The 68–95–99.7 rule can also be used for a quick
assessment of normality. For example, in a sample
with less than 100 items, obtaining a z-score below
–3.0 or above +3.0 indicates possible outliers or skew.
Graphing calculators and computers can be used to
construct normal probability plots, which are a more
advanced system for assessing normality.
• In a normal probability plot, the z-scores in a data set
are paired with their corresponding x-values.
10
1.1.3: Assessing Normality
Key Concepts, continued
• If the points in the normal plot are approximately
linear with no systematic pattern of values above and
below the line of best fit, then it is reasonable to
assume that the data set is normally distributed.
11
1.1.3: Assessing Normality
Common Errors/Misconceptions
• treating a data set that has outliers as if it were a
normal distribution
• removing outliers without justification
• adhering too strictly to the rules of thumb for assessing
normality
• deeming a distribution as normal when it is actually
skewed left or right
12
1.1.3: Assessing Normality
13
1.1.3: Assessing Normality
Ex. 1
What do we know?
𝑚𝑔
𝜇 = 165
𝑑𝐿
𝑚𝑔
𝜎 = 20
𝑑𝐿
What are our upper and lower bounds for 1 standard
deviation of the mean?
𝜇 − 1𝜎 = 165 − 20 = 145
𝜇 + 1𝜎 = 165 + 20 = 185
14
1.1.3: Assessing Normality
So we now can calculate the z-scores of the upper
bound and the lower bound…
15
1.1.3: Assessing Normality
Then table lookup…
*Percent of data from z-score of the upper
bound:_______
*Percent of data from the z-score of the lowerbound:_______
Then subtract the percent of the lower bound from the
percent of the upper bound:
16
1.1.3: Assessing Normality
*Is the percent of students within one standard deviation of the
mean close to 68%?
*Ok, now do this for 2 standard deviations from the mean….and
then 3 standard deviations from the mean...
*Does the percent of students within 2 standard deviations from the
mean fall close to 95%?
*Does the percent of students within 3 standard deviations from the
mean fall close to 99.7%?
What can you conclude from your findings?
IOW…Is the data normally distributed?
17
1.1.3: Assessing Normality
Guided Practice
Example 2
In order to constantly improve instruction, Mr. Hoople
keeps careful records on how his students perform on
exams. The histogram on the next slide displays the
grades of 40 students on a recent United States history
test. The table next to it summarizes some of the
characteristics of the data. Use the properties of a
normal distribution to determine if a normal distribution is
an appropriate model for the grades on this test.
18
1.1.3: Assessing Normality
Number of students
Recent U.S. History Test Scores
Summary
statistics
n
40
μ
80.5
Median
85
σ
18.1
Minimum
0
Maximum
98
Test score
19
1.1.3: Assessing Normality
Guided Practice: Example 2, continued
1. Analyze the histogram for symmetry and
concentration of values.
The histogram is asymmetric; there is a skew to the
left (or a negative skew). The mean is 85.0 – 80.5 =
4.5 less than the median. Also, there appears to be a
higher concentration of values above the mean
(80.5) than below the mean.
20
1.1.3: Assessing Normality
Guided Practice: Example 2, continued
2. Examine the distribution for outliers and
evaluate their significance, if any outliers
exist.
There is one negative outlier (0) on this test. There
may be outside factors that affected this student’s
performance on the test, such as illness or lack of
preparation.
21
1.1.3: Assessing Normality
Guided Practice: Example 2, continued
3. Determine whether a normal distribution
is an appropriate model for this data.
22
1.1.3: Assessing Normality
Guided Practice: Example 2, continued
3. Determine whether a normal distribution
is an appropriate model for this data.
Because of the outlier, the normal distribution is not
an appropriate model for this population.
✔
23
1.1.3: Assessing Normality
Guided Practice: Example 2, continued
24
1.1.3: Assessing Normality
Example 3
Rent at the Cedar Creek apartment complex includes all
utilities, including water. The operations manager at the
complex monitors the daily water usage of its residents. The
following table shows water usage, in gallons, for residents
of 36 apartments. To better assess the data, the manager
sorted the values from lowest to highest. Does the data
show an approximate normal distribution?
25
1.1.3: Assessing Normality
26
1.1.3: Assessing Normality
1 – Create a histogram…
How many categories?
Lets make an educated choice….
What is the range of the data?
Range = Maximum Value – Minimum Value
Range = 431 – 181 = 250
36 data points (apartments) span this range
Lets make 6 categories.
27
1.1.3: Assessing Normality
2 – What’s the category width?
28
1.1.3: Assessing Normality
2 – What’s the category width?
Category width = range ÷ number of categories
250
𝑐𝑎𝑡𝑒𝑔𝑜𝑟𝑦 𝑤𝑖𝑑𝑡ℎ =
= 41.67
6
… so, approximately 40 gallons.
29
1.1.3: Assessing Normality
3 – Create a frequency table…
30
1.1.3: Assessing Normality
3 – Create a frequency table…
31
1.1.3: Assessing Normality
4 – Sketch a graph of this table…
32
1.1.3: Assessing Normality
5 – Describe the overall shape of the distribution.
• Is there any kind of skew?
• Where is the highest concentration of values?
• Are there any outliers?
33
1.1.3: Assessing Normality
6 – Drawing Conclusions…
• Statistical analysis require your judgment.
• Can you assume normality?
Things to think about:
• What is the context of the problem?
• What will the calculations be used for?
Is the data to be used to make an informed
decision?
OR
Is it life impacting?
34
1.1.3: Assessing Normality
6 – Drawing Conclusions, Continued…
In Example 3…
• Why is one apartment using more water than
another?
Number of residents?
Do they have a washer, dish washer?
Do we know any of this information?
Can we assume normality?
35
1.1.3: Assessing Normality
Guided Practice
Example 4
Use a graphing calculator to construct a normal
probability plot of the following values. Do the data
appear to come from a normal distribution?
{1, 2, 4, 8, 16, 32}
36
1.1.3: Assessing Normality
Guided Practice: Example 4, continued
1. Use a graphing calculator or computer
software to obtain a normal probability plot.
Different graphing calculators and computer software
will produce different graphs; however, the following
directions can be used with TI-83/84 or TI-Nspire
calculators.
37
1.1.3: Assessing Normality
Guided Practice: Example 4, continued
On a TI-83/84:
Step 1: Press [STAT] to bring up the statistics menu. The
first option, 1: Edit, will already be highlighted.
Press [ENTER].
Step 2: Arrow up to L1 and press [CLEAR], then
[ENTER], to clear the list. Repeat this process to
clear L2 and L3 if needed.
Step 3: From L1, press the down arrow to move your
cursor into the list. Enter each number from the
data set, pressing [ENTER] after each number to
navigate down to the next blank spot in the list.
38
1.1.3: Assessing Normality
Guided Practice: Example 4, continued
Step 4: Press [Y=]. Press [CLEAR] to delete any
equations.
Step 5: Set the viewing window by pressing
[WINDOW]. Enter the following values, using
the arrow keys to navigate between fields
and [CLEAR] to delete any existing values:
Xmin = 0, Xmax = 35, Xscl = 5, Ymin = –3,
Ymax = 3, Yscl = 1, and Xres = 1.
Step 6: Press [2ND][Y=] to bring up the STAT
PLOTS menu.
39
1.1.3: Assessing Normality
Guided Practice: Example 4, continued
Step 7: The first option, Plot 1, will already be
highlighted. Press [ENTER].
Step 8: Under Plot 1, press [ENTER] to select “On”
if it isn’t selected already. Arrow down to
“Type,” then arrow right to the normal
probability plot icon (the last of the six icons
shown) and press [ENTER].
Step 9: Press [GRAPH].
40
1.1.3: Assessing Normality
Guided Practice: Example 4, continued
On a TI-Nspire:
Step 1: Press the [home] key.
Step 2: Arrow over to the spreadsheet icon and
press [enter].
Step 3: The cursor will be in the first cell of the first
column. Enter each number from the data
set, pressing [enter] after each number to
navigate down to the next blank cell.
41
1.1.3: Assessing Normality
Guided Practice: Example 4, continued
Step 4: Arrow up to the topmost cell of the column,
labeled “A.” Name the column “exp1” using
the letters and numbers on your keypad.
Press [enter].
Step 5: Press the [home] key. Arrow over to the data
and statistics icon and press [enter].
Step 6: Press the [menu] key. Arrow down to 2: Plot
Properties, then arrow right to bring up the
sub-menu. Arrow down to 4: Add X Variable,
if it isn’t already highlighted. Press [enter].
42
1.1.3: Assessing Normality
Guided Practice: Example 4, continued
Step 7: Arrow down to {…}exp1 if it isn’t already
highlighted. Press [enter]. This will graph the
data values along an x-axis.
Step 8: Press [menu]. The first option, 1: Plot Type,
will be highlighted. Arrow right to bring up
the next sub-menu. Arrow down to
4: Normal Probability Plot. Press [enter].
43
1.1.3: Assessing Normality
Guided Practice: Example 4, continued
Your graph should show the general shape of the plot
as follows.
44
1.1.3: Assessing Normality
Guided Practice: Example 4, continued
2. Analyze the graph to determine whether it
follows a normal distribution.
Do the points lie close to a straight line? If the data
lies close to the line, is roughly linear, and does not
deviate from the line of best fit with any systematic
pattern, then the data can be assumed to be normally
distributed. If any of these criteria are not met, then
normality cannot be assumed.
45
1.1.3: Assessing Normality
Guided Practice: Example 4, continued
The data does not lie close to the line; the data is not
roughly linear. The data seems to curve about the line,
which suggests a pattern. Therefore, normality cannot
be assumed. The normal distribution is not an
appropriate model for this data set.
✔
46
1.1.3: Assessing Normality
Guided Practice: Example 4, continued
47
1.1.3: Assessing Normality