Displaying Categorical Variables Frequency Table

Download Report

Transcript Displaying Categorical Variables Frequency Table

Displaying Categorical Variables
Frequency Table
Variable
Categories of
the Variable
Count of elements
from sample in each
category. Total = 498
Section 2.1, Page 24
1
Displaying Categorical Variables
Relative Frequency Circle Graph
Relative Frequency
or Proportion.
74/498 = 15%
Section 2.1, Page 25
2
Displaying Categorical Variables
Vertical Bar Graph
Section 2.1, Page 25
3
Displaying Categorical Variables
Pareto Chart for Hate Crimes USA 1993
Cumulative
count/relative
frequency
Frequency, Relative Frequency, and
Cumulative Relative Frequency Table
Section 2.1, Page 25
4
Displaying Quantitative Data
Dot Plots
Section 2.2, Page 26
5
Displaying Quantitative Data
Stem-and-Leaf Displays
To make stem-and-leaf display, first find the minimum
and maximum number, 52 and 96. We then graph the
tens digits in the left column, 5 – 9. We then plot each
number opposite its tens digit. The plot point is the
ones digit.
Section 2.2, Page 27
6
Stem-and Leaf-Displays
Problems
Problems, Page 50
7
Displaying Quantitative Data
Ungrouped Frequency Distribution
Values of the
Variable in the
data set
Section 2.1, Page 29
Frequency or number of
times each value occurs
in the data set
8
Displaying Quantitative Data
Grouped Frequency Distribution
Classes or
bins, usually
5 to 12 of
equal width
95 or more to less than 105
Section 2.2, Page 30
9
Displaying Quantitative Data
Histograms
Histogram: A bar graph that represents a
frequency distribution of a quantitative variable.
10
Count
15
5
5 to 12 equal sized classes or bins
Section 2.2, Page 32
10
Histograms
Shapes of Distributions
Section 2.2, Page 33
11
Calculator Fundamentals
Clear the Home screen: “Clear” key
Setting the calculator decimal places:
“Mode” key: Down arrow key to
“FLOAT”, right arrow key to desired
number, then “ENTER”
Entering data into a List: “STAT-EditEnter”: Type in each number followed
by “ENTER” or down arrow.
Deleting number for a list: Position
the cursor over the number and press
“Del” key.
Clear a List: Position the cursor over the list
title, press “Clear-Enter”. Caution if you
press “DEL” you will eliminate the entire list
position.
Return to Home Screen: “2nd – Quit
Scientific Notation: When an answer will
be less than 1 with more than three zeros
after the decimal point, the calculator will
return the answer in Scientific Notation. For
example, the number 5.3E-5 is converted to
normal notation by moving the decimal point
5 places left: 0.000053.
Section 2.2
12
Constructing Histogram
TI-83 Calculator
(50 States)
Enter Data:
STAT-1:Edit-ENTER Type all the data in L1
Set up Plot:
2nd Stat Plot Enter --Turn plot ON, select Histogram Icon,
enter XList: as L1 and Freq: as 1
Set the Viewing Window
Zoom 9: ZoomStat – Hit Trace key then arrows to view
axes values.
Change category size to 7
Window –Make Xscl= 7. Then hit Graph Key
Display class width and frequency.
Trace
Section 2.2
13
TI-83 Histogram Display
# of
States
% College Students Enrolled in Public Institutions
The leftmost class or bin shows the number of
states between 44 and <51. There are 2 states in
this bin. To see the next bin, hit the right arrow
button.
Section 2.2 WS #21
14
Histogram Problem
2.4 Heights of NBA players selected in the June
2004 Draft.
a. Construct a histogram. Be sure to
show the scale and the label for the x
and y axes.
b. Describe the shape of the distribution.
Section 2.2, Page 50
15
Cumulative Frequency Distribution
Final Exam Scores for 50 Students
Cumulative Relative
Frequency
2/50 = .04
4/50 = .08
11/50 =.22
24/50 = .48
35/50 = .70
46/50 = .92
50/50 = 1.0
Cumulative Relative Frequency
For classes <65,
11/50 = .22
Section 2.2, Page 34
16
Measures of Central Tendency
Mean
Find the sample mean for the set {6, 3, 8, 6, 4}
Section 2.3, Page 35
17
Measures of Central Tendency
Median
The median is the value of the middle number when the
data are ranked according to size.
Find the median for the data the following set with an odd
n: {3, 3, 5, 6, 8}, n=5. The data values are in ascending
order. Depth of median = (n+1)/2. For this set: (5+1)/2 = 3
The median is the 3rd number, 5.
Find the median for the following data values that are in
ascending order with even n: {6, 7, 8, 9, 9, 10}, n=6.
Dept of median = (n+1)/2 = 3.5
The median is then the average of the 3rd and 4th
number. The median is (8+9)/2 = 8.5
Section 2.3, Page 36
18
Measures of Central Tendency
Mode and Midrange
(L+H)/2 = (3+8)/2 = 5.5
Section 2.3, Page 37
19
Measures of Central Tendency
Summary
The most useful measure is the mean. However,
when a set of numbers has outliers, the mean gets
distorted and may not be representative of the
central tendency. When this happens, the median is
a better measure of central tendency because it is
not affected by outliers.
Section 2.3, Page 37
20
Measures of Dispersion
Range
Secton 2.4, Page 39
21
Measures of Dispersion
Variance and Standard Deviation
{6, 3, 8, 5, 3 }
Section 2.4, Page 41
22
Measures of Position
Percentiles
Percentiles: Values of the variable that divide a set of
ranked data into 100 equal subsets: each set of data
has 99 percentiles.
A specific number from
within the range of values
In the set
Section 2.5, Page 42
23
Finding Percentiles
Example
Sample data set of 20 numbers in ascending rank
order:
{6, 12, 14, 17, 23, 27, 29, 33, 42, 51,
59, 65, 69, 74, 79, 82, 84, 88, 92, 97}
Find the 21st Percentile. Sample size n=20.
Calculate the depth: percentile*n/100 =
21*20/100 = 4.2.
(If the depth is an integer, Pk is the average of
the number and the next number. If the depth
contains a decimal, Pk is the next number.)
Since the depth contains a decimal, Pk is the
next number, the 5th number, Pk = 23.
Find the 75th Percentile:
Depth = 75*20/100 = 15. Since the depth is an
integer, the 75th percentile is the average of the
15th and 16th numbers, (79+82)/2=80.5.
Section 2.5, Page 43
24
Using the TI-83 to Find Percentiles
Find the 21st and the 75th percentile of the following data
set.
{6, 12, 14, 17, 23, 27, 29, 33, 42, 51,
59, 65, 69, 74, 79, 82, 84, 88, 92, 97}
STAT-EDIT: Enter the data in L1
PRGM: down arrow to PRCNTILE
ENTER: (Copies program to home
screen)
ENTER: (Displays Program Input Page)
2nd L1: (Enters the List name)
ENTER: (Asks for Percentile)
21.0: (Enters the desired percentile)
ENTER: (Displays the 21st percentile)
ENTER-2ND L1-75: (Displays the 75th percentile)
CLEAR: (Clears the home screen)
Section 2.5, Page 43
25
5-Number Summary
Box and Whisker Display
Q1
Q3
H
L
Med
Interquartile Range = Q3-Q1
Range of middle 50% of values
Measure of dispersion resistant to outliers.
Section 2.5, Page 44
26
TI – 83 Problem (1)
a. Find the mean, standard deviation sample
data.
STAT – EDIT: Enter the data is L1
PRGM – SAMPSTAT - ENTER
2ND L1 - ENTER
Sample Mean
DISPLAY:
Standard Deviation
Variance
Problems, Page 52
27
TI-83 Problem (2)
b. Find the Interquartile range (IQR)
Q3 – Q1 = 32 – 28 = 4
c. Find the range.
Max – Min = 34 – 25 = 9
Problems, Page 50
28
TI-83 Problem (3)
d. Make a box and whisker display of the data.
2ND STAT PLOT-ENTER
ENTER: Sets plot to ON
DOWN ARROW
RIGHT ARROW 5 TIMES: Select box Plot
DOWN ARROW – 2nd L1: Select List
Display:
ZOOM – 9
TRACE: Display:
RIGHT-LEFT ARROW:
Display 5-number
summary
Problems, Page 50
29
Summary: Measures of Center
and Spread
The mean and median are measures of the center of
a distribution. Outliers will distort the mean, so when
outliers are present the mean is not a good measure
of the center. The median is not distorted by outliers.
The standard deviation, variance, range, and
Interquartile range (IQR) are measures of the
spread or variability of a distribution. Outliers will
distort the standard deviation, variance, and range,
so when outliers are present, these are not good
measures of the spread or variability. The
Interquartile range is not distorted by outliers.
When outliers are present, then use the median and
IQR as measures of the center and spread.
When no significant outliers are present, use the
mean and standard deviation as measures of center
and spread. These measures allow use of the
maximum number statistical tools using the
distribution.
Section 2.4
30
Problem
a. Find the mean, variance, and standard deviation.
b. Find the 5-number summary.
c. Make a box and whisker display and label the
numbers.
d. Calculate the Interquartile range and the range
e. Describe the shape of the distribution
f. Find the 33rd percentile.
Problems, Page 50
31
Problem
a. Find the mean, variance, and standard deviation.
b. Find the 5-number summary.
c. Make a box and whisker display and label the
numbers.
d. Calculate the Interquartile range and the range
e. Describe the shape of the distribution.
f. Find the 90th percentile.
Problems, Page 50
32