Transcript The mean

Data
Description
Instructor: Alaa saud
Note: This PowerPoint is only a summary and your main source should
be the book.
Outline:
• Introduction.
3-1 Measures of Central Tendency.
3-2 Measures of Variation.
3-3 Measures of Position.
3-4 Exploratory Data Analysis
Note: This PowerPoint is only a summary and your main source should
be the book.
3-1 Measures of Central
Tendency.
Mean, Median, Mode, Midrange,
Weighted Mean.
Note: This PowerPoint is only a summary and your main source should
be the book.
A statistic:
Is a characteristic or
measure obtained by
using all the data
values from a sample.
A parameter:
Is a characteristic or
measure obtained by
using all the data
values from a specific
population.
Note: This PowerPoint is only a summary and your main source should be the book.
1-The Mean:
• The mean is the sum of the values, divided by the
total number of values.
The symbol for the
sample mean: x
X
X
n

X 1  X 2  X 3  .............  X n
n
Where n: no. of val. In
sample.
The symbol for the
population mean:

X
N


X 1  X 2  X 3  .............  X n
N
Where N: no. of val. In
population.
Note: This PowerPoint is only a summary and your main source should be the book.
Example 3-1: Days Off per Year
The data represent the number of days off per year for
a sample of individuals selected from nine different
countries. Find the mean.
20, 26, 40, 36, 23, 42, 35, 24, 30
X1  X 2  X 3 
X
n
 Xn
X


n
20  26  40  36  23  42  35  24  30 276
X

 30.7
9
9
The mean number of days off is 30.7 years.
Note: This PowerPoint is only a summary and your main source should be the book.
Example 3-1P(106):Area Boat Registrations.
• The data is:
3782 6367 9002 4208 6843 11008
Find the mean?
• Solution:
3782  6367  9002  4208  6843  11008
X
 6868.3
6
• The mean of the six county boat registrations is 6868.3.
Note: This PowerPoint is only a summary and your main source should be the book.
Rounding Rule: Mean
The mean should be rounded to one more decimal
place than occurs in the raw data.
The mean, in most cases, is not an actual data
value.
Note: This PowerPoint is only a summary and your main source should be the book.
2-The Median:
• The median is the midpoint of the data array.
where the data array is the ordered of the data set.
• The median will be one of the data values if there is
an odd number of values.
• The median will be the average of two data values if
there is an even number of values.
• The symbol of the Median is MD.
Note: This PowerPoint is only a summary and your main source should be the book.
Example 3-4: Hotel Rooms
The number of rooms in the seven hotels in
downtown Pittsburgh is
713, 300, 618, 595, 311, 401, and 292.
Find the median?
No. of
data is
•Sort in ascending order.
odd
292, 300, 311, 401, 596, 618, 713
•Select the middle value.
MD = 401
The median is 401 rooms.
Note: This PowerPoint is only a summary and your main source should be the book.
Example 3-6: Tornadoes in the U.S.
The number of tornadoes that have occurred in the
United States over an 8-year period follows. Find the
median.
684, 764, 656, 702, 856, 1133, 1132, 1303
.
656, 684, 702, 764, 856, 1132, 1133, 1303
764  856 1620
MD 

 810
2
2
The median number of tornadoes is 810.
No. of
data is
even
*Example 3-8 P(111):Magazines Purchased.
• The data is:
1 7 3 2 3 4
Find the median?
• Solution:
1,2,3,3,4,7
MD 
33
3
2
Note: This PowerPoint is only a summary and your main source should be the book.
2-The Mode:
The value that occurs most often in a data set is
called the mode.
A data set that has only one mode is said to be
unimodal.
A data set that has two mode is said to be
bimodal.
A data set that has more than two mode is said to
be multimodal.
When no data value occurs more than once, the
data set is said to have no mode.
Note: This PowerPoint is only a summary and your main source should be the book.
Example 3-9: NFL Signing Bonuses
Find the mode of the signing bonuses of eight NFL
players for a specific year. The bonuses in millions
of dollars are
18.0, 14.0, 34.5, 10, 11.3, 10, 12.4, 10
You may find it easier to sort first.
10, 10, 10, 11.3, 12.4, 14.0, 18.0, 34.5
Select the value that occurs the most.
The mode is 10 million dollars.
Note: This PowerPoint is only a summary and your main source should be the book.
Example 3-10: Coal Employees in PA
Find the mode for the number of coal employees per
county for 10 selected counties in southwestern
Pennsylvania.
110, 731, 1031, 84, 20, 118, 1162, 1977, 103, 752
No value occurs more than once.
There is no mode.
Note: This PowerPoint is only a summary and your main source should be the book.
Example 3-11: Licensed Nuclear Reactors
The data show the number of licensed nuclear
reactors in the United States for a recent 15year period. Find the mode.
104 104 104 104 104 107 109 109 109 110
109 111 112 111 109
104 and 109 both occur the most. The data set
is said to be bimodal.
The modes are 104 and 109.
Note: This PowerPoint is only a summary and your main source should be the book.
3-The Midrange:
• The midrange is defined as the sum of the
lowest and highest values in the data set,
divided by 2.
• The symbol of midrange is MR.
lowest value  highest value L.v  H .v
MR 

2
2
Note: This PowerPoint is only a summary and your main source should be the book.
Example 3-15: Water-Line Breaks
In the last two winter seasons, the city of
Brownsville, Minnesota, reported these
numbers of water-line breaks per month.
Find the midrange.
2, 3, 6, 8, 4, 1
1 8 9
MR 
  4.5
2
2
The midrange is 4.5.
Note: This PowerPoint is only a summary and your main source should be the book.
4-The Weighted Mean:
• Find the Weighted Mean of a variable X by
multiplying each value by its corresponding
weight and dividing the sum of the products by
the sum of the weights.
•
w1 X 1  w2 X 2  .......  wn X n  wX
X

w1  w2  .......  wn
w
• Where w1 ,w2 , …., wn are the weights
• And x1 , x2 , ….., xn are the values.
Note: This PowerPoint is only a summary and your main source should be the book.
Example 3-17: Grade Point Average
A student received the following grades. Find
the corresponding GPA.
Course
Credits, w
Grade, X
English Composition
3
A (4 points)
Introduction to Psychology
3
C (2 points)
Biology
4
B (3 points)
Physical Education
2
D (1 point)
wX

X
w
3  4  3  2  4  3  2 1 32

  2.7
33 4  2
12
The grade point average is 2.7.
Note: This PowerPoint is only a summary and your main source should be the book.
Summary of Measures of Central Tendency.
Measure
Mean
Definition
Sum of values, divided by
total no. of values
Median
Middle point in data array
Mode
Most frequent data value
Midrange L.V plus H.V ,divided by 2
Symbol
,
 x
MD
None
MR
Note: This PowerPoint is only a summary and your main source should be the book.
Properties and Uses of
Central Tendency
Note: This PowerPoint is only a summary and your main source should be the book.
Properties of the Mean






Uses all data values.
Varies less than the median or mode
Used in computing other statistics, such as the variance
Unique, usually not one of the data values
Cannot be used with open-ended classes
Affected by extremely high or low values, called outliers
Note: This PowerPoint is only a summary and your main source should be the book.
Properties of the Median
Gives the midpoint
Used when it is necessary to find out whether the
data values fall into the upper half or lower half of
the distribution.
Can be used for an open-ended distribution.
Affected less than the mean by extremely high or
extremely low values.
Note: This PowerPoint is only a summary and your main source should be the book.
Properties of the Mode
Used when the most typical case is desired
Easiest average to compute
Can be used with nominal data
Not always unique or may not exist
Note: This PowerPoint is only a summary and your main source should be the book.
Properties of the Midrange
Easy to compute.
Gives the midpoint.
Affected by extremely high or low values in a
data set
Note: This PowerPoint is only a summary and your main source should be the book.
Distributions
Note: This PowerPoint is only a summary and your main source should be the book.
Exercises:
Q(1):What is the most appropriate measure of central tendency for the following
data set?
male, female, female, male, male, male, female
A) The midrange B) The mode
C) The median
D) The mean
Q(2):Which value in the given data set would affect the mean?
5000, 9000, 7000, 40, 6000, 8000
A) 40
B) 5000
C) 8000
D) None of the above
Q(3): Calculate the mean,median ,midrange for the following numbers:
0, 14, 9, 0, 12
A) mean=9,median=9,midrange=11.5
B)mean= 7,median=9,midrange=7
C) mean=11.7,median=0,midrange=14
D) mean=35,median=12,midrange=7
3
4
5
6
7
8
9
23
48
145
5667
345569
014444889
0147
Use this plot to answer the questions (4-5)
Q(4)Find the mode.
A) 84
B) 4
C) 9
D) 48
Q(5) :Based on the distribution shape of the stem and leaf plot, choose the correct
statement that describes the relationship between the measures: mean, median and
mode.
A) Mean = Median = Mode
B) Mean < Median < Mode
C) Mean > Median > Mode
D) The exact relationship cannot be determined