Data and Data Analysis
Download
Report
Transcript Data and Data Analysis
Data and Data Analysis
Measures of Central
Tendency
Used to interpret data by choosing one number to represent all the
numbers in the data set.
Types of measures of
Central Tendency
Mean
Median
Mode
Range
Quartiles and Interquartile Range
Mean (Average)
The sum of the numbers in a data set divided by
how many numbers are in that set.
Ex: Find the mean in the data.
12, 18, 22, 27, 27, 29, 30, 33, 33, 33, 45
12 18 22 27 27 29 30 33 33 33 45
11
309
28.09
11
Median (Middle)
The middle number in a data set when the set is in
arranged in order from least to greatest.
Example: Find the Median…
33, 17, 16, 23, 45, 21
First but the data in order from least to greatest.
16, 17, 21, 23, 33, 45
There are 2 numbers in the middle, so to find the
median, you need to add the 2 numbers together and
divide by 2.
21 23 44
22
2
2
Mode (Most)
The number that appears most often in a data set.
A data set may contain more than one mode.
Example: Find the Mode:
22, 16, 15, 31, 31, 10, 31, 15
The mode is 31.
Range
The measure of the variability in a data set (how the
number vary or change).
To find the range, calculate the difference between
the largest and smallest numbers.
Example: Find the Range
11, 15, 18, 21, 27, 33, 33, 35, 40
40-11=29
Quartile and Interquartile
Range
Quartile
Used in statistics to represent one fourth of the data set.
Lower Quartile
The median of the lower half of the data set
Second Quartile
Median of the data set
Upper Quartile
The median of the upper half of the data set.
Interquartile Range
The difference between the upper quartile and the lower
quartile.
Quartile Example
What are the lower quartile, upper quartile, and
interquartile range of the following numbers?
21, 33, 45, 52, 47, 35, 39, 60, 63, 58, 70, 49
• Arrange the numbers from least to greatest to find
the medians of the upper and lower halves.
21, 33, 35, 39, 45, 47, 49, 52, 58, 60, 63, 70
Lower Quartile:
Median 37
Upper Quartile:
Median 59
Interquartile Range:
59-37=22
Percentiles
A measure that tells what percent of the total
frequency (the total number of numbers in the data
set) is scored at or below that measure.
To find the percentile of a data set, arrange the data
in order from least to greatest.
Compute the index, the position of the percentile in
the ordered set, by multiplying the percent by the
frequency.
If the product is not an integer, round up.
Percentile Example
The following list represents the scores that 15 students
received on the last science quiz.
13, 14, 16, 17, 19, 19, 20, 20, 21, 21, 21, 22, 24, 24, 25
If Wilson’s score was at the 93rd percentile, what score
did Wilson receive?
Convert 93% to a decimal (0.93) and multiply by the
frequency (15).
0.9315 13.95
Since 13.95 is not an integer, round up to 14. Wilson’s
score is the 14th score listed. Therefore, Wilson received a
score of 24 on the quiz.
Representing Data
Here are the different graphs that can be used to represent data…
Types of Data
Discrete Data
Data that can be counted
Continuous Data
Data that are assigned an infinite number of values
between whole numbers.
The assigned values are approximated.
Bar Graph
Used to compare amounts.
Uses vertical and
horizontal bars to show
data.
Sports Drink Sales
Color
Number Sold
Blue
170
Orange
106
Red
145
Purple
98
Histogram
A type of bar graph that is
used to show continuous
data:
Bars are always vertical.
Bars are always connected
to each other.
The horizontal axis is
labeled using intervals.
Theater Arrivals
Time
2:45
2:46
2:47- 2:49- 2:512:48 2:50 2:52
2:53- 2:55
2:54 2:56
2:57- 2:592:58 3:00
Number of
People
8
10
15
40
6
18
35
28
Line Graph
Useful for showing trends
in data over a period of
time.
Trend
A clear direction or pattern
in a graph that suggests
how the data values will
behave in the future.
Pine Tree Growth
Year
2001 2002 2003 2004
2005 2006
Height
(inches)
6
61
22
36
51
69
Circle Graph (Pie Chart)
Used to show how different parts of a whole compare to one another.
Each part can be expressed as a fraction or as a percent.
Shows data from one particular time and does not show trends or
changes over a period of time.
Favorite Sports
Sport
Number
of Votes
Percent
Soccer
75
30%
Basketbal
l
100
40%
Volleyball 50
20%
Tennis
25
10%
Total
250
100%
Scatterplot
Used to show how closely 2
data sets are related.
Water Depth
Time
(min)
1
2
3
4
5
6
7
8
9
10
11
Depth of
Water
(mm)
20
37
49
64
77
90
105
120
137
150 165
12
13
14
15
16
178
193
20
9
22
6
240
Correlation and Trend
Lines
When data are plotted in a scatterplot, the closer the points
come to forming a straight, slanted line, the stronger the
correlation.
If 2 data sets are correlated, a trend line can be drawn to
approximate missing data.
Has close to the same number of points above and below it.