DM15: Visualization and Data Mining

Download Report

Transcript DM15: Visualization and Data Mining

Visualization
and
Data Mining
Introduction
 What is Data Visualization?
 How does Data Visualization Work?
 History -Jacques Bertin
 Image Theory
 “Image” a definition
 Data Visualization and its use today
 What are the benefits of Data Visualization?
 Examples of Data Visualization
 Conclusion
 References
What is Data Visualization
 Data visualization is the process of converting raw data
into easily understood pictures of information that
enable fast and effective decisions.
Data -> Easily Understood
Pictures
 Jacques Bertin who wrote the classic works of graphical
visualization “Semiology of Graphics” states that the
“transformation from numbers to insight requires two stages.”
Data/Processes
Image
Algorithm
Perception
Insight
Bertin’s 7 Visual Variables
 Seven Visual Variables
 position
 form
 orientation
 color
 texture
 value
 size
 combined with a visual semantics for linking data attributes to visual elements
Image Theory
 Visual Processing occurs in 3 steps.
 1) formation of the retinal image,
 2) decomposition of the retinal image information into
an array of specialized representations and
 3) reassembly of the information into object perception.
Uses Today
 Data-driven actions are increasingly made without access
to information provided by traditional information
presentation
 Information visualization is emerging as an important
fusion of graphics, scientific visualization, database, and
human-computer interaction.
 In Military, Commercial Industries use Data Visualization to convey
complex results as understandable images.
What are the benefits of Data
Visualization?
 Data visualization allows users see several different
perspectives of the data.
 Data visualization makes it possible to interpret vast
amounts of data
 Data visualization offers the ability to note exceptions in
the data.
 Data visualization allows the user to analyze visual
patterns in the data.
Outline
 Graphical excellence and lie factor
 Representing data in 1,2, and 3-D
 Representing data in 4+ dimensions
 Parallel coordinates
 Scatterplots
 Stick figures
9
Visualization Role
 Support interactive exploration
 Help in result presentation
 Disadvantage: requires human eyes
 Can be misleading
10
Bad Visualization:
Spreadsheet
Year Sales
1999 2,110
2000 2,105
2001 2,120
2002 2,121
2003 2,124
Sales
2130
2125
2120
2115
2110
2105
2100
2095
Sales
1999
2000
2001
What is wrong with this graph?
11
2002
2003
Bad Visualization:
Spreadsheet with misleading Y –axis
Year Sales
1999 2,110
2000 2,105
2001 2,120
2002 2,121
2003 2,124
Sales
2130
2125
2120
2115
2110
2105
2100
2095
Sales
1999
Y-Axis scale gives WRONG
impression of big change
12
2000
2001
2002
2003
Better Visualization
Year Sales
1999 2,110
2000 2,105
2001 2,120
2002 2,121
2003 2,124
Sales
3000
2500
2000
1500
Sales
1000
500
0
1999
2000
Axis from 0 to 2000 scale gives
correct impression of small change
13
2001
2002
2003
Lie Factor
size of effect shown in graphic
Lie Factor 

size of effect in data
(5.3  0.6)
7.833
0
.
6


 14.8
(27.5  18.0) 0.528
18
Tufte requirement: 0.95<Lie Factor<1.05
(E.R. Tufte, “The Visual Display of Quantitative Information”, 2nd edition)
14
Tufte’s Principles of
Graphical Excellence
 Give the viewer
 the greatest number of ideas
 in the shortest time
 with the least ink in the smallest space.
 Tell the truth about the data!
(E.R. Tufte, “The Visual Display of Quantitative Information”, 2nd edition)
15
Visualization Methods
 Visualizing in 1-D, 2-D and 3-D
 well-known visualization methods
 Visualizing more dimensions
 Parallel Coordinates
 Other ideas
16
1-D (Univariate) Data
 Representations
7
Tukey box plot
5
low
3
1
Middle 50%
high
Mean
0
Histogram
17
20
2-D (Bivariate) Data
 Scatter plot, …
price
mileage
18
3-D Data (projection)
price
19
3-D image
(requires 3-D blue and red glasses)
Taken by Mars Rover Spirit, Jan 2004
20
Visualization Summary
 Many methods
 Visualization is possible in more than 3-D
 Aim for graphical excellence
21