Transcript ppt

Visual Computing
Lecture 2
Visualization, Data, and Process
Pipeline 1
High Level Visualization Process
1.
2.
3.
4.
5.
Data Modeling
Data Selection
Data to Visual Mappings
Scene Parameter Settings (View Transforms)
Rendering
Pipeline 2
Computer Graphics
1.
2.
3.
4.
5.
6.
Modeling
Viewing
Clipping
Hidden Surface Removal
Projection
Rendering
Pipeline 3
Visualization Process
Pipeline 4
Knowledge Discovery
(Data Mining)
A Data Analysis Pipeline
Raw
Data
Processed
Data
Hypotheses
Models
Results
D
Cleaning
Filtering
Transforming
A
Statistical Analysis
Pattern Rec
Knowledge Disc
Validation
B
C
Where Does Visualization Come In?
• All stages can benefit from visualization
• A: identify bad data, select subsets, help
choose transforms (exploratory)
• B: help choose computational techniques, set
parameters, use vision to recognize, isolate,
classify patterns (exploratory)
• C: Superimpose derived models on data
(confirmatory)
• D: Present results (presentation)
What do we need to know to do Information
Visualization?
•
Characteristics of data
–
–
•
Characteristics of user
–
–
•
Perceptual and cognitive abilities
Knowledge of domain, data, tasks, tools
Characteristics of graphical mappings
–
–
•
Types, size, structure
Semantics, completeness, accuracy
What are possibilities
Which convey data effectively and efficiently
Characteristics of interactions
–
–
Which support the tasks best
Which are easy to learn, use, remember
Visualization Components
•
Human
Abilities
•
Design
Principles
Imply
•
Visual perception
•
Visual display
•
Cognition
•
Interaction
•
Motor skills
•
Design
Process
•
Iterative design
•
Design studies
•
Evaluation
Inform
design
•
Frameworks
Constrain
•
Data types
design
•
Tasks
•
Techniques
•
Graphs & plots
•
Maps
•
Trees & Networks
•
Volumes & Vectors
•
…
Issues Regarding Data
•
Type may indicate which graphical mappings are
appropriate
–
–
–
–
–
–
–
•
•
Nominal vs. ordinal
Discrete vs. continuous
Ordered vs. unordered
Univariate vs. multivariate
Scalar vs. vector vs. tensor
Static vs. dynamic
Values vs. relations
Trade-offs between size and accuracy needs
Different orders/structures can reveal different
features/patterns
Types of Data
• Quantitative (allows arithmetic operations)
- 123, 29.56, …
• Categorical (group, identify & organize; no arithmetic)
Nominal (name only, no ordering)
• Direction: North, East, South, West
Ordinal (ordered, not measurable)
• First, second, third …
• Hot, warm, cold
Interval (starts out as quantitative, but is made categorical by subdividing
into ordered ranges)
• Time: Jan, Feb, Mar
• 0-999, 1000-4999, 5000-9999, 10000-19999, …
Hierarchical (successive inclusion)
• Region: Continent > Country > State > City
• Animal > Mammal > Horse
Adapted from Stone & Zellweger
11
Quantitative Data
• Characterized by its dimensionality and the
scales over which the data has been measured
• Data scales comprise:
– Interval scales - real data values such as degrees
Celsius, but do not have a natural zero point.
– Ratio data scales - like interval scales, but have a
natural zero point and can be defined in terms of
arbitrary units.
– Absolute data scales - ratio scales that are defined
in terms of non-arbitrary units.
Data Dimensions
•
Scalar - single value
– e.g. Speed. It specifies how fast an object is traveling.
•
Vector – multi value
– e.g Velocity. It tells the speed and direction.
•
Tensor – multi value
– Scalars and vectors are special cases of tensors with degree (n) equal to 0 and 1
respectively.
– The number of tensor components is given as dn, where d is the dimensionality
of the coordinate system.
– In a three dimensional coordinate system (d=3), a scalar (n=0) requires three
values; and a tensor (n=2) requires 9 values.
– There is a difference between a vector and a collection of scalars.
– A multidimensional vector is a unified entity, the components of which are
physically related.
– The three components of a velocity vector of particle moving through three-space
are coherently linked; while a collection scalar measurements such a weight,
temperature, and index of refraction, are not.
Metadata
• Metadata provides a description of the data and
the things it represents.
– e.g., a data value of 98.6 oF has two metadata
attributes: temperature and temperature scale.
– The value 98.6 has little meaning without the
metadata attribute of temperature.
– By adding Fahrenheit the attribute, we know the
Fahrenheit sale is used.
• Metadata may also include descriptions of
experimental conditions and documentation of
data accuracy and precision.
Issues Regarding Mappings
• Variables include shape, size, orientation,
color, texture, opacity, position, motion….
• Some of these have an order, others don’t
• Some use up significant screen space
• Sensitivity to occlusion
• Domain customs/expectations
www3.sympatico.ca/blevis/Image10.gif
Importance of Evaluation
•
•
•
•
Easy to design bad visualizations
Many design rules exist – many conflict, many routinely
violated
5 E’s of evaluation: effective, efficient, engaging, error
tolerant, easy to learn
Many styles of evaluation (qualitative and quantitative):
– Use/case studies
– Usability testing
– User studies
– Longitudinal studies
– Expert evaluation
– Heuristic evaluation
Categories of Mappings
•
Based on data characteristics
– Numbers, text, graphs, software, ….
•
Logical groupings of techniques (Keim)
–
–
–
–
–
•
Standard: bars, lines, pie charts, scatterplots
Geometrically transformed: landscapes, parallel coordinates
Icon-based: stick figures, faces, profiles
Dense pixels: recursive segments, pixel bar charts
Stacked: treemaps, dimensional stacking
Based on dimension management (Ward)
–
–
–
–
Dimension subsetting: scatterplots, pixel-oriented methods
Dimension reconfiguring: glyphs, parallel coordinates
Dimension reduction: PCA, MDS, Self Organizing Maps
Dimension embedding: dimensional stacking, worlds within worlds
Scatterplot Matrix
•
•
•
Each pair of dimensions
generates a single
scatterplot
All combinations
arranged in a grid or
matrix, each dimension
controls a row or column
Look for clusters,
outliers, partial
correlations, trends
Parallel Coordinates
•
•
•
•
•
Each variable/dimension
is a vertical line
Bottom of line is low
value, top is high
Each record creates a
polyline across all
dimensions
Similar records cluster
on the screen
Look for clusters,
outliers, line angles,
crossings
Star Glyph
•
Glyphs are shapes whose
attributes are controlled by data
values
•
Star glyph is a set of N rays spaced
at equal angles
•
Length of each ray proportional to
value for that dimension
•
Line connects all endpoints of
shape
•
Lay glyphs out in rows and
columns
•
Look for shape similarities and
differences, trends
Other Types of Glyphs
Dimensional Stacking
•
•
•
•
Break each dimension range into bins
Break the screen into a grid using the number of bins for 2
dimensions
Repeat the process for 2 more dimensions within the
subimages formed by first grid, recurse through all dimensions
Look for repeated patterns, outliers, trends, gaps
Pixel-Oriented Techniques
•
•
•
•
Each dimension creates
an image
Each value controls
color of a pixel
Many organizations of
pixels possible (raster,
spiral, circle segment,
space-filling curves)
Reordering data can
reveal interesting
features, relations
between dimensions
Methods to Cope with Scale
• Many modern datasets contain large number of
records (millions and billions) and/or
dimensions (hundreds and thousands)
• Several strategies to handle scale problems
– Sampling
– Filtering
– Clustering/aggregation
• Techniques can be automated or usercontrolled
Examples of Data Clustering
Example of Dimension Clustering
Example of Data Sampling
The Visual Data Analysis (VDA) Process
•
•
•
•
•
•
Overview
Filter/cluster/sample
Scan
Select “interesting”
Details on demand
Link between different views
Issues Regarding Users
• What graphical attributes do we perceive
accurately?
• What graphical attributes do we perceive
quickly?
• Which combinations of attributes are
separable?
• Coping with change blindness
• How can visuals support the development of
accurate mental models of the data?
• Relative vs. absolute judgements – impact on
tasks
Role of Perception
MC Escher
Consider the Following
Role of Perception
• Users interact with visualizations based on
what they see. (e.g. black spots at
intersection of white lines)
• Must understand how humans perceive
images.
• Primitive image attributes: shape, color,
texture, motion, etc.
Visualization Example
Op Art - Victor Vasarely
OpGlyph (Marchese)
Gestalt Psychology
Rules of Visual
Perception
Principles of Art &
Design
Proximity
Similarity
Continuity
Closure
Symmetry
Foreground & Background
Size
Emphasis / Focal Point
Balance
Unity
Contrast
Symmetry / Asymmetry
Movement / Rhythm
Pattern / Repetition
Issues Regarding Interactions
• Interaction critical component
• Many categories of techniques
– Navigation, selection, filtering, reconfiguring,
encoding, connecting, and combinations of
above
• Many “spaces” in which interactions can
be applied
– Screen/pixels, data, data structures,
graphical objects, graphical attributes,
visualization structures