Software and Data Visualization
Download
Report
Transcript Software and Data Visualization
Its place in Software Engineering
Jonathan Reese
Visual representation of data
Broad topic!
Software creates new possibilities
◦ Powerful and automatic generation
◦ Massive amounts of data
Make data
◦ Understandable
◦ Manageable
◦ Exciting
We generate and store data at a much faster
rate than we understand it
◦ Corporation’s records
◦ Police reports
◦ Statistics and Surveys
Tap into the wealth of data
Automated process of interpreting data to
extract more specific information.
Data mining is not a part of data visualization
Both are independent, but can be used
together
225,000 users gave names to 5,000,000
randomly generated colors.
◦ Big data
He created an algorithm to extract commonly
understood color names and their values
from the data
◦ Data mining
A 3D rotating model was created to show the
spectrum of color names.
◦ Data visualization
Make it easier to find what you are looking
for in data
◦ Exclude unwanted information
◦ Represent numbers visually
Colors
Distance
Location
Size
Visualization can assist more than just end
users
Valve monitors and records statistics from
their players.
They use data visualization to make sense of
their own recordings
Heat maps
◦ Valve monitors the frequency and location where
action takes place in their FPS games
◦ Helps them make map design choices, and monitor
effects of changes
A prospective customer of Verizon does not
want to know the locations of all the Verizon
towers.
The customer wants to see where there is
coverage.
A customer can quickly decide if there will be
coverage issues by looking at the map
Statistical Analysis Software
◦ Provides data visualization software
◦ Customers purchase their software to essentially
look at their data differently
◦ “The Power to Know”
Magnaview
◦ “Visualize anything, visualize everything”
Companies providing software that simply
makes data viewable
Data visualization can improve or harm an
interface
◦ It can overcomplicate an interface
◦ Cause confusion
◦ Prevent a viewer from finding specific data
It is important to know when it is appropriate
Criteria
◦ Raw data is overwhelming
◦ There is interest in understanding the data
◦ The specific numbers are less important than what
the numbers represent
◦ We are safe to assume what a viewer will be looking
for in the data
There are many methods for visualization
filling different niches
Choose a method that
◦ Includes only relevant information
◦ Has minimal distractions
◦ Represents data in an intuitive way
Pros
◦ Interprets big data of media to finding similar
content
◦ Intuitive
Connections represent relation
Distance represents how related the items are
Cons
◦ Distractions
Color and size are meaningless/unexplained
Connections are redundant
Compare to a list of related artists
Scenario
◦ A convenience store owner has a database with
purchase records of customers
◦ The owner wants to reorganize to maximize sales
and customer satisfaction
Correlation matrices would be helpful
◦ A matrix for how often item categories are
purchased together
◦ A matrix for how often individual items are
purchased together
Correlations between categories would help
decide
◦ What categories to put in the same aisle
◦ What aisles to put next to each other
Correlations between individual items in
categories
◦ Organize sections appropriately
Using color to represent numbers works in
that example for a couple reasons
◦ A large matrix of percentages would not make the
interesting data pop out.
◦ The exact percentages are not as important as how
the percentages compare to each other
When using color SAS says it is ideal for color
shade represent a value instead of color hue
If color is transparent then changes in hue
will be more easily visible (Valve’s heatmaps)
Represents coordinate data with a point on
the map
Useful if location is relevant
With Google Maps and other similar resources
it became very easy to add distractions
◦ Satellite view
◦ Points of Interest
◦ Roads
These should be included only if they are
relevant
Violent Crime in
Milwaukee by
neighborhood
Roads are relevant
Satellite view would
add clutter and
warp colors.
Similar to a pie chart; effective at
demonstrating the portion taken up by items
Rectangular containers and items
Hierarchical
◦ Containers can hold containers or items
Item sizes are based on portion of the
category it is in
Thus category sizes are based on the portion
taken up by what it holds
WinDirStat is an open-source memory
management application for Windows
Uses “TreeMap” class from Java’s libraries
Turns the big data of a storage device’s file
system into a TreeMap
◦
◦
◦
◦
◦
Folders – Categories
Files – Items
Portion of used memory taken - Size
Type of file - Color
Glare shows folders without using border space!
Customizable and Interactive TreeMap
visualization of news headlines
◦
◦
◦
◦
Color – Type of news
Category – Location
Portion – How big the story is (Attention it’s getting)
Color Shade – How “breaking” the news is
Interactive
◦ clicking on a link sends you to the news article
Customizable
◦ If there is information that doesn’t interest you,
remove it
Currently, the internet presents a highly disorganized
collage of information. Many of us are working in an
information-soaked world. There is too much of everything.
We are subject everywhere to a sensory overload of images,
bombarded with information; in magazines and
advertisements, on TV, radio, in the cityscape. The internet
is a wonderful communication tool, but day after day we find
ourselves constantly dealing with information overload.
Today, the internet presents a new challenge, the wide and
unregulated distribution of information requires new visual
paradigms to organize, simplify and analyze large amounts
of data. New user interface challenges are arising to deal
with all that overwhelming quantity of information.
- Markos Weskamp
Big data
◦ A wealth of useful information, but is overwhelming
◦ Data visualization helps to make big data
manageable
Data visualization represents the data in a
meaningful way
Different types are useful based on the
situation
Visualizations can often be improved by
adding Interactivity and Customizability
Data visualization helps make useful
discoveries in data
Can make an interface stand out
There is room for creativity! We only covered
a few templates
Next time you have big data on your hands,
think of it as an opportunity instead of a
problem
[1] Statistical Analysis Solutions (2012). Data Visualization Techniques. Retrieved from
http://www.sas.com/reg/wp/corp/51989
[2] Stephen Few (2009). Introduction to Geographical Data Visualization. Retrieved from
http://www.perceptualedge.com/articles/visual_business_intelligence/geographical_data_visualization.pdf
[3] Crime safety map (2013). [Geographical data visualization of Milwaukee March 19, 2013] Location Inc.
Retrieved from http://www.neighborhoodscout.com/wi/milwaukee/crime/
[4] Newsmap news feed (2013). [Treemap data visualization of Google News March 19, 2013]
Marcos Weskamp. Retrieved from http://newsmap.jp
[5] Marcos Weskamp. Newsmap [pg 6]. Message posted to http://marumushi.com/projects/newsmap
[6] WinDirStat Developers (Open Source) (2007). WinDirStat (Version 1.1.2) [Software]. Available from
http://windirstat.info/index.html
[7] Verizon coverage locator (2013). [Verizon mobile phone coverage locator March 19, 2013] Verizon. Retrieved
from http://www.verizonwireless.com/b2c/support/coverage-locator
[8] MagnaView (2013) Website. http://www.magnaview.com/
[9] Howard Yeend (May 4, 2010). XKCD Colour Survey – a 3D visualization. [Web log post]. Retrieved from
http://www.puremango.co.uk/2010/05/xkcd-color-survey-3d-visualization/
[10] Duncan Graham-Rowe (2007). Mapping the Internet. MIT Technology Review. Retrieved from
http://www.technologyreview.com/news/408104/mapping-the-internet/
[11] Valve (2007). CP_dustbowl heat map. Retrieved from http://www.tfportal.de/?site=news_details&id=447
[12] Time (2013). Population Density US Map. Retrieved from
http://www.time.com/time/interactive/0,31813,1549966,00.html
[13] Wordle (2013). Used to generate UWP Facebook Word Map
[14] Itoh, T., Yamaguchi, Y.; Ikehaha, Y.; Kajinaga, Y., (2004). Hierarchical data visualization using a fast
rectangle-packing algorithm,” Visualization and Computer Graphics, IEEE Transactions on, vol.10, no.3
pp.302,313. May 2004
doi:http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=1272729&contentType=Journals+%26+Ma
gazines&searchField%3DSearch_All%26queryText%3D.QT.Data+visualization.QT.
[15] Yau, N. (2012). Visualize this, the flowingdata guide to design, visualization, and statistics. Wiley.