Software and Data Visualization

Download Report

Transcript Software and Data Visualization

Its place in Software Engineering
Jonathan Reese



Visual representation of data
Broad topic!
Software creates new possibilities
◦ Powerful and automatic generation
◦ Massive amounts of data

Make data
◦ Understandable
◦ Manageable
◦ Exciting

We generate and store data at a much faster
rate than we understand it
◦ Corporation’s records
◦ Police reports
◦ Statistics and Surveys

Tap into the wealth of data



Automated process of interpreting data to
extract more specific information.
Data mining is not a part of data visualization
Both are independent, but can be used
together

225,000 users gave names to 5,000,000
randomly generated colors.
◦ Big data

He created an algorithm to extract commonly
understood color names and their values
from the data
◦ Data mining

A 3D rotating model was created to show the
spectrum of color names.
◦ Data visualization

Make it easier to find what you are looking
for in data
◦ Exclude unwanted information
◦ Represent numbers visually





Colors
Distance
Location
Size
Visualization can assist more than just end
users



Valve monitors and records statistics from
their players.
They use data visualization to make sense of
their own recordings
Heat maps
◦ Valve monitors the frequency and location where
action takes place in their FPS games
◦ Helps them make map design choices, and monitor
effects of changes



A prospective customer of Verizon does not
want to know the locations of all the Verizon
towers.
The customer wants to see where there is
coverage.
A customer can quickly decide if there will be
coverage issues by looking at the map

Statistical Analysis Software
◦ Provides data visualization software
◦ Customers purchase their software to essentially
look at their data differently
◦ “The Power to Know”

Magnaview
◦ “Visualize anything, visualize everything”

Companies providing software that simply
makes data viewable

Data visualization can improve or harm an
interface
◦ It can overcomplicate an interface
◦ Cause confusion
◦ Prevent a viewer from finding specific data

It is important to know when it is appropriate

Criteria
◦ Raw data is overwhelming
◦ There is interest in understanding the data
◦ The specific numbers are less important than what
the numbers represent
◦ We are safe to assume what a viewer will be looking
for in the data


There are many methods for visualization
filling different niches
Choose a method that
◦ Includes only relevant information
◦ Has minimal distractions
◦ Represents data in an intuitive way

Pros
◦ Interprets big data of media to finding similar
content
◦ Intuitive
 Connections represent relation
 Distance represents how related the items are

Cons
◦ Distractions
 Color and size are meaningless/unexplained
 Connections are redundant

Compare to a list of related artists

Scenario
◦ A convenience store owner has a database with
purchase records of customers
◦ The owner wants to reorganize to maximize sales
and customer satisfaction

Correlation matrices would be helpful
◦ A matrix for how often item categories are
purchased together
◦ A matrix for how often individual items are
purchased together

Correlations between categories would help
decide
◦ What categories to put in the same aisle
◦ What aisles to put next to each other

Correlations between individual items in
categories
◦ Organize sections appropriately

Using color to represent numbers works in
that example for a couple reasons
◦ A large matrix of percentages would not make the
interesting data pop out.
◦ The exact percentages are not as important as how
the percentages compare to each other


When using color SAS says it is ideal for color
shade represent a value instead of color hue
If color is transparent then changes in hue
will be more easily visible (Valve’s heatmaps)



Represents coordinate data with a point on
the map
Useful if location is relevant
With Google Maps and other similar resources
it became very easy to add distractions
◦ Satellite view
◦ Points of Interest
◦ Roads

These should be included only if they are
relevant



Violent Crime in
Milwaukee by
neighborhood
Roads are relevant
Satellite view would
add clutter and
warp colors.



Similar to a pie chart; effective at
demonstrating the portion taken up by items
Rectangular containers and items
Hierarchical
◦ Containers can hold containers or items


Item sizes are based on portion of the
category it is in
Thus category sizes are based on the portion
taken up by what it holds



WinDirStat is an open-source memory
management application for Windows
Uses “TreeMap” class from Java’s libraries
Turns the big data of a storage device’s file
system into a TreeMap
◦
◦
◦
◦
◦
Folders – Categories
Files – Items
Portion of used memory taken - Size
Type of file - Color
Glare shows folders without using border space!

Customizable and Interactive TreeMap
visualization of news headlines
◦
◦
◦
◦

Color – Type of news
Category – Location
Portion – How big the story is (Attention it’s getting)
Color Shade – How “breaking” the news is
Interactive
◦ clicking on a link sends you to the news article

Customizable
◦ If there is information that doesn’t interest you,
remove it
Currently, the internet presents a highly disorganized
collage of information. Many of us are working in an
information-soaked world. There is too much of everything.
We are subject everywhere to a sensory overload of images,
bombarded with information; in magazines and
advertisements, on TV, radio, in the cityscape. The internet
is a wonderful communication tool, but day after day we find
ourselves constantly dealing with information overload.
Today, the internet presents a new challenge, the wide and
unregulated distribution of information requires new visual
paradigms to organize, simplify and analyze large amounts
of data. New user interface challenges are arising to deal
with all that overwhelming quantity of information.
- Markos Weskamp

Big data
◦ A wealth of useful information, but is overwhelming
◦ Data visualization helps to make big data
manageable



Data visualization represents the data in a
meaningful way
Different types are useful based on the
situation
Visualizations can often be improved by
adding Interactivity and Customizability




Data visualization helps make useful
discoveries in data
Can make an interface stand out
There is room for creativity! We only covered
a few templates
Next time you have big data on your hands,
think of it as an opportunity instead of a
problem
















[1] Statistical Analysis Solutions (2012). Data Visualization Techniques. Retrieved from
http://www.sas.com/reg/wp/corp/51989
[2] Stephen Few (2009). Introduction to Geographical Data Visualization. Retrieved from
http://www.perceptualedge.com/articles/visual_business_intelligence/geographical_data_visualization.pdf
[3] Crime safety map (2013). [Geographical data visualization of Milwaukee March 19, 2013] Location Inc.
Retrieved from http://www.neighborhoodscout.com/wi/milwaukee/crime/
[4] Newsmap news feed (2013). [Treemap data visualization of Google News March 19, 2013]
Marcos Weskamp. Retrieved from http://newsmap.jp
[5] Marcos Weskamp. Newsmap [pg 6]. Message posted to http://marumushi.com/projects/newsmap
[6] WinDirStat Developers (Open Source) (2007). WinDirStat (Version 1.1.2) [Software]. Available from
http://windirstat.info/index.html
[7] Verizon coverage locator (2013). [Verizon mobile phone coverage locator March 19, 2013] Verizon. Retrieved
from http://www.verizonwireless.com/b2c/support/coverage-locator
[8] MagnaView (2013) Website. http://www.magnaview.com/
[9] Howard Yeend (May 4, 2010). XKCD Colour Survey – a 3D visualization. [Web log post]. Retrieved from
http://www.puremango.co.uk/2010/05/xkcd-color-survey-3d-visualization/
[10] Duncan Graham-Rowe (2007). Mapping the Internet. MIT Technology Review. Retrieved from
http://www.technologyreview.com/news/408104/mapping-the-internet/
[11] Valve (2007). CP_dustbowl heat map. Retrieved from http://www.tfportal.de/?site=news_details&id=447
[12] Time (2013). Population Density US Map. Retrieved from
http://www.time.com/time/interactive/0,31813,1549966,00.html
[13] Wordle (2013). Used to generate UWP Facebook Word Map
[14] Itoh, T., Yamaguchi, Y.; Ikehaha, Y.; Kajinaga, Y., (2004). Hierarchical data visualization using a fast
rectangle-packing algorithm,” Visualization and Computer Graphics, IEEE Transactions on, vol.10, no.3
pp.302,313. May 2004
doi:http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=1272729&contentType=Journals+%26+Ma
gazines&searchField%3DSearch_All%26queryText%3D.QT.Data+visualization.QT.
[15] Yau, N. (2012). Visualize this, the flowingdata guide to design, visualization, and statistics. Wiley.