Spreadsheet Visualisation to Improve End

Download Report

Transcript Spreadsheet Visualisation to Improve End

Spreadsheet Visualisation to
Improve End-user Understanding
Daniel Ballinger, Robert Biddle and James Noble
School of Mathematical and Computing Sciences
Email: {db, robert, kjx}@mcs.vuw.ac.nz
http://www.mcs.vuw.ac.nz/~db/honours.html
Motivation
 Spreadsheets are a common form of end-user
programming.
 Unfamiliar spreadsheets can contain daunting
amounts of information in the layout and intercell dependencies.
 Visualisation can be used to aid in end-user
understanding of spreadsheets.
 Working outside the spreadsheet application
allows for greater flexibility in visualisation.
 We focused on Microsoft Excel due to its large
market share.
Excel’s Current Visualisation Support –
Range Finder
Invoked by clicking in
a cell and then in the
formula bar.
Components are
coloured in the bar and
sheet.
Allows for visual
manipulation.
Mainly only useful for
spatially close cells.
Range Finder
Excel’s Current Visualisation Support –
Formula Auditing Tools
Auditing Tools
Invoked using Formula
Auditing Toolbar.
Trace dependents or
precedents.
Arrows always point to
referenced cell.
Users may navigate
spatially disjoint cells.
(semantic navigation)
Complicated
spreadsheets can create a
tangle of arrows.
Related Work
 Takeo Igarashi
– Spreadsheets augment “a visible tabular layout with invisible formulas”.
– Created visualisations to help reveal the hidden dataflow graphs and
superficial tabular layouts of spreadsheets.
 Markus Clermont
– Most end-users are not trained programmers.
– Many spreadsheets exist beyond being simple scratch pads.
 Raymond Panko
– Studies of empirical data into spreadsheet errors.
– Found error rates can be disturbingly high.
– Errors attributed to over confidence and lack of formal checking.
 Margaret Burnett
– The importance of scalability in visualisations.
– Forms/3 and an embedded testing methodology.
Spreadsheet Application Toolkit
Query
Find and store spreadsheets
from the Internet.
Extract low level structures.
E.g. Cell values and
formulas.
Analyse spreadsheet
structures. Either individual
or corpus.
Conveying the findings
through visualisation.
Gobbler
Google
Fetcher
Web Servers
Extractor
BIFF Reader
URLs
XLS files
Toolkit Files
Algorithms
Analyser
Processed
Metrics
Data
Visualisation Tools
Image
Visualisations








Spreadsheet layout
Clustering
Data Dependency Flow
Data Dependency Direction
Graph Structure
Fisheye view
Formula Inspection
Corpus Analysis
Spreadsheet layout – Real-estate Utilisation 2D
Understanding layout is an
important first step in learning about
a new spreadsheet.
Actual values and formulas are only
shown as occupied cells.
The visualisation layout mimics that
of Excel, with columns along the top
of the x-axis and rows running down
the y-axis.
Cells with a higher occupancy level
are coloured towards the red end of
the colour spectrum.
Spreadsheet layout – Real-estate Utilisation 3D
Occupancy data is
projected into 3D to
create a surface map.
Discrete to continuous
data transformation helps
smooth the effects of
spikes.
Coloured to give a
Topographical terrain
effect.
Full benefit is seen with
user interaction.
BIRCH Clustering
BIRCH clustering partitions records into clusters that
are similar according to two or more attributes.
Current visualisations use the Euclidean distance
between cells as the similarity metric.
Data Dependency Flow
Visualising just average
unit vectors for each cell
can reduce the visual
clutter.
3D can be used to
separate vectors that occur
at different sheet levels.
Note the curvature back
towards the origin for this
workbook.
Data Dependency Direction
Concentrate purely on
the directions of data
flow relative to cells.
Angles are sorted into
36 buckets then feed to
Excel to create the graph.
After the four main axis
the next significant
measure occurs between
300 and 360º.
Radar Graph for
Outgoing Dependencies
Graph Structure
Disregarding spatial bounds allows some structures to
become clearer.
Spring view
Source Data
Fisheye view - Focus + Context
Addresses formula
dependencies that span large
distances or are many cells
deep.
Trees are warped over a
hyperbolic lens to achieve
both focus in the centre and
context.
An artificial red root node is
introduced to connect disjoint
trees.
Formula Inspection – Data Flow
Visualising formula components and flow direction.
Fully trace worksheets in
one view.
Basic Referencing Components
Formula Inspection - Dependency Types
Row Absolute
Excel allows for
Fully Absolute
combinations of relative
and absolute positioning. Column Absolute
Relative
Understanding the
referencing type is
important when
replicating formula and
identifying regular
patterns.
Corpus Analysis of 259 Workbooks
Spatial Centre
Demonstrations of
visualisations created from a
corpus.
With this sample corpus the
average worksheet centre is
more column centric.
Function utilisation suggests
that the logical functions,
such as IF, actually
outnumber simpler math
functions like SUM.
Number of non-empty Worksheets: 227
Number of empty Worksheets: 195
Average Row: 1.348
Average Column 18.098
Max Row: 1384
Max Col: 82
Total Occupied cells: 55491
Orphans : 51570
Root Cells : 2105
Leafs: 1031
Nodes in Cyclic References: 29
Local Formula: 108
Family Trees: 509
Max Tree Depth: 22
Max Tree Breadth: 150
Function Utilisation
Summary
 Spreadsheets are significant examples of end-user
programming
 Visualisation can assist end-users in better
understanding the structure of spreadsheets
– In particular, the “hidden structures” created by formula
 Reviewed literature to investigate the implications
of the hidden structures.
 Developed a toolkit to externally access the
spreadsheet structure and generate visualisations.
 Created several sample visualisations to help
improve end-user understanding.
Current and Future Work
 Detailed user studies, including usability
evaluations
 Domain specific visualisations
 Spreadsheet corpus analysis to find large
patterns
 Visualisation scalability to larger more
complex spreadsheets
http://www.mcs.vuw.ac.nz/~db/honours.html