Beyond BirdSpot a proposal

Download Report

Transcript Beyond BirdSpot a proposal

Beyond BirdSpot
learnings and a proposal
L. Shyamal
[email protected]
16 Nov 2005
What BirdSpot was meant to do
•
Compile records of birds from all available published material seen to build a true picture
of distribution in a database system making it amenable for searches and analyses
–
•
Start collecting trip reports to build more accurate snapshots of birdlife in various parts of
the country - provide reliable measures of commonness (frequency of sighting) and
seasonal status
–
•
•
trip report being lists of all bird species seen on a particular trip to a location at a particular point of
time along with estimates of time spent in field
Generate simple visualizations of the data and generate reports
Ensure quality of data by user a feeling of ownership for the data and by keeping it offline
–
•
•
thereby also making it a bibliography of bird reports
most users have no motivation to store junk / frivolous data on their own systems
Ensure preservation of the data safe by replicating across a user community
Data sharing achieved on a peer to peer basis so that users can build their own compilation
of data based on their judgement of accuracy of observers - sharing via email
The implementation
•
•
Visual Basic 6.0 - MS Access 97
User interface - guiding principles
–
–
•
Database schema
–
–
•
high data/ink ratio (Edward Tufte)
high interaction (Ben Schniederman / SpotFire inspired)
separation of the Trip concept from stray records concept
use of a single linear taxonomic definition list
Installer
–
–
Inno Setup
MDAC installation not sufficiently well integrated
The database schema
Data entry
•
•
•
•
Initially data entered using MS Excel
Modified using a data grid
– data grid modifications - short lists of species
– checking of input data using database
Fuzzy text mining and matching (Fixit) - not integrated
Current data contents almost entirely entered by author
–
data entry has taken over at least 5 years for the current status
Current User Interface: Geographic view
•Gets list of species for location
or region marked along with
consolidated reporting rates
•Generates HTML report
•Shows sources of information
•Background maps can be
changed
•Map can be customized
•Rectangular and lasso selector
supported for region marking
•Tool tips provide lat-long info
•Maps have to be orthonormal
raster images of fixed size
Help and other items relegated to
a flat strip at the bottom
•Dialog itself is not resizable inability to manage layout easily
Current User Interface: Species view
•Gets locations for a selected
species
•Generates HTML report
•Shows sources of information
•Background maps can be
changed
•Map can be customized
•Month range highlighting
•Shading of spots based on report
count
•Gridding and rank based spot
generation
Help and other items relegated to
a flat strip at the bottom
Current User Interface: Data entry
•
•
•
•
Initially data entered using
MS Excel
Modified using a data grid
– data grid modifications
– short lists of species
– checking of input data
using database
Fuzzy text mining and
matching (Fixit) - not
integrated
Current data contents almost
entirely entered by author
–
data entry has taken over at
least 5 years for the current
status
Current User Interface: Exporter
•
•
•
•
Separate application
Generates a text output with
multiple sections
file needs to be emailed to
configured database compiler
Has not be used by many users
Current User Interface: Importer
•
•
•
•
Separate application
Parses text file sections and
inserts into database
No checking for duplication done
Has not be used by many users
Suggestions, feedback and problems
•
•
•
•
•
•
•
•
•
•
•
•
Unable to resize window
How do I store only my data ?
Near zero data contribution from user base
Failure to handle multiple species treatments
Lack of support for reassignment or editing of reports
Lack of support for version history maintenance
Installation troubles especially due to MDAC
Difficulty of georeferencing
How do I modify this to work with ‘butterflies/…/...’ ?
Can’t this also have photographs and identification information ?
Slow speed of operation especially ranking and gridding
Failure to reflect changes immediately (requirement to flatten data)
What is needed
•
Web based system with backend database
–
–
–
–
–
–
•
Utilities for database synchronization via web services
–
•
•
•
•
•
Taxonomic subsystem
User subsystem
Database maintenance
Species information
Multimedia storage
Reporting and analysis
Offline data entry and analysis tools
Support for linking with multimedia database and wiki style identification and species
material
Participation - data entry, compilation of species information, contribution of images,
recordings, text and reference material compilation from existing reference material
Prevention of large scale data loss, by allowing all users to synchronize various parts of
the data into offline systems
Ensure high quality data by ensuring that users first enter into their personal offline
systems and then synchronize the data with that on server
All new users are registered through ‘references’ or invitations given by existing users
Backend database
•
Species concept
–
–
–
–
•
User management
–
•
Should essentially support the idea of ‘species’ being a named set of organisms
Should support the idea of multiple ‘sets’ of species - overlapping, non-overlapping, equal and
containment
Should support the idea of nested hierarchies of these sets for higher taxon levels
Should provide tools for marking and updating of lists and taxonomic concepts (see TDWG,
Taxonomic Concept Transfer Schema)
Should track user activities
Database control
–
–
–
–
–
–
Should maintain database history and support undo/rollback
Should automatically taint bad records or users
data grid modifications - short lists of species
Should aid in easier data entry
Should try to emphasise trip lists and dissuade stray record entry except for rarer species
Should support synchronization, version control, merging and diff-ing apart from ease of
replication
Web Interfaces
•Querying
Range selectors
•temporal brushing
•spatial querying
•Reporting
Sliding window
•maps
•Seasonal histograms
Available data range
•latitudinal trends, gridding, smoothing, contouring
•Data entry
•Fuzzy text matching, text mining (extracting from emails)
•User management
•Browsing
•Image upload
•Authoring
Offline client utilities
•
•
•
•
•
Taxonomy editor
Database sychronization
Data entry
Data analysis
Secure authenticated communication with website
–
•
Allow low bandwidth users to work with breaks in communications
Offline operations and as web-based application should have similar capabilities