Migration of the Natural History Museum Entomology Department`s

Download Report

Transcript Migration of the Natural History Museum Entomology Department`s

Migrating Entomology’s
Collection Management System to
EMu
Adrian Hine
Programme Timeline
• Separate sub-projects owing to different requirements and very
different legacy systems between departments.
• For Entomology initial planning started in March 2005. We have
gone live in February 2007.
2005
Jan - Mar
2006
Apr - Jun
Jul - Sep
Oct - Dec
Jan - Mar
2007
Apr - Jun
Jul - Sep
Oct - Dec
Jan - Mar
Mineralogy
Entomology
Palaeontology
Zoology
Botany
Apr - Jun
Jul - Sep
Oct - Dec
Legacy System - Paradox
• Core system was the Collection Management System written inhouse on a Paradox for DOS Database in 1995.
• In addition many more specimen and/or taxonomic datasets in
Access.
Advantages
Disadvantages
Simplicity to enter data on one/two
screens.
Largely non-relational design (unnormalised
data).
Stable and reliable system running
for more than a decade.
Data input into free text fields (no constraints
through drop downs/lookups or validation).
Requires minimal training to use it.
Unsupported programming language (PAL).
Good departmental user knowledge
built up over this time.
Limitations as a Collection Management tool,
especially tracking material as it moves
around the museum - Location is simple text
field. Limited tracking of changes to taxonomic
names.
Relatively few fields to record data.
Legacy System - Paradox
Legacy System - Datasets
• Four main kinds of datasets;
1) Accession Register (3,300 records)
2) Specimen level (231,000 records)
3) Collection Index (798,000 records)
4) Taxonomic (55,600 records)
• All these have different data models.
• In total 47 datasets were migrated into EMu. The data structures
were varied and all had now to fit into the EMu data model.
Data Mapping
• It was a huge task to reconcile the numerous and varied datasets
into a single data repository for the first time ever in the department
(and eventually the museum).
• There were a range of data models present between different
datasets. These had to be mapped to the EMu model.
• Probably the most critical stage of the process was data mapping of
our existing data model to the EMu model.
• Numerous EMu records can be generated from a single Paradox CI
record and some quite complex logic was necessary to achieve this.
• In particular for each CI record four taxonomy records could be
generated and these all had be correctly created and attached to
one another.
EMu Records for Collection Index
Collection Index
Agabus montanus
Taxonomy
Synonym
Agabus montanus
Taxonomy
Agabus melanocornis
Original Combination
Catalogue
(Index Lot)
Catalogue
(Index Lot)
Agabus montanus
Agabus montanus
Taxonomy
Taxonomy
Ilybius montanus
Ilybius melanorcornis
Bibliography
Systema naturae
Locations
Locations
Dry Collection
Spirit Collection
Parties
Linnaeus
EMu Catalogue
• Our main record types in the Catalogue are;
1) Specimen – Specimen Level Datasets
2) Preparation – Specimen level Datasets
3) Acquisition – Accession Register
4) Index Lot - Collection Index
• Focus on the Index Lot record type and its relationship to another
module named the Collection Index.
EMu Index Lot & Collection Index
• A Catalogue Index Lot is the presence of a species (but can be other
taxonomic rank) at a particular location in the museum.
• The species may be present at multiple locations around the
museum in different collections, hence have multiple Index Lot
records in the Catalogue.
• Disparate nature of the material. To be able to mange the material
more effectively we wanted to view all Index Lot records together.
• A new module was created for the NHM client to achieve this – the
Collection Index module.
• There is a single Collection Index record for each species. This
record brings together all Index Lots for the same species to be
viewable and editable in a single view.
• Also have functionality built into the client to display all Synonyms of
that name.
EMu Index Lot & Collection Index
Collection Index
Agabus montanus
Filed as Name
Taxonomy
Synonym
Agabus montanus
Displays Synonym Lots as well
Catalogue
(Index Lot)
Catalogue
(Index Lot)
Catalogue
(Index Lot)
Agabus montanus
Agabus montanus
Agabus melanocornis
Locations
Locations
Locations
Main Dry Collection
Accessory Collection
Spirit Collection
Taxonomy
Agabus melanocornis
EMu - Collection Index
EMu - Index Lot
What Now?
• Now migration complete, considerable work to do.
• Data clean-up – now data in same depository a whole raft of new
issues have now arisen e.g. duplication of records sitting in separate
datasets – legacy of non-relational model previously.
• Training – we have a large body of staff. Core Collection
management staff of ca. 30. Up to 30 others to train.
• The department is now in the process of learning the new system –
all core staff have undertaken a two day training course followed up
by one-one-one sessions. It will take time for everyone to reach a
good level of expertise with the new system.
• Complex system compared to the old Paradox system. The key to
success of EMu is ensuring there is an excellent skill base across
the department.
Conclusion
• Has been a challenging process migrating our data into single data
depository – it is not an easy process!! However there are
considerable advantages to having our data in one place.
• The process has inevitably thrown up many data clean-up issues
that we will be dealing with a long time to come.
• User training is critical to the success of the project. EMu is
substantially more complex than the old system and we need to
invest a lot of time and energy bringing everybody in the department
to a good standard of competency.
• We are hopeful that EMu will provide a significantly more effective
system to manage our collections at the museum.