general-modern-db-overview-peet-aug-2001

Download Report

Transcript general-modern-db-overview-peet-aug-2001

The challenge of biodiversity:
Plot, organism and taxonomic databases
Robert K. Peet
University of North Carolina
The National Plots Database Committee
John Harris
NCEAS
A case study:
The US National Plots Database
Project organized and directed by:
Robert K. Peet, University of North Carolina
Marilyn Walker, USDA Forest Service & U. Alaska
Dennis Grossman, The Nature Conservancy / ABI
Michael Jennings, USGS-BRD & UCSB
Project supported by:
National Center for Ecological Analysis & Synthesis
U.S. National Science Foundation
USGS-BRD Gap Analysis Program
ABI / The Nature Conservancy
Biodiversity data structure
Locality
Observation/Collection
Event
Plot databases
Object or specimen
Specimen databases
Taxon
Taxonomic databases
Taxonomic database challenge
The problem:
Integration of data potentially representing
different times, places, investigators and
taxonomic standards
The traditional solution:
A standard list of kinds of organisms.
Current standards
• Biological organisms are names following international
rules of nomenclature.
• Database standards are being developed by
TDWG, GBIF, IOPI, etc.
• Metadata standards have been developed. For example,
the Darwin Core is a profile describing the minimum set of
standards for search and retrieval of natural history
collections and observation databases.
(http://tsadev.speciesanalyst.net/DarwinCore/)
There exist numerous compilations of
organism names.
For example:
• Species 2000 http://www.sp2000.org/default.html
(Composed of 18 participant databases)
• All Species http://www.all-species.org
• ITIS
http://www.itis.usda.gov/
(The US government standard list)
• Index to organism names
http://www.biosis.org.uk/triton/indexfm.htm
Taxon-specific standard lists are available.
Representative examples for higher plants include:
North America
USDA Plants http://plants.usda.gov/
ITIS
http://www.itis.usda.gov/
ABI
http://www.natureserve.org
World
IPNI International Plant Names Checklist
http://www.ipni.org/
IOPI Global Plant Checklist
http://iopi.csu.edu.au/iopi/iopigpc1.html
Most standardized plant lists fail to allow
effective integration of datasets.
The reasons include:
•
The user cannot reconstruct the database as
viewed at an arbitrary time in the past,
•
Taxonomic concepts are often not defined,
•
Multiple party perspectives on taxonomic
concepts and names cannot be supported or
reconciled.
Three concepts of shagbark hickory
Splitting one species into two illustrates the ambiguity
often associated with scientific names. If you encounter
the name “Carya ovata (Miller) K. Koch” in a database,
you cannot be sure which of two meanings applies.
Carya carolinae-sept.
(Ashe) Engler & Graebner
Carya ovata
(Miller)K. Koch
Carya ovata
(Miller)K. Koch
sec. Gleason 1952
sec. Radford et al. 1968
Multiple concepts of Rhynchospora plumosa s.l.
Elliot
1816
Gray
1834
R. plumosa
R. plumosa
Chapman
1860
Kral
1998
R. plumosa
v. plumosa
R. plumosa
R plumosa
v. intermedia
R. intermedia
R. plumosa
v. interrupta
R. pineticola
Peet
2002
R. sp. 1
1
R. plumosa
v. plumosa
2
R. plumosa
v. pineticola
3
An assertion represents a unique
combination of a name and a reference
Assertion is equivalent to
Potential taxon & taxonomic concept
Name
Assertion
Reference
Five shagbark hickory assertions
Possible taxonomic synonyms are listed together
Names
Carya ovata
Carya carolinae-septentrionalis
Carya ovata var. australis
References
Gleason 1952 Britton & Brown
Radford et al. 1968 Flora Carolinas
Stone 1997 Flora North America
Assertions
(One shagbark)
C. ovata sec Gleason ‘52
(Southern shagbark)
C. carolinae-s. sec Radford ‘68
C. ovata australis sec FNA ‘97
(Northern shagbark)
C. ovata sec Radford ‘68
C. ovata sec FNA ‘97
A usage represents a unique combination of
a taxon and a name.
Usages can be used to track nomenclatural synonyms
Name
Usage
Taxon
Published names
1. Carya ovata
2. C. carolinae-septentrionalis
3. C. ovata var. australis
Usage
Species concepts
1-A
1-C
2-B
3-B
A. One shagbark
B. Southern shagbark
C. Northern shagbark
An example of a nomenclatural synonym is
the linkage of the assertion “Carya ovata
var. australis sec. FNA 1997” with the
name “Carya carolinae-septentrionalis” by
both ITIS and ABI.
A usage (name assignment) and
assertion (taxon concept) can be
combined in a single model
Name
Usage
Reference
Assertion
Party Perspective
The Party Perspective on an Assertion includes:
•Status – standard, nonstandard, undetermined
• Correlation with other assertions –
Equal, Greater, Lesser, Overlap, Undetermined.
•Lineage – Predecessor and Successor assertions.
•Start & Stop dates.
Party
Assertion
ITIS
FNA Committee
ABI
Carya ovata sec Gleason 1952
Carya ovata sec Radford 1968
Carya carolinae sec Radford 1968
Carya ovata sec FNA 1997
Carya ovata australis sec FNA 1997
Status
Party
Assertion
Status Start
ITIS
ITIS
ITIS
ITIS
ITIS
ovata – G52
S
ovata – R68
A
carolinae – R68
A
carolinae – R68
S
ovata aust – FNA A
1996
1996
1996
2000
2000
Name
ovata
carolinae
carolinae
Concept-based taxonomy is coming soon
• All organisms in databases should be identified by
linkage to an assertion = name and reference!
• Various standards are being developed by
FGDC, TDWG, IOPI, GBIF, etc.
• Most major databases are working toward
inclusion of assertions (e.g. ITIS, IOPI, ABI).
• Until standard assertion lists are available,
databases that track organisms should include
couplets containing both a scientific name and a
reference.
National Taxonomic Database?
•
•
•
•
Concept-based
Party-neutral
Synonymy and lineage tracking
Perfectly archived
An upgrade for ITIS & Species 2000?
Specimen/object databases
Information on specimens/objects
should be tracked by reference to
• Place (place or collection)
• Unique identifier (accession number)
• Time
A museum is a place
Database systems for tracking specimens
The following are a few of the many available
• BioLink
http://www.ento.csiro.au/biolink/index.html
• Specify
http://usobi.org/specify/default.htm
• Biota
http://viceroy.eeb.uconn.edu/Biota
• Taxis
http://taxis.virtualave.net/
TDWG maintains links to multiple software systems
http://www.bgbm.fu-berlin.de/TDWG/acc/Software.htm
Project
Plot
Core elements of the
National Plots Database
Plot
Observation
Taxon
Observation
Taxon
Interpretation
Plot
Interpretation
Support multiple interpretations of which
concept applies to an organism or community.
Various observers will associate different taxonomic
concepts with records in a database
Provision must be made for inclusion of these taxonomic
interpretations.
Minimal attributes include
• Concept applied
• Date applied
• Who made the interpretation
• Links to supporting information
Interface tools
•Desktop version for data preparation and local use.
•Loaders for legacy data.
•Data export.
•Tools for linking taxonomic concepts.
•Standard query, flexible query, SQL query.
•Flexible export.
•Local data refresh
•Easy web access with consistent interface
Conclusions for database designers
1. Records of organisms should always contain
(or point to) couplets consisting of a scientific name
and a reference where the name was used.
2. Design for future annotation of organism concepts.
3. Track specimens/objects by location, unique identifier
& time.
4. Design for reobservation! Separate permanent from
transient attributes.
5. Archival databases should provide time-specific
views.