Building a Distributed Geospatial Library
Download
Report
Transcript Building a Distributed Geospatial Library
Additional text in
“Notes” view
Alexandria Digital Library Project
Building a
Distributed Geospatial Library
where we are now
where we’re going
what we’re facing
Greg Janée
[email protected]
Alexandria Digital Library Project
Goals
Digital library for georeferenced information
distributed, autonomous nodes
heterogeneous
rich services
scalable
– many providers
– collections, large and small
Standard components, interfaces
Greg Janée • ADEPT retreat • November 8, 2002
2
Alexandria Digital Library Project
The big picture
collection registry
thesaurus
collection-level search
shared vocabularies
library
content
gazetteer
item-level search,
metadata management
data access
maps placenames
to locations
collection
map
collection
item
item
item
background imagery,
layering capability
item
item
*many interconnections
between services*
Greg Janée • ADEPT retreat • November 8, 2002
3
Alexandria Digital Library Project
Library server
user
interface
metadata
mapper
harvest
loader
item
tracker
client interface (XML / Java,HTTP,RMI)
middleware
access control; query fan-out; query result caching & ranking
collection referencing & registration
collection interface (XML / Java)
internal
collections
generic
database
driver
Z39.50
driver
proxy
driver
Greg Janée • ADEPT retreat • November 8, 2002
collection
aggregator
4
Alexandria Digital Library Project
Issues
1. Finding the right participation model
I have a collection o’ stuff, how do I join ADL?
2. Providing a complete solution
I’m a map library, I want a library-in-a-box
3. Gaining adoption
How do I add spatial searching to my DL?
4. Simple, effective spatial searching
I want spatial search but I’m cheap and lazy
Greg Janée • ADEPT retreat • November 8, 2002
5
Alexandria Digital Library Project
Participation via database mapping
Assumes a relational database of
metadata
Collection described as a view of
the database
ADL provides
template-based report generator
mapping language
extensible library of composable
mapping components (“paradigms”)
offline software package to generate
collection statistics
ADL node
config
view
RDBMS
provider
Greg Janée • ADEPT retreat • November 8, 2002
6
Alexandria Digital Library Project
Sample paradigms
Spatial
Informix Geodetic blade
4 box coordinates
SQL LIKE substring
matching
Verity text engine
IIT SIRE
Temporal
begin, end dates
single integer year
Hierarchical
integer codes w/ code
ancestor relationships
constant
Textual
Numeric, Identification, ...
Field adaptors
qualification
union
concatenation
constant
Greg Janée • ADEPT retreat • November 8, 2002
7
Alexandria Digital Library Project
A bucket mapping
"subject-related-text" : UT.Bucket("textual",
UT.standardTextualOperators,
P.Adaptor_Concatenation(
{ "tag:sio.ucsd.edu:sioexplorer/nsdl_mif_dbc/subject" :
P.Textual_LikeSubstring(
"nsdl.nsdl_mif_dbc",
"identifier",
"subject",
UT.Cardinality("1"),
P.TextUtils.mappings.
uppercaseAlphanumericOthersToWhitespace,
P.TextUtils.deleteLists.keepAll,
"UPPER"),
"tag:sio.ucsd.edu:sioexplorer/subject-keywords" :
P.Textual_Constant(
"nsdl.nsdl_mif_dbc",
"identifier",
UT.Cardinality("1"),
["oceanographic data", "Stephen’s baby"])
...
Greg Janée • ADEPT retreat • November 8, 2002
8
Alexandria Digital Library Project
A bucket mapping
"subject-related-text" : UT.Bucket("textual",
UT.standardTextualOperators,
P.Adaptor_Concatenation(
{ "tag:sio.ucsd.edu:sioexplorer/nsdl_mif_dbc/subject" :
P.Textual_LikeSubstring(
"nsdl.nsdl_mif_dbc",
"identifier",
"subject",
UT.Cardinality("1"),
P.TextUtils.mappings.
uppercaseAlphanumericOthersToWhitespace,
P.TextUtils.deleteLists.keepAll,
"UPPER"),
"tag:sio.ucsd.edu:sioexplorer/subject-keywords" :
P.Textual_Constant(
"nsdl.nsdl_mif_dbc",
"identifier",
UT.Cardinality("1"),
["oceanographic data", "Stephen’s baby"])
...
Greg Janée • ADEPT retreat • November 8, 2002
9
Alexandria Digital Library Project
Database mapping: an assessment
What’s good
data stays close to provider
collection-as-DB-view parallels real-world funding situation
– nobody is paid to be an ADL node
What’s bad
high bar
– must have database, good metadata, reasonable data
modeling, appropriate indexes
complex configuration
– multiple, different representations of same info
– requires superhuman diligence
complex software
– generic query translator compiler
Greg Janée • ADEPT retreat • November 8, 2002
10
Alexandria Digital Library Project
Participation via metadata transfer
Database is internal to ADL
“Universal” schema
supports all buckets, bucket types
automates all indexing, bucket
mappings, collection statistics
enforces collection policies
RDBMS
config
Provider supplies metadata
entire XML documents
via OAI or otherwise
ADL node
Mapping to ADL metadata views
(bucket, browse, access) still
required, but...
simpler, higher-level
no duplication
Greg Janée • ADEPT retreat • November 8, 2002
mapper
metadata
provider
11
Alexandria Digital Library Project
Issue 2: providing a complete solution
ADL provides:
discovery
Missing:
ingest, editing tools
management of...
– metadata
– data
– data services
...and synchronization of the above
workflow
A reasonable goal (?):
ADL provides complete map library solution
Greg Janée • ADEPT retreat • November 8, 2002
12
Alexandria Digital Library Project
Issue 3: gaining adoption
Adoption by other DLs has been difficult
features (spatial search, buckets) not separable from
architecture
nobody understands buckets anyway
The world speaks Dublin Core
we don’t
close doesn’t count
Greg Janée • ADEPT retreat • November 8, 2002
13
Alexandria Digital Library Project
Adoption strategies
New, compelling reasons to use ADL!
harvesting automates collection building
metadata mapping will support qualified Dublin Core
Our proposal to NSDL/CI:
“search semantics” profile for qualified DC
generic search framework that supports
– typed searches
– over federated search services
Greg Janée • ADEPT retreat • November 8, 2002
14
Alexandria Digital Library Project
Issue 4: design philosophy
“The right thing”
1 : interface simplicity, correctness, consistency
2 : implementation simplicity, completeness
“Worse is better”
1 : implementation simplicity
2 : interface simplicity
3 : correctness, consistency
4 : completeness
exemplified by Unix, C
(Richard Gabriel, early ‘90s)
Greg Janée • ADEPT retreat • November 8, 2002
15
Alexandria Digital Library Project
Our approach
We have the “right” interfaces
searching based on continuous geodetic coordinates
complex spatial representations (polygons, polylines, ...)
gazetteer (content & protocol) provides mapping to names
simple!
But... implementation is very difficult
polygons, etc. make life difficult at all levels
polygons require $$$ 3rd-party software
client integration with gazetteer is difficult
still don’t have a usable gazetteer
Greg Janée • ADEPT retreat • November 8, 2002
16
Alexandria Digital Library Project
Other approaches
We pay a big price for our approach
spatial search was motivator for typed metadata
typed metadata is responsible for much of complexity
Might other approaches be equally effective?
simplified spatial models, e.g., boxes only
other coordinate systems (discrete, coded, ...)
cataloging against fixed gazetteer w/ topological
relationships
Greg Janée • ADEPT retreat • November 8, 2002
17
Alexandria Digital Library Project
Summary
Future directions
simpler participation model
collection-level discovery
remote deployment
NSDL/CI
Legacy
production-quality software
– copiously documented
– no known bugs, omissions, or bottlenecks
in step with MIL
Greg Janée • ADEPT retreat • November 8, 2002
18
Alexandria Digital Library Project
Cast of characters
Dave Valentine
client, databases, testing, deployment
Catherine Masi
MIL collection development
Rudolf Nottrott
outreach, software development
Greg Janée
overall design, core software development
Jim Frew
guru
Greg Janée • ADEPT retreat • November 8, 2002
19
Alexandria Digital Library Project
Issues
1. Finding the right participation model
I have a collection o’ stuff, how do I join ADL?
2. Providing a complete solution
I’m a map library, I want a library-in-a-box
3. Gaining adoption
How do I add spatial searching to my DL?
4. Simple, effective spatial searching
I want spatial search but I’m cheap and lazy
Greg Janée • ADEPT retreat • November 8, 2002
20