Summary of NSF Databases and the

Download Report

Transcript Summary of NSF Databases and the

Recap @ Science on the Semantic Web,
Rutgers, October 2002
Invitational Workshop on Database and Information
Systems Research
For Semantic Web and Enterprises
Amit Sheth & Robert Meersman
NSF Information & Data Management PI’s Workshop
Amit Sheth & Isabel Cruz
“Ask not what the Semantic Web Can
do for you, ask what you can do for
the Semantic Web”
Hans-Georg Stork, European Union
http://lsdis.cs.uga.edu/SemNSF
Context for Amicalola workshop
• Series of Workshops and upcoming conferences:
Lisbon (9/00), Hong Kong (5/01), Palo Alto (7/01),
Amsterdam (12/01); since then WWW2002/ISWC
– Observation: visible lack of DB/IS involvement
• “Semantic Web – The Road Ahead,”
[Decker, Hans-Georg Stork, Sheth, … SemWeb’2001 at WWW10,
Hongkong, May 1, 2001. ]
• Semantic Web: Rehash or Research Goldmine
[Fensel, Mylopoulous, Meersman, Sheth (Chair), CooPIS’01]
• At Castel Pergine, Italy
Semantics & IDM – Brief History
(partial)
• Semantic Data Modeling
M. Hammer and D. McLeod: "The Semantic Data Model: A Modelling Machanism for Data
Base Applications"; Proc.. ACM SIGMOD, 1978.
• Conceptual Modeling
Michael Brodie, John Mylopoulos, and Joachim W. Schmidt. On Conceptual Modeling. Springer
Verlag, New York, NY, 1984. A series of preceding workshops.
• Data Semantic: What, Where and How?
- "Database Semantics", R.A. Meersman and T.B. Steel (eds), Proceedings of the IFIP DS-1
Conference, North-Holland (1985).
- So Far (Schematically) yet So Near (Semantically) –Sheth, Keynote at DS-5
- Meersman, Navathe, Rosenthal, Sheth (Chair); IFIP DS-6 Panel
• Semantic Interoperability on Web
many projects in 90s
– 1994 CIKM paper on Semantic Information Brokering talked about query
processing in a multi-ontology environment
• Domain Modeling, Metadata, Context, Ontologies, Semantic
Interoperability, Semantics in Schema Integration, Semantic
Information Brokering, Spatio-temporal-geographic- image-videomultimodal semantics
• All these involving Semantics, Databases, IS and even Web –
before “Semantic Web” term is coined
Challenges – unique role of IDM
SCALE and PERFORMANCE
Acceptable computation (query/analysis) time when
you have millions and billions of instances
(documents, digital content) and metadata
(annotation)
• locking for sharing/storage management
• Semantic similarity, mappings, interoperability
(schema transformation/integration aka ontology
mismatch)
• indexing for expediting computations
• workflow for Web Services-based processes
Organization/Output
•
•
•
•
20+ senior researchers/practitioners
2.5 days in Georgia Mountains
Proceedings of position papers (also talks)
Three workgroups: Application Pull (Brodie/Dayal),
Ontology (Decker/Kashyap) and Web Services
(Fensel/Singh)
• <SWIS WG at IDM PI’s meeting>
• Review at OntoWeb3 Panel
• Final Report
• SIGMOD Record special issue December 2002
Every thing is at lsdis.cs.uga.edu/SemNSF/
Participants
Karl Aberer, LSIR, EPFL, Switzerland
Mike Brodie, Verizon
Isabel Cruz, The University of Illinois at Chicago
Umeshwar Dayal, Hewlett-Packard Labs
Stefan Decker, Stanford University
Max Egenhofer, University of Maine
Dieter Fensel, Vrije Universiteit Amsterdam
William Grosky,University of Michigan-Dearborn
Michael Huhns, University of South Carolina
Ramesh Jain, UC-San Diego, and Praja
Yahiko Kambayashi, Kyoto University
Vipul Kashyap, National Library of Medicine
Ling Liu, Georgia Institute of Technology
Frank Manola, The MITRE Corporation
Robert Meersman, Vrije Universiteit Brussel (VUB)
Amit Sheth, University of Georgia and Voquette
Munindar Singh, North Carolina State University
George Stork, EU
Rudi Studer, AIFB Universität Karlsruhe
Bhavani Thuraisingham, NSF-CISE-IIS
Michael Uschold, The Boeing Company
Medical metaphor
• Ontologies: anatomy
• Processes: physiology
• Applications: pathology 
Application Pull …Agenda
• Premises
– Every resource meaningfully available
– Current & Planned Web Services
– Beneficiaries and Requirements
• Potential Semantic Services
– B2B, C2C, Intra-Enterprise
– Example Semantic Web Services
• Challenges / Questions / Concepts
• What the Semantic Web Will Look Like
Application Pull …Scenarios
• Scenarios
– Tax preparation (Individual)
– Supply Chain (B2B)
– Scientific Research
• Semantics will be added at three
different levels in successive phases
– Information
– Transactions
– Collaborations
Application Pull …Benefits / Requirements
• Lowering barriers to entry
– Costs
– Entrants
• Consumers
• Service providers
• Dynamic
– Ability to adjust to rapidly
changing circumstances
• Continuous
– Continuous activity (i.e.,
taxes, financial activity)
monitoring
– Event Detection
– Do
taxes
anytime,
anywhere
• X-Internet
– Executable
– Extended
• Improved
–
–
–
–
–
Transparency
Timeliness
Accuracy
Optimization
Eliminate
tasks
mundane
• Additional services
• Reliability and trust
• Archiving
– Data
– Meta-data
– Transaction histories
Application Pull …Challenges
• Upper ontologies
– Entities
• Personal
• Organizations
– Activities / Events
– Processes
• Ontologies
–
–
–
–
–
–
–
–
–
Products
Services
Financial contracts
Business objects
Tax laws (all agencies)
Financial activities
Service providers
Financial planning
Supply chain processes
– Activities (to be monitored)
• Ontology activities
–
–
–
–
Search
Select
Create, refine
Maintain, version
• Local
• Shared
• Global
– Mapping
• Ontology-based activities
– Accountability
• Arbitration
• Trust
• Tracing
• Engineering
– Managing ontologies and
mappings
– Scalability, robustness,
Ontology Search
Compare/Similarity
Requirements/
Analysis
Ontology
Learning
Merge/
Refine/Assemble
Evaluation
Maintenance
Versioning
Creation/
Change
Consistency
Checking
Deployment
(e.g., Hypothesis Generation, Query)
DB Research in the Ontology LifeCycle
• Operations to compare
Models/Ontologies
• Scalability/Storage Indexing of
Ontologies
– DB approaches data model specific
– Need to support graph based data
models
• Temporal Query Languages
Lots of work in Schema Integration/translation
Ontology WG: DB Research in the Ontology
LifeCycle II
• Schema Mapping
– Meta Model specific
– Representation of exceptions, e.g.,
tweety
– Specification of Inexact Schema
Correspondences
• E.g., 40% of animals are 30% of humans
• Meta Model
Transformations/Mappings (e.g., UML
to RDF Schema)
Ontology WG: DB Research in the
Ontology LifeCycle III
• Ontology Versioning
– Collaborative editing
– Meta Model specific versioning
– Version of Schema/Meta Model
Transformations
Ontology WG: DB Research &
Semantic Interoperation
• Inference v/s Query Rewriting/Processing for Semantic
Integration:
• E.g., RichPerson = (AND Person (> Salary 100))
• Can Query Processing/Concept Rewriting provide the
same functionality as inferences ? More efficiently ?
• Distributed Inferences and Loss of Information
• Query Languages for combining metadata and data queries
• Graph-based data models and query languages
• Schema Correspondences/Mappings
•Intensional Answers (Answers are descriptions,
e.g. (AND Person (> Salary 100)) instead of a list of all rich people)
• Semantic Associations (identification of meaningful
relationships between different documents and entities)
Semantic Index
Semantic WS Scope
Worth pursuing
Std
Program
All
Formally self-described
currency.com
Amazon
html
Self-described
Hard code
People
Mike’s Humor
• Services vs. Ontologies
“Well done is better than well
said.”
Ben Franklin
Research Issues
• Environment
• Representation
•
•
•
•
Programming
Interaction (system)
Architecture
Utilities
• Scalable, openness,
autonomy, heterogeneity,
evolving
• Self-description,
conversation, contracts,
commitments, QoS
• Compose & customize,
workflow, negotiation
• Trust, security,
compliance
• P2P, privacy,
• Discovery, binding, trustservice
SWS – Fitting in and expanding IS/DB/DM:
Or why Bhavani & George should care?
Data => services, similar yet more
challenging:
–
–
–
–
–
–
Modeling <functional and operational>
Organizing collections
Discovery and comparison (reputation)
Distribution and replication
Access and fuse (composition)
Fulfillment
• Contracts, coordination versus transactions
• Quality: more general than correctness or precision
• Compliance
– Dynamic, flexible information security and
trust.
Research Issues
• Conversational (state-based, event-based, historybased)
• Interoperability of conversational services – compose,
translate,
• Representations for services: programmatic selfdescription
• Commitments, contracts, negotiation, compliance,
cooperation
• Discovery, location, binding
• Transactional workflow: rollback, roll-forward, semantic
exception handling, recovery
• Trustworthy service (discovery, provisioning,
composition, description)
• Security; privacy vs. personalization
• Quality-of-Service, w.r.t. various aspects, negotiable
DB / IS
subcommu
nity
How is it relevant to
research on the SW
How may the SW stimulate
research in this community
DB theory
Type theory, Complexity, theory of
concurrency
Ontology axiomatics and theory; formal
semantics; semantics for incomplete,
inconsistent and evolving representations
Data(base)
semantics
Everything; in particular ontology
language development; constraints;
data structures
Ontology modeling; formal semantics of
web services
Normalization/
design
Not specifically as such; some work
on Non-First Normal Form
Requirement for formal properties for
ontology organization; perhaps ontology
design guidelines or “semantic normal
forms”; conflict resolution; redundancy
checks in general
Data modeling
reuse/extend/map DM formalisms,
techniques and methods e.g. EER,
ORM, UML for ontology (content)
specification and design
semantic data modeling; ontology content
creation techniques and methods; complex
ontological relationships; domain models
View
integration
Ontology alignment, translation,
object
identities,
updateable
views…; model mappings
see Federated DBs; ontology support for
view and application integration; ontology
composition and update
Schema
integration
apply to autonomously designed
schemas; global schemas as preontologies? conflict detection
Ontology alignment; new kinds of models will
pose new kinds of problems
Deductive
DB/Datalog
Learn from its failure,
processing and F-logic
how to handle different complexity levels
efficiently
Multimedia DB
Image ontologies; semantic indexing;
similarity-based search
Image-based ontologies?
Temporal/Spati
al DB
GIS semantics and archiving; histories
data management;
requirement to model temporal knowledge as
first class citizen in ontologies; spatial, temporal
modeling in upper ontologies; versioning of GIS
becomes critical issue
Document DB
Digital libraries, unstructured data;
standards for digital library resource
descriptions to beused on the SW
Lack of a priori global model presents a
research challenge
OO DB
Object-oriented
and
object-based
models for ontologies, extensible
databases;
modeling
of
object
behavior; build OODB into Java
management of large collections of object-,
behavior- and resource identifiers
Visual DB
Visualization for the SW,
queries; ontology visualization
semantic upgrades of image databases to be
used as visual ontologies
query
visual
XML/Web
DB
Most relevant, caching
Size and semantics; XML shortcomings
for semantics definition
Distributed
DB
everything
Constraint
DB
Constraint
enforcement
as
semantics
mechanism;
semantics-based
query
processing
loosening of ACID properties
trust/privacy/compliance
issues
in
distributed DBMS; design/dynamic
tailoring of DDBMS underlying web
services
Non-closed world assumption issues
Transaction
modeling
Transaction
processing
limits of what can/must be
transactional
Mobile DB
not directly; “mobile”
platform issue
Main
memory DB
Semantic caching
is
a
Web services, Extended distributed
transaction models; non-CWA issues;
smart user profiling
ACID properties of Web services;
semantic support for very long
transactions
context-aware
computing;
device
location-independent
semantics;
mobility issues raised/enabled by the
(Semantic) Web
possibly semantic caching i.e. using
application semantics or context
Parallel DB
unclear at present; straightforward
reuse/apply (e.g. parallel queries,
transactions, …) in certain niches
DB machines
Not clear at present Web SoA; parallel
architectures for ontology servers?
Not clear at present Web SoA
DB security
A lot, e.g., access control
trust and privacy, QoS; dynamically
changing
and
conflicting
security
requirements
Federated DB
Autonomy;
approaches
for
integrating
heterogeneous
data
sources,
in
particular
web
information
sources;
mediator/
wrapper-based architectures
www = huge federated DB; develop more
powerful (scalable) approaches for ontology
alignment and integration; heterogeneous
sources may have different credibility;
service composition
Query
processing
high applicability; e.g. “smart”
query enhancement
Query
optimization
high applicability; e.g. use domainknowledge to optimize query
execution and rewriting
Information
retrieval
broad applicability of techniques
and theory;
DB
interoperability
DB versioning
Everything; esp. see federated DBs;
see schema integration
Semantic aspects of interoperability; see
federated DBs; quality of interoperation
Link
maintenance;
versioning
Annotations,
ontology
versioning of instance data
modeling,
Annotations,
versioning
modeling,
ontology
Metadata
ontology
Mediation/Mi
ddleware
Web services will benefit
P2P, collaboration, new
mediating components
DB
warehousing
DW architectures for decision
support; improve e.g. web service
efficiency; see the (S)Web as a giant
DW
web mining; clustering; learning;
information extraction profiles
Smart data warehousing; share/compose
application semantics; ontology behind
“real” data
DBMS (components) as web
service(s); add semantics to every
function/module in a DBMS’s
architectures
Ontology support in data dictionaries;
new, more flexible DB architectures for
better SW support and processing on the
web
Data(base)
mining
Database
architectures
and DBMS
market
for
mining from text; exploit semantics in mining;
derive semantics inductively from query
results on “real” data including exceptions;
machine learning
Web-IS
architectures
fitting enterprise IS (components)
into the SW; Web IS; also see
DBMS architectures
New architectures and design principles for
Web IS
Functional
modeling
design of web services; functional
modeling that deals explicitly with a
domain’s semantics
Decomposition and composition of web
services; event modeling
IS
in
organizations
looser coupling required, provide
potential for organizations to
morph into the SW; see also
workflow modeling
serving new organizations of business,
community and government with emergent
SW-based IS technology
Web-IS
applications
IS workflow
modeling
exception handling in long (business)
transactions; workflows as “the”
paradigm for “programming” the SW
IS
methodologies
ontology lifecycle issues; as IS
components
become
more
intelligent, work shifts to selforganization
CASE tools
ontology management systems
smart (ontology-driven) SW portals and
search engines (“Google++”-type); SWbased “direct marketing”-style systems;
smart user profiling
unreliability of components; unavailability
of services
New thinking required! E.g. Web IS in
enterprises; how must business processes
change to deal with existence of the SW;
develop/maintain SW-based systems for user
community unknown a priori
User
interfaces
new
applications
principles for GUIs
of
design
DB
application
architectures
AI-and-DB
and
Web application service
knowledge
inference
representation,
Uncharted
territory 1
Uncharted
territory 2
New and complex requirements
methods, immersive environments
Sensor
input
management
In general, most algorithms in DM
are poor when they are applied to
access, report etc data on the web.
Domain semantics in such requests
need to be exploited; however
“centralized”
solutions
(where
resources need to notify potential
requestors) will not be scalable.
and
stream
data