Transcript Document
17/07/2015
ebXML Day
(Barcelona 23.5.2002)
Implementing ebXML Registry
Information Model
Peter Burgess
1
TietoEnator©2002
17/07/2015
Some background...
Technical Architect (Media and Telecom section,
TietoEnator in Brussels, Belgium).
Specialising in system architecture + development of
Java/ XML solutions.
Open-source evangelist.
New committer on ebxmlrr open-source project.
Previously worked for Nokia (Finland), IBM Global
Services (Belgium)
Peter Burgess
2
TietoEnator©2002
17/07/2015
Some background...
TietoEnator
Staff of 10,000 and annual net sales of 1.1 billion euros.
IT Services organization with strong base in Scandinavia
esp. Finland, Sweden
Consulting, systems development and integration,
operation and support, product development services,
and software services.
In Belgium, working with both commercial and public
sector clients.
http://www.tietoenator.com/
Peter Burgess
3
TietoEnator©2002
17/07/2015
Our Project
Implementation of MIReG metadata model and
framework
MIREG = Managing Information Resources for eGovernment
Sponsored by European Commission’s IDA initiative
IDA = Interchange of Data between Administrations
IDA’s mission: ‘using advances in information and
communications technology to support rapid
electronic exchange of information between Member
State administrations’
http://europa.eu.int/ISPO/ida/
Peter Burgess
4
TietoEnator©2002
17/07/2015
Project Goal
To implement a system for managing metadata about
information resources, documents and services.
To implement a system that facilitates:
content interoperability
simplification of administrative processes
improved information flows
To allow users to:
locate and track documents, metadata and versions
search and manage content
search and manage administrative metadata
Peter Burgess
5
TietoEnator©2002
17/07/2015
What is Metadata??
Data about data.
Metadata describes a resource – e.g:
– Name
– Title
– Subject
– Date issued
– Version
– Date modified
– Identifier
Dublin Core standard is simple standard for describing a wide
range of networked resources. See http://dublincore.org
Peter Burgess
6
TietoEnator©2002
17/07/2015
Dublin Core Elements
Creator
Publisher
Contributor
Relation
Coverage
Rights
Date
Subject
Description
Source
Format
Title
Identifier
Type
Language
Peter Burgess
7
TietoEnator©2002
17/07/2015
MIReG Metadata Model &
Framework
Metadata management system manages metadata about
information resources, documents and services
Describes citizens, enterprises, public servants, long-lived
information (e.g. archived documents).
Dublin Core + MIReG extensions:
administrative metadata - to describe how the
resource should be managed and processed
access rights
security classification
disposal
long-term preservation
etc…
Peter Burgess
8
TietoEnator©2002
17/07/2015
Functional requirements
Metadata management system should support:
Exporting documents and their metadata
Converting existing metadata to Dublin Core/RDF
Adding or updating administrative metadata
Storing metadata
Providing metadata search capability
Importing documents and their metadata.
Peter Burgess
9
TietoEnator©2002
17/07/2015
Why ebXML?
Metadata Management System should:
be flexible and evolutionary;
facilitate ‘content interoperability’ – i.e. information
exchange between organisations;
be standards compliant and open;
provide well defined interfaces, allowing Creation,
Update, Retrieval and Deletion of metadata and
content
Peter Burgess
10
TietoEnator©2002
17/07/2015
Non-functional requirements
All open-source solution, adopting ‘best of breed’
solutions (e.g. Apache WS, Apache Tomcat, Apache
Xindice, Castor Java-XML binding)
Co-operate with open source community wherever
possible.
Delivered system based on standards such as ebXML,
W3C Schema, RDF, Dublin Core.
Total XML solution – from database to user interface.
Peter Burgess
11
TietoEnator©2002
17/07/2015
Tools and APIs: Requirements
Open-source!! Ability to see the code and make
changes if necessary...
Support – open-source community responds quickly
to bug reports and questions.
Don’t reimplement ebXML Registry from scratch –
co-operate with existing open-source team(s).
Don’t reinvent the wheel – reuse best existing
solutions.
Use stable and well adopted tools (e.g. Apache Web
Server 1.3, Apache Tomcat 4.0+)
Peter Burgess
12
TietoEnator©2002
17/07/2015
Tools and APIs: Problems
Steep learning curve ... many new tools and APIs to master.
Support for full W3C Schema standard not available in Castor
Java-XML binding.
No concrete JAXB implementation available that supports W3C
Schema (only DTD)
Xindice 1.0 only really supports US-ASCII (UTF-8 patch now
available in Xindice 1.1 development..)
Xindice XPath contains() search is slow. Must use equality tests
to gain benefits of indexing.
Xindice’s transaction support not yet available ...
Peter Burgess
13
TietoEnator©2002
17/07/2015
Standards & Technologies (1)
Standards:
Peter Burgess
JAXP – Java API for XML Processing
JAXB – Java API for XML Binding
JAXR – Java API for XML Registries
JAXM – Java API for XML Messaging
SOAP Version 1.1
W3C XML Schema
RDF & RDF Schema
XSLT (Version 1.0 )
XPATH (Version 1.0 )
XML :DB
Java Servlet specification (Sun Microsystems) – version 2.3
JSP specification (Sun Microsystems) – version 1.2
14
TietoEnator©2002
17/07/2015
Standards & Technologies (2)
Open Source technologies & tools:
Apache Web Server 1.3
Apache Tomcat 4.0.3 servlet engine
Apache SOAP (XML messaging API)
Apache Xerces XML parser
Apache Xalan XSLT processor
Apache Xindice (DBXML) native XML database
Castor open source framework for Java – XML binding
All software written using the Java programming
language (Java versions 1.3 and 1.4)
Peter Burgess
15
TietoEnator©2002
17/07/2015
Architecture Overview
(High Level)
Peter Burgess
16
TietoEnator©2002
17/07/2015
Architecture Overview
(ebXML Service Layer)
Peter Burgess
17
TietoEnator©2002
17/07/2015
Architecture Overview
(XML database layer)
Peter Burgess
18
TietoEnator©2002
17/07/2015
Xindice v Relational
Relational database model:
– Tables
– Views
– Data is structured, based on pre-defined schema
– Standardised queries via SQL (SELECT, INSERT,
UPDATE, DELETE etc..)
– Most RDBMS support JOIN operations
– Possible to make XML to Relational mapping (e.g. IBM
DB2 XML Extender)
Peter Burgess
19
TietoEnator©2002
17/07/2015
Xindice v Relational
Xindice database model:
– Hierarchical organisation of data
– The root of the hierarchy is a database instance
– Data managed as XML Documents
– Insert the data as XML and retrieve it as XML
– Sets of documents form a Collection (similar
idea as file system folder)
– Queries with using standard XPath (Query engine built around
Apache Xalan)
– Indexation system speeds Xpath query performance
Peter Burgess
20
TietoEnator©2002
17/07/2015
Mapping ebXML RIM to Xindice
Main Concepts:
All ebXML RIM components stored as separate XML documents
In Xindice ebXML RegistryObject id used as document id
Use Association to link two RegistryObjects e.g:
<rim:ObjectRef id="urn:uuid:b2345678-1234-1234-123456789077"/>
<rim:ObjectRef id="urn:uuid:c2345678-1234-1234-123456789012"/>
<!– Association describes relationship between these two objects -->
<rim:Association associationType="Packages"
sourceObject="urn:uuid:b2345678-1234-1234-123456789077"
targetObject="urn:uuid:c2345678-1234-1234-123456789012"/>
Peter Burgess
21
TietoEnator©2002
17/07/2015
Mapping ebXML RIM to Xindice
All XML data is wrapped in a custom <RegistryData> wrapper.
<RegistryData> wrapper contains namespace declaration
<RegistryData xmlns="urn:oasis:names:tc:ebxmlregrep:rim:xsd:2.0" xmlns:rim="urn:oasis:names:tc:ebxmlregrep:rim:xsd:2.0“>
XPath queries include namespace prefix e.g.
//rim:ExtrinsicObject
All ebXML RIM components are stored in same collection
Peter Burgess
22
TietoEnator©2002
17/07/2015
ebXML RIM and Dublin Core
Metadata mapped to ExtrinsicObject slots
e.g: creator=‘Arthur C. Clarke’ maps to:
<Slot name=“creator" slotType=“dc-metadata">
<ValueList>
<Value>Arthur C. Clarke</Value>
</ValueList>
</Slot>
Sub-set of ebXML RIM implemented in short-term – User, Slot,
ExtrinsicObject, AuditableEvent, Association, ExternalLink
Peter Burgess
23
TietoEnator©2002
17/07/2015
Querying Xindice with XPath (1)
W3C standard XPath
Advanced path like expressions, allowing node selection and
filtering
Example
<rim:ExtrinsicObject id="urn:uuid:b089d653-bad1-41d6-93ad-9dc93c055339">
<rim:Name>
<rim:LocalizedString value="ebXML RIM Schema metadata"/>
</rim:Name>
<rim:Description>
<rim:LocalizedString value="metadata about ebXML RIM schema"/>
</rim:Description>
<!-- metadata here as slots -->
<rim:Slot name="title" slotType="schema-metadata">
<rim:ValueList>
<rim:Value>ebXML RIM W3C Schema</rim:Value>
</rim:ValueList>
</rim:Slot> etc . . .
Peter Burgess
24
TietoEnator©2002
17/07/2015
Querying Xindice with XPath (2)
Select ExtrinsicObject with identifier
‘urn:uuid:b089d653-bad1-41d6-93ad-9dc93c055339’:
//rim:ExtrinsicObject[@identifier='urn:uuid:b089d653-bad141d6-93ad-9dc93c055339']
Case sensitive Select ExtrinsicObject with Slot whose
name is ‘title’ and whose value list entry contains the
word ‘ebXML
//rim:ExtrinsicObject[rim:Slot[@name='title']/rim:ValueList/rim
:Value[contains(.,'ebXML')]]
Peter Burgess
25
TietoEnator©2002
17/07/2015
Querying Xindice with XPath(3)
XPath JOIN
One Xindice collection can be queried as one large document:
Example
<rim:ExternalLink id="acmeLink2">
<rim:Name>
<rim:LocalizedString value="Link #2"/>
</rim:Name>
<rim:Description>
<rim:LocalizedString value="ACME's Link #2"/>
</rim:Description>
</rim:ExternalLink>
<rim:Association id="acmeLink2-alreadySubmittedCPP-Assoc"
associationType="ExternallyLinks" sourceObject="acmeLink2"
targetObject="urn:uuid:a2345678-1234-1234-123456789012"/>
Peter Burgess
26
TietoEnator©2002
17/07/2015
Querying Xindice with XPath (4)
XPath JOIN
XPath:
Get the RegistryObject whose id is the same as the
targetObject’s id of the Association whose sourceObject’s id is
‘acmeLink2’
//*[@id=//rim:Association[@sourceObject='acmeLink2']/@targetObject]
Peter Burgess
27
TietoEnator©2002
17/07/2015
Querying Xindice with XPath (5)
XPath JOIN
Example
<ExtrinsicObject id="urn:uuid:548b6bf0-cf77-4450-9efeee465b504484" status="Submitted"
xmlns="urn:oasis:names:tc:ebxml-regrep:rim:xsd:2.0">
…
</ExtrinsicObject>
<AuditableEvent
id="urn:uuid:724719b2-6b4f-41ca-b910-af5219ebcdd9"
objectType="AuditableEvent"
eventType="Created"
registryObject="urn:uuid:548b6bf0-cf77-4450-9efeee465b504484" timestamp="2002-05-15T11:38:56.980"
user="urn:uuid:921284f0-bbed-4a4c-9342-ecaf0625f9d7"
xmlns="urn:oasis:names:tc:ebxml-regrep:rim:xsd:2.0" />
Peter Burgess
28
TietoEnator©2002
17/07/2015
Querying Xindice with XPath (6)
XPath JOIN
XPath:
Get all RegistryObjects created by user with id
'urn:uuid:921284f0-bbed-4a4c-9342-ecaf0625f9d7‘:
//*[@id=//rim:AuditableEvent[@eventType='Created' and
@user='urn:uuid:921284f0-bbed-4a4c-9342ecaf0625f9d7']/@registryObject]
Peter Burgess
29
TietoEnator©2002
17/07/2015
XUpdate
• XML:DB initiative specification http://www.xmldb.org/xupdate
• Batch modifications against XML document set.
Example:
<xupdate:update select="//rim:User[@id='urn:uuid:921284f0-bbed4a4c-9342ecaf0625f9d7']/rim:EmailAddress/@address">peter.burgess@tietoen
ator.com</xupdate:update>
Peter Burgess
30
TietoEnator©2002
17/07/2015
Castor Java-XML Binding (1)
• Implements majority of W3C Schema recommendation (e.g. no Union)
• UnMarshal a java.io.Reader into Java object
StringReader stringReader = new StringReader(extObjXML);
ExtrinsicObject extObj =ExtrinsicObject.unmarshal(stringReader);
• Marshall Java object to java.io.Writer
extObject.marshal(stringWriter);
String extObjXML = stringWriter.toString();
Peter Burgess
31
TietoEnator©2002
17/07/2015
Castor Java-XML Binding (2)
• Fast, reliable, performant
• Uses SAX
• High level interface.
• Manipulate XML document as Java Object
• No need to walk the DOM tree, or build custom SAX handlers
Peter Burgess
32
TietoEnator©2002
17/07/2015
Lessons Learned
• Reuse of existing solutions saves much time in long term
• Access to all software sources was invaluable – make own bug fixes on
the spot.
• Open-source is a two-way street. Use other’s solutions and also
contribute your own.
• Solid architecture because we took time to carefully design the system
(plus prototyping, learning new APIs)
• XML databases offer a very realistic solution for projects with XML data
storage needs.
• XPath is very powerful – even possible to implement JOIN in Xindice.
Peter Burgess
33
TietoEnator©2002
17/07/2015
References
• ebxmlrr project http://sourceforge.net/projects/ebxmlrr
• Apache Xindice http://xml.apache.org/xindice
• XML:DB initiative http://www.xmldb.org
• Castor Java-XML Binding http://castor.exolab.org/
• IDA (European Commission) http://europa.eu.int/ISPO/ida
• TietoEnator http://www.tietoenator.com
• Peter Burgess - [email protected]
Peter Burgess
34
TietoEnator©2002