XML DB Architecture

Download Report

Transcript XML DB Architecture

Oracle XML DB and
XQuery
Chris Gianfrancesco
Aruna Apuri
Oleg Rekutin
Jason Wilson
Introduction


XML Type abstraction
Storage
 Shredded

Publishing
 XML

or LOB
Views of relational tables
SQL / XML functions and constructs
 XMLQuery, XMLTable,
 XML Updates

and more...
XQuery evaluation and processing
XML Type
XQuery
XPath
XSLT
SQL / XML
XML Type abstraction
Physical Storage
CLOB
XML Type Views
Shredded
Relational Data
Hybrid
Binary XML
XML View
Create virtual XML version of objectrelational data
 Allows XQuery to access relational data
 Uses XML Publishing
 ora:view()

SQL/XML Functions

SQL/XML querying function and construct
 XMLQuery,

XMLTable
SQL/XML functions for creating XML from
SQL
 XMLElement(),
XMLForest()
XMLConcat(), XMLAttributes(),
More XML Functions

Other XML functions
 XMLColAttVal(),
XMLSequence(),
ExtractValue(), Extract(), XMLTransform()

To support XML updates
 UpdateXML(),
DeleteXML(),
InsertChildXML(), InsertXMLBefore(),
AppendChildXML()
XQuery Hybrid Evaluation
Transform XMLTable into XMLQuery
 Static analysis and type checking
 If possible, compiles into native SQL data
structures
 If not possible, XMLQuery is left as is for
processing by XMLQuery processor

XQuery Hybrid Evaluation
SQL query containing
XMLQuery/XMLTable
Transform XMLTable to XMLQuery
SQL query containing
XMLQuery
Native compilation of XMLQuery
SQL structures with
XML operators
SQL structures containing
XMLQuery
XQuery evaluated natively
Co-processor evaluates
XMLQuery expressions
Input & Data Representation
 All
data in one row, one XMLType column
<employees>
XMLType
<employee>
<name>John Doe</name>
<job>Adjuster</job>
</employee>
<employee>
<name>Michael Smith</name>
<job>Investigator</job>
</employee>
<employee>
<name>Sam Adams</name>
<job>Engineer</job>
</employee>
</employees>
Input & Data Representation
 Each
data row in separate DB row,
column of XMLType
<employees>
<employee>
VARCHAR
XMLType
<name>John
Doe</name>
row
<job>Adjuster</job>
</employee>
<employee>
XMLType
VARCHAR
<name>Michael Smith</name>
row
<job>Investigator</job>
</employee>
<employee>
XMLType
VARCHAR
<name>Sam Adams</name>
row
<job>Engineer</job>
</employee>
</employees>
Input & Data Representation
 Each
data row in separate DB row,
contents in separate columns
<employees>
<employee>
<name>John Doe</name>
row
<job>Adjuster</job>
</employee>
<employee>
<name>Michael Smith</name>
row
<job>Investigator</job>
</employee>
<employee>
<name>Sam Adams</name>
row
<job>Engineer</job>
</employee>
</employees>
VARCHAR
VARCHAR
VARCHAR
VARCHAR
VARCHAR
VARCHAR
Input Tools

Straight XML in SQL
 INSERT
VALUES(
XMLType(‘<xml>goes here</xml>’))
JDBC using special XMLType (also C)
 SQL*Loader w/ direct path load mode
 XML-SQL Utility (XSU)

 Maps
XML to columns
 Rigid default mapping
 No support for attributes
Storage in Database

XMLType CLOB
 File
preserved as complete text (whitespace,
comments, etc) [textual fidelity]
 Can still be validated against a schema
 Data internally is not “typed”
 Slow querying
 Fastest storage and retrieval
Storage in Database

XMLType View
 Create
a virtual XML document on top of
relational tables
 Fast querying, manipulation using pure SQL
 Deeply nested views are slow
 Updating/inserting requires triggers
 Lose strict order guarantee, no textual fidelity
 Supports multiple XML schemas on top of one
relational schema
Storage in Database

Native XML type (Structured Storage)
 Preserves
textual fidelity
 Shreds into SQL tables
 Complete validation, full SQL support
 No triggers to update tables (built-in rewriting)
 Some overhead
 Cannot change schema w/o reloading all data
 Requires a schema
Structured Storage Detail

Annotate XML schema to control nested
collections storage, as:
 CLOB
 Array
of serialized SQL objects
 Nested table of serialized SQL objects
 Array of XMLType
Working with XML Schema

Registering schema
begin dbms_xmlschema.registerSchema( ‘http://namespace',
xdbURIType('schema.xsd').getClob(),
TRUE,TRUE,FALSE,TRUE);
end;

Creating table w/ schema
CREATE TABLE TableName of XMLType
XMLSCHEMA "http://namespace"
XQuery Support in Oracle

XMLDB integrated database engine

SQL / XML standard support
 Optimized queries – rewrite to relational

Standalone Java query engine

100% Java
 Integrated into Oracle App Server -XDS
 Interoperates with XSLT/XPath
XQuery database support








Production in Oracle Database 10gr2
Supports XMLQuery and XMLTable construct
Native compilation into SQL /XML structures
Returns XMLType(Content)
Can query over relational, O-R, XMLType data
fn:doc - Maps to XDB Repository on server
SQLPlus provides xquery command to execute
XQuery
XSL-T will also get compiled to XQuery
Architecture
X
Q
U
E
R
Y
XQuery XSL-T Parser
XQueryX
Normalization
Statically
Type checked
Tree
Compiler
Normalized Tree (casts, treat )
Rewrite to SQLX
Compiled XQuery Tree
XQuery Type check
SQL/XML Operand Tree
SQLX rewrite
SQL Operand Tree
SQL Metadata
XMLSchema
Repository
XML Indexes,
Text Indexes
Relational Optimizer
Execution Structures
XQuery F&O
Execution engine
S
Q
L
XQuery Java implementation







XQuery or XQueryX input
Extensible function implementation
Compiles into rowsource like structures
Optimization – push XQuery to XMLDB
XQJ API driver – for accessing mid tier/backend
Shared data model with XSL/XPath
Shared F&O – pre-defined & external
 Standard implementation interfaces
 Write Java Function once use it in
XQuery/XSLT
Processing XQuery
Oracle XQuery Compilation Engine

Parser convert XQuery into
XQueryX

XQueryX is an XML
representation of XQuery
(another W3C candidate
recommendation)

XML parser construct a DOM
tree from XQueryX

Work on the DOM afterward

Corresponding components are
extended for XQuery too
Sample XQuery
For each author in the bibliography, list the author's name and the titles of all books
by that author, grouped inside a "result" element."
<results>
FOR $a IN
distinct(document("http://www.bn.co
m")//author)
RETURN
<result>
$a,
FOR $b IN
document("http://www.bn.com")/bib/b
ook[author = $a]
RETURN $b/title
</result>
</results>
WHAT IS XQueryX

Is an XML representation of an XQuery.

Created by mapping the productions of the XQuery
abstract syntax directly into XML productions.

The result is not particularly convenient for humans to
read and write.

Easy for programs to parse, and because XQueryX is
represented in XML, standard XML tools can be used to
create, interpret, or modify queries
Environments in which XQueryX
useful

Parser Reuse. In heterogeneous data environments, a variety of systems
may be used to execute a query. One parser can generate XQueryX for all
of these systems.

Queries on Queries. Because XQueryX is represented in XML, queries can
be queried and can be transformed into new queries.
For instance, a query can be performed against a set of XQueryX queries
to determine which queries use FLWOR expressions to range over a set of
invoices.


Generating Queries. In some XML-oriented programming environments, it
may be more convenient to build a query in its XQueryX representation
than in the corresponding XQuery representation, since XML tools can be
used to do so.

Embedding Queries in XML. XQueryX can be embedded directly in an XML
document
Why XQuery static type checking?
XQuery static type checking is very
useful when the input XML structure is
known during compile time.
 The feature itself enables early error
recovery.

XQuery Static Type-Checking in
Oracle XML DB



Oracle XML DB performs static (that is, compile-time)
type-checking of XQuery expressions. It also performs
dynamic (runtime) type-checking.
Example Static Type-Checking of XQuery Expression
The XML view produced on the fly by Oracle XQuery
function ora:view has ROW as its top-level element, but
this example incorrectly lacks that ROW wrapper element.
This omission raises a compile-time error. Forgetting that
ora:view wraps relational data in this way is an easy
mistake to make, and one that could be difficult to
diagnose without static type-checking.
Static Type-Checking of XQuery
Expressions: ora:view

This produces a static-type-check error, because "ROW" is missing.
SELECT XMLQuery('for $i in ora:view("REGIONS"), $j in
ora:view("COUNTRIES")
where $i/REGION_ID = $j/REGION_ID and
$i/REGION_NAME = "Asia"
return $j'
RETURNING CONTENT) AS asian_countries
FROM DUAL;
SELECT XMLQuery('for $i in ora:view("REGIONS"), $j in
ora:view("COUNTRIES")
*
ERROR at line 1:
ORA-19276: XP0005 - XPath step specifies an invalid
element/attribute name:
(REGION_ID)
Correct code
SELECT XMLQuery('for $i in ora:view("REGIONS"), $j in
ora:view("COUNTRIES")
where $i/ROW/REGION_ID = $j/ROW/REGION_ID
and $i/ROW/REGION_NAME = "Asia"
return $j'
RETURNING CONTENT) AS asian_countries
FROM DUAL;
Result Sequence
<ROW><DEPARTMENT_ID>10</DEPARTMENT_ID><DEPARTMENT_NAME>Administr
ation</DEPARTMENT_NAME><MANAGER_ID>200</MANAGER_ID><LOCATION_I
D>1700</LOCATION_ID></ROW>
<ROW><DEPARTMENT_ID>20</DEPARTMENT_ID><DEPARTMENT_NAME>Marketing
</DEPARTMENT_NAME><MANAGER_ID>201</MANAGER_ID><LOCATION_ID>180
0</LOCATION_ID></ROW>
<ROW><DEPARTMENT_ID>30</DEPARTMENT_ID><DEPARTMENT_NAME>Purchasin
g</DEPARTMENT_NAME><MANAGER_ID>114</MANAGER_ID><LOCATION_ID>17
00</LOCATION_ID></ROW>
<ROW><DEPARTMENT_ID>40</DEPARTMENT_ID><DEPARTMENT_NAME>Human
Resources</DEPARTMENT_NAME><MANAGER_ID>203</MANAGER_ID><LOCATI
ON_ID>2400</LOCATION_ID></ROW>
XQuery Processing
Choices: co-processor or native
compilation?
 Co-processor:

 “off-the-shelf”
XQuery processor
 opaque to DBMS

Native compilation:
 XQuery
processing added to database engine
 DBMS-specific processor
Co-processor Advantages
Easy to implement and install
 Modularity of XQuery processor
 Standard XQuery processor between
applications
 Third-party development
 Flexibility

Co-processor Limitations

Storage Optimization
 Advanced
Oracle XML DB features being
wasted (e.g. indexed XML)

Query Optimization
 Cannot
use already-established Oracle query
engine optimizations
 No support for SQL/XML query optimization
Oracle's Native Processing
XQueries are compiled into sub-blocks and
execution structures usable by existing DB
engine
 “tightly integrate XQuery and SQL/XML
support within the database kernel”
 Focuses on utilizing existing optimization
techniques (algebra optimizations)
 XQuery interpreter for unsupported
operations

Native Processor Architecture
Advantages of Oracle's Approach
Fully utilizes mature optimization
techniques
 Integration of SQL and XQueries

 Much
stronger support for SQL/XML mixed
query optimizations
 No need for development of a separate set of
optimizations

“performance that is orders of magnitude
faster than the co-processor approach”
Conclusion
XMLType
 Variety of ways for data to be stored
 XQuery parsing and static type checking
 XQuery native processing and coprocessor

References








Zhen Hua Liu, Maralidhar Krishnaprasad, Vikas Aora. Native XQuery
Processing in Oracle XMLDB. SIGMOD2005.
Ravi Murthy, Zhen Hua Liu, Muralidhar Krishnaprasad, et al. Towards An
Enterprise XML Architecture. SIGMOD2005.
Mark Scardina. XML Storage Models: One Size Does Not Fit All.
http://www.oracle.com/technology/oramag/webcolumns/2003/techarticles/s
cardina_xmldb.html
XML Query (XQuery) Support in Oracle Database 10g Release 2. Oracle
White Paper. May 2005.
XML and Datenbanken.
http://www.dbis.ethz.ch/education/ws0506/xml_db_ws2005
http://www.dbspecialists.com
http://www.w3.org/TR/2003/WD-xqueryx-20031219/#N1016C
http://www.w3schools.com/xquery/xquery_example.asp