XML DB Architecture
Download
Report
Transcript XML DB Architecture
Oracle XML DB and
XQuery
Chris Gianfrancesco
Aruna Apuri
Oleg Rekutin
Jason Wilson
Introduction
XML Type abstraction
Storage
Shredded
Publishing
XML
or LOB
Views of relational tables
SQL / XML functions and constructs
XMLQuery, XMLTable,
XML Updates
and more...
XQuery evaluation and processing
XML Type
XQuery
XPath
XSLT
SQL / XML
XML Type abstraction
Physical Storage
CLOB
XML Type Views
Shredded
Relational Data
Hybrid
Binary XML
XML View
Create virtual XML version of objectrelational data
Allows XQuery to access relational data
Uses XML Publishing
ora:view()
SQL/XML Functions
SQL/XML querying function and construct
XMLQuery,
XMLTable
SQL/XML functions for creating XML from
SQL
XMLElement(),
XMLForest()
XMLConcat(), XMLAttributes(),
More XML Functions
Other XML functions
XMLColAttVal(),
XMLSequence(),
ExtractValue(), Extract(), XMLTransform()
To support XML updates
UpdateXML(),
DeleteXML(),
InsertChildXML(), InsertXMLBefore(),
AppendChildXML()
XQuery Hybrid Evaluation
Transform XMLTable into XMLQuery
Static analysis and type checking
If possible, compiles into native SQL data
structures
If not possible, XMLQuery is left as is for
processing by XMLQuery processor
XQuery Hybrid Evaluation
SQL query containing
XMLQuery/XMLTable
Transform XMLTable to XMLQuery
SQL query containing
XMLQuery
Native compilation of XMLQuery
SQL structures with
XML operators
SQL structures containing
XMLQuery
XQuery evaluated natively
Co-processor evaluates
XMLQuery expressions
Input & Data Representation
All
data in one row, one XMLType column
<employees>
XMLType
<employee>
<name>John Doe</name>
<job>Adjuster</job>
</employee>
<employee>
<name>Michael Smith</name>
<job>Investigator</job>
</employee>
<employee>
<name>Sam Adams</name>
<job>Engineer</job>
</employee>
</employees>
Input & Data Representation
Each
data row in separate DB row,
column of XMLType
<employees>
<employee>
VARCHAR
XMLType
<name>John
Doe</name>
row
<job>Adjuster</job>
</employee>
<employee>
XMLType
VARCHAR
<name>Michael Smith</name>
row
<job>Investigator</job>
</employee>
<employee>
XMLType
VARCHAR
<name>Sam Adams</name>
row
<job>Engineer</job>
</employee>
</employees>
Input & Data Representation
Each
data row in separate DB row,
contents in separate columns
<employees>
<employee>
<name>John Doe</name>
row
<job>Adjuster</job>
</employee>
<employee>
<name>Michael Smith</name>
row
<job>Investigator</job>
</employee>
<employee>
<name>Sam Adams</name>
row
<job>Engineer</job>
</employee>
</employees>
VARCHAR
VARCHAR
VARCHAR
VARCHAR
VARCHAR
VARCHAR
Input Tools
Straight XML in SQL
INSERT
VALUES(
XMLType(‘<xml>goes here</xml>’))
JDBC using special XMLType (also C)
SQL*Loader w/ direct path load mode
XML-SQL Utility (XSU)
Maps
XML to columns
Rigid default mapping
No support for attributes
Storage in Database
XMLType CLOB
File
preserved as complete text (whitespace,
comments, etc) [textual fidelity]
Can still be validated against a schema
Data internally is not “typed”
Slow querying
Fastest storage and retrieval
Storage in Database
XMLType View
Create
a virtual XML document on top of
relational tables
Fast querying, manipulation using pure SQL
Deeply nested views are slow
Updating/inserting requires triggers
Lose strict order guarantee, no textual fidelity
Supports multiple XML schemas on top of one
relational schema
Storage in Database
Native XML type (Structured Storage)
Preserves
textual fidelity
Shreds into SQL tables
Complete validation, full SQL support
No triggers to update tables (built-in rewriting)
Some overhead
Cannot change schema w/o reloading all data
Requires a schema
Structured Storage Detail
Annotate XML schema to control nested
collections storage, as:
CLOB
Array
of serialized SQL objects
Nested table of serialized SQL objects
Array of XMLType
Working with XML Schema
Registering schema
begin dbms_xmlschema.registerSchema( ‘http://namespace',
xdbURIType('schema.xsd').getClob(),
TRUE,TRUE,FALSE,TRUE);
end;
Creating table w/ schema
CREATE TABLE TableName of XMLType
XMLSCHEMA "http://namespace"
XQuery Support in Oracle
XMLDB integrated database engine
SQL / XML standard support
Optimized queries – rewrite to relational
Standalone Java query engine
100% Java
Integrated into Oracle App Server -XDS
Interoperates with XSLT/XPath
XQuery database support
Production in Oracle Database 10gr2
Supports XMLQuery and XMLTable construct
Native compilation into SQL /XML structures
Returns XMLType(Content)
Can query over relational, O-R, XMLType data
fn:doc - Maps to XDB Repository on server
SQLPlus provides xquery command to execute
XQuery
XSL-T will also get compiled to XQuery
Architecture
X
Q
U
E
R
Y
XQuery XSL-T Parser
XQueryX
Normalization
Statically
Type checked
Tree
Compiler
Normalized Tree (casts, treat )
Rewrite to SQLX
Compiled XQuery Tree
XQuery Type check
SQL/XML Operand Tree
SQLX rewrite
SQL Operand Tree
SQL Metadata
XMLSchema
Repository
XML Indexes,
Text Indexes
Relational Optimizer
Execution Structures
XQuery F&O
Execution engine
S
Q
L
XQuery Java implementation
XQuery or XQueryX input
Extensible function implementation
Compiles into rowsource like structures
Optimization – push XQuery to XMLDB
XQJ API driver – for accessing mid tier/backend
Shared data model with XSL/XPath
Shared F&O – pre-defined & external
Standard implementation interfaces
Write Java Function once use it in
XQuery/XSLT
Processing XQuery
Oracle XQuery Compilation Engine
Parser convert XQuery into
XQueryX
XQueryX is an XML
representation of XQuery
(another W3C candidate
recommendation)
XML parser construct a DOM
tree from XQueryX
Work on the DOM afterward
Corresponding components are
extended for XQuery too
Sample XQuery
For each author in the bibliography, list the author's name and the titles of all books
by that author, grouped inside a "result" element."
<results>
FOR $a IN
distinct(document("http://www.bn.co
m")//author)
RETURN
<result>
$a,
FOR $b IN
document("http://www.bn.com")/bib/b
ook[author = $a]
RETURN $b/title
</result>
</results>
WHAT IS XQueryX
Is an XML representation of an XQuery.
Created by mapping the productions of the XQuery
abstract syntax directly into XML productions.
The result is not particularly convenient for humans to
read and write.
Easy for programs to parse, and because XQueryX is
represented in XML, standard XML tools can be used to
create, interpret, or modify queries
Environments in which XQueryX
useful
Parser Reuse. In heterogeneous data environments, a variety of systems
may be used to execute a query. One parser can generate XQueryX for all
of these systems.
Queries on Queries. Because XQueryX is represented in XML, queries can
be queried and can be transformed into new queries.
For instance, a query can be performed against a set of XQueryX queries
to determine which queries use FLWOR expressions to range over a set of
invoices.
Generating Queries. In some XML-oriented programming environments, it
may be more convenient to build a query in its XQueryX representation
than in the corresponding XQuery representation, since XML tools can be
used to do so.
Embedding Queries in XML. XQueryX can be embedded directly in an XML
document
Why XQuery static type checking?
XQuery static type checking is very
useful when the input XML structure is
known during compile time.
The feature itself enables early error
recovery.
XQuery Static Type-Checking in
Oracle XML DB
Oracle XML DB performs static (that is, compile-time)
type-checking of XQuery expressions. It also performs
dynamic (runtime) type-checking.
Example Static Type-Checking of XQuery Expression
The XML view produced on the fly by Oracle XQuery
function ora:view has ROW as its top-level element, but
this example incorrectly lacks that ROW wrapper element.
This omission raises a compile-time error. Forgetting that
ora:view wraps relational data in this way is an easy
mistake to make, and one that could be difficult to
diagnose without static type-checking.
Static Type-Checking of XQuery
Expressions: ora:view
This produces a static-type-check error, because "ROW" is missing.
SELECT XMLQuery('for $i in ora:view("REGIONS"), $j in
ora:view("COUNTRIES")
where $i/REGION_ID = $j/REGION_ID and
$i/REGION_NAME = "Asia"
return $j'
RETURNING CONTENT) AS asian_countries
FROM DUAL;
SELECT XMLQuery('for $i in ora:view("REGIONS"), $j in
ora:view("COUNTRIES")
*
ERROR at line 1:
ORA-19276: XP0005 - XPath step specifies an invalid
element/attribute name:
(REGION_ID)
Correct code
SELECT XMLQuery('for $i in ora:view("REGIONS"), $j in
ora:view("COUNTRIES")
where $i/ROW/REGION_ID = $j/ROW/REGION_ID
and $i/ROW/REGION_NAME = "Asia"
return $j'
RETURNING CONTENT) AS asian_countries
FROM DUAL;
Result Sequence
<ROW><DEPARTMENT_ID>10</DEPARTMENT_ID><DEPARTMENT_NAME>Administr
ation</DEPARTMENT_NAME><MANAGER_ID>200</MANAGER_ID><LOCATION_I
D>1700</LOCATION_ID></ROW>
<ROW><DEPARTMENT_ID>20</DEPARTMENT_ID><DEPARTMENT_NAME>Marketing
</DEPARTMENT_NAME><MANAGER_ID>201</MANAGER_ID><LOCATION_ID>180
0</LOCATION_ID></ROW>
<ROW><DEPARTMENT_ID>30</DEPARTMENT_ID><DEPARTMENT_NAME>Purchasin
g</DEPARTMENT_NAME><MANAGER_ID>114</MANAGER_ID><LOCATION_ID>17
00</LOCATION_ID></ROW>
<ROW><DEPARTMENT_ID>40</DEPARTMENT_ID><DEPARTMENT_NAME>Human
Resources</DEPARTMENT_NAME><MANAGER_ID>203</MANAGER_ID><LOCATI
ON_ID>2400</LOCATION_ID></ROW>
XQuery Processing
Choices: co-processor or native
compilation?
Co-processor:
“off-the-shelf”
XQuery processor
opaque to DBMS
Native compilation:
XQuery
processing added to database engine
DBMS-specific processor
Co-processor Advantages
Easy to implement and install
Modularity of XQuery processor
Standard XQuery processor between
applications
Third-party development
Flexibility
Co-processor Limitations
Storage Optimization
Advanced
Oracle XML DB features being
wasted (e.g. indexed XML)
Query Optimization
Cannot
use already-established Oracle query
engine optimizations
No support for SQL/XML query optimization
Oracle's Native Processing
XQueries are compiled into sub-blocks and
execution structures usable by existing DB
engine
“tightly integrate XQuery and SQL/XML
support within the database kernel”
Focuses on utilizing existing optimization
techniques (algebra optimizations)
XQuery interpreter for unsupported
operations
Native Processor Architecture
Advantages of Oracle's Approach
Fully utilizes mature optimization
techniques
Integration of SQL and XQueries
Much
stronger support for SQL/XML mixed
query optimizations
No need for development of a separate set of
optimizations
“performance that is orders of magnitude
faster than the co-processor approach”
Conclusion
XMLType
Variety of ways for data to be stored
XQuery parsing and static type checking
XQuery native processing and coprocessor
References
Zhen Hua Liu, Maralidhar Krishnaprasad, Vikas Aora. Native XQuery
Processing in Oracle XMLDB. SIGMOD2005.
Ravi Murthy, Zhen Hua Liu, Muralidhar Krishnaprasad, et al. Towards An
Enterprise XML Architecture. SIGMOD2005.
Mark Scardina. XML Storage Models: One Size Does Not Fit All.
http://www.oracle.com/technology/oramag/webcolumns/2003/techarticles/s
cardina_xmldb.html
XML Query (XQuery) Support in Oracle Database 10g Release 2. Oracle
White Paper. May 2005.
XML and Datenbanken.
http://www.dbis.ethz.ch/education/ws0506/xml_db_ws2005
http://www.dbspecialists.com
http://www.w3.org/TR/2003/WD-xqueryx-20031219/#N1016C
http://www.w3schools.com/xquery/xquery_example.asp