Strategies for All Your Data
Download
Report
Transcript Strategies for All Your Data
Session id: 40236
Strategies for All Your Data
Sandeepan Banerjee
Vishu Krishnamurthy
Oracle Corporation
Where are you spending
your money ?
Data Management
Labor
Software Integration
Hardware and System
Integration
Too much information in too
many places
Relational
Documents
Specialty Servers For
Different Kinds Of Data
Data Isolation
High Systems Admin
And Management Costs
Scalability Problems
High Training Costs
Complex Support
Problems
Multimedia
Specialized …
Location
Messages
XML
One Management System for All
Your Data
Relational
Characters, Numbers and Dates
Complete
Integrated
Robust
Scalable
Secure
Available on all
platforms
XML DB
Integrated Native XML Database
Oracle Text & Ultra Search
Text management and search
Oracle Locator & Spatial
Location and Proximity Searching
Oracle interMedia
Multimedia management
Oracle Collaboration Suite
Unified Messaging and Files
Extensibility Framework
Chemical, Genetic, Engineering,…
What is Oracle XML DB?
Database support for the XML data model
–
XMLType, XMLSchema, DOM Fidelity, Xpath, …
Hierarchical organization of the data
–
WebDAV compliant with indexing for fast access
Transparent storage optimizations
Query Language: SQLX and XQuery
Classes of XML DB Applications
Exchanging Structured Documents
–
Well-formed templated business-documents e.g.
Purchase Orders, Phone Bills, …
Managing Unstructured Documents
–
Documents, Messages, Instructions
Integrating and normalizing data from diverse
sources
Structured Document Exchange
Relational storage remains the “right” way to store
highly structured data
As an XML programmer, you do not want to think
about “tables”
–
A hierarchical data model is what you want to
manipulate
XML DB’s XMLType is about preserving the XML
paradigm while getting the benefits of relational
performance and scalability
Structured Document Exchange
with Oracle XML DB
XML data model and API’s familiar to XML
programmers
–
–
XML Schema, Schema Validation, Dom Fidelity
JNDI, DOM, XPATH, SQLX, XQuery
Enterprise Class Performance & Scalability
–
–
–
–
Piecewise updates
Schema caching
Lazy materialization
Server-based XSL transformations
Structured Data: Temenos
GLOBUS Banking platform: #1 selling platform,
major banks worldwide
Contract-based system, deeply nested data model,
user-customizable
80+ major subsystems, 6000 Tables, 100s of GB
“Using Oracle XML DB, we successfully benchmarked 22 million
banking transactions per day, which translated to 2500 databasetransactions-per-second, for Temenos' GLOBUS banking platform.
Oracle XML DB’s performance assured us that powerful XML
innovations can be operationalized and deployed without sacrificing
enterprise-class scalability.”
- TEMENOS
Managing Unstructured Data
More and more content is being produced as XML
(Microsoft Word, Corel XMetal, Arbortext Epic, …)
–
Markup improves search, processing, organization, …
XML DB’s Repository enables XML document
content to be stored as ‘files’ in ‘folders’ without
losing strong-management, queryability,
unbreakable security etc.
XML is doing for unstructured data what
Relational did for structured: create a standard
way to store, query and manage unstructured data
Managing Unstructured Data
with Oracle XML DB
XML data model and API’s familiar to Content
Developers
Integrated Repository
–
–
–
WebDAV compliant
Xpath index for fast traversal of foldering hierarchies
SQL Queryable
Integrated Text Processing
–
Optimizations such as “tag aware” search
Reed Elsevier
Large technical publishing conglomerate
More than 1700 scientific, technical & medical
peer-reviewed journals
Over 59 million abstracts
Over two million full-text scientific journal
articles , another one million full-text articles
via CrossRef (http://www.crossref.org/) to
other publishers' platforms
XML DB chosen as Repository Database
g
10 : What’s new in XML DB
Broad Performance Improvements
–
–
–
–
–
SQLX query rewrites
XSLT optimizations
Repository Access and Query optimizations
Direct loader support, loading large XML documents
Storage optimizations
I18N: support for differing character sets on client and
server
Schema Evolution
–
Transparently achieves data load/reload
Unified XML API between XDK and XML DB
–
Unified C interfaces
XML-based Integration: XQuery
Why XQuery ?
–
Declarative way to query XML
documents
Why Java?
–
–
Run in mid-tier or database
Future server implementation in C
Why XML Database ?
–
–
–
–
Native XML storage
XML data management
Performance optimizations
SQL/XML or XQuery depending
on data
Status
–
OTN downloads (pending W3C
standard finalization in ’04)
XQuery Engine
XQuery Engine
iAS
J2EETM Platform
Server JVM
XML DB
XQuery Example
Assume a document – emp.xml
<empset>
<emp empno=“21” ename=“SCOTT” salary=“120000”/>
<emp empno=“22” ename=“JONES” salary=“344000”/>
</empset>
To get the names of employees with salary > 200000
for $i in document(‘emp.xml’)/empset
let $j = 200000
where $i/@salary > $j
return $i/@ename
Result (attribute node)
JONES
Differences from SQL
Navigation-oriented (using XPath expressions)
Different type system (XMLSchema based simple
types)
Identity-based (XML Node identities and document
order)
Namespace aware name-resolution (functions,
variables, element creation)
Row based versus Item based
Results are heterogeneous sequences
Does not have all SQL extensions (e.g, OLAP, FullText..)
Oracle XQuery API
JXQI – Java API (ongoing standards discussions)
import oracle.xquery;
XQueryContext ctx = new XQuerycontext();
Reader strm = new FileReader(“exmpl1.xml”)
XQueryPreparedStatement
xq = ctx.prepareStatement(strm);
XQueryResultSet rset = xq.executeQuery();
while (rset.next())
rset.getNode().print(System.out);
XQLPlus tool! (like SQLPlus)
Datasources
Enables arbitrary input sources
–
files, cache, JCA datasources
xmldatasrc – Oracle language addition
Datasource API
–
–
–
–
initialize
describe
execute
Fetch
Bind (an existing DOM)
Rewrite to SQL
XQuery over Oracle databases – Rewrite!
for $i in view(“scott.emp”)/ROW
where $i/SALARY > 200000
return $i/ENAME
-- is translated to --select “$i”.ename
from scott.emp “$i”
where “$i”.salary > 200000;
More SQL rewrite
for $i in view(‘purchaseOrder’)/ROW/PurchaseOrder
where $i/ShipAddr/City = ‘San Francisco’
return <PO ponum=$i/@Poid> <$i/ShipAddr> </PO>
select xmlelement(“PO”,
XMLAttributes(extractvalue(“$i”,‘/PurchaseOrder/@Poid’) as “ponum”)),
extract(“$i”, ‘/PurchaseOrder/ShipAddr’))
from scott.purchaseorder “$i”
where extractvalue(“$i”, ‘/PurchaseOrder/ShipAddr/City’) =
‘San Francisco’
D E M O N S T R A T I O N
XQuery
Oracle Text
Rich Full-Text Capabilities built into the Oracle
database
Integrated Search support for Applications
–
OCS, Portal, Ebusiness Suite
Catalog Search
Document Archives and Warehouses
Infrastructure for Intranet and Extranet Search
(via Ultra Search.)
Oracle Text: Rich Full-Text
g
10 : What’s new in Oracle Text?
Supervised Classification – Rule-based and SVM
Unsupervised Classification (Clustering) – KMeans and
Hierarchical
Query-Log Analysis
Query-Templating for Progressive-Relaxation, Query-rewriting,
Alternative scoring etc.
Index creation improvements -- Real-time synchronization
Better Partitioning: Create local-partitioned indexes in parallel
Filtering enhancements
–
Filter and index RFC-822 email messages
Language Enhancements
–
Japanese stemming, Customization of Japanese & Chinese Lexicons
Information Visualization – Stretch viewer
Oracle Ultra Search
Out-of-the-box heterogeneous search-and-locate
capabilities
–
DB, Web Servers, Files, E-Mail, Apps
High performance threaded Java crawlers
Web-style interface
Extensible, customizable (Java API)
–
–
–
Customizable metadata search
Custom crawling
Custom rendering
Integrated administration
Fully multilingual and globalized
Integrated with Oracle Portal (repository, portlet) and
Oracle Collaboration Suite
10g: What’s new in Ultra Search?
Enhanced Security
–
–
–
Secure Crawling (https support)
Better Authentication
http Digest and Forms
ACL-secured search hitlist
Role-based ACLs per datasource
Or custom ACLs stamped by crawler
Federated Search
–
JCA-compliant Searchlet API
Unified Search
–
Secure Crawler API
OID Integration
D E M O N S T R A T I O N
Information
Visualization
The Media-enabled Oracle
Platform
Oracle Database 10g
–
Storage, management, & retrieval of image, audio, video data
–
Native format understanding, metadata extraction, methods
for image processing
–
Support for leading streaming media servers
Oracle Application Server 10g
–
JSP, servlet and PL/SQL application development support
–
Media Adaptation Services for Wireless
–
JDeveloper (BC4J/UIX) and Portal integration
Oracle Collaboration Suite
–
Metadata extraction for OCS Files
g
New Oracle10 Multimedia
Features
Standards Support – SQL/MM Still Image
New version of Java Advanced Imaging (JAI 1.1.1_01)
and additional image processing operators
Support for additional media formats
–
•
•
•
Microsoft ASF, MPEG2 & MPEG4
Microsoft Windows Media Server Plugin
Real Server Plugin for Helix Server
XML DB integration
How Oracle’s Multimedia
capabilites are better
Only Oracle10g:
Supports media content natively
–
–
No manual initiation of separate processes to enable database
tablespace to accept media data.
No need for DBAs to initiate these processes for each table where
they wish to store media data
Stores all media and its metadata in the same table as the
associated relational data
–
–
No triggers on each and every media object created to update the
separate “administration” tables that contain media objects and
metadata.
No added processing and I/O overhead for access and retrieval
Provides Java class libraries and JSP Tag libraries for application
development and media access.
Oracle is the Leading Spatial Database
“In repeated surveys, IDC has found that Oracle is used
in an 80%-90% share of Spatial Information
Management oriented database installations.”
IDC, December 2002
Oracle 10g Locator feature: Beginning with Oracle9i
LOCATION capabilities have been part of EVERY
database at NO ADDITIONAL COST
–
Enables business, web and LBS applications
Oracle Spatial 10g: Enterprise Edition Option
–
Supports advanced Land Management, GIS,
Transportation,Energy / Utilities, Remote Sensing, Defense
and Intelligence applications
Oracle10g Location Features
Locator
Spatial (Enterprise Option)
Points, lines, polygons
2D, 3D, 4D data
Spatial Operators
All Locator features
Spatial functions
–
–
Distance
Relationships
Coordinate Systems
Long Transactions
Table Partitioning*
Object Replication**
Parallel Query* – NEW!
Deferred Spatial Indexes –
NEW!
* Requires Enterprise Edition with Partitioning Option
** Some replication features on Enterprise Ed. only
–
–
area/length calculation
buffer, centroid, intersection,
union, etc.
Linear Referencing
Spatial Aggregates
Coordinate Transforms
GeoRaster – NEW!
Topology Data Model – NEW!
Network Data Model – NEW!
GeoCoder – NEW!
Spatial Data Analysis &
Mining – NEW!
Location features in the Oracle “Stack”
Any device
CRM & ERP Applications
TCA schema
Web Services
e-Business Suite
Application Server
iAS MapViewer / JDeveloper
B2B, B2E,
B2C
iAS LBS Components
Oracle Application Server 10g
SOAP, WSDL
Data Server
Spatial
Locator
Oracle Database 10g
Oracle Location Technology
Online
Service
Oracle core technologies
Oracle’s Extensibility Framework
Open API to plug in new data types
and access methods
Specialty Data Types
Chemical
Genetic
Engineering
Biometric
Multimedia
Driven by specialized-domain ISVs -MDL, NetGene, Informax, Protegrity, …
Extensibility: In Silico
Chemistry
Chemistry searching requires special
techniques
“Viagra®”
–
–
Chemical name is not unique
Chemists think graphically
“sildenafil citrate”
H
H
O
O
N
The solution:
H
H
N
N
N
N
–
–
A graphical search engine
H
H
N
S
H
O
H
Specialized operators such as substructure
search (“sss”) = a chemical “contains”
O
H
Oracle Collaboration Suite
Consolidate management of unstructured data (email, shared
documents and other collaborative content)
Before grid computing, resources such as storage and CPUs had
to be managed separately for each component of the suite (e.g.
email vs files vs web conferencing).
OCS 10g takes advantage of grid infrastructure for greater
efficiency, reduced cost and easier management
Extended Data Management
Oracle Collaboration Suite, Oracle Portal, eBusiness Suite
provide solutions
Ultra Search crawls and (where desirable) federates non-Oracle or
legacy sources, and bring these in the ambit of uniform access
• Search, Interchange, Visualization
• Analytics and Mining
Oracle provides the most robust open and extensible platform
and the important services for all your data
• Storage and Management
• Search, Interchange, Visualization
• Analytics and Mining
• Structured data will stay Relational
• Documents & Messages will move to XML
• Multimedia will be in BLOBs, with metadata annotated in XML
QUESTIONS
ANSWERS