Transcript Document

JTC1 SC32N1637
Query Language for MDR/XMDR
February 05-08, 2007
ISO/IEC JTC 1/SC 32 Study Period: Future Database Needs
Clearwater, Florida, USA
Doo-Kwon Baik ([email protected]), Korea University
Dongwon Jeong ([email protected]), Kunsan National University
SungKong Park ([email protected]), MetaRight, Inc.
Contents
 Introduction
 What is Metadata registry?
 Metadata registries built
 Background and Need
 Issue and Proposal
 A Solution










How to Define MDR Operators
Query Patterns of Metadata Registry
Summary of MDR (Search) Query Patterns
An Example of Search Operators
A Brief BNF for Search Query
Examples of SQL/MDR
Advantages
Restriction (Disadvantage? Barrier?)
Another Issue: SQL/MDR vs. SQL/XMDR
Conclusion
Future Database Issue: Query Language for MDR/XMDR
2
Introduction: What is Metadata registry?
 ISO/IEC 11179, which is an international standard for

sharing and exchanging of data,
 dynamically managing metadata, and so on
 Key components of this standard

Data element
–

data format + data descriptions
Data element concept, Conceptual Domain, Object Class, etc.
 MDR concept

Figure that depicts concepts of data element and metadata, and relationship between them
Exchange Unit
Database
File
Etc.
Data Element
Identifier
Definition
Name
Etc.
Metadata registry
= set of data elements
Future Database Issue: Query Language for MDR/XMDR
instance
Identifier=“. . .”;
Definition=“. . .”;
...
Name=“Title”;.
Database
File
Etc.
3
Introduction: Metadata registries built
 Many MDRs have been building for dynamic metadata management
 EDR (Environement Data Registry),
– http://www.epa.gov/edr/
 ANHIK(Australian National Health Information Knowledgebase),
– http://www.aihw.gov.au/
 ITS(Intelligent Transportation System),
– http://www.dot.gov/
 Metadata Registry For Science and Technology Bibliographic Information,
– http://www.kisti.re.kr/
 Component Metadata Registry, and so on
– http://etri.re.kr/
Future Database Issue: Query Language for MDR/XMDR
4
Background and Need
 Needs
 Interoperability between MDRs in the different fields depends on effective sharing
metadata.
 Successful sharing of metadata, i.e., metadata registries, requires a standardized
and consistent access method to exchange metadata between them.
 However,
 ISO/IEC 11179 does not provide the standard access method between registries
 Previous metadata registry management systems developed and used different their
own access method respectively.
 In addition, existing metadata registries do not exactly follows (or misuse) the
standard specification of ISO/IEC 11179.
 No standardized method
 causes incompatibility of metadata
 causes heterogeneity between metadata registry management systems
 makes realization of integrated metadata retrieval system difficult
Future Database Issue: Query Language for MDR/XMDR
5
Issue and Proposal
 Issue
 How to exchange and share metadata consistently between registries?
 Our proposal
 SQL-based query language for consistent access to metadata registries.
 To achieve this goal, we
– analyze metadata registry access patterns and define MDR operators
– integrate the operators into SQL3
MDR
(11179)
XMDR
SQL3
(SQL4)
Extract and
Define
Operations
(MDR
Operators)
Future Database Issue: Query Language for MDR/XMDR
SQL-based
Query Lang.
(SQL/MDR)
6
How to Define MDR Operators
 A Methodology to achieve the goal: Definition Processes
Analyzing
query patterns
Defining
operators
Components of
11179
Pattern-1
Life Cycle of DE
Pattern-2
Search Operators
Update Operators
(Add, Update, and
Delete)
...
Mandatory or
Optional?
Pattern-N
Future Database Issue: Query Language for MDR/XMDR
DDL
(Create, Update,
Rename, Delete,
etc.)
7
Query Patterns of Metadata Registry
 Key components of metadata registry
 Data element vs. Group element
 Most components are used as a logical grouping unit of data elements.
 Group elements (GEs)
– can logically cluster data elements
– For example, conceptual domain, data element concept, class object, and so on.
Data Element Concept
Conceptual Domain
Value Domain
Data Element
Object Class
Representation Class
Group elements
Future Database Issue: Query Language for MDR/XMDR
8
Query Patterns of Metadata Registry (cont.)
 Lifecycle of a data element
 Submitted, Recorded, Qualified, Standard, Preferred standard, and Retired
<Notation>
REC
SER
QUA
SUB
Lifecycle
HIS
RET
STA
• SER: Submitter
• SUB: Submitted
• REC: Recorded
• QUA: Qualified
• STA: Standard
• PRE: Preferred standard
• RET: Retired
• HIS:Retired data element set
PRE
 A Data element has mandatory attributes and optional attributes.
 Mandatory attributes
– Must be captured to achieve the goal of this paper
– For example, name, definition, and context of data elements should be described necessarily.
 Optional attributes
– There is no need to be captured indispensably, but ???
Future Database Issue: Query Language for MDR/XMDR
9
Summary of MDR (Search) Query Patterns
 MDR query operators for searching are defined by using the following
factors:
Service patterns of the
existing MDR management
systems
Search Operator types extracted
•DE search operators with mandatory attributes
•DE search operators with registration statuses
Key objects of MDR
DE lifecycle
(Registration statuses)
•DE search operators using GE(Group elements)
MDR Search
Operators
•GE search operators
•Search options
(exact matching, partial matching, ...)
MDR Operator types for DML
Mandatory DE attributes
MDR Operator types for DDL
Future Database Issue: Query Language for MDR/XMDR
MDR Operators
(for DML & DDL)
10
An Example of Search Operators
 MDR Query Operators for Retrieval: Search Operators
Search Operators
Description
Notations
KW
: Keyword given by users
DE_name(KW, OPT)
DE_definition((KW, OPT)
DE_context(KW, OPT)
DE_identifier(KW, OPT)
DE_reg_organization(KW, OPT), . . .
Query operators to retrieve DEs with
mandatory attributes
DE_status(RA, DA, KW, OPT)
DE_status(RA)
DE_status_submitted(DA, KW, OPT)
DE_status_retired(DA, KW, OPT), . . .
Query operators to retrieve DEs with
registration statuses
RA
: Registration attribute
DE_object_class(KW, OPT)
DE_conceptual_domain(KW, OPT)
DE_concept(KW, OPT)
Query operators to retrieve DEs using the
group elements
DA
: Mandatory attributes of
data elements such as name,
definition, context, and so
on.
object_class(KW, PT)
conceptual_domain(KW, OPT)
element_concept(KW, OPT), . . .
Query operators
Elements)
OPT
: Search options (Partial,
exact, and starting with
matching)
Future Database Issue: Query Language for MDR/XMDR
to
retrieve
GE(Group
11
A Brief BNF for Search Query
<extended query specification>
::= SELECT <extended attribute list>
FROM <extended relation list>
WHERE <extended attribute qualification>;
<extended attribute list>
::= <attribute list>|<MDR attribute list>|<MDR operator>;
<extended relation list>
::= [COMMA]<general relation list> |<MDR relation list> [<extend attribute list>];
<extended attribute qualification>
::= <general qualification> |<MDR qualification>;
<MDR qualification>
::= [<boolean term>]<MDR operator>[<extend attribute qualification>];
<MDR operator>
::= <DE mandatory attribute name> | <DE registration status> | DE_STATUS
L_PAREN<MDR param list>R_PAREN;
<DE mandatory attribute name>
::= NAME | DEFINITION | … | ORGANIZATION;
<DE registration status>
::= DE_STATUS_SUBMITTED | DE_STATUS_RECORDED | DE_STATUS_QUALIFIED
| DE_STATUS_STANDARD | DE_STATUS_PREFERRED | DE_STATUS_RETIRED;
<MDR relation list>
::= DATA_ELEMENT | DATA_ELEMENT_CONCEPT | … | OBJECT_CLASS;
Future Database Issue: Query Language for MDR/XMDR
12
Examples of SQL/MDR
 Query 1. Retrieve all data elements where registration status is ‘RECORDED’.
SQL/MDR> SELECT DE_status (RECORDED)
FROM data_element ...................................................................(1-a)
 Query 2. List all data elements where registration status is 'RECORDED' and name is
‘KOREA’.
SQL/MDR> SELECT * FROM data_element
WHERE DE_status (RECORDED)
AND name=‘KOREA’ …………..……..........................……….(2-a)
SQL/MDR> SELECT * FROM data_element
WHERE DE_ status (RECORDED, name, ‘KOREA’)………...…(2-b)
SQL/MDR> SELECT * FROM data_element
WHERE DE_status_recorded (name, ‘KOREA’)......……………..(2-c)
Future Database Issue: Query Language for MDR/XMDR
13
Advantages
 Advantages
 Ease of use
 Familiarity to users
 Consistent and Standardized access method
– Independent query description on physical structures (table structures)
– Low modeling cost (decreases query description time)
– Simplicity of query statement for accessing distributed registries
– Low complexity for distributed query composition
 In its application aspect, SQL/MDR
 can be used Unified Access Method to Various Registries
 provides Unified query and Unified query result form
 can be used for developing distributed registries management system
 can be used as a communication protocol (method) for developing semi-automatic
tools: Registries Mapping and Integrating tools
 Data Integration (among databases following their own registry)
Future Database Issue: Query Language for MDR/XMDR
14
Advantages: with a Scenario
 The below query shows advantages such as efficiency and simplicity of
SQL/MDR.
 Let’s assume the following situation to show contributions of SQL/MDR.
 Assumption (Situation): There are two registries with their own physical structure
– There exist two metadata registries.
– Each metadata registry has different MDR structure each other.
– The first metadata registry, MDR1 is designed as follows:
(1) data_element_table includes all of the data elements,
(2) data_element_name is a field name of data_element_table and means name of data elements, and
(3) Status is a field name of data_element_table and means registration status of data elements
– The second metadata registry, MDR2 consists of two tables as follows:
(1) table1 and table2 include all of the data elements together. Most attributes are included in table1 and
some attributes including registration status are in table2,
(2) table1 and table2 use name as their join key, and
(3) Registration status is involved in reg_status, a field of table2.
Future Database Issue: Query Language for MDR/XMDR
15
Advantages: with a Scenario (cont.)

Table Structures of Two MDRs, MDR1 and MDR2
MDR1
data_element_table
*data_element_name
definition
status
...

MDR2
DE_table1
*name
definition
...
DE_table2
name
reg_status
...
Query 3. (Access to two metadata registries) Retrieve all data elements, where registration status
is ‘RECORDED’ from metadata registries, MDR1 and MDR2.
SQL/MDR> SELECT DE_status (RECORDED)
FROM data_element ...............................(3-a)
SQL> SELECT data_element_name
FROM data_element_table
WHERE status = ‘RECORDED’ ........................(3-b)
SQL> SELECT table1.name
FROM table1, table2
WHERE table1.name=table2.name
AND table2.reg_status = ‘RECORDED’ .............(3-c)
Future Database Issue: Query Language for MDR/XMDR
In case of SQL/MDR based approach
In case of use of SQL
//query rewriting is required, i.e., requires several queries
//high design time, complicated processing, etc.
16
Restriction (Disadvantage? Barrier?)
Query processor should know the followings:
(1) data_element  de_table
(2) name  de_name
(3) reg_status holds registration information
SELECT *
FROM data_element
WHERE DE_status (RECORDED)
AND name=‘%Korea%’
In a word, a preprocessing (mapping) is required
for realization of the proposal.
Query Processor
1st approach to define SQL/MDR
: Loose Coupling
MDR
de_table
de_name
Korea01
Korea02
...
vs.
...
...
...
reg_status
Submitted
Recorded
...
...
Another approach to define SQL/MDR
: Using a Predefined MDR Schema (Tightly Coupled)
If a relational schema is given, this
problem can be easily solved ?!
Future Database Issue: Query Language for MDR/XMDR
17
Another Issue: SQL/MDR vs. SQL/XMDR
 XMDR (Extended Metadata Registry)
 Extension of MDR
 Captures ontology concept
 Another Query Language for XMDR?
 There are several alternatives
 Alternative I.
– SQL/XMDR (MDR is a subset of XMDR)
– SQL/MDR won’t be developed
 Alternative II.
– SQL/MDR, and then could be extended to SQL/XMDR
– XMDR is not a standard yet; after selected as an international standard
 Alternative III.
– Only SQL/MDR
– SQL/XMDR won’t be?
 And .......???
Future Database Issue: Query Language for MDR/XMDR
18
Conclusion?
 Issue-I: SQL/MDR
 SQL-based query language for consistent access to metadata registries
(ISO/IEC 11179)
 Advantages
– Unified access method: One described, All returned (i.e., same query, same
formatted results)
– Strengthen standardization of registries
 Issue-II: Query Language for XMDR, SQL/XMDR
 SQL-based query language for consistent access to ontologies, XMDRs
 Relationship between SQL/MDR and SQL/XMDR
 And...
Future Database Issue: Query Language for MDR/XMDR
19
Q/A
Thank you!
Dongwon Jeong ([email protected]), Kunsan National University, Korea
Doo-Kwon Baik ([email protected]), Korea University, Korea
SungKong Park ([email protected]), MetaRight, Inc., Korea
Future Database Issue: Query Language for MDR/XMDR
20