XML_Query_Language_Project

Download Report

Transcript XML_Query_Language_Project

XML Query Language Project
Dennis Petesch
Mentors and Affiliations: Lola Olsen &
GCMD
What is GCMD?
Global Change Master Directory
 Sponsored by Nasa
 Accessed Online
 Contains Data Sets and Portals
 Learning Page
 Has Docbuilder which aids in submitting
information to the GCMD.

Division of Time

Participating in a user testing survey for the Global
Change Master Directory website (www.gcmd.nasa.gov).
 provided insightful knowledge about the website, which
would help future viewers better understand how to use
and navigate throughout the site.

Remaining Time was spent on:

* Researching XML Databases
 * Developing a Java program
 * Testing
 * Documentation
Goal

Provide support by implementing a Javabased parser that will query the GCMD
Database using XML XPath expressions
(Standardized XML based query language)
instead of the current in-house query
language.
Research Questions
 XPath?
 The
primary purpose of XPath is to address
certain criteria of an XML document through
the hierarchal navigation of an XML
document.
 Will XPath be sufficient in filtering XML
Metadata? Can I use XPath to exactly
pinpoint certain information?
Approach
•
Practice XML and Java
•
Research XPath functionality
•
Research Database functionality
•
Testing XPath expressions
•
•
Black box testing
Open Source XML database from eXist, Xindice
XML EXAMPLE

<?xml version="1.0" encoding="ISO8859-1" ?>
 <DIF>
<Parameters>
<Topic>ATMOSPHERE</Topic>
<Term>ATMOSPHERIC PHENOMENA</Term>
<Variable>STORMS</Variable>
</Parameters>
<Parameters>
<Topic>ATMOSPHERE</Topic>
<Term>ATMOSPHERIC ANOMOLIES</Term>
<Variable>HURRICANES</Variable>
</Parameters>
</DIF>
GCMD Query Language
Simple Queries
[Entry_ID=‘ZZZ415’]
Boolean Queries
[Location:Location_Name=‘NORTH AMERICA’] AND
[Source_Name:Short_Name=‘LANDSAT’]
Grouping ()’s
[Location:Location_Name=‘NORTH AMERICA’] OR
([Source_Name:Short_Name=‘LANDSAT’] AND
[Project:Short_Name=‘DODS’])
EXAMPLE
GCMD Query Language
[Entry_ID=‘ds018.0’]
XPath
/DIFS/DIF[Entry_ID/text()=‘ds018.0’]
Boolean
GCMD Query Language
XPath
[Location:Location_Name=
‘NORTHERN HEMISPHERE’ ] AND
[Data_Set_Language=‘English’]
/DIFS/DIF[Location/Location_Name/
text()=‘NORTHERN HEMISPHERE’ and
Data_Set_Language/text()=‘English’]
Grouping()
Database Comparison







Exist
Open Source
Written in Java
Makes use of GUI
Supports XPath &
XQuery
2^31 small documents
Minimal command line
documentation







XINDICE
Open Source
Written in Java
Makes use of GUI
Does not support
XQuery
Many small
documents
Maintained by Apache
Results
•
•
Xindice XML database fit requirements,
plus it made easier use of the command
line.
Use a soap interface to interact with the
user. The soap interface would tell the XML
database what to search for, and then the
formatted results would be displayed to the
user on a web page.
Visualization
Summary
Review XML and Java
 Research XPath
 Develop test data
 Black box testing on open source
databases
 Develop Java code to interact with
Database

Questions