How to Build a Database-driven MIRC Teaching File

Download Report

Transcript How to Build a Database-driven MIRC Teaching File

How to Build a Databasedriven MIRC Teaching File
System – A Case Study
Stephen Moore
Mallinckrodt Institute of Radiology
Financial Disclaimer
• Stephen Moore is employed by Washington
University and have received no extramural
funding for this project.
• The Mallinckrodt Institute does not charge
for access to the NM teaching file or for the
storage service software.
Intended Audience
• Those who wish to add a Storage Service to
an existing Teaching File
• Those who wish to build a Teaching File
that includes a Storage Service
• Not someone who wants to install/run
someone else’s software
– You need a different session or exhibit
Who Knows How the MIRC
Community Works?
MIRC
MIRC site
User
Query
Service
RSNA site
Index
Server
Internet
Index
MIRC site
Index
Index
Server
Server
Server
MIRC site
MIRC Terminology
• Query Service
– A point of access to the entire MIRC community. It
provides a query form to the user, distributes the search
criteria to all selected storage services, collates the
responses, and presents them to the user.
• Storage Service
– Responds to the query received from the query service,
searches its index for documents meeting the search
criteria, and returns abstracts and locations of the
matching documents to the query service
Approaches to Creating a
Storage Service
• Install, manage RSNA software
– Attractive solution if you don’t already have a teaching
file
• Write new software to interface to your existing
teaching file (live link to existing database)
– Allows existing teaching file to continue and you get
automatic updates to Storage Service
• Map data to MIRC-based database; write new
software to interface to new database
– Snapshot approach; minimal interference with existing
teaching file
Storage Service Model
User
Storage Service
XML
Parser
Control
DB
Logic Interface
Query
Service
Output
Generator
Web Server
DB
Storage Service Processing
Steps
• Web Server passes query to Control Logic
• Control Logic invokes XML parser to
produce a “query”
• Control Logic invokes DB Interface (or DB
directly) to perform search
• Output generator produces legal XML that
is passed back to Query Service
Software Development Steps
• Define system architecture
• Make technology choices for web server, parser,
database
• Map MIRC Document Schema to your database
schema (4-12 hours)
• Map XML query to existing database schemas (26 hours)
• Extend MIR software based on new schemas (2
days)
MIR MIRC Components
http post, XML
Java Tomcat
Web Server
Servlet API
Java Servlet
(MIR)
SQL
PostgreSQL
Relational
Database
Java Servlets Are a
Technology Choice
“Java Servlet technology provides Web
developers with a simple, consistent
mechanism for extending the
functionality of a Web server and for
accessing existing business systems.
A servlet can almost be thought of as an
applet that runs on the server side -without a face.”
Sun Java 2 Software
Development Kit (J2SDK)
• The J2SDK provides core technology for
the Tomcat web server and for servlets
• http://java.sun.com
• We are using an older version (1.2.2)
• We intend to upgrade to 1.4.2
Apache Tomcat Server
“Tomcat is the servlet container that is used in
the official Reference Implementation for
the Java Servlet and JavaServer Pages
technologies.
The Java Servlet and JavaServer Pages
specifications are developed by Sun under
the Java Community Process. “
Apache Tomcat Server
• Provides a simple interface to obtain the
http commands and return results
• http://jakarta.apache.org/tomcat/index.html
• We use version 4.1.29 (today)
• We have used 4.0.3 and 4.1.12 previously
Apache Xerces XML Parser
• “Xerces2 is a fully conforming XML
Schema processor.”
• http://xml.apache.org/xerces2-j/index.html
• We use Xerces-J-bin-2.5.0. I see a 2.6.0
release is available as of 11/20
PostgreSQL Relational
Database
• A robust, relational database supporting
SQL queries
• Software is freely available without
licensing fees
• We use version 7.1, 7.3
• http://www.postgresql.org
Java Interface to PostgreSQL
• PostgreSQL contains an optional package
that implements JDBC
• This is compiled and installed one time
• It allows a Java class (servlet) to execute
SQL operations
Other Technology to Consider
• Apache web server / CGI / PHP
• Microsoft IIS / ASP
• Relational databases
– MySQL
– Oracle, SQL Server, Sybase
• XML index
MIR Storage Service
(Moore/MIR)
Web Server
Apache
Control Logic
Custom (Java servlets)
XML Parser
Xerces-J (2.5.0)
DB Interface
JDBC (PostgreSQL)
Database
PostgreSQL / Unix
Output Generator
Custom (Java servlets)
UCSF Storage Service
(Tellis/UCSF)
Web Server
4th Dimension
Control Logic
XML Parser
Custom (scripting
language)
Expat4D
DB Interface
4th Dimension
Database
4th Dimension
Output Generator
Custom (scripting
language)
Software Development Steps
• Define system architecture
• Make technology choices for web server, parser,
database
• Map MIRC Document Schema to your database
schema (4-12 hours)
• Map XML query to existing database schemas (26 hours)
• Extend MIR software based on new schemas (2
days)
One MIRC Document = One
NM Teaching File Case
• We decided not to replicate Teaching File
functions with this Storage Service
• The MIRC Document that we return gives
two pointers
– One reference to the specific case that is chosen
– Second reference to the Nuclear Medicine
Teaching File main page
MIRC Query Schema
• 21 first tier elements
–
–
–
–
Title
Author
Abstract
Keywords
• Patient and image elements have subelements (9 such child elements)
MIR Nuclear Med /MIRC TF
Schema
• We mapped the 30 MIRC schema elements
to 12 columns in one table
• Remaining 18 elements are not
implemented (but should be)
MIRCQuery <-> MIR DB
Columns
MIRC Query
NM
MIR
Database MIRC DB
title
brief history title
author
author
author
abstract
full_history
abstr
diagnosis
diagnosis
diag_dx
Mapping NM Database to
MIRC Database (Snapshot)
perl script
HTML Files
SQL insert
statements
PostgreSQL
Relational DB
Questions to Ask When
Mapping to Your Schema
• If starting from scratch, will you create a flat
database design that matches the MIRC Document
schema?
– Do you have different tables for different concepts
(patient, study, image)?
• How will your system change in the future should
the query schema change?
• If you have an existing system, can you create a
database view to get close to the query
requirements?
Loading Data MIRC Database
• MIR Nuclear Medicine Teaching File can
be exported as a series of HTML files
• We wrote a perl script to scan each
file/record and produce an equivalent record
in our SQL table
A Simple MIRC Document
Can Be Returned
• Attributes that we return in response include
–
–
–
–
docref
title
author
abstract
• RSNA Query Service will take care of
formatting the output for us
Software Development Steps
• Define system architecture
• Make technology choices for web server, parser,
database
• Map MIRC Document Schema to your database
schema (4-12 hours)
• Map XML query to existing database schemas (26 hours)
• Extend MIR software based on new schemas (2
days)
Mapping MIRCQuery to Your
Database
• If underlying technology is a SQL database,
you need only parse the XML query and
produce appropriate SQL and harvest results
• select title, author, abstr
from doc_reference where
keywords ilike ‘%paget%’;
Other Considerations
• Is there any private data in your existing database
that you need to protect?
• Can you easily run case-insensitive queries against
your database?
• Is it simple to provide a document + URL that will
get the user directly to your teaching file case?
• Is this a system to also manage your teaching file
or is it designed to provide another entry point?
MIR Implementation Details
Lines of Java Code
•
•
•
•
Class to parse XML
Setup JDBC interface
JDBC/SQL query
Publish results
600 lines *
50 lines
70 lines
30 lines
• * The XML parsing code uses the Xerces-J
software. Our software could be more
efficient.
Java Parser Sample
public void doPost (HttpServletRequest req,
HttpServletResponse res){
DocumentBuilder b =
DocumentBuilderFactory.newInstance.newDocumentBuilder();
BufferedReader rdr = req.getReader();
org.xml.sax.InputSource is =
new org.xml.sav.InputSource(rdr);
Document d = builder.parse(is);
JDBC Code Sample
Class.forName(“org.postgresql.Driver”);
mDB = DriverManager.getConnection(
“jdbc:postgresql:doc_reference?user=postgres”);
Statement stmt = mDB.createStatement();
Stmt.execute(“select title, author, … where …”);
ResultSet rs = stmt.getResultSet();
while (rs.next()) {
// process row from database
String title = rs.getString(1).trim();
String author = rs.getString(2).trim();
…
}
Setup Steps
Install J2SDK
5 min
Install Tomcat server
5 min
Compile/install
PostgreSQL (Solaris)
Install/configure
PostgreSQL (Linux)
3 hours **
1 hour **
Conclusion
• The process to build a Storage Service
interface is relatively easy
• Free tools exist (web server, servlets,
relational database) that assist the developer
• Our software available next week
• http://www.erl.wustl.edu/RSNA-MIRC
• Questions?