Transcript Wendy Yang

CCP4(i) Database Development
Wanjuan (Wendy) Yang
CCP4 Annual developer’s meeting
March 28, 2006 York
Outline of the talk




Background of the project
System Architecture & components
functionalities
Choice of technologies
Current status & immediate plan
Background of the project

As part of BIOXHIT:


Expand current CCP4i job database and include extended
tracking and storage of crystallographic data
Aim to separate the project database handling from main
CCP4i process, allow access to project database from
non-CCP4i applications.


Expand scope of CCP4i database to store richer dataset.
Useful for:



Users of CCP4i
Automated structure determination pipelines: to record the
progress/steps/result within pipelines
Non-CCP4i applications that want to use the database:
sharing data between different applications
System Architecture
CCP4i
XIA
HAPPy
Client API
Client API
Client API
Data Visualiser
DB Handler
Coot
CCP4 MG
Other CCP4 &
Non-CCP4
Applications
Client API
Project DB
Database.def
Project DB
SQL db
Other databases
e.g. PIMS
Components functionalities

DB handler:





Runs in server mode: one handler per user; allow many
clients connect at the same time.
Can be invoked from clients or start separately.
Shut down procedure: automatically shutdown when there
is no clients; clients send shutdown message.
Broadcasting: when database changes state, handler
notifies clients.
Can deal with different types of database through DB APIs:
e.g. i. CCP4i project database
ii. SQL database
Components functionalities

Project Database:


project history tracking data: store steps/jobs taken within
software pipeline
knowledge base data: common crystallographic data items
used within the software pipeline
Components functionalities

Client API:




A list of methods/commands for interacting with the
handler.
Hides details of socket communications
Implemented in different programming languages to
support different applications.
Visualisation tool:
for viewing data and analysing data
Choice of technologies

Client-Server architecture:
socket communications between server and clients.

XML used as messaging technology:
Allow system being language independent
Request example:
<db_request>
<command>NewProject</command>
<argument>ProjectName</argument>
<argument>ProjectDirectory</argument>
</db_request>
Response example:
<db_response>
<status>ok</status>
<result>3</result>
</db_response>
Choice of technologies

Database backend choices:
We provide multiple databases backend:

Embedded database: we are using SQLite.
Features:
 simple to use: a small C library. Self-contained,
embedded SQL database engine
 portable: single files on the file system.
 zero-configuration: no set-up or administration needed
 fast random access to the data

flat files: Def file backend implemented in Python. Replicates
current CCP4i.

Client APIs implemented in:


Tcl API to support CCP4i and Tcl-based applications.
Python API to support python-based automation projects.
Current Status & immediate plans

Python DB handler has been implemented.


Currently developing SQL schema for “rich
database” backend





Next step is to integrate in CCP4i with .def file backend.
Allow other applications to make use of database.def data.
Consists of knowledge base (i.e. crystallographic data) and
tracking (project history)
Being developed in conjunction with HAPPy and XIA
projects.
Several iterations will be required to stabilise schema and
API.
Next step is to make Client APIs available to HAPPy and
XIA.
Begin to develop visualisation tool to
integrate project tracking data
Acknowledgements






Peter Briggs
Graeme Winter
Charles Ballard
Daniel Rolfe
Steve Ness
Many others …