Transcript PowerPoint

Leveraging Database
Technologies in Condor
Jeff Naughton
March 14, 2005
Overview
› Introducing ourselves
› How we got involved
› What we are doing and what we hope
to do
› Request for input
Who we are
› Faculty: David DeWitt, Jeff
Naughton
› Students: Jiansheng Huang, Ameet
Kini, Christine Reilly, Eric Robinson,
Srinath Shankar, Lakshmikant
Shrinivas
Wisconsin DB Group
› A world-leading DB research group for over 20
›
years.
Strong presence in:
 Research publications.
 Grads on faculty at top schools (Berkeley X 2, Cornell X
2, CMU)
 Grads at top industrial DB research centers (IBM
Almaden, MS Research)
 Grads in development organizations of main DB companies
(IBM DB2, Oracle, MS SQL Server)
› History of influential software artifacts (WiSS,
Gamma, Exodus, SHORE, Paradise)
So how did we get to
Condor/Paradyn week?
›
›
4th floor of CS building: 4361 Naughton,
4367 DeWitt, 4369 Livny (adjacent
offices!)
Miron was very persuasive. His algorithm:
Enter our offices.
Describe some challenging and interesting
data management problem Condor faces or
will face.
3. Leave office, get on airplane.
4. Return to Madison, go to 1.
1.
2.
Why Condor and DBMS?
› Premise: A running Condor system is awash
in data:
 Operational data
 Historical data
 User data
› DBMS technology can help capture,
organize, manage, archive, and query this
data.
Three potential levels of
involvement
1. Passively collect and organize data,
2.
3.
expose it through DB query interfaces.
Move/extend some data-related portions
of Condor to DBMS (Condor writes to and
reads from DBMS)
Provide services to help users manage
their data.
Why do this?
› For Condor developers:
 Easier to trouble shoot and debug the system;
 Easier to implement new functionality;
• Less time hassling with data management issues;
• Power of declarative data management language.
 Easier to make data management aspects of the
system scalable;
• Leverage 25 years of DBMS research on scalable data
management.
Why do this?
› For Condor administrators
Easier to analyze and trouble shoot;
Easier to audit;
Easier to explore current and past
system status and behavior.
Why do this?
› For Condor users:
 An ever-improving system due to more productive
developers and administrators.
 Easier to monitor and understand performance of their
jobs.
 Easier to analyze history of their use of the system.
• Complete record of every job they have submitted, and
everything that happened to every job while it was running.
• Support for detailed data lineage queries.
 Data management facilities to assist them in handling
large, complex, inter-related data sets.
Our projects and plans
› Quill: Transparently provide a DBMS
›
query interface to job_queue and
history data. [ready to deploy!]
CondorDB: Transparently captures
and provides interface to critical
data from all Condor daemons.
[status: partial prototype working in
our own “sandbox”]
Longer-term plans
› Tight integration of DBMS
technology and Condor [status:
thinking hard!].
› DBMS-inspired data management
services to help Condor users manage
their own data. [status: thinking
really hard!]
Why doesn’t Condor currently
use DBMS technology?
› Simple answer: Condor and DBMSs
“grew up” together.
Condor project started 1986.
Postgres project started 1986.
› Now both are ready for each other.
Project 1: Quill
> Non-invasive approach to capturing job related
>
>
>
information
Works by sniffing updates to the job queue log
Serves condor_q and condor_history queries
Independent, reliable, and efficient querying of
job related information
So how does it work?
Quill Architecture
Master
Startd
…
Schedd
Quill
RDBMS
Job
Queue
log
Queue
+
History
Tables
Querying Job Related
Information
Master
Startd
…
Schedd
Quill
Querying an
already busy
schedd!!
RDBMS
Independent and a
more powerful
query functionality
Quill benefits
› Robustness: Monitored by master just like other
›
›
›
›
›
condor daemons – resilient to failure
Independence: Not in critical path of any other
condor daemons
Performance: Derive benefits of SQL to serve job
related queries an order of magnitude faster
Functionality: A broader range of queries
Extensibility: Easy to add more complex queries
Downside: only handles job queue and history data.
Project 2: CondorDB
› CondorDB is a passive approach to
capturing operational data in a condor
pool
› Modified daemons log events to the
database at run time – no log sniffing
› Central database serves entire pool
› Web-based query GUI
Data Capture in CondorDB
› Condor daemons
›
›
augmented to record
important events in a
database
Database is in addition
to standard daemon
logs
Pool will run
unaffected even in the
absence of a database
Schedd
Shadow
Startd
Starter
Negotiator
A Machine
CondorDB User Interface
› Users can access
Condor through a
web-interface
 Job queue, job
history, machine
info, match and
reject info,
aggregates and
summaries, etc…
› The web server
queries the
database with PHP
Users see only their own job
information
Users see only their own job
queue on a shared machine
Drill-down to get detailed job
information
Matchmaking data at your
fingertips
Matches
Rejects
Machine information in a
single central repository
The data-centric approach
makes many tasks easier
› Privacy enhanced by presenting user with
›
›
›
›
queue/history information about her jobs only
Intuitive “drill-down” navigation to get increasingly
detailed information
All information about a job from submit-time until
present available from a single screen
Useful summary information presented in tabular
and graphical format
Optionally query database directly for ad hoc
information on job queue, job history,
matchmaking and file usage
Acknowledgement
› The Condor team has been
wonderfully responsive and supportive
throughout this effort.
Demos!
› Come see demos of Quill and
CondorDB in room 4360 CS on Wed.
afternoon.
Virtuous Cycle
› As we learn where Condor can use DBMS
technology, we also learn where DBMS
technology can be (must be?) improved.
 Support for dynamic-schema sparse data sets.
 Extreme requirements of self-installation and
self-maintenance.
 Pushing match-making style operations into
DBMS.
› Improving DBMS technology will lead to
more places that it can be installed.
Request
› We want your input!
› We have a lot of ideas but want to
filter, modify, and augment them
through the benefit of your
experience.
› Send mail to [email protected]
anytime.