What`s New in Condor - Computer Sciences Dept.

Download Report

Transcript What`s New in Condor - Computer Sciences Dept.

Condor RoadMap
Condor Week 2007
Todd Tannenbaum
Computer Sciences Department
University of Wisconsin-Madison
[email protected]
http://www.cs.wisc.edu/condor
Current Situation
› Stable Series
 Current: Condor ver 6.8.4. (Feb 5th)
› Development Series
 Current : Condor ver 6.9.2. (April 10th)
2
Major v6.9 Changes
› Virtual Machine Universe
See Jaeyoung’s talk
› Quill 2.0
See Jeff and Erik’s talk
› Scalability Improvements
› GCB Improvements
› Privilege Separation
3
Team Scalability!
Dan “Two-Faces” Bradley (half CMS, half
Condor), “Papa” Todd Tannenbaum, and “Uncle”
Greg Thain
4
The new condor_q GUI?
How we looked…
›
›
›
›
Intuition / Heated hallway discussion
Wagering
Examination of the log files
Some Tools
 callgrind and kcachegrind
• Or Compuware DevPartner on Win32
 gprof
 strace
 tcpdump
7
What we found…
› Inappropriate Data Structures
schedd
› Deadly embraces
Bill and Monica
› Implementation issues
“I am sooo
embarrassed!”
 Bad hash function, table sizes
 Buffer copy, copy, copy, copy, copy,
copy
8
What we did…
› More non-blocking I/O in critical areas to
›
›
eliminate timeouts/embraces
Cleansed a bunch o embarrassments
Reuse of a claim
 Carefully cache candidate jobs
 Use “autoclusters”
› Auto-adapting parameters based on
workload
 Old way: “do some bookeeping every 5 seconds”
 New way: “spend 5% of time doing bookeeping”
9
Creepy fork()
Got This: 0.125sec/fork!
Wanted this:
10
Lets fix this one, please.
(patch circa 2003)
11
And last but not least,
thanks you to
(i.e. Francesco Prelz!)
Numerous scalability
improvements to Condor-C
Buffered writes to
schedd transaction log
12
Let’s see some results to date
13
SWEEEEET!!
Todd, lets see that
graph again!!!
15
condor_q performance
(sans Quill)
› Already done
 batch sending of ads (eliminate latency, let tcp
window warm up)
 projection of attributes (note : now “condor_q
–l” more expensive than "condor_q -format").
› Still Todo ?
 i/o in another thread
 href protocol on the wire
 caching of parsed expressions -- classads are
very redundant
 same improvements into condor_status
21
Collector Performance
› Fixing Dropped updates
 increased incoming buffer sizes; problems caused by
synchronization via condor_reconfig -all etc.
 Also, with Winsock, UDP sendto() is always successful. (!)
› Added DNS caching for unauthenticated
connections in Condor.
 profiling was important; we had no suspicion this was the
problem. Collector was spending 20% of its time in the
DNS resolver library!
› Todo: Ian Alderman discovered non-blocking
communication assumptions violated by
authentication methods that require round-trips.
23
Negotiation Performance
› v6.8 -> automatic “significant attributes”.
› v6.9 -> “resource request” ads
 Simple explanation: Resource request ad == a count plus
all significant attributes.
 Inserted into a schedd submitter ad.
 “Give me 400 resources like this, and 200 resources like
that, etc”.
› Matchmaking algorithms remains the same, just
›
›
how it “learns” about jobs changes.
Disabled by default.
Possibilities, possibilities…
 More robust against unresponsive schedds
 No startd Rank preemption?
 Others?
25
Impact of negotiation changes
› UW CS Pool – Negotiation cycle times:
 2583 seconds baseline
 Dropped to 366 secs w/ autoclustering
 Add matchlist caching, dropped to 223 secs
 Add resource request ads, drops again to 129
seconds.
 CM memory footprint increased by 80k.
26
Team GCB!
Derek “Mr. Follow-CVS-rules-or-ELSE” Wright,
Alan “Ask me about Social Security #s” DeSmet,
and Jaime “The GridMan” Frey
28
› Improved Scalability: Only use the broker if required!
 Local Host Optimizations (6.9.1)
• Bypass GCB if two daemons are talking on the same host
 Local Network Optimizations (6.9.3)
• Two hosts on the same private net bypass the broker
• Every network is assigned a unique network name
• Daemons advertise (a) public accessible IP; (b) real IP; (c)
network name.
• Names match ? use real ip : use public IP.
› Improved Robustness
 Broker dies -> master finds another broker and restarts.
 When master starts up, it pings a list o brokers and randomly
chooses from those that respond.
 Bug fixes
› Improved Logging – now they are helpful and sane.
29
Team Privilege Separation!
“Cousin” Greg Quinn, Pete “Psilord” Keller, and
Zach “When the grid relaxes, its Zmiller time” Miller
30
Condor’s Privilege Separation
› Apply principle of
›
›
›
›
›
least privilege to
Condor
No more root / superuser privilege required
Currently completed
on execute side
(v6.9.3), “almost” on
submit side
Use glexec or Condor’s
own sudo
Can still run the “old
way” if you want
Refer to Greg Quinn’s
Talk
31
Minor v6.9 changes
› Leases added to COD.
› Simple best-fit algorithm added to dedicated
›
›
›
scheduler.
Can reference resource usage and quota
information in preemption policy.
condor_config_val –dump [-v]
Chirp improvements
 Jobs can write messages into the user log
 Can use proc 0 ClassAd as a “scratch pad”
› Condor shutdown via expressions
 External Awareness
 Plug: Talk w/ Joe Meehan @ the Research BOF!
32
Minor v6.9 changes, cont.
› More types of jobs can survive across
a shutdown/crash of submit machine
Such as jobs that stream stdout/err.
› User’s job log changes.
Can have a centralized job log file.
Get values of any job ad attribute in log.
33
Hoping for v6.9,
but no promises
› Rich Wolski’s
›
›
›
prediction work
Support for VOMS
attributes
Update condor binaries
on job boundaries
Secure install by
default
 Via pool password?
34
So 2-3 more developer
releases, then new stable
series Condor 7.0
(… or Condor Vista? … )
And the next developer series
after v6.9 ?
35
Terms of License
Any and all dates in these slides are
relative from a date hereby unspecified in
the event of a likely situation involving a
frequent condition. Viewing, use,
reproduction, display, modification and
redistribution of these slides, with or without
modification, in source and binary forms, is
permitted only after a deposit by said user into
PayPal accounts registered to Todd Tannenbaum
….
Beyond v6.9
› For the next year, so far we have
identified the following intial focus areas:
 Continue our work w/ Storage Management
(MOPS)
• Refer to Dan Fraser’s talk
 Continue our work w/ Virtual Machines
• Refer to Jaeyoung Yoon’s talk
 Scheduling Work
 Startd Enhancements
37
Scheduling in Condor Today
CM
startd
startd
startd
startd
startd
schedd
schedd
startd
startd
startd
startd
startd
CM
schedd
schedd
schedd
› Distributed Ownership
› Settings reflect 3 separate viewpoints:
 Pool manager, Resource Owner, Job Submitter
38
But some sites want to use
Condor like this:
schedd
startd
startd
startd
startd
startd
› Just one submission point (schedd)
› All resources owned by one entity
› We can do better for these sites.
 Policy configurations are complicated.
 Some useful policies not present because they
are hard to do a wide-area distributed system.
 Today the dedicated “scheduler” only supports
FIFO and a naive Best Fit algorithms.
39
So what to do?
schedd
startd
startd
startd
startd
startd
› Give the schedd more scheduling
options.
Examples: why can’t the schedd do
priority preemption without the
matchmakers help? Or move jobs from
slow to fast claimed resources ?
› Pluggable scheduler routines.
40
StartD Enhancements:
New sources of work
› Condor-G enabled the SchedD to talk to many
different scheduling systems to run jobs…
› Now the StartD will be able to talk to different
managers to fetch jobs and work.
› StartD configured to be “claimed at boot” so that
you don’t need the overhead of match-making.
› Don’t necessarily need a SchedD -- fetch jobs
(work units) from other systems (DB of jobs, etc).
41
StartD Enhancements:
Dynamic slots
› Currently, resource slots are static -- some
›
changes require restarting the StartD.
Would like to add dynamic computing slots:
 Dual-core machines are ubiquitous.
 1 or 2 gigs of RAM is “commodity”.
 Instead of statically partitioning the RAM (1
gig for each slot), it’d be nice to advertise “2
CPUs, 2 gigs of RAM”, and once 1 CPU is claimed
for 1/2 gig, to advertise “1 CPU, 1.5 gigs of
RAM” for the other slot.
42
StartD Enhancements:
Dynamic slots (cont’d)
› Can also be used to simplify complex policies:
 Currently “checkpoint to swap” implemented with static
slots and pre-configured policy.
 Would like to just dynamically allocate new slots, and
make it easier to have global, slot-wide policy
expressions, not just per-slot policies.
› Could have implications for COD, GlideIn and other
uses of the StartD…
 GlideIn under an existing Condor pool might just allocate
a new slot on the “parent” StartD, instead of spawning a
whole new StartD under the parent StartD
 COD claims could allocate new dynamic slots, too…
43
Thank you!
44