51. Gustavo Alonso - Swiss Federal Institute of Technology

Download Report

Transcript 51. Gustavo Alonso - Swiss Federal Institute of Technology

Proactive computing:
Adaptability, autonomic behavior,
dependability
Gustavo Alonso
Computer Science Department
Swiss Federal Institute of Technology
ETH Zürich
[email protected]
http://www.inf.ethz.ch/department/IS/iks/
Proactive computing
CENTRAL TOPIC in DeFINE
“The challenge is to build proactive systems that
regulate themselves and reduce the involvement of humans,
whether these be administrators, operators or end users.
Humans can thus concentrate on the main task instead of
dedicating unnecessary efforts to tasks that can be performed
by computers”
(Define documentation 18.11.02)
©Gustavo Alonso. ETH Zürich.
2
The All vs. All

Problem: Compute a crosscomparison of the SwissProt
v38 protein database.
80‘000 entries.
Average sequence length:
400 amino acid residues.
1.5 years (cpu time) on a
single machine.
Several months to run
just updates (using a
medium size cluster).
<E><ID>CY2_RHOVI</ID><AC>P00083;</AC><DE>C
YTOCHROME C2 PRECURSOR.</DE>
<OS>R HODOPSEUDOMONAS VIRIDIS.</OS>
<OC>PROKARYOTA; GRACILICUTES;
ANOXYPHOTOBACTERIA; PURPLE BACTERIA;
RHODOSPIRILLACEAE.</OC>
<DR>PDB; 1CRY;</DR>
<KW>ELECTRON TRANSPORT; PHOTOSYNTHESIS;
HEME; SIGNAL; 3DSTRUCTURE.</KW>
<SEQ>MRKLVFGLFVLAASVAPAAAQDAASGEQVFKQCLVCH
SIGPGAKNKVGPVLNGLFGRHSGTIEGFAYSDANKNSGITWT
EEVFREYIRDPKAKIPGTKMIFAGVKDEQKVSDLIAYIKQFN
ADGSKK</SEQ>
<RES>1.6</RES></E>
lengths=97,110 simil=275.3, PAM_dist=85.9883, offsets=5978923,5974267,
identity=40.4%, similarity=12.3%
QDAASGEQVFKQCLVCHSIGPGAKNKVGPVLNGLFGRHSGTIEGFAYSDANKNS___GITWTEEVFREYIRDPKA_____
|||.:||.|||||!.||.
..||.|||.|.|:.||.:||..||.||..|.||
|:.||.! . .|:.||.|
QDAKAGEAVFKQCMTCHR___ADKNMVGPALGGVVGRKAGTAAGFTYSPLNHNSGEAGLVWTADNIINYLNDPNAFLKKF
_________KIPGTKMIFAGVKDEQKVSDLIAYI
.!. |||.|. :.:||:..|:!||:
LTDKGKADQAVGVTKMTFK_LANEQQRKDVVAYL
©Gustavo Alonso. ETH Zürich.
3
Proactive vs. manual computing
Other
user
needs
cluster
All jobs
manually
killed
Disk
space
shortage
Cluster
failure
Server down Cluster busy
with other
for
jobs
maintenance
30
20
Number of processors
10
0
30
20
10
0
Jan 21
Jan 20
Jan 19
Jan 18
Jan 17
Jan 16
Jan 15
Jan 14
Jan 13
Jan 12
Jan 11
Jan 10
Jan 09
Failed jobs
Jan 08
Jan 07
Jan 06
Jan 05
Succesful jobs
Jan 04
Jan 03
Jan 02
Jan 01
Dec 31
Running jobs
Dec 30
Dec 29
Dec 28
Dec 27
©Gustavo Alonso. ETH Zürich.
Dec 26
Dec 25
Dec 24
Dec 23
Dec 22
Dec 21
Dec 20
Dec 19
Dec 18
Dec 17
Available processors
Jobs on CPU
4
PROPOSAL
A RESEARCH AGENDA ON
PROACTIVE COMPUTING
©Gustavo Alonso. ETH Zürich.
5
The challenge is to do it all
Flexible, efficient
architectures
High level
representations
PROACTIVE
COMPUTING
Models and
instrumentation
©Gustavo Alonso. ETH Zürich.
6
The basis for proactive computing

High level representations of complex systems and computations
 programming languages are at the wrong level for doing this
 processes, new languages for composition, visual languages

Flexible and efficient architectures in which almost any aspect of the
system can be dynamically adapted (beyond typical tuning knobs)
 separation of concerns and fully modular architectures
 reflective middleware
 dynamic AOP

Models of behavior and adequate instrumentation (which, in fact, is an
adaptation) for acquiring the data that will be processed by adaptation
algorithms that will use the flexibility of the architecture to dynamically change
the system as needed
 better understanding and tested models of non-functional properties
 ability to cope with complex data sets and algorithms for mining this
information both online and offline
©Gustavo Alonso. ETH Zürich.
7
©Gustavo Alonso. ETH Zürich.
8
These are hard problems

A consensus is emerging on what is needed, but many of these
goals contradict each other (an improvement here makes life very
difficult there):
lack of models, programming paradigms, and suitable
architectures
autonomic behavior and automatic reconfiguration makes it
very complex to understand what is going on (code evolution,
instrumentation, monitoring)
dynamic AOP, reflective middleware, and application
extensibility change the nature of the application on the fly
making it more difficult to monitor and instrument
programming for composition may actually lead to new
programming paradigms where extensibility is an essential part
of the language and not a middleware feature (what is better?)
end to end understanding of the problem is the key but we
don’t know how to deal with composite systems
©Gustavo Alonso. ETH Zürich.
9
Dependability: autonomic behavior

True dependability encompasses a broad range of issues usually ignored by
research:

Advance management support:
 change almost every aspect of the system at run time
 monitoring and dynamically alter the configuration of the system
 decision support mechanisms:
• what needs to be stopped if machine A is taken off-line?
• given the current load, what is the best schedule for maintenance of each
sub-cluster?

Autonomic behavior:
 automatic rejuvenation (fast kill and reboot)
 automatic recovery of complex distributed computations
 add and remove nodes without stopping computations
 low level instrumentation for run time awareness model (off line nodes, over
loaded nodes, defective networks, load prediction, QoS, etc)
 instrumentation as a dynamic extension
©Gustavo Alonso. ETH Zürich.
10