Improving Separation of Concerns
Download
Report
Transcript Improving Separation of Concerns
Improving Separation of Concerns in
the Development of
Scientific Applications
S. Masoud Sadjadi, Juan Martinez,
Tatiana Soldo, Luis Atencio
Florida International University
Miami, Florida, U.S.A.
Rosa M. Badia and
Jorge Ejarque
Barcelona Supercomputing Center
Barcelona, Spain
SEKE-2007 July 10, 2007
Outline
Motivation
Background
GRID superscalar
TRAP/J
Case Study: Matmul
Transparent Grid Enablement
Results
Related Work
Conclusions
Future Work
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 2
Motivation
High performance computing (HPC) is gaining popularity in solving
complex scientific applications.
The current HPC programming standards (e.g., MPI, Open MP, and
Grid Computing toolkits) are not targeted for scientists to develop
their scientific applications
For example, Weather Research and Forecast is 200,000+ lines of
code in FORTRAN 90 that uses MPI and Open MP
This lack of separation of concerns has resulted in scientific
applications with rigid code, which entangles non-functional
concerns (e.g., the parallel code and the platform-specific code) into
functional concerns (i.e., the core business logic).
Effectively, this tangled code hinders the maintenance and
evolution of these applications.
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 3
Transparent Grid Enablement: Goals
To separates the task of developing the business logic of a scientific
application from the task of improving its performance.
To increase the level of modularity of code by separating
crosscutting parallel programming related code from the business
logic of the scientific application.
To develop an automatic (or semi-automatic) Grid enablement
process that requires no manual modifications to the business logic
of the scientific application and hence “transparent” to the scientists
and their sequential code.
TGE achieves this goal by integrating two existing software tools,
namely, TRAP/J and GRID superscalar.
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 4
Background: GRID superscalar
•
Inspired by the superscalar processors, GRID superscalar provides
an easy programming paradigm for developing parallel programs.
Similar to superscalar processors that provide out-of-order and
parallel execution of machine instructions by bookkeeping their
dependencies, GRID superscalar provides parallelism to the
functions of a program written in a high-level programming language
such as Java.
GRID superscalar enables the development of applications for a
computational Grid by hiding details of job deployment, scheduling,
and dependencies and enables the exploitation of the concurrency
of these applications at runtime.
In TGE, actual gridification of the application is obtained through
GRID superscalar and the GRID superscalar calls are woven
transparently into the scientific application using TRAP/J.
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 5
Background: GRID superscalar
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 6
Background: TS, TRAP, and TRAP/J
•
Transparent Shaping is a programming model that enables
software adaptation through interception and redirection of
interactions among different part of a software system without the
need to manually modify the code.
•
TRAP (Transparent Reflective Aspect Programming) is an extension of
Transparent Shaping for object-oriented programming languages.
•
TRAP/J is a realization of TRAP in Java that enables static and
dynamic adaptation in Java programs at startup and runtime,
respectively.
•
Other realizations:TRAP/C++,TRAP/BPEL, and TRAP.NET.
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 7
Background: TRAP/J
•
Using TRAP/J, we can insert generic hooks/interceptors at
important/sensitive points in a Java program.
•
Later, we can use these hooks to intercept and redirect
the flow of control to a new code.
Flow of Control in
the Original
Application
Flow of Control in
the Adapt-Ready
Application
Invoke
Original Task
Invoke Original Task
TRAP/J
No
Adapt?
Yes
Execute the Original Task
Execute the
Original Task
Execute the
New Task
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 8
Background: TRAP/J
•
TRAP/J allows crosscutting concerns to be separated
from the functional logic not only at development time,
but also at run time.
Before
TRAP/J
After
TRAP/J
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 9
Outline
Motivation
Background
GRID superscalar
TRAP/J
Case Study: Matmul
Transparent Grid Enablement
Results
Related Work
Conclusions
Future Work
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 10
Case Study: Matmul
Matmul is a simple matrix multiplication program written
in Java.
It uses a sequential matrix multiplication algorithm, which
computes C = A.B, where A, B, and C are matrices of size
NxN.
B
A
X
C
=
This typical “row by column” sequential algorithm
involves O(N3) operations.
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 11
Transparent Grid Enablement
C00=C00+A00*B00
C00=C00+A01*B10
C01=C01+A00*B01
C01=C01+A01*B11
C10=C10+A10*B00
C10=C10+A11*B10
C11=C11+A10*B01
C11=C11+A11*B11
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 12
Transparent Grid Enablement
Matmul Sequential
Application
New parallel
approach
HPC Expert
Scientist
TRAP/J
GRID Superscalar
Adapt-Ready/Grid-Enabled Application
Startup-time
adaptation
A
B
X
C
=
Finer-Grain Parallelism: Adaptive
code for maximum parallelism of 9.
A
B
X
C
=
Coarser-Grain Parallelism: Adaptive
code for maximum parallelism of 4.
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 13
Transparent Grid Enablement
MultiplyMatrices
delegate
Matmul Application
TRAP/J
Adapt-ready Matmul Application
GRID superscalar
multiply_acc()
multiply_acc()
Matmul IDL
multiply_acc()
multiply_acc()
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 14
Transparent Grid Enablement
Sequential matrix multiplication
public static void main(String[] args)
{
. . .
Multiply_Matrices(size, args[1], args[2], args[3]);
}
public static void Multiply_Matrices(int size, fileC, fileA,
fileB)
{
Block A = new Block(fileA, size);
Block B = new Block(fileB, size);
Block C = new Block(size);
C.Multiply(A,B);
C.blockToDisk(fileC);
}
Matmul IDL
interface MATMUL
{
void multiply_acc(inout File f3, in File f1, in File f2,
in int size);
};
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 15
Transparent Grid Enablement
Multiply Matrices delegate class
public class Matmul_Del implements DelegateInterface
{
public static void Multiply_Matrices(int size, fileC, fileA, fileB)
{
GSMaster.On();
for(int i=0;i<num_of_pieces;i++)
{
for(int j=0; j<num_of_pieces;j++)
{
for(int k=0; k<num_of_pieces;k++)
{
//Method sent to each node in grid
Matmul.multiply_acc(C[i][j], A[i][k],B[k][j],…);
}
}
}
GSMaster.Off();
MergeFiles();
//Merge files after computation …
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 16
Outline
Motivation
Background
GRID superscalar
TRAP/J
Case Study: Matmul
Transparent Grid Enablement
Results
Related Work
Conclusions
Future Work
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 17
Results
Initially we got:
Matrix Sequential Parallel with
Size (N)
(ms)
4 blocks (ms)
Speedup
(S/P)
144
674
61512
0.010957212
288
2031
66096
0.030728032
576
9527
69365
0.137345924
1152
62269
172787
0.360380121
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 18
Results
More disappointment!
Matrix
Size (N)
Seq.
(ms)
Par. w/ 4
blocks and
2 workers
(ms)
Par. w/ 4
blocks and
4 workers
(ms)
Par. w/ 9
blocks and
6 workers
(ms)
144
5576
79221
57656
145331
288
14934
86259
62013
146744
576
44755
108107
78096
148240
1152
19318
176464
133058
176464
2304
79837
643925
441891
474215
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 19
Results
Initially we got:
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 20
Results: Optimization
Problem in GS: GS_Off()
freeing resources and deleting temporary files after finishing
the calls to the grid methods.
Since all the data is distributed along the nodes, there will be
the need for cleanup that wastes extra time.
Solution: Avoiding the cleanup! ;)
Optimizing the use of GridFTP
TCP has a slow start
You can instruct GridFTP to open more TCP connections with
bigger starting window to compensate for the slow start of
TCP.
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 21
Results: Optimization
Using Network File System
<?xml version="1.0" encoding="UTF-8"?>
<project isSimple="yes" masterBandwidth="100000" masterBuildScript=""
masterInstallDir="/home/lion-e/globus2/matmul_java_master"
masterName="la-blade-01.cs.fiu.edu"
masterSourceDir="/a/lion.cs.fiu.edu./disk/216/e/globus2/matmul_java_mast
er" name="Matmul"
workerBuildScript=""
workerSourceDir="/a/lion.cs.fiu.edu./disk/216/e/globus2/matmul_java_work
er">
<disks>
<disk name="_MasterDisk_"/>
<disk name="_WorkingDisk_la-blade-02_cs_fiu_edu_"/>
<disk name="_WorkingDisk_la-blade-03_cs_fiu_edu_"/>
</disks>
<directories>
<directory disk="_MasterDisk_" isWorkingPath="yes" path="/home/lione/globus2/matmul_java_master"/>
</directories>
<workers>
<worker Arch="" GFlops="1.0" LimitOfJobs="1" Mem="16" NCPUs="1"
NetKbps="100000" OpSys="" Queue="none" Quota="0"
deploymentStatus="deployed"
installDir="/home/lion-e/globus2/matmul_java_worker" name="la-blade02.cs.fiu.edu">
<directories>
<directory disk="_WorkingDisk_la-blade-02_cs_fiu_edu_"
isWorkingPath="yes" path="/home/lion-e/globus2/matmul_java_worker"/>
</directories>
</worker>
Improving Separation of Concerns in the Development of
<?xml version="1.0" encoding="UTF-8"?>
<project isSimple="yes" masterBandwidth="100000" masterBuildScript=""
masterInstallDir="/home/lion-e/globus2/matmul_java_master"
masterName="la-blade-01.cs.fiu.edu"
masterSourceDir="/a/lion.cs.fiu.edu./disk/216/e/globus2/matmul_java_master"
name="Matmul"
workerBuildScript=""
workerSourceDir="/a/lion.cs.fiu.edu./disk/216/e/globus2/matmul_java_worke
r">
<disks>
<disk name="_MasterDisk_"/>
<disk name="_WorkingDisk_la-blade"/>
<disk name="_sharedDisk_la-blade"/>
</disks>
<directories>
<directory disk="_MasterDisk_" isWorkingPath="yes" path="/home/lione/globus2/matmul_java_master"/>
</directories>
<workers>
<worker Arch="" GFlops="1.0" LimitOfJobs="1" Mem="16" NCPUs="1"
NetKbps="100000" OpSys="" Queue="none" Quota="0"
deploymentStatus="deployed"
installDir="/home/lion-e/globus2/matmul_java_worker" name="la-blade01.cs.fiu.edu">
<directories>
<directory disk="_WorkingDisk_la-blade" isWorkingPath="yes"
path="/home/lion-e/globus2/matmul_java_worker"/>
<directory path="shared_path" disk="_SharedDisk_la-blade"
isWorkingPath="no"/>
</directories>
</worker> Applications, by Masoud Sadjadi et al., SEKE 2007. 22
Scientific
Results
After the optimizations we got:
Sequenti
al
Parallelism (4) - 2
workers
Parallelism (4) - 4
workers
Parallelism (9) - 6
workers
1
0.070385378
0.09671153
0.038367588
1
0.17312976
0.240820473
0.101769067
1
0.413987993
0.573076726
0.301909066
1
1.094750204
1.451878128
1.094750204
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 23
Results
In the Speedup graph shown below, we see that our approach
performs almost twice better than the sequential one.
Algorithms Speedup
2
1.8
1.6
Speedup
1.4
Sequential
1.2
Parallelism (4) - 2 w orkers
1
Parallelism (4) - 4 w orkers
0.8
Parallelism (9) - 6 w orkers
0.6
0.4
0.2
0
144
288
576
1152
2304
Matrix Size
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 24
Outline
Motivation
Background
GRID superscalar
TRAP/J
Case Study: Matmul
Transparent Grid Enablement
Results
Related Work
Conclusions
Future Work
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 25
Related Work
Satin
is a Java based programming model for the Grid which allows
explicit expression of divide-and-conquer parallelism.
Satin uses marker interfaces to indicate that certain invocation
methods need to be considered for potentially parallel
(spawned) execution.
Synchronization is also explicitly marked whenever it is
required to wait for the results of parallel method invocations.
Higher-Order Components (HOCs)
is a component-oriented approach based on a master-worker
schema.
HOCs express recurring patterns of parallelism that are
provided to the user as program building blocks, pre-packaged
with distributed implementations.
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 26
Related Work
ASSIST
is a programming environment aimed at providing parallel
programmers with user-friendly, efficient, portable, fast ways of
implementing parallel applications.
It includes a skeleton based parallel programming language
(ASSISTcl, cl stands for coordination language) and a set of
compiling tools and run time libraries.
The ensemble allows parallel programs written using ASSISTcl
to be seamlessly run on top of workstation networks
supporting POSIX and ACE (the Adaptive Communication
Environment, which is an extern, open source library used
within the ASSISTcl run time support).
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 27
ProActive
is a Java GRID middleware library for parallel, distributed and
multi-threaded computing.
With a reduced set of simple primitives, ProActive provides a
comprehensive API to simplify the programming of Grid
Computing applications: distributed on Local Area Network (LAN),
on clusters of workstations, or on Internet GRIDs.
ProActive is only made of standard Java classes, and requires no
changes to the Java Virtual Machine, no preprocessing or compiler
modification, leaving programmers to write standard Java code.
Architected with interception and reflection, the library is itself
extensible, making the system open for adaptations and
optimizations.
Current implementation is focusing of the CoreGRID NoE
specification of the Grid Component Model (GCM).
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 28
Related Work
None of the above mentioned approaches provide an
explicit separation of concerns identifying separate
tasks for scientist developers and HPC expert developers.
TGE can be extended to use these works instead or in
complement to GRID superscalar and can be used as an
enabler for supporting interoperation among the above
mentioned approaches.
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 29
Outline
Motivation
Background
GRID superscalar
TRAP/J
Case Study: Matmul
Transparent Grid Enablement
Results
Related Work
Conclusions
Future Work
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 30
Conclusions
In this work, we have presented an innovative approach to
transparent grid-enablement of scientific applications.
We achieved this goal by combining two of our previously
developed toolkits, namely, GRID superscalar and TRAP/J.
Although this work is still in its preliminary stage, we were able to
show its effectiveness through a simple case study.
We acknowledge that it may not be easy (and even may be
impossible) in some applications to separate the code parallelism
from the business logic of the application; however, there are many
existing applications that can benefit from TGE.
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 31
Future Work
•
We are applying TGE to a real case study, namely, hurricane
mitigaiton simulation and visualization applications.
•
Currently, TGE support static adaptation at startup time. We plan to
extend it to support dynamic adaptation.
•
Currently, TGE supports self-configuration and self-optimization. We
plan to extend TGE to support other autonomic behavior including
self-healing and self-protection.
•
We plan to extend the self-optimization of self-configuration of TGE
so that it can take advantage of more worker nodes becoming
available during runtime.
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 32
Acknowledgements
This work was supported in part by IBM (SUR and Student
Support awards), the National Science Foundation (grants
OCI-0636031, REU-0552555, and HRD-0317692), the Spanish
CICYT (contract TIN2004-07739-CO2-01), and the BSC-IBM
Master R&D Collaboration agreement. This work is part of the
Latin American Grid (LA Grid) project.
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 33
References
[1] S. Masoud Sadjadi, J. Martinez, T. Soldo, L. Atencio, R. M. Badia, and J. Ejarque.
Improving separation of concerns in the development of scientific applications. In
Proceedings of The Nineteenth International Conference on Software Engineering and
Knowledge Engineering (SEKE'2007), Boston, USA, July 2007.
[2] S. Masoud Sadjadi, Philip K. McKinley, and Betty H.C. Cheng. Transparent shaping
of existing software to support pervasive and autonomic computing. In Proceedings
of the first Workshop on the Design and Evolution of Autonomic Application Software
2005 (DEAS'05), in conjunction with ICSE 2005, St. Louis, Missouri, May 2005.
[3] S. Masoud Sadjadi. Transparent Shaping of Existing Software to Support Pervasive
and Autonomic Computing. A Dissertation submitted to Michigan State University, 2004.
[4] S. Masoud Sadjadi, Philip K. McKinley, Betty H.C. Cheng, and R.E. Kurt Stirewalt.
TRAP/J: Transparent generation of adaptable Java programs. In Proceedings of the
International Symposium on Distributed Objects and Applications (DOA'04), Agia
Napa, Cyprus, October 2004.
[5] Rosa M. Badia, Raül Sirvent, Jesus Labarta, and Josep M. Perez. Programming the
GRID: An Imperative Language Based Approach. book chapter in Engineering the Grid,
Section 4, Chapter 12 , January 2006.
[6] Philip K. McKinley, S. Masoud Sadjadi, Eric P. Kasten and Betty H.C. Chen.
Composing Adaptive Software. Computer. July 2004, pages 56-64.
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 34
Questions/Comments
Contact Information:
S. Masoud Sadjadi ([email protected])
Autonomic Computing Research Lab. (ACRL)
School of Computing and Information Sciences (SCIS)
Florida International University (FIU)
TGE, TRAP/J, TRAP.NET, TRAP/BPEL, ACT/J, and other
Transparent Shaping tools can be downloaded from
http://acrl.cis.fiu.edu/
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 35