Present status of the ILMT software

Download Report

Transcript Present status of the ILMT software

Present status of the ILMT
software
Poels, J. and Bartczak, P.
ARC Liège, 19 February 2009
ILMT system components
database
storage
applications
The ideal solution, however out of reach (highly expensive)
One multiprocessor computer and RAID storage
Database
Storage
Oracle
RAID
Applications
Process 2
Thread 1A
Thread 1
Thread 1B
Process 1
Thread 2A
Thread 2
Thread 2B
Process 3
Monolithic systems
•
•
•
•
•
•
•
Advantages
In the box multiple CPUs (8,16,…)
Very low latency due to share of RAM (16GB,32GB,…)
between processors through high speed FSB (backbone
memory bus).
Easy setup of multiple virtual machines
High speed disk transactions
Drawbacks
Costs out of reach (purchase, maintenance, …)
Dedicated hardware
Contracting with a unique company
ILMT cluster
PC
Gateway
for users
PC
NODE
PC
NODE
PC
NODE
PC
Database
PC
NODE
PC
PC
NODE
NODE
The clustering concept
Advantages
• A way to break up large programs and split them across a series of
workstations. From 1997, words like « Beowulf » are tossed and cheap
supercomputers running Linux OS.
• Heterogeneous hardware brands
• A modern PC (node) can now accommodate up to 4TB of SATA disks and
plenty of RAM
Drawbacks
• High latency due to network communication between nodes
• Data to process is not necessarily stored on the target node -> TCP/IP
transfers on NAS (multi-TB Networks-attached storage)
• Beowulf solutions even though well adapted to procedural languages like
C is not adequate to OO (Object Oriented) programming.
• Needs for a middleware Layer as an interface between software and
hardware to ease distributed computing in OO environment.
Solutions
• High latency can be overcome avoiding inter-nodes communications,
therefore, processes must be independent. However, interaction with the
cluster scheduler running the ORACLE database server is still necessary.
• The cluster scheduler (implemented in Oracle) should have a knowledge
of the location of the data to be used in the process and run it in the
closest node if possible.
• The overhead time decreases as Ethernet performance increases
(100mb/s now)…1Gb is now affordable meaning that the future is bright.
• As a rule of thumb one should write the code such that the total overhead
time <<<< than the process execution time.
PC
Network LAN
PC
NODE
Database
Storage
ORACLE
Application
POSIX threads
CORBA
Applications
Common Object Request Broker Architecture (CORBA)
client side
•
•
•
•
•
•
class testclient {
testclient();
~testclient();
// Resolved and narrowed CORBA object for proxy calls
Data::test_var m_Data;
bool RequestTestSum(); };
•
•
•
•
•
•
•
•
•
bool testclient::RequestTestSum()
{
CORBA::Long num1=4; CORBA::Long num2=5; CORBA::Long num3=6;
long sum;
m_Data = test::_narrow(obj1.in());
// This is the CORBA call which is to be executed remotely
sum=m_Data->testsum( num1, num2, num3);
cout << « sum=" << sum << endl;
return true; }
•
•
•
•
•
•
void main(int argc, char** argv)
{
// Constructor establishes the link with the CORBA server.
testclient testcl;
testcl.RequestTestSum());
}
Common Object Request Broker Architecture (CORBA)
server side
•
•
•
•
•
•
•
class test_i : public POA_Data::test,
public PortableServer::RefCountServantBase
{
public:
test_i();
virtual ~test_i();
virtual CORBA::Long testsum( CORBA::Long num1,CORBA::Long num2,CORBA::Long num3); };
•
•
•
•
•
•
•
•
•
•
•
•
•
test_i::test_i(){ }
test_i::~test_i(void){ }
CORBA::Long test_i::testsum(
CORBA::Long num1,
CORBA::Long num2,
CORBA::Long num3)
{
long result;
result = num2 + num1 +num3;
return result;
}
Pipeline(s)
• A pipeline (C++, Pyraf) is a stack of ordered tasks or processes where the
output of each becomes the input to the next in the sequence, the final
result being images or alphanumeric data:
• Preliminary astrometry and photometry on incoming ILMT data (daily)
• Image subtraction aimed to compute the variability of point-like sources
melted in extended sources or early discovery of supernovae's (daily)
• Build and refine reference images (monthly)
• Extract potential periods from point-like sources photometry (SQL Oracle
database)
• Etc…
• As the project is international, the cluster must accommodate
independent pipelines developed by collaborators (Canada, India)
• For cluster performance reasons a pipeline must be run on one and only
one node providing its results to the cluster master
Database server
New observation
data
NODE computers
Welcome deamon
Storage
JOBS list
Pipeline 3
JOBS
watchdog
Pipeline 2
Pipeline 1
Results
Pipeline 1
Pipeline 2
Pipeline 3
Flatfield
Flatfield
Flatfield
Darkframe
Darkframe
Darkframe
Etc
Etc
Etc
Xxxx
Xxxx
Xxxx
Xxx
Xxx
Xxx
Xx
Xx
Xx
X
X
X
aaa
aaa
aaa
aaa
aaa
aaa
aa
aa
aa