Heterogeneous Parallelization for RNA Structure Comparison

Download Report

Transcript Heterogeneous Parallelization for RNA Structure Comparison

Heterogeneous Parallelization for RNA Structure Comparison
Eric Snow, Eric Aubanel, and Patricia Evans
University of New Brunswick Faculty of Computer Science
Abstract
Parallel Algorithm Design
Parallel and grid computing have had a significant impact on the feasibility of running many time- and space-intensive
applications. However, the additional computing power supplied is not without additional costs; development of algorithms
and applications that run efficiently in parallel is generally more complex than traditional sequential development. In
particular, consideration of the underlying heterogeneity of the computing platform can be very difficult when using
multicomputers or grid computing, but can lead to significantly more efficient parallel applications.
Overall Design
•Modified master-slave paradigm.
•Master responsible for load balancing, slaves responsible for task creation and evaluation.
•New tasks stored in each slave’s task list prior to evaluation.
In this work, we analyze a sequential dynamic programming algorithm for the comparison of RNA structures including those
with pseudoknots, and come up with a corresponding parallel design. Particular attention is paid to the fact that the
underlying platform which will be used to test the algorithm will be heterogeneous.
Data Structures
•4-d and 8-d tables distributed among slaves.
•Due to unpredictable “shape”, a best guess initial division of tables takes place, and is improved during load balancing.
•Division takes place along first dimension since its limit is always n (the length of the first RNA chain), and a single array
can be used to hold data structure mapping information on the master.
Finding Common RNA Pseudoknot Structures
Purpose of Sequential Algorithm: Given two RNA chains and their chemical bonds, find the maximum common ordered
substructure (MCOS) formed by the bonds, such that the order of the bonds is maintained.
Master
RNA Chain 1
Data Structure-to-Processor
Mapping Array
RNA Chain 2
1
1
1
2
2
2
2
3
3
3
3
Maximum Common
Ordered Substructure
Example of finding the MCOS of a pair of RNA chains (matched arcs share the same colour, unmatched are black).
Slave 1
Slave 2
Slave 3
While the general form of the maximum common ordered substructure problem is known to be NP-complete, by restricting
the allowable bonds to only those consistent with actual RNA (including pseudoknots and pseudoknot-like structures), a
working algorithm was created that had worst case O(n10) time and O(n8) space requirements [1].
Implementation
•Recursive dynamic programming using memoization.
•One 4- and one 8-dimensional dynamic programming table.
•Selective allocation/calculation performed, so that only values consistent with the input structures would be calculated.
•Only an average of 10-14 of the worst-case space used.
Sequential Algorithm Analysis
Dynamic programming dependency analysis showed that calculations of any partial result in the dynamic programming
tables has the following characteristics:
•Polyadic - Each calculation must call on at least 2 other partial results.
•Non-serial - Each calculation can require calls to partial results 1 or more levels away in the recursive graph of the problem.
•Irregular Dependency - The number of calls required in each calculation is not static; calculations in the 4-dimensional table
can require between 2 and 4 calls, while calculations in the 8-dimensional table require between 4 and 13 calls.
Local Index
0
1
2
0
1
2
3
0
1
2
3
Global Index
0
1
2
3
4
5
6
7
8
9
10
Example of initial data structure distribution for 4 processors with n = 11.
Load Balancing
•Dynamic load balancing used, based on excess data structure size or computational demand
•Master responsible for load-balancing decisions based on periodic updates from slaves.
•Support for heterogeneous platforms is encapsulated in load balancing procedures; master node must have information
about the underlying platform, including network characteristics and processors’ available memory.
Communication
•Master-slave messages are used only to exchange load balancing information (generally very small in size), and must be
passed synchronously to avoid out-of-date load balancing information being used by slaves.
•Slave-slave communication can take place on two fronts:
–Task evaluation requests and results: short, asynchronous messages.
–Load balancing communications: large, synchronous messages.
Master
Result
Performance/Data Structure Information
Load Balancing Updates
Example of a recursive graph of a point in a non-serial, polyadic dynamic programming problem with irregular dependencies (only the final two levels’
dependencies are completely shown).
The unpredictability of the recursions based on the three characteristics above make this problem among the most difficult
type to parallelize, due to the unpredictability of communication.
Data Structures
The major data structures are the 4- and 8-dimensional tables. Due to the selective allocation used, the “shape” of the tables
can be very odd, with only two dimensions’ sizes being predictable ahead of time (in the case of the 8-dimensional table, this
leaves 6 dimensions that must have their sizes calculated each time a new hyperplane is allocated). The image below
illustrates the odd shape that can take place in the 4-dimensional table in the 2nd and 4th dimensions.
m=2
n=3
Task Evaluation Requests/Results
Slave 1
Slave 2
Table/Task Data (Load Balancing)
Example of communication patterns for 3 processors, where green arrows indicate asynchronous communication and red arrows indicate synchronous
communication.
Conclusions and Remaining Work
In this work we have described the design for a parallel version of the MCOS algorithm for RNA chains with pseudoknots
and pseudoknot-like structures.
Remaining work on this project includes the completion of the parallel implementation, and testing to ensure both
correctness and efficiency. In addition, testing on the heterogeneous platforms available through ACEnet (www.ace-net.ca)
will take lace to test the effectiveness of the heterogeneous platform-specific features.
References
[1] P. Evans. Finding Common RNA Pseudoknot Structures in Polynomial Time. Combinatorial Pattern Matching, Lecture
Notes in Computer Science. 2006, pp. 223-232.