FTELMIN Network: A new MIN with Fault-Tolerance

Download Report

Transcript FTELMIN Network: A new MIN with Fault-Tolerance

FTELMIN Network: A new MIN
with Fault-Tolerance
Characteristics
Presenter:陳韋仁
Data:2005/06/23
Outline
•
•
•
•
•
•
Abstract
Introduce
Structure of Baseline MIN
Fault Tolerant MINs
Routing in FTELMIN
Fault-Tolerance of FTELMIN
Abstract
•
The purpose of adding redundancy to unique path MINs is to increase their
reliablity
•
In this paper, a fault-tolerant multistage interconnection network (MIN), a 2dilated baseline network is analyzed for reliability and performance
improvements
•
This MIN provides 2 paths between any pair of source and destination,
where n is the number of stages in the MIN
n
Introduce
•
With the progress of research on parallel processing, interconnection
networks provide the communications needed in a parallel processing
system between processors, between processors or memory modules
benefits, enhancing the performance of the system, requiring lower
hardware overhead
•
If no extra SEs and links are provided for MIN, even one fault disable us
from establishment some paths between input and output terminals
•
The modest cost of unique path MINs (a single path between any source
and destination), make them attractive for large multiprocessor systems, but
then lack of fault tolerance is a major drawback
Introduce [contd.]
•
To mitigate this problem and to provide multiple paths for a sourcedestination pair so that alternate paths can be used in case of faults, many
methods are used
(1)replication of the entire network
(2)adding additional links
(3)addition of extra stages
(4)partial addition of links/stages
(5)dulication of switches/links at various stages
•
To design a fault tolerant MIN general goals followed are high reliability,
good performance even in the presence of faults, low cost and simple
control
Introduce [contd.]
•
In this paper a new class of regular fault tolerant multistage interconnect
network named FTELMIN (Fault Tolerant Extra link Multistage
interconnection network), which is a modification of the link connection
patterns of 2-dilated MIN
•
The 2-dilated ELMIN suffered from the problem of healthy SEs to be unused
because of the presence of adjacent SE failures in the stage just before it
•
New MIN’s SE
SE
SE
Structure of Baseline MIN
•
Baseline Network with 8
0
0
1
1
2
3
2
3
4
4
5
5
6
7
6
7
Structure of Baseline MIN [contd.]
•
In the Baseline network, the routing of a message from a given source to a
given destination is based on the destination address
110
•
0
0
1
1
2
3
2
3
4
4
5
5
6
7
6
7
This is termed as Self-routing or Destination-routing
Structure of Baseline MIN [contd.]
•
Blocking can occur when to establish the path for the requests generated
some of the paths may pass through an identical output of SE according to
the binary numbers assigned to the destination
000 0
1
0
001 2
3
2
3
4
4
5
5
6
7
6
7
1
Fault Tolerant MINs
•
Previously Proposed MINs
►A 2-dilated MIN as show in below, it is superior to a conventional MIN with 2╳2
SEs in throughput
►It tolerants at most one link fault in any path established from an input terminal
to
an output one
►A SE fault, however, invalidates this fault tolerant
Fault Tolerant MINs [contd.]
•
►A 2-dilated ELMIN is show in below
►A path is always established between any input terminal and any output one even
if at most four SE faults occur in the intermediate stages
Fault Tolerant MINs [contd.]
•
•
•
Construction procedure of FTELMIN network
The method for determining link connection patterns of MIN with N input
and N output terminals is summarized as follows
[Step1]Beginning at the highest pair of IN1 and IN2 (or Out1 and Out2) in
any stage, number the pairs from 0 to N-1. Note that an identical
number assigned to the pair of In1 and In2 (or Out1 and Out2) of SE
is different from that to the lower pair. GO TO STEP2
Fault Tolerant MINs [contd.]
•
•
[Step2]For In1s and Out1s of any SE, employ the link connection patterns
similar to those of SEs in the baseline network. GO TO STEP3
Fault Tolerant MINs [contd.]
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
[Step3]For the i-th and (i+1)-th stages, convert the assigned decimal
numbers to binary ones.
► Assume that In1, having
of
SE in the (i+1)-th stage is connected with
Out1 having
of SE in the i-th
stage where 1≤i≤n-2 and
►Then connection Out2 having
of the latter SE with In2 having
of SE in the (i+1)-th stage where X  1  X where X={0,1}
►For the links between Out2s in the (n-1)-th stages and In2s in the
n-th stage, employ the connection patterns similar to those in the
baseline network.
►Finally connect In2 (Out2) having
in the first stage (or (n+1)-th stage) with the input (output) terminal
having
Fault Tolerant MINs [contd.]
•
•
A FTELMIN, with N=8
[Step1]
0
0
1
1
2
2
3
3
4
4
5
5
6
6
7
7
Fault Tolerant MINs [contd.]
•
[Step3] i=2 ,
X X X
1
0
2

X X X
2
1
0
000
000
001
001
010
010
011
011
100
100
101
101
110
110
111
111
Fault Tolerant MINs [contd.]
X X X
1
0
2

X X X
2
1
0
000
000
001
001
010
010
011
011
100
100
101
101
110
110
111
111
Fault Tolerant MINs [contd.]
•
For the link between Out2s in the(n-1)-th and In2s in the n-th stage
000
000
001
001
010
010
011
011
100
100
101
101
110
110
111
111
Fault Tolerant MINs [contd.]
•
Final(IN2s and Out2s)
000
000
001
001
010
010
011
011
100
100
101
101
110
110
111
111
Fault Tolerant MINs [contd.]
•
Final
000
000
001
001
010
010
011
011
100
100
101
101
110
110
111
111
Fault Tolerant MINs [contd.]
•
FTLEMIN
Routing in FTELMIN
•
•
If Out1 in the n-th stage is used as part of the path established form any
input terminal to output one with  X n1 X n2 ... X 1 X 0 
If Out2 in the n-th stage is used as part of the path this output terminal, the
routing is executed according to X n1 X n2 ... X 1 X 0 
000
000
000
000
001
001
001
001
010
010
010
010
011
011
011
011
100
100
100
100
101
101
101
101
110
110
110
110
111
111
111
111
Fault-Tolerance of FTELMIN
•
The proposed FTELMIN network satisfies tolerance criteria as it can operate even in
presence of certain faults
•
The following theorems characterize the faults that can be tolerated in the FTELMIN
•
•
Theorem 1
In an FTELMIN network,
if the faults occur such that at most switches affected in 1 to
m
(n+1)/2 stages is 2m where m varies from 1 to (n+1)/2 and the stages after the
(n+1)/2 stages is 2 where m varies from (n+1)/2 to 1 if the number of stages is
odd
Fault-Tolerance of FTELMIN
[contd.]
•
•
Theorem 2
If the number
of stages is even then at most switches affected in 1 to n/2
m
stages is 2 where m varies from 1 to n/2 and the stages after the n/2
m
stages is 2 where m varies from (n/2)-1 to 1
•
•
Theorem 3
FTELMIN is single switch fault tolerant in the first and the last stagess