投影片 1 - CSIE -NCKU
Download
Report
Transcript 投影片 1 - CSIE -NCKU
Author : N. Sertac Artan, Haowei Yuan, and H. Jonathan Chao
Publisher/Conf : IEEE GLOBECOM 2008
Speaker : Chen Deyu
Data : 2009.10.14
Network applications such as IP traceback , route lookup, TCP
flow state monitoring, and malware detection often require large
data storage resources, fast queries, and frequent updates.
Hash tables are traditional data structures that allow large
amounts of data to be stored, queried, and updated in a spaceand time-efficient manner on average.
However, in the worst case, which is critical for network
applications, hash tables perform poorly. This poor performance
is due to hash collisions.
To reduce the impact of hash collisions on worst-case
performance, the hash table can be modified to store multiple
keys, say up to G max key in a single hash bucket.
Unfortunately, even when multiple keys are allowed to be stored
in one bucket, occasional overflows cannot be prevented.
(a) Co-processor architecture
(b) The LB hash table reduces number of
overflows compared to the naïve hash
table
The hash co-processor consists of
four parts:
(1) Group Counts Table(GCT)
(2) Bin-Table (BT)
(3) Hash Function
(4) CAMs.
BothGCT
GCTconsists
and BT are
The
of gdivided
entries,
into equal-sized
segments.
Let
where
each entry,
GCTi shows
the number of
in a
the
of groups
keys currently
in
single
segment
of GCT be Gs
the
group
i
and the number of bins in a
single segment of BT be s.
Insertion (Include three algorithm)
(1) Non-Search Algorithm (NSA)
(2) Single-Search Algorithm (SSA)
(3) Double-Search Algorithm (DSA)
Query
Deletion
K1
2
0
14
GCTi < Gmax ?
overflow
CAM
K2
43
54
17
16
13
11
15
gmin
K3
b39
2
43
98
8
16
15
13
15
gmin
If in the system, the queried key is either in the hash table
or in the CAMs. If the key is found in the CAMs, it is
returned immediately. Otherwise, the group of the bin
where the key is hashed is read as a single burst from the
hash table.
So, the query operation takes at most one burst access.
To delete a key, the key is first queried. If the key is in
the CAMs, it is removed from the CAMs and deletion
is completed. If this key is not in the CAMs, the group to
which this key belongs is read as a single burst from the
off-chip structure.
The key is deleted and the remaining keys are written back
to the off-chip structure in another burst. So, the delete
operation takes at most two burst accesses.
On-chip Memory Consumption
(1) One bin requires: M b log 2 (Gmax 1) log 2 (Gs )
(2) Each entry in GCT requires: M g log 2 (Gmax 1)
(3) CAM storage: M c n K s log 2 ( g )
(4) On-chip memory requirement per key:
m Mb g M g 2 Mc
C l
• n is the number of keys stored in CAMs and Ks denotes key size.
• C is the hash table capacity.
• l is the ratio between the keys stored in the system(i.e. in hash table and
CAMs) at a given time and hash table capacity.
Time Complexity for On-chip Search
The main contributor to on-chip time complexity is the search times for the
SSA and DSA. The group search used by both algorithms takes time
proportional to Gs, since the search is limited to a single segment and Gs −1
groups need to be searched. Note that for a hardware implementation, since
Gs is small, the group search can be done in parallel using a simple priority
encoder.
External Memory Access (Per key on average)
N n
(1) NSA: AavgA
N
(2) SSA、DSA: AavgB
( N W ) 1 (W n) 3
2W 3N
1
N
N
Where W is the number of overflows in a give time period.
N is the total number of keys inserted into the system.
C=65,536 keys Gmax=8 l=0.8
Performance comparison for NSA, SSA, and DSA, where GSI
and BSI stands for group and bin search per insertion, respectively.
AI is external memory access for insert.