Reflections on the Development of A/P Nets

Download Report

Transcript Reflections on the Development of A/P Nets

ECE 526 – Network Processing
Systems Design
System Implementation Principles I
Varghese Chapter 3
1
Outline
• What is network algorithmics
• Clarifying example
• Implementation principles
─ Reflect what we learned
2
Network Algorithmics
• Different with Network Algorithms
• Network Algorithmics
─ Interdisciplinary system approach to streamlining network
implementation
• Using a design problem to clarify difference
3
Security Forensics
• Identifying a network intrusion (e.g., port scanning) need
a period of time (e.g., 10 second)
• Suspicious intrusion is recorded as flow state in
suspicious table
• Flow state is updated by checking each incoming packet
from intruder.
• Once suspicion table deem a suspicious flow as bad,
evidence (all packets from intruder will be reported to
forensic system)
4
Packet Buffering
• Packet buffer needed to store all packets (100,000) during 10s for
evidence if needed later
• Buffer bounded by inserting new packet in the head of queue, and
discard at the tail if buffer is full
5
Evidence Collection
• Collecting all packets belonging the deemed attack /flow
from 100, 000-packet queue
• The collection has to be performed fast enough before
the new incoming packet overwrite the evidence
6
Hashing Table -- Algorithm
•
•
•
•
•
Real problem: find all packet belong to flow F quickly
Store pointer to packet belong to flow using linked list
Pointer to the head of linked list is store in hash table.
Advantage: quickly find the malicious flow
Disadvantage:
─ Maintain a hashing table
─ Update the linked list of each flow
─ Read out packets of malicious flow
7
Algorithmics Approach
• Once attack/flow F identified, don’t attempt to
immediately identify all packets of flow F
• Instead, identify each packet individually as they reach
the end of packet queue.
• Shift computation time – evaluate lazily
8
Taxonomy of Principles
• P1-P5: System-oriented Principles
─ These recognize/leverage the fact that a system is made up of
components
─ Basic idea: move the problem to somebody else’s subsystem
• P6-P10: Improve efficiency without destroying modularity
─ “Pushing the envelope” of module specifications
─ Basic engineering: system should satisfy spec but not do more
• P11-P15: Local optimization techniques
─ Speeding up a key routine
─ Apply these after you have looked at the big picture
9
P1: Avoid Obvious Waste
• Key Concept: look for ways to avoid doing something, or
to avoid doing it multiple times
• Example: copying in protocol stacks
─ In the old days, copying was not a bottleneck
─ But transmission & processor speeds increased much faster
than memory speeds
• Today, reading/writing from/to memory is slowest operation on a
packet
• “Zero-copy” protocol stacks never copy packet data once it is stored
in memory by the NIC
─ Eventually multiple copies became a bottleneck
10
P2: Shift Computation in Time
• Key Concept: Move expensive operations from where
time is scarce to where it is more plentiful
─ P2a: Precompute (= shift earlier in time)
• Example: computing a function by table lookup
─ P2b: Evaluate Lazily (= shift later in time)
• Example: network forensics
─ P2c: Share Expenses (= collect things in time)
• Example: batch processing
• Examples:
11
P3: Relax Subsystem Requirements
• Key Concept: make one subsystem’s job easier by
relaxing the specification (possibly at the expense of
another subsystem’s job getting slightly harder)
─ P3a: Trade certainty for time (= use probabilistic solutions)
• Example: random sampling
─ P3b: Trade accuracy for time (= use approximate solutions)
• Example: hashing, bloom filter
─ P3c: Shift computation in space (= let someone else do it)
• Example: fast path/slow path
12
P4: Leverage Off-System Components
• Key Concept: design bottom-up, use hardware
─ P4a: Exploit locality (= exploit caches)
• Note: not always useful (cf. IP forwarding lookups)
─ P4b: Trade Memory for Speed
• Note: this can be interpreted two ways
– Use more memory to avoid computation (cf P12)
» Example: table lookup
– Use less memory to make data structures fit in cache
» Example: data compression
─ P4c: Exploit hardware features
– Example: Incremental checksum computation
• Examples
─ Onboard Address Recognition & Filtering: Shift
“processing” duty from
CPU to NIC
13
P5: Add Hardware to Improve Performance
• Key Concept: Hardware is inherently parallel, can be very fast,
and is cheap in volume; consider adding hardware to perform
operations that must be done often and are expensive in software
─ P5a: Use memory interleaving, pipelining (= parallelism)
─ P5b: Use Wide-word parallelism (save memory accesses)
─ P5c: Combine SRAM, DRAM (low-order bits each counter in
SRAM for a large number of counters)
• Examples:
14
P6: Replace inefficient general routines with
efficient specialized ones
• Key Concept: General-purpose routines cannot
leverage problem-specific knowledge that can improve
performance
• Example: NAT using forwarding and reversing table
15
P7: Avoid Unnecessary Generality
• Key Concept: Do not include features or handle cases
that are not needed.
─ "When in doubt, leave it out"
• Example: RISC, microengine
16
Reminder
• Read Varghese Chapter 3
• NO Class, Wed. 11/5
─ on-line quiz
17