COE 590 Special Topics: Parallel Architectures
Download
Report
Transcript COE 590 Special Topics: Parallel Architectures
Programming Multi-Core
Processors based Embedded
Systems
A Hands-On Experience on Cavium Octeon
based Platforms
Lecture 4: Layering & Deep Packet Inspection
Course Outline
Introduction
Multi-threading on multi-core processors
Applications for multi-core processors
Application layer computing on multi-core
Application layer protocols
Deep packet inspection and content processing
Performance measurement and tuning
Copyright © 2009
4-2
KICS, UET
Agenda for Today
Application layer protocols
Application layering
Layer 4-7 applications
Header and deep packet inspection
Performance considerations
Case study: deep packet inspection
Case study: deep message inspection
Copyright © 2009
4-3
KICS, UET
Multi-Core Systems for Application Layer
Internet architecture typically consists of
Network core
Network edge and access network
End-point hosts
Compute capabilities within networks
Historically, network does simple operations
Compute intensive tasks left to end hosts
Examples: packet re-assembly, state-ful protocol
processing, content filtering and matching, etc.
However, “intelligence” moving to networks from
end hosts with increasing processor capabilities
Copyright © 2009
4-4
KICS, UET
Multi-Core Systems for Application Layer
(2)
Increasing processing capabilities in networks
Moore’s law in 90’s: increasing clock frequencies
Moore’s law in 00’s: increasing number of cores
Additional capabilities are being used for:
Access control, filtering, caching, load balancing, packet
header and content inspection, and intrusion detection
In addition to regular switching and routing
Enabling new computing paradigms:
Network as a platform
Software as a Service (SaaS)
Cloud Computing
Copyright © 2009
4-5
KICS, UET
Multi-Core Systems for Application Layer
(3)
Networks are leveraging from multi-core systems
One or more cores dedicated to routing/switching
Other cores for application layer processing
Performance challenges:
Layer 2 and 3 devices operate at link speeds
Users expect same performance even with application layer level
processing
Multiple challenges to deliver expected performance:
Processor-memory speed disparities
Processor-I/O speed disparities
OS and system software level overheads
Thread synchronization overheads
We address these topics today by:
Understanding the nature of application layer protocols
Specific examples: layer 4-7 filtering and deep packet inspection
Discussion of performance considerations at application layer
Copyright © 2009
4-6
KICS, UET
Application Layering
Distributed Operating Systems
By Andrew S. Tanenbaum
Application Layering
Increasing number of distributed applications
Multiple end-points
Clients, servers, databases, etc.)
At geographically dispersed locations
Connected through public or private (IP) networks
Application level issues:
Hide low level networking details interface
Application level abstraction protocols
Convenient to deal with at application layer
No need to change lower level protocols (layer 1-4)
Copyright © 2009
4-8
KICS, UET
Layered Protocols
Message-passing requires agreements at different
levels
Signal voltage levels to represent 0 or 1 bits
Detection of the last bit by the receiver
How can a receiver detect message errors or loss and what
should it do in that case
How numbers and strings should be represented
Open System Interconnection (OSI) reference model
Identifies various levels and gives them standard names
Points out the functions that belong to a particular level
Designed to allow open systems to communicate through
standard rules protocols
Protocols govern message format, content, and their meanings
A group of computers must agree on protocols to communicate
Copyright © 2009
4-9
KICS, UET
The OSI Reference Model
Communication
divided into 7
layers:
Each layer deals
with a specific
aspect
Provides an
interface to the
one above
Interface defines
its services
Copyright © 2009
4-10
KICS, UET
Message-Passing Using OSI Model
Process A on machine # 1 needs to send a
message to process B on machine # 2
Message is built up through layers before it
goes to the network
Copyright © 2009
4-11
KICS, UET
Protocols at various Levels
Several protocols at different layer levels
Three lower level protocols in OSI model:
Protocol suite or stack used in a particular system
Implementation different from reference model itself
Physical layer
Data link layer
Network layer
Transport protocols
Higher level protocols
Session and presentation layer protocols
Application protocols
Middleware protocols
Copyright © 2009
4-12
KICS, UET
Example of a Data Link Layer
Protocol
Message exchange between A and B at link
layer level
Copyright © 2009
4-13
KICS, UET
Examples of Transport Protocols
Regular TCP
Copyright © 2009
4-14
TCP for transaction
processing (T/TCP)
KICS, UET
Middleware Protocols
Consist of general-purpose protocols below application layer
Examples: authentication, distributed commit, and
synchronization protocols
An adapted OSI model:
Copyright © 2009
4-15
KICS, UET
Example: Organization of a Search
Engine
Copyright © 2009
4-16
KICS, UET
Client-Server Architectures
Using three levels, a simple architecture
includes:
A client machine containing an implementation of
only the user-interface level
A server machine containing the rest
processing and data level
Problem: this is not a truly distributed system
Everything is handled by the server
Client is simply a dumb terminal
Copyright © 2009
4-17
KICS, UET
Multitiered Architectures
Physically, a 2-tiered architecture
Consists of two types of machines: clients and servers
Various organizations are possible at three levels
Fig. (e) example: a browser with local cache of WWW pages
Copyright © 2009
4-18
KICS, UET
Example: Server Acting as a Client
Physically, a 3-tiered architecture
Server can act as a client
Programs or data may be distributed across
multiple servers
Example: a transaction processing system
Copyright © 2009
4-19
KICS, UET
Modern Architectures
Vertical distribution
Horizontal distribution
Achieved by placing logically different parts on different machines
Example: a multitiered architecture
Client or server is split in logically equivalent parts
Each part operates on its own share of complete data set
Goal is often to balance the load
Example: replicated web servers
Peer-to-peer distribution
No server
Clients collaborate with one another
Other clients can dynamically join or leave a group of clients
Example: Napster
Copyright © 2009
4-20
KICS, UET
Example: Horizontal Distribution
Web service
Replicated across three servers
Useful for highly popular web sites with enough
bandwidth
Copyright © 2009
4-21
KICS, UET
Layer 4-7 Applications
Examples and Multi-Core Platforms
Layer 4-7 Switching Examples
Load balancing
HTTP/HTTPS
VPN
NAT
Proxying
Filtering
HTTP/HTTPS traffic proxy
Forward or reverse proxy
Web caching
Application delivery
Copyright © 2009
4-23
Firewalls
Content based processing
and routing
Business intelligence
Software as as service
(SAAS) gateways
Cloud computing
Policy enforcements
KICS, UET
Layer 4-7 Switch Usage
Copyright © 2009
4-24
KICS, UET
A Web Proxy
1. Client request
2. Proxy request
4. Proxy response
3. Server response
Internet
Client
Web server
Web proxy server
A proxy works as an intermediary
Between client and server
Typically at the “edge” of the network
Network aggregation points are administered by ISPs
Services
Infrastructure, caching, access control, authentication, and firewall
Copyright © 2009
4-25
KICS, UET
Caching Services
Caching is beneficial for entire web infrastructure
Types of transactions that can be cached
Upstream bandwidth saving
Improved latency for the end user
Content distribution to downstream users
Less load on servers
HTTP
FTP
Gopher
NNTP
Streaming media
Proxy should be transparent for other types
transactions
For example, SSL tunneling
Copyright © 2009
4-26
KICS, UET
Modes of Proxy Usage
Server vs. appliance
Proxy is a software application
Proxy is a pre-packaged box
Explicit: user explicitly points the browser to a proxy
Example: ITC proxy
Transparent: user is unaware of the proxy
Example: WCCP, L-4 switches, WPAD (Web Proxy Auto Discovery), and others
Forward vs. reverse
Examples: Cisco, CacheFlow, and Network Appliances caching products
Explicit vs. transparent
Examples: Microsoft Proxy, Inktomi Traffic Server, and Squid
Forward: used for bandwidth saving at aggregation points
Reverse: used as a server accelerator
Single vs. hierarchical (chained) vs. cluster (array) configurations
Distributed caching
Load balancing
Copyright © 2009
4-27
KICS, UET
Platforms for Layer 4-7: Multi-Core
Requirements for layer 4-7 devices:
Suitable processor for a spectrum of applications
Efficient memory hierarchy
Efficient I/O and networking support
A flexible operating environment
A generic OS (typically POSIX compliant)
System level support for development and maintenance
High throughput to match user expectations
Multi-core systems are the choice
Technology and price-performance dictates it
OS and system software support
Multi-threading to achieve high application performance
Copyright © 2009
4-28
KICS, UET
Deep Packet Inspection
T. Lam et al., “XML Document Parsing:
Operational and Performance
Characteristics,” IEEE Computer, Sept.,
2008
Current Internet Architecture
Source: Fang Yu, “High Speed Deep Packet Inspection with Hardware Support,” PhD Dissertation, University of
California, Berkeley, 2006. http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-156.html.
Copyright © 2009
4-30
KICS, UET
Characteristics of Internet Architecture
Internet core
Limited to packet routing and switching
Typically, little or no “intelligence” involved
Use only packet header information
Network edges
Perform application level functions
Increasing “intelligence” at edges and hosts
Involves processing packet header as well as
content deep packet inspection
One or more packets
Stateless or state-ful
Copyright © 2009
4-31
KICS, UET
Deep Packet Inspection (DPI)
Unlike traditional packet inspection, DPI
checks the complete payload instead of just
the packet header(5-tuple) to make the
decision.
When protocol ID is hidden, DPI is required
to find out protocol information in the packet
payload based on predefined protocol
features.
Copyright © 2009
4-32
KICS, UET
DPI—Classification and Matching
Packet classification depends on:
DPI applications
Inspecting header layer 2-4 protocols
Inspecting pay load layer 7 protocols DPI
Detect virus, spam, etc.
Network intrusion detection
Network montoring
Load balancing
DPI is all about comparing content against a
set of specified patterns
Copyright © 2007
2009
4-33
33
KICS, UET
Example DPI Applications
Stopping worms from spreading
Current schemes rely on end-host anti-virus
software and are not very effective
DPI allows content checking at network edges
Content based routing
Identifying HTML traffic for a web server
Directing XML traffic to an application server
Balancing load
Copyright © 2009
4-34
KICS, UET
Representing a Pattern
An explicit text string
A regular expression
Represents what we are looking for
One string can represent only one pattern
Replacing strings for regular text searching
Enhance the expressive power of a pattern
More flexible and widely understood
Other possibilities e.g., Xpath
Standard language for matching in XML content
Copyright © 2007
2009
4-35
35
KICS, UET
Complexities with DPI
Patterns are pre-defined keywords or strings
Explicit strings
Regular expressions
Patterns related requirements:
Match multiple patterns
Simultaneously or in specified sequence
For wide variety of attacks
One packet to be matched against 1000’s of
signatures
Example: SNORT has more than 4000 signatures
Copyright © 2007
2009
4-36
36
KICS, UET
Complexities with DPI (2)
Keywords complexities:
Performance challenge:
Can be of any length
Can be anywhere in the payload of a packet
Match at line speed
Continuously increasing: 10, 100, 1,000, 10,000
Mbps
Flexibility to accommodate new patterns
Copyright © 2009
4-37
KICS, UET
XML Pattern Matching
XML is widely used at application layer
Stages for XML processing:
Extensible Markup Language
De facto standard for Internet document format
Parsing
Access
Modification
Serialization
Parsing is the most expensive operation
Copyright © 2009
4-38
KICS, UET
XML Processing Stages
Copyright © 2009
4-39
KICS, UET
DPI Case Study: A Layer 7 Filter
Linux L7 filter:
An extension to Netfilter
Classify packets in connection
D. Guo et al. “A Scalable Multithreaded L7-filter
Design for Multi-Core Servers,” ANCS ’08, 2008
Multi-core architecture
Multi-core architecture duplicates hardware
resources such as ALU, L1 cache, etc. on the
same die, and hence allows multiple processes to
run concurrently on different cores.
Copyright © 2009
4-40
KICS, UET
Implementation Details
Offline model
Connection level parallelism
Well-controlled research environment
Faster processing
Better cache performance
Connection-based affinity for multi-core
scalability
Direct mapping in VM
Transplant native optimization to VM
Copyright © 2009
4-41
KICS, UET
Trace-Driven L7 Filter Data Flow
Copyright © 2009
4-42
KICS, UET
Affinity Based Multithreaded L7 Filter
Architecture
Xen Driver Domain
Incoming
Packets
libnids
Preprocessing
Scheduler
Schedule new packets
To MTs
MT #1
MT #2
MT #3
VCPU
#3
Xen Hypervisor
PCPU
#0
...
...
VCPU
#2
MT #7
...
...
VCPU
#1
MT #6
...
...
VCPU
#0
MT #5
...
PT #0
MT #4
VCPU
#4
VCPU
#5
VCPU
#6
VCPU
#7
PCPU
#5
PCPU
#6
PCPU
#7
Direct Mapping
PCPU
#1
PCPU
#2
PCPU
#3
PCPU
#4
Physical Devices
Copyright © 2009
4-43
KICS, UET
Data Flow in Scheduler
Copyright © 2009
4-44
KICS, UET
Throughput and Core Utilization
100%
T-ori
T-aff
U-ori
U-aff
1.2
1.0
90%
80%
70%
0.8
60%
50%
0.6
40%
0.4
30%
0.2
20%
10%
0.0
0%
1
2
3
4
5
6
7
CPU Utilization
Throughput (Gbps)
1.4
8
# of MTs
Copyright © 2009
4-45
KICS, UET
L2 Cache Misses
Copyright © 2009
4-46
KICS, UET
Execution Time Comparisons
Copyright © 2009
4-47
KICS, UET
Execution Time Profile
Copyright © 2009
4-48
KICS, UET
Summary of Case Study Results
L7 filer implementation
With multiple threads
A scheduling mechanism using core affinity to
optimize L2 cache utilization
Improvements compared to native Linux
51% higher throughput
15% lower core utilization
Copyright © 2009
4-49
KICS, UET
Performance Consideration
Case Studies
DPI Performance Evaluation
We consider two case studies
Case study: DPI
String matching using regular expression
Regular vs. optimized regular expressions
Case study: deep message inspection
XML based message content
TCP termination messages instead of packet
Performance of XML operations: parsing,
validation, and XPath based matching
Copyright © 2009
4-51
KICS, UET
Case Stdy:
Efficient Regular Expression
Matching for Deep Packet
Inspection
Fang Yu et al., ANCS ’06, 2006
DPI—Performance Requirements
High speed
Match core link speeds (10 Gbps)
Match edge link speeds (1 Gbps)
Match increasing processor performance
Moore’s law in 90’s increasing clock speeds
Moore’s law now increasing number of cores
Such rates are difficult to achieve with software
based worm/virus pattern detection
Example: SNORT can achieve up to 250 Mbps
Copyright © 2009
4-53
KICS, UET
DPI—Performance Requirements (2)
Low cost
High throughput requirement can be fulfilled with
powerful parallel architectures but at high costs
Currently available multi-core systems provide
highly parallel and high performance compute
systems at low cost
Layer 4-7 application performance challenges
Cache and memory latencies architecture
Multi-threading complexities software
development
Copyright © 2009
4-54
KICS, UET
Performance Optimization
Techniques evolved for all representations
Explicit text representations
Algorithmic optimizations e.g., Bloom filters
Architectural optimizations e.g., TCAMs
Regular expression based representations
Nondeterministic Finite Automaton (NFA) based
approaches inefficient
Deterministic Finite Automaton (DFA) based
approaches
Grouping for over-lapping patterns
Copyright © 2007
2009
4-55
55
KICS, UET
Optimization Case Study: RegEx
RegEx pattern optimizations:
NFA NFA based implementation
DFA RP Flex generated DFA based repeated
scan engine
DFA OP Optimized DFA with grouping
Experimental setup:
NFA based algorithms from Linux L7 filter and
SNORT
Packet traffic based on real traces from MIT and
UCB real network dumps
Copyright © 2009
4-56
KICS, UET
Throughput Comparison of Scanners
Copyright © 2009
4-57
KICS, UET
Results Summary for Case Study
Implemented DFA based approach
Instead of NFA based commonly used
Re-writing results in reduced memory needs
compared to exponential memory with regualr
DFA
Selective grouping enhances throughput
2 to 3 order of magnitude more than NFA
1 to 2 order of magnitude faster than regular DFA
Suitable for multi-core processor implementation
Copyright © 2009
4-58
KICS, UET
Case Study:
Benchmarking XML Based
Application Oriented Network
Infrastructure and Services
Abdul Waheed and Jason Ding
SAINT ’07
January 18, 2007
Outline
Application oriented networking
Benchmarking
AON benchmarking requirements
State of application services benchmarks
AONBench
Application services in network infrastructure
Role of XML for AON
Methodology
Specifications
Case studies of using AONBench
Conclusions
Copyright © 2009
4-60
KICS, UET
Application Oriented Networking
(AON)
Increasing “intelligence” in the network
AON provides application awareness in the network
Beyond switching, routing, and traffic engineering functions
Traditionally, complex functions left to end hosts
Cost-effective processing capabilities enable increasing
complex functions within the network
Provides value added functions in network infrastructure
Adds value to a vast array of enterprise applications
Examples:
Network edge devices for caching, filtering, and security
Protocol translation for integration at enterprise level
Network policing and monitoring conformance to SLAs
Copyright © 2009
4-61
KICS, UET
AON is the “Network for
Applications”
APPLICATIONS,
PROCESSES,
PEOPLE
APPLICATION
ORIENTED
NETWORKING
MESSAGE
ROUTING
EVENT
CAPTURE
APPLICATION
SECURITY
PACKET
NETWORK
Next stage in the evolution of the network:
Process at application message level
Understand the content and context of messages
Enable application to focus on business logic and user
interaction while offloading application overhead aspects to the
network with no changes to the existing systems
Copyright © 2009
4-62
KICS, UET
AON understands Application
Messages
Conventional networks:
Provide intelligent packet
level services
Cannot interpret message
contents
101011001011011011010100110101
AON interprets application
message contents for much
richer detailed information:
PURCHASE ORDER #: 012345678
FROM: BigWig Co, Anytown
TO: Cisco Systems DATE: 04/01/05
QTY: 50
PART#: Widget #12345a
PRICE:=$500 ea. TOTAL: = $25,000
DELIVERY: Urgent SLA:= 2 days
APPLICATION-ORIENTED
NETWORKING
Example: Ship To, Part#,
Qty, $, SLA
Allows business driven
policies to be executed on
application messages at
runtime
Copyright © 2009
?
101011001011011011010100110101
PACKET NETWORKING
4-63
KICS, UET
Cisco AON Core Capabilities
•
•
•
•
•
•
Reliable messaging
Content based routing
Transformation
Protocol switching
Message distribution
Message load balance
•
•
•
•
•
•
Authentication
Authorization
Encryption/Decryption
Data integrity / N-R
Digital signatures
Centralized PKI mgt.
Application Optimization
•
•
•
•
•
•
Event capture, filtering
Logging for audit
Automatic notification
Policy controlled
Feed to dashboards
Link to Network events
Extensibility
• Hardware Acceleration (SSL, Crypto, XML)
• Message level Caching and Compression
• High Availability, Failover, Load Balancing
Copyright © 2009
Event
Visibility
Application
level Security
Intelligent
Messaging
4-64
• ADK (for custom adapters)
• SDK (for custom bladelets)
• AON Technology Partners
KICS, UET
Application Services without AON
Application
Server
Common
services
Back-end
Client end
WAN
Application server hosts the business services
Uses back-end services and network infrastructure
Back-end: to conduct common or compute intensive tasks
Network infrastructure: for connectivity and routing
Copyright © 2009
4-65
KICS, UET
Application Services with AON
Application
Server
Client end
WAN
AON module
AON modules push common services to the network
Application servers can leverage AON to:
Off-load compute-intensive task
Off-load routing, security, load balancing, caching, protocol
classification, and application-specific integration
Focus on providing business logic
Copyright © 2009
4-66
KICS, UET
Cisco AON Products
Pushes common application services to infrastructure
Security
Firewalls and access control
Content caching and delivery
XML content processing
Load balancing
Protocol matching
Integration of heterogeneous standards/applications
Multiple AON form factors:
AON modules in Catalyst 6500 series switches
AON modules in Cisco 2600/2800/3700/3800 series routers
Cisco 8340 AON appliance
Copyright © 2009
4-67
KICS, UET
Role of XML for AON
XML is becoming ubiquitous for network applications
XML features
Similar to IP, which is ubiquitous for packet routing
Application message content is increasingly XML
Self defining, extensible format
Convenient to parse, validate, and transform
Suitable for integrating multiple services in enterprise
networks
Performance is a challenge
Different from stateless packet (header) processing
XML processing and security are compute-intensive
Benchmarking is needed as AON awareness grows
Copyright © 2009
4-68
KICS, UET
Requirements for an AON
Benchmark
Benchmark should be vendor-neutral
It should be open
Need a level playing field for everyone
Should focus on services rather than specific products
Open in terms of specifications
Also, open source tools
User should be able to employ their own choice of tools
Benchmarks should exercise
Networking/web services and
XML Content processing services, such as parsing, pattern
matching, validation, transformation, and security
Copyright © 2009
4-69
KICS, UET
State of Application Services
Benchmarks
Two distinct classes of benchmarks: WWW and XML
Available benchmarks lack all requirements:
Web services benchmarks
XML microbenchmarks
XML storage management benchmarks
XML security benchmarks
Benchmark either web services or XML processing
Product-specific benchmarks
Commercial tools that are not open benchmarks or
specification for benchmarking
Our approach:
Provide open specifications for benchmarking
Leave tool development/selection to the evaluator
Copyright © 2009
4-70
KICS, UET
AONBench
AONBench is not a benchmarking tools
Methodology
Setup needed for measurements
Setup for service request and response
Criteria for XML functional success and failure
Specifications
Consists of two parts: methodology and specifications
Tool development or selection is left to the evaluator
Application services based use cases
XML messages with schemas and style sheets
We do have two realizations of AONBench
Using public domain tools
Using custom tools
Copyright © 2009
4-71
KICS, UET
AONBench Methodology (1)
GigE switch
Default server
Client end
HTTP text/xml message
HTTP response
Second server
endpoint
Ingress / Egress
AON module
Isolated measurement setup
AON module
GigE network connection client, servers, and AON
It can scale with traffic generation needs
Copyright © 2009
4-72
KICS, UET
AONBench Methodology (2)
Client/server setup
Service selection
Client sends XML content as HTTP POST messages
AON module intercepts (explicitly or implicitly)
Forwards the message to an endpoint
Based on unique URL in POST request
AON module provides pre-configured services
Services can succeed or fail
Modes of AON forwarding
AON selects an end-point based on success/failure of a
service request
No need to use any product-specific interface
XML function can be verified by modifying the message
Copyright © 2009
4-73
KICS, UET
AONBench Methodology (3)
Connection type
Content sizes
Default: keep-alive connections with unlimited messages
Keep-alive can be disabled as needed
Application services frequently use small messages
Small: 1 KB to 5 KB messages
Large: 500 KB messages
Basic performance metrics
Throughput
Messages per second
Megabits per second
End-to-end request-response latency
Other metrics can be derived from the basic metrics
Copyright © 2009
4-74
KICS, UET
Specifications for Use Cases
We specify three classes of use cases
Network infrastructure services
XML content based services
Content Based Routing (CBR)
Schema Validation (SV)
XML Transformations (Trans)
XML security services
Forwarding Request (FR) without modifications
Provide HTTP proxying baseline
Encryption, Decryption, Digital Signature, and Verification
We also specify XML messages for these use cases
Message contents
Schemas and style sheets
Copyright © 2009
4-75
KICS, UET
Specifications: Forward Request
AON module actions
Message
Receives the HTTP POST request from a client
Identifies it as FR use case
Forwards the message to specified server end point
Receives the response from server end point
Forwards the response back to the client
Any XML message can be used
Contents are irrelevant
Size: one of 1, 5, or 500 KB
Performance significance
Provides a baseline for all other cases
Should result in highest throughput compared to other UCs
Copyright © 2009
4-76
KICS, UET
Specifications: XML Processing Use
Cases
AON module actions:
Receives the HTTP POST
request
Identifies the type of
service: CBR, SV, or Trans
Performs the selected
service on content
If no errors forwards to
specified end point
else forwards to the
default end point
Receives response from the
end point
Forwards response back to
the client
Copyright © 2009
Messages:
Performance significance:
4-77
CBR:
XML message with SOAP
envelope containing a string
of interest
SV:
XML message with name of
schema XSD file
Trans:
XML message that use a
pre-stored style sheet
Exercise XML computeintensive operations
Also exercise network I/O
KICS, UET
Specifications: XML Security Use
Cases
AON module actions:
Receives the HTTP POST
request
Identifies the type of
service: Enc/Dec/Sign/Verify
Performs the selected
service on content
If no errors forwards to
specified end point
else forwards to the
default end point
Receives response from the
end point
Forwards response back to
the client
Copyright © 2009
Messages:
Performance significance:
4-78
Encryption/Signature:
XML message with SOAP
envelope containing a PO
Decryption:
Uses message resulting
from Encyrption
Verification:
Uses result of Sign
Exercise XML crypto
operations
CPU intensive use cases
KICS, UET
AONBench Implementations
Implementation does not depend on any tool
Our setup
Users can select tools of their choice
Need client/server setup
Tool needs to support HTTP POST request
Client end point: ApacheBench
Server end point: Apache web server
Multiple AON form factors
We use specified messages, schema, and style sheet
Exercise services through unique URIs
In-house tools can also be used
Copyright © 2009
4-79
KICS, UET
Studies of Using AONBench
We have used AONBench at various phases
of AON product development
Product requirements and definition
Product design phase for architects
Product execution phase
performance
regression testing
product release phase
for marketing
Cases studies of applying AONBench
Competitive analysis
Selection of appliance hardware
Evaluation of acceleration hardware
Copyright © 2009
4-80
KICS, UET
Case Study: Selection of Appliance
H/W
Hardware Platform
Speedup
Transformation
Schema Validation
2x3.2GHz CPUs, 1 MB L2, 2
MB L3, 667 MHz FSB, and 4
GB DRAM
Baseline-1
Baseline-2
4x3.2 GHz CPUs, 1 MB L2,
no L3, 667 MHz FSB, and 4
GB DRAM
1.25x
1.1x
2x3.2 GHz CPUs, 1 MB L2, 8
MB L3, 667 MHz FSB, and 4
GB DRAM
1.6x
1.8x
Increased L3 cache size more effective than doubling the number of CPUs
Copyright © 2009
4-81
KICS, UET
Case Study: Competitive Analysis
25
20
15
Design-1
Design-2
Competitor-1
Competitor-2
Design-3
Relative
Performance
10
5
0
ValidateSchema
Encryption
Decryption
Signature
VerifySignature
Competitive analysis of three designs using AONBench
Copyright © 2009
4-82
KICS, UET
Case Study: Evaluating Accelerator
H/W
Use Case
Performance speedup due to an accelerator
5 KB messages
500 KB messages
Content Based Routing
1.08
3.08
Schema Validation
1.12
1.11
Encryption
1.06
1.44
Decryption
1.39
1.18
Digital Signature
1.11
1.02
Signature Verification
1.03
1.02
Hardware accelerator is more valuable for processing larger messages
Copyright © 2009
4-83
KICS, UET
Cast Study Conclusions
AONBench contributions
Introduces specification based benchmarking for networked
application services
Use cases
XML messages
Neutral to vendor and measurement tools while focusing on
the performance of networked services
Useful for architects, developers, as well as end users
Future work
Use current “atomic” use cases to develop more realistic UCs
Extend the scope of use cases: caching, load balancing,
message filtering, protocol classification, and application
integration
Copyright © 2009
4-84
KICS, UET
Key Takeaways for Today’s Session
Most of the action at application layer
Wide range of protocols and applications
Simpler to deploy on existing IP networks
Exponentially increasing demand
Multi-core systems are suitable
To deliver high throughput
To match link speeds despite compute/memory
intensive tasks at application layer (e.g., DPI)
To concurrently perform compute intensive
security operations for scalability
Copyright © 2009
4-85
KICS, UET