Some Thoughts on E2Epi

Download Report

Transcript Some Thoughts on E2Epi

Some thoughts on E2EPI
Shawn McKee <[email protected]>
Pipefitters Meeting,
Internet2 Spring Meeting
8 April, 2003
The Problem
Applications
Developer
Hey, this is not
working right!
LAN
Administrator
Others are
getting in ok
Not our problem
Talk to the other guys
Applications
Developer
LAN
Administrator
Everything is
AOK
System
Administrator
Campus
Networking
Campus
Networking
The computer
Is working OK
System
Administrator
No other
complaints
Gigapop
How do you solve
a problem along a path?
Looks fine
Gigapop
Backbone
All the lights
are green
We don’t see
anything wrong
The network is lightly loaded
2
The issue…
We all know high bandwidth links are
not sufficient to provide high network
performance.
• General users of the network, even technically
proficient ones, can’t be expected to be network
wizards nor have intimate knowledge of their end-toend path…
• Problems arise continually and can be due to
hardware, applications, hosts and misconfiguration
• What are we to do to make the most impact in the
least amount of time?
End to End Performance Issues
We require knowledge of the endsystems as well as the intervening
network segments to evaluate the
observed performance and diagnose
problems
There are a number of pieces to the
puzzle…
Major Impacts in E2E Performance
What is the CPU, disks,
System state is critical;
interfaces and memory of what is the CPU load
the end hosts…and what
and estimated bus
are their performance?
What is the network
interface: type, firmware,
parameters and expected
performance, both longterm and based upon
recent results?
loading?
Final piece: what is the
upcoming workload in
the network, both locally
and globally?
Adapting Existing Tools for E2E
Many of the tools being developed for dataintensive Grids have to address similar issues
in monitoring and planning.
These tools should be looked at for how well
they can meet the requirements of the End-toend Initiative
MonaLisa is one example of an already
deployed grid-monitoring package which may
be a good match for the needs of E2E…
Start at the ends: Hosts
We must enable “data acquisition” from
the hosts involved.
• These hosts represent the logical dividing point
between the “network” and the “user”
• Many problems are related to host configs, TCP/IP
stacks, OS version, NICs, firmware, application
design, etc
PROBLEMS: Hosts run many different OS’s on many
different hardware platforms…how to generically
capture needed info while minimizing user
involvement?
First pass at Host info gathering…
We need a system which can dynamically
download a data gathering application which
can run on most systems…
JAVA seems to be the most likely candidate.
•
•
•
•
Pervasive
Can be cryptographically signed
Permissions can be fine grained
Runs on MANY OS’s
What to “acquire” from each host?
Stable Info
Operating system and
version:
• RedHat V8.0 or WindowsXP SP1
• Processor details
•NIC info (firmware, brand,
type)
• Memory info
• TCP stack parameters
Dynamic Info
Interrupts/sec
CPU usage
NIC bandwidth, errors,
queue lengths
Memory usage
Bus usage
Standards are critical..
Whatever we do for host data acquisition we
should insure the output is in some “standard”
format
The GGF Network Measurement group is just
now grappling with measurement profiles and
data schema. We should plan to use this and
contribute to its development
Host information exists in CIM, DTMF and
others…lets pick something capable of storing
what we need and move on.
Host applets
Having a system accessible thru the
web and supporting Linux and Windows
would give us the broadest initial
coverage.
First time users connect to a E2E server
or peer-to-peer system and download a
signed Java applet to their host.
Starting the Applets
The user allows the applet to start (security
signing)
The applet starts and creates a GUID for this
host and records, in standard format, the
“stable” host details. Each new invocation of
the applet will verify the currency of the stable
information
The applet can provide the GUID and host
details to a registration server with or without
“anonymization” of identifying details
User Interface
Once the info is locally (and optionally
remotely) stored, the user can be presented
with a user interface:
Register host
Test path
Client
Server
Log Problem
Search Database
Testing the path
A user with a problem could initiate path
testing (in conjunction with a remote user or a
PMP)
User specifies “client” or “server” mode and
partner IP information
The test could be a defined set of
measurements:
•
•
•
•
Ping (reachability/RTT)
Traceroute (both forward and backward)
One-way loss (each direction)
Iperf (bandwidth EACH way, measured simultaneously)
Path testing
The series of tests is run by a Java
application.
Missing components are downloaded from
servers.
Bandwidth testing is done both ways,
simultaneously to find duplex problems on the
path
Dynamic host information is recorded at both
ends during each sub-test
Test results
Test results would be saved locally and a
summary given to the user.
A Java analysis applet could parse the info,
looking for common problems
Results could be “logged” with a central service
Logged events could be further analyzed by
central servers with access to current network
details. Users could be referred to the most likely
problem domain with current contact information
provided by the central server
Advantages of such a system
Creating such a service could bootstrap the
effort.
First step toward improving user experience is
determining what is limiting performance
Database of test results is enormously important:
• Sets “baseline” for various hosts, locations, applications, etc.
• Provides problem frequency data so we can focus on fixing the
most pervasive or restrictive problems first
• Allows analysis of components: NIC vs OS, Firmware X vs Y,
etc.
•Will require host applets (OS specfic), host
specific flavors of applications, central servers
and a distributed database
Some Goals
Put the “wizard” knowledge into the applets
Enable ordinary users to perform state of the
art testing
Provide a reference set of network testing
applications by host type for users
Define a network measurements database for
the network users community
Interoperate with PMP stations in the network
Instrument applications to automatically
provide data to system
HENP Efforts
The HENP Sponsored Interest Group is also
focused on end-to-end issues.
http://www.internet2.edu/henp
We have a list of 9 goals related to networking,
many of which are also related to the issues the E2E
initiative is trying to address
HENP can help build the infrastructure and serve as
a test case for the E2E Pipes effort
MonaLisa is a deployed application which provides a
measurement framework and could be adapted for
the E2E effort
Monitors and Beacons at Michigan
We have an effort at Michigan to instrument our
gigabit backbone and selected sites with “Monitor
& Beacon” boxes.
Each station has dual gigabit adapters and fast
raid disks for real-time traffic capture and analysis
We are extending the GARA GT2.2 code to
provide fully authorized functionality, both for
individual “one-off” testing and scheduled testing
via our web portal
Some of this effort may be able to be used to
further the E2E goals
Applets Needed
Host “stable” data
acquisition
Future Developments
Host “dynamic” data
acquisition
“Finger pointer” (working
with PMPs and
databases)
Network “measurement”
Network testing results
“logger”
Network measurement
“analysis”
Host network “tuner”
“Search” (find info on your
NIC, OS, bandwidth
history, etc)
“Visualization” (Dynamic
Network problem “logger” plotting of network data)
Conclusion
We need to get some capability deployed
which can make a difference in the user’s
experience
HENP has a very strong interest in solving
this problem and is willing to put in resources
to help get things done
Starting from the host is a logical first step
which has the potential to make the biggest
impact