Presentation to UG scholars about NIIT SLAC collaboration

Download Report

Transcript Presentation to UG scholars about NIIT SLAC collaboration

Stanford University, SLAC, NIIT, the
Digital Divide & Projects
Prepared by Les Cottrell, SLAC
for the NIIT Under Graduate Students,
March 15, 2007
Stanford University
• Location
Some facts
• Founded in 1890’s by Governor Leland Stanford &
wife Jane
– in memory of son Leland Stanford Jr.
– Apocryphal story of foundation
• Movies invented at Stanford
• 1600 freshman entrants/year (12% acceptance), 7:1
student:faculty, students from 53 countries
• 169K living Stanford alumni
Some alumni
•
•
•
•
Sports: Tiger Woods, John McEnroe
Sally Ride Astronaut
Vint Cerf “father of Internet”
Industry:
– Hewlett & Packard, Steve Ballmer CEO Microsoft, Scott
McNealy Sun …
• Ex-presidents: Ehud Barak Israel, Alejandro Toledo
Peru
• US Politics: Condoleeza Rice, George Schultz,
President Hoover
Some Startups
• Founded Silicon Valley (turned orchards into
companies):
– Start by providing land and encouragement (investment)
for companies started by Stanford alumni, such as HP &
Varian
– More recently: Sun (Stanford University Network), Cisco,
Yahoo, Google
Excellence
• 17 Nobel prizewinners
• Stanford Hospital
• Stanford Linear Accelerator Center (SLAC) – my home:
– National Lab operated by Stanford University funded by US
Department of Energy
– Roughly 1400 staff, + contractors & outside users => 3000, ~ 2000
on site at a given time
– Fundamental research in:
•
•
•
•
•
Experimental particle physics
Theoretical physics
Accelerator research
Astro-physics
Synchrotron Light research
– Has faculty to pursue above research and awards degrees, 3
Nobel prizewinners
Work with NIIT
Start 2004, MoU, funding from Pak MoST & US State Dept.
Development of students, build research capacity, develop
publicly available tools, publish etc., e.g.:
– Quantify the Digital Divide:
• Develop a robust measurement infrastructure to provide information on the
extent of the Digital Divide
• Develop innovative visualization tools
• Improve understanding, provide planning information, expectations, identify
needs (e.g. last mile problems, fragility, congestion …), report to politicians,
funding agencies, net operators, end users, is it good enough for Grids,
telemedicine (e.g. consulting expertise for poor communities)…
• Case studies for S. Asia, Pakistan, Africa
• Provide and deploy tools in Pakistan (NIIT, QAU, PERN)
– Geo-location of hosts
– Network Weather Forecasting; Anomaly: detection, diagnosis and
alerting
Students
• About a dozen students at NIIT co-supervision SLAC/NIIT
• Plus 6 chosen students with internships at SLAC for one
year each:
– Exposure to National Lab and world class network experts, work
on state of the art projects, exposure to high speed networks such
as will be available in Pakistan with PERN2,
– Take courses at Stanford
– 3 currently at SLAC
– 3 students completed their year, will return to NIIT as research
assistants to share experiences:
• One returned to NIIT to pursue PhD
• One to startup company in Silicon Valley
• One to U of New South Wales
Experiences
• Extremely successful on-going collaboration
– Developed and refined effective ways of communicating at a
distance
– Useful tool kits developed and made publicly available
– Many publications and public presentations
• Students hard-working, dedicated, enthusiastic and
innovative
• Have performed well in course work at Stanford, compare
well with Stanford students
• Next step: proposal to join the International perfSONAR
project (currently: Europe, US & Brazilian NRENs):
– provide open set of protocols + ref. implementation for
cross-domain sharing of network measurements
PingER Project
• Arguably the world’s most extensive active end-to-end
Internet Performance Project
– Digital Divide emphasis
– Partially funded by MoST, US State Department
• Last three years a joint development effort of SLAC & NIIT
• Many NIIT students cut their teeth on it, many papers, presentations
• Results:
– Highly successful
– Identified & quantified rates of improvement for
regions/countries
• How far behind, catching up, falling behind
• Many presentations to funding agencies, politicians, NRENs,
recommendations
– Case studies identified: fragility of e2e connections, last mile
congestion problems, inefficient routing
PingER Methodology
Uses ubiquitous ping
Monitoring
host
Internet Remote
Host
(typically
a server)
Data Repository @ SLAC
Measure Round Trip Time & Loss
Architecture
• Monitor hosts send 21 pings each 30 mins to Remote
Hosts and cache results
• Archive hosts gather data daily, save, analyze & make
results available publicly via web
PingER Deployment
• PingER project originally (1995) to measure
network performance for US, Europe and Japanese
HEP community
• Extended this century to measure Digital Divide:
– Collaboration with ICTP Science Dissemination Unit
http://sdu.ictp.it
– ICFA/SCIC: http://icfa-scic.web.cern.ch/ICFA-SCIC/
• >120 countries (99% world’s connected population)
• >30 monitor sites in 14 countries
• Monitor 44 sites in
S. Asia
Time
Series
results
• Divides into 2
– India, Maldives, Pakistan, Sri Lanka
– Bangladesh, Nepal, Bhutan, Afghanistan
• Weekend vs. weekday indicates heavy congestion
World Measurements: Min RTT from US
•
•
•
•
Maps show increased coverage
Min RTT indicates best possible, i.e. no queuing
>600ms probably geo-stationary satellite
Between developed regions min-RTT dominated by
distance
– Little improvement possible
• Only a few places still using satellite for international
access, mainly Africa & Central Asia
2000
2006
Losses from SLAC to world
• # hosts monitored increased seven-fold
• Increase in fraction with good loss
– Despite adding more hosts in developing world
>=12%
>=5% <12%
>=2.5% < 5%
>=1% < 2.5%
< 1%
Unreachability
• All pings of a set fail ≡
unreachable
• Shows fragility, ~ distance
independent
• Developed regions US, Canada,
Europe, Oceania, E Asia lead
– Factor of 10 improvement in 8 years
SE Asia
C Asia
Oceania
L America
M East
Africa
Developing
Regions
S Asia
SE Europe
Russia
E Asia
US & Canada
Europe
• Africa, S.
Asia
followed by
M East & L.
America
worst off
Developed
Regions• Africa NOT
improving
World thruput seen from US
Throughput ~
1460Bytes /
(RTT*sqrt(loss))
(Mathis et al)
Behind Europe
6 Yrs: Russia,
Latin America
7 Yrs: Mid-East,
SE Asia
10 Yrs: South Asia
11 Yrs: Cent. Asia
12 Yrs: Africa
South Asia,
Central Asia, and
Africa are in
Danger of Falling
Even Farther
Behind
Normalized for Details
• Note step
changes
• Africa v.
poor
• S. Asia
improving
• N. America,
Europe, E
Asia,
Oceania
lead
Conclusions
• Last mile problems, network fragility, poor routing
• Decreasing use of satellites, expensive, but still needed
for many remote countries in Africa and C. Asia
• Africa ~ 10 years behind and falling further behind,
leads to “information famine”
• Africa big target of opportunity
– Growth in # users 2000-2005 200%, Africa 625%
– Need more competitive pricing
• Fibre competition, government divest for access, low cost VSAT
licenses
• Consortiums to aggregate & get better pricing ($/BW reduces with BW)
– Need better routing - IXPs
– Need training & skills for optimal bandwidth management
• Internet performance correlates strongly with UNDP & ITU
development indices
– Increase coverage of monitoring to understand Internet performance
Application to PERN
• Place PingER monitoring node(s) inside PERN
– V. modest host, trivial install
– Add traceroute/landmark server for geolocation
• PERN configures to monitor to border routers &/or
to end hosts at sites (e.g. site web servers)
• Currently gathers data daily, analyze, present via
SLAC/FNAL
• NIIT/SLAC plans to develop front end to
analyze/visualize results on real time basis using
cached data & RRD/smokeping
perfSONAR: Next
Generation Network
Monitoring
• Partnership of Internet2 (US), GEANT (EU), ESnet
(US), RNP (Brazil)
– Plus in the US: SLAC, U Delaware, GATech
– 13 EU related NREN deployments of perfSONAR
Needs
• Advancements in networks improve scientific
collaborations, help accelerate discoveries
– E.g. High Energy Physics (HEP), seismology, tele-medicine, astrophysics, global weather, education …
• Modern science relies on global Internet
– Data exchange, interaction & teleconferencing, Grids …
• Network problems have increased significance for science
• Thus dependent on cyber infrastructure to support efficient
network problem diagnosis along paths traversing multiple
network domains
– This is an unresolved issue today
– Hard to overstate amount of effort today to resolve problems
• Often duplicated
• Scientists forced to become part-time network engineers
Why is this hard?
• Internet very diverse, hard to find “invariants, phone models do not
work
• Constantly changing both short and long-term
– Changes are not smooth but usually in steps, findings may be out of date
• No central organization
– Scientific communities span multiple organizations in many countries
– Typical path requires crossing at least 5 administrative domains (campus,
regional, backbone, regional and campus)
– Domains are autonomous
• Measurement not high on vendor’s priorities
• ISP’s concerned about privacy, competitive advantage, public
embarrassment
• Diagnosis hard:
– Convince ADs there is a problem and that they could/should help
– Need multiple pieces of information from multiple sources (ends, multiple
middles…), with no coordinating body
– Gets even harder for layer 2 networks
New Proposal to Address
• Widespread demand for net info by:
–
–
–
–
Researchers to know how network is performing
Advanced net apps such as Grids (e.g. place data)
Net Ops staffs to diagnose problems
Education
• Flexibility in extracting net performance data, needed since
– Network changes quickly, diagnostic data is moving target
– New tools, metrics and types of analysis are constantly developed
– Lack of effective ways to share performance data across domains
perfSONAR Infrastructure
Provide/Enable Measurement Points and Archives
• Provide Authentication/Authorization
• Provide registration, discovery & distributed lookup
services
• Provide open set of protocols + reference implementation
for cross-domain sharing of network measurements
– Common performance middleware
– Open Grid Forum NMWG = extensible XML data representation
– All development is open source to encourage widespread
development, deployment, ownership & involvement
• Early framework prototypes deployed in Europe, N and S
America (Brazil), also adopted by LHC
Next Steps
• Develop scalable, distributed, redundant Federated Lookup Service
(like DNS)
• Integrate common, existing authentication management into
perfSONAR
• Design and build the Resource Protector to implement policy
• Provide specific, useful example diagnostic services as high quality
examples (e.g. for traceroute, ping, one-way delay, SNMP, Layer-2
link services etc.)
• Provide a Topology service to provide layer-2 & 3 interconnection
information
• Promote perfSONAR to research community
– Students get reliable data from perfSONAR, request on demand
measurements, provide new analyses
• Turn into hardened/production quality distributable code
Impact NRENs & Customers
• R&E relies on reliable networking.
– Debugging problems across domains extraordinarily
difficult today, increased switched networks will make
harder.
• PerfSONAR enables divide and conquer between
end & intermediate points:
– provides easy access to relevant data enables on
demand measurements
– reduces need to coordinate multi-domain admins
(scientist > local net admin > Regional net admin
Backbone admin > …), telephone tag, explaining
– Reduces participants, hours, days, frustration etc
Some Projects
One Big Challenge
• Elegant graphics are great to understand problems
BUT:
– Can be thousands of graphs to look at (many site pairs,
many devices, many metrics)
– Need automated problem recognition AND diagnosis
• So developing tools to reliably detect significant,
persistent changes in performance
– Initially using simple plateau algorithm to detect step
changes
• Provide reliable alerts
• Automatically partially diagnose events
– Gather info from routers, monitors etc and eliminate less
likely causes
Challenges: Finding Hosts
• Best via contacts
• Also use Google to provide hosts for country (eg .ly)
–
–
–
–
Found 844 hosts => 702 unique names => 600 ping
88 unique IP addresses
6 in Libya according to Geo IP Tool www.geoiptool.com/
Automated by Akbar Mehdi of NIIT at SLAC (see
https://confluence.slac.stanford.edu/display/IEPM/PingER+Host+Searcher )
• Verified with TULIP geolocator
– Locates hosts using RTT from multiple landmarks to target
– Also see Octant for US www.cs.cornell.edu/~bwong/octant/
TULIP geolocator (Faran)
• www.slac.stanford.edu/comp/net/wan-mon/tulip/
– Java applet (needs Java Webstart)
– Friendly client, easy vizualization
– Need landmarks around world
Enter target
Pings min/avg/max from landmarks to targets
Target
Traceroutes from landmarks to targets
Landmarks
Real Time Display of PingER
data
• Only gather once a day so old
• On monitor host, copy data into RRD data base
• Use Smokeping or something similar to select &
look at data
• Important for PERN
Case Studies of PingER data
• For example we have Sub-Sahara & S. Asia
• Need Latin America, Middle-East
• How is region doing relative to the world, catching
up, falling behind, how far behind?
• How are countries doing, RTT, losses, reliability,
throughput
• How do they compare to development indices
• What is routing like
Build a Make/Install package for
IEPM-BW
• We have a new version of our integrated
monitoring, archive, analysis, vizualization package
– Called Internet End-to-end Monitoring for BandWidth
• Hard to install, done twice, once at QAU
• Need to make easier and more robust
Visualization of IEPM-BW data
•
•
•
•
•
•
Display in real time on a map the connections
Mouse over for information on hosts
Click for performance graphs
Color line by test,
Thickness by performance
Up to you to think about what is available and how
to use this to drill down to it…
Only a Sample
• Lots of work on perfSONAR
– Data access via web services
– Data registration, discovery lookup
– Topology
– Event detection
– Etc.