Network Infrastructure - Digital Science Center

Download Report

Transcript Network Infrastructure - Digital Science Center

Grids and the Harmony and
Prosperity of Civilizations
“Beijing Forum” (2004)
The Harmony and Prosperity of Civilizations
http://www.beijingforum.org/english/index.htm
Geoffrey Fox
Professor of Computer Science, Informatics, Physics
Pervasive Technology Laboratories
Indiana University Bloomington IN 47401
[email protected]
http://www.infomall.org
CPU and Network Infrastructure


Moore’s law predicts that electronic components will
improve in performance by a factor of 100 or so every
ten years (double every 18 months)
Networks are increasing in performance every year
much faster than this as more and better technology is
deployed (Gilder’s law)
• Last-mile versus backbone performance
• Latency versus bandwidth


Cable, DSL, Satellite, Optical fiber, wireless are
competing to provide high speed connectivity to the
citizens of the world
By 2006, GTRN (Global Terabit Research Network)
aims at a 1000:1000:100:10:1 gigabit performance ratio
representing international backbone: national:
organization: optical desktop: Copper desktop links.
Global Enterprises


As communication improves, activities are spread more
and more across the globe.
Faster physical transportation (cars, trains, aircraft)
enabled
• Increasing international tourism
• Separation of manufacturing, design and sales of vehicles,
consumer electronics, clothes

Universal networking is allowing instant global
information
• The latest event at the Olympic Games or
• The latest terrorist event

e-Infrastructure is allowing more and more
sophisticated activities to become distributed
• Scientific research, Business and for this meeting Civilization
e-Infrastructure






e-Infrastructure builds on the inevitable increasing performance
of networks and computers linking them together to support new
flexible linkages between computers, data systems and people
• Grids and peer-to-peer networks are the technologies that
build e-Infrastructure
• e-Infrastructure called CyberInfrastructure in USA
We imagine billions of conventional local or global connections
• Phones, web page accesses, plane trips, hallway conversations
On this we superimpose high value multi-way linkages
• Such as collection of people at this meeting
If N items are joined to M others, added value goes like N × M
for small M but in broadcast limit M ≈ N, the value decreases to
a constant × N. (A Complex System theorem)
Conventional Internet technology manages billions of broadcast
or low (2-way) or broadcast links
Grids superimpose multiple M-way overlaid organizations with
optimized resources and system support
On Complex Systems Language







Web and Grid resources (people, pages, databases, computers) are
“just spins”
Local Interactions are terms in an energy function
• E = sum( nearest neighbor i,j) weight(i,j).s(i).s(j)
“Internet Communication” corresponds to a long range force with
• E= sum(all spins i) H . s(i)
And behaves like a magnetic field aligning spins in physics
(complex systems) analogy
• Aligning is harmonizing
Maximizing Prosperity is minimizing “Complex Systems Energy”
Abrupt Social changes are phase transitions
In this language, Grids provide different local energy functions
(enhanced interaction) and harmonizing forces through
community shared resources
4×N Interactions


In days gone by
people
communicated
with their local
community
Nearest
neighbour
communications
in a physics
analogy with
communication
= force
N plus N Interactions


Television and the Web allows individuals to communicate
instantly with each other via Web Pages and Headline News
acting as proxies
N resources deposit information and N can view  Call N plus N
M2 Interactions
• Superimpose M way
“Grids” on the sea
(heatbath) of “2 by N”
or N plus N
ordinary
interactions
Implement Grids
as a software
overlay network
Dynamic light-weight Peer-to-peer
Collaboration Training Grid
Enterprise Grid
Students
Information Grid
R1
R2
Compute Grid
Teacher
Campus Grid
4 Overlay Networks
With a 5th superimposed
Large and Small Grids






N resources in a community (N is billions for the world and 100010000 for many scientific fields)
Communities are arranged hierarchically with real work being
done in “groups” of M resources – M could be 10-100 in e-Science
Metcalfe’s law: value of network grows like square of number of
nodes M – we call Grids where this true Metcalfe or M2 Grids
Nature of Interaction depends on size of M or N
• N plus N Shared Information Grids for large N
• M2 Metcalfe Grids for smaller M
Technology support depends on M – might use a relatively static
DHT (Distributed Hash Table) for large M and a distributed
shared memory for small M
Grids must merge with peer-to-peer networks to support both N
plus N and M2 Grids
Architecture of (Web Service) Grids




We view the “ordinary” Internet as providing support for the
huge number of low-complexity interactions which are the
dominant traffic
We superimpose multiple Grids on top of these; each Grid
supports a high value high complexity interaction
• Grids built from Web Services communicating through an
overlay network
Grids provide the special quality of service (security,
performance, fault-tolerance) and customized services needed for
“distributed complex enterprises”
We need to work with Web Service community as they debate the
60 or so proposed Web Service specifications
• Use Web Service Interoperability WS-I as “best practice”
• Must add further specifications to support high performance
• Database “Grid Services” for N plus N case
• Streaming support for M2 case
Application Specific Grids
Generally Useful Services and Grids
Workflow WSFL/BPEL
Service Management (“Context etc.”)
Service Discovery (UDDI) / Information
Service Internet Transport  Protocol
Service Interfaces WSDL
Base Hosting Environment
Protocol HTTP FTP DNS …
Presentation XDR …
Session SSH …
Transport TCP UDP …
Network IP …
Data Link / Physical
Higher
Level
Services
Service
Context
Service
Internet
Bit level
Internet
(OSI
Stack)
Layered Architecture for Web Services and Grids
Working up from the Bottom





We have the classic (CISCO, Juniper ….) Internet routing the
flood of ordinary packets in OSI stack architecture
Web Services build the “Service Internet” or IOI (Internet on
Internet) with
• Routing via WS-Addressing not IP header
• Fault Tolerance (WS-RM not TCP)
• Security (WS-Security/SecureConversation not IPSec/SSL)
• Information Services (UDDI/WS-Context not
DNS/Configuration files)
• At message/web service level and not packet/IP address level
Software-based Service Internet possible as computers “fast”
Familiar from Peer-to-peer networks and built as a software
overlay network defining Grid (analogy is VPN)
SOAP Header contains all information needed for the “Service
Internet” (Grid Operating System) with SOAP Body containing
information for Grid application service
Service Context
• On top of “Service Internet”, one supports dynamic context or
the “shared memory” supporting groups (M from 2 to more) of
services that are inevitable for Grids
• Context information defines “state” (a token linking messages
and services together), policy/implementation for security, fault
tolerance, lifetime etc.
– Includes generalization of “environment” and “configuration” variables
• This context can be implemented as a Service itself – using
SOAP message interactions with a database
– This is a lightweight highly dynamic database
• Interesting debate between shared (a single service) memory
or distributed memory (Collection of messages with context in
header) architectures
– Familiar from parallel computing with “distributed shared memory” a
natural solution
• Note this can only be done dynamically if Grids are small –full
Internet case needs larger but less dynamic context support
Alternative definitions of a Grid




Supporting human decision making with a network of at least
four large computers, perhaps six or eight small computers,
and a great assortment of disc files and magnetic tape units not to mention remote consoles and teletype stations - all
churning away. (Licklider 1960)
Coordinated resource sharing and problem solving in
dynamic multi-institutional virtual organizations
Infrastructure that will provide us with the ability to
dynamically link together resources as an ensemble to support
the execution of large-scale, resource-intensive, and
distributed applications.
Realizing thirty year dream of science fiction writers that
have spun yarns featuring worldwide networks of
interconnected computers that behave as a single entity.
e-Business e-Science and the Grid






e-Business captures an emerging view of corporations as
dynamic virtual organizations linking employees, customers
and stakeholders across the world.
• The growing use of outsourcing is one example
e-Science is the similar vision for scientific research with
international participation in large accelerators, satellites or
distributed gene analyses.
The Grid integrates the best of the Web, traditional
enterprise software, high performance computing and Peerto-peer systems to provide the information technology
infrastructure for e-moreorlessanything.
DATA
ADVANCED
,ANALYSIS
ACQUISITION
VISUALIZATION
A deluge of data of unprecedented and inevitable size must
be managed and understood.
People, computers, data and instruments must be linked.
COMPUTATIONAL
On demand
assignment
of
experts,
computers, networks and
RESOURCES
IMAGING INSTRUMENTS
LARGE-SCALE DATABASES
storage resources must be supported
QuickTime™ and a
decompressor
are needed to see this picture.
e-Defense and e-Crisis

Grids support Command and Control and provide
Global Situational Awareness
• Link commanders and frontline troops to themselves and to
archival and real-time data; link to what-if simulations
• Dynamic heterogeneous wired and wireless networks
• Security and fault tolerance essential

System of Systems; Grid of Grids
• The command and information infrastructure of each ship is
a Grid; each fleet is linked together by a Grid; the President
is informed by and informs the national defense Grid
• Grids must be heterogeneous and federated

Crisis Management and Response enabled by a Grid
linking sensors, disaster managers, and first responders
with decision support
e-Business and (Virtual) Organizations





Enterprise Grid supports information system for an
organization; includes “university computer center”, “(digital)
library”, sales, marketing, manufacturing …
Outsourcing Grid links different parts of an enterprise together
(Gridsourcing)
• Manufacturing plants with designers
• Animators with electronic game or film designers and
producers
• Coaches with aspiring players (e-NCAA or e-NFL etc.)
Customer Grid links businesses and their customers as in many
web sites such as amazon.com
e-Multimedia can use secure peer-to-peer Grids to link creators,
distributors and consumers of digital music, games and films
respecting rights
Distance education Grid links teacher at one place, students all
over the place, mentors and graders; shared curriculum,
homework, live classes …
Information/Knowledge Grids


Distributed (10’s to 1000’s) of data sources (instruments,
file systems, curated databases …)
Data Deluge: 1 (now) to 100’s petabytes/year (2012)
• Moore’s law for Sensors




Possible filters assigned dynamically (on-demand)
• Run image processing algorithm on telescope image
• Run Gene sequencing algorithm on compiled data
Good example of N plus N Grid
Metadata (provenance)
critical to annotate data
Integrate across experiments
as in multi-wavelength
astronomy
Data Deluge comes from pixels/year available
Virtual Observatory Astronomy
N plus N Grid that Integrates Experiments
Radio
Far-Infrared
Visible
Dust Map
Visible + X-ray
Galaxy Density Map
CERN LHC Data Analysis Grid
• Typical experiment at LHC has 2000 physicists
• Analyzing data from LHC is a “N plus N Grid” with huge scale
• 30,000 CPU’s processing simultaneously LHC data
• In a few years, over a 100 of Petabytes of data
• Physics discovery is a M2 Grid with perhaps M=10
• Lots of such groups working simultaneously
• Note hierarchical structure
• M=10 in Physics analysis
• M=2,000 in one LHC Experiment
• M=10,000 physicists in particle physics
• M= 100,000 total physicists
• M=? Scientists
• M= Billions People
DAME
Rolls Royce and UK e-Science Program
Distributed Aircraft Maintenance Environment
In flight data
~5000 engines
~ Gigabyte per aircraft per
Engine per transatlantic flight
Airline
Global Network
Such as SITA
Ground
Station
Engine Health (Data) Center
Maintenance Centre
Internet, e-mail, pager
Several small M2 Grids – one for each aircraft back-ended by
N plus N Grid of reference data of all engines
Information Complexity I

Consider a community of N resources with groups of size
M with each group complexity C
• N/M Groups

Information in systems varies from coherent
(harmonious) to incoherent limits
• Web and Grid data resources supply coherence as in curated
astronomy or bioinformatics database
• Can consider N plus N Grids as Coherent or Harmonious
Grids


I = (NM)0.5 . (C/M) Incoherent to N . (C/M) Coherent
In this language Grids do one or both of
• Coherence/Harmony – common shared asynchronous
resources
• Interactivity – Increase complexity to M2 with real-time
linkage of interacting resources
Information Complexity II




N plus N Community database has I = N Coherent
• Improving on N0.5 incoherent case
Nearest Neighbor groups is I = (NM)0.5
• Becoming I = N in limit M = N
• M is correlation length in Complex Systems approach
M-ary Interactive group (M2 Metcalfe Grids) has C = M2
and
I = (NM3)0.5 Incoherent
to I = NM Coherent
• Coherent case most natural in science due to synergy
between Metcalfe and Coherence Grids
“Small World (logarithmic) networks” and hierarchical
group structure require more discussion
Grids and e-globalcommunity

Peer-to-peer networks already are a good example of
value of Information Technology supporting broad
global communities
• File sharing, text chats, bulletin boards



Grids must include these capabilities and extend in
terms of increased functionality and quality of service
This will support business and cultural interactions
between nations
Several interesting applications can be supported by
• Replacing files by multi-media streams so can collaborate in
real-time
• Adding traditional tools like audio-video conferencing and
shared applications to P2P set

This integration of P2P and Grid to give M2 Grids
impacts e-Business as well as e-globalcommunity
Outsourcing or Not?




In the USA, over last 30 years people worried about loss of
manufacturing jobs from the first wave of enterprise distribution
created by “physical communication”
Now they worry about the next wave of outsourcing seen in areas
like software, and movie/game animation created by eInfrastructure – electronic communication
Probably this globalization of enterprises will increase not
decrease as it allows one to tap the cheapest and best expertise
for a particular task
• Further the core software and electronic infrastructure will
continue dramatic improvements
Assuming global enterprises are inevitable each community
should identify its expertise and enhance its ability to work in a
distributed fashion
• Suggests increasing specialization within communities
Streaming M2 Grids







e-Textilemanufacturing involves Clothes designers in USA and
manufacturers in Hong Kong exchanging designs which are
streams of images
e-Sports is a possible collaboration between Indiana University and
Beijing Sport University
• Basket ball coaches (teacher) interact with aspiring NBA players
in China
• Martial Arts masters in China train neophytes in Indiana
• Faculty recreational sports adviser works from university with
faculty exercising at home
• Hope to have working incredibly well by the 2008 Olympics
Interactive TV Grid: allows anybody to discuss professional or
home video (of sports or other events) within a custom Grid
Multi-player distributed games which should be supported with
exactly the same overlay Grid
Video Game Production Grid links artistic direction (design) in one
country with digital animation (manufacturing) in another
e-Science: Physics and Environmental Science Sensors
Surveillance Grid enables security personnel to annotate and
discuss suspicious remote camera images/streams
Some Technology for Streaming M2 Grids





Basic capability is collaborative annotatable multimedia
tool for images, sensors and real-time video streams
• Allow Grid participants to view real-time streams,
rewind on the fly and add text and graphical
comments
• Similar to instant replay on TV but far more flexible
Need rich metadata system to label and correlate streams,
images and annotations
Extend Grid and P2P file access paradigms to stream
storage, browsing and access
Core Technologies shared with distance education
Using http://www.globalmmcs.org for multimedia
services and http://www.naradabrokering.org for overlay
network
P2P and Server based solutions






Peer-to-peer architectures have advantage that they can be deployed
just using client resources and no system commitment is needed
Typically clients do not have good network QoS and it is hard for
example to support rich multi-point audio video conferencing in this
way
M2 Grids typically require multicast so average load in P2P case on
client legs goes like O(M)
Grid
Farm
in theload
Sky on
(clouds)
• Server-side multicast
puts
O(M)
backbone and O(1) load
on clients and can lead to much better scaling and performance
• N plus N Grids may not see such large improvements with server
Grid Servers
side support
So Grids should support initial P2P deployment with a seamless
upgrade to add better QoS using Servers.
Extend familiar P2P paradigms like BitTorrent to Grids and
Streaming
P2P
Grid and peer-to-peer linkage combines scalable performance with
ease of deployment