Transcript IP Network

Changes
- Increasing processing power
- Increasing storage capability and smarter devices
- Reducing cost
- Popularity of computing&communication devices
- desktop, laptop, PDA, Cell Phone
- Ubiquitous connection
- high speed network, wireless network
- Emerging applications
- video conference, collaborative scientific computing, digital library
- Mobile user & increasing user demands
- Computing environment
Solitary, fixed-location event
-> widely distributed, highly interactive, mobile activity
Challenges for data management
- large-scale system - effective management
- clients, devices, amount of data
- diversity – flexibility/adaptive performance
- distributed and heterogeneous storage devices
- various applications with demands of different support
- clients with different abilities (mobile user)
- different networks
- multiple administrative domains - security
- data belongs to different administrative domains
- clients belong to different administrative domains
- data privacy and anonymity
Efficient, secure and effective location- and context- aware
data access from anywhere at anytime
Towards Grid Computing
A novel paradigm that enables the
sharing, selection, & aggregation of
geographically distributed resources anywhere & anytime:
Wide
area
 Computers – PCs, workstations, clusters, supercomputers,
laptops, notebooks, mobile devices, PDA, etc.
 Software – e.g., ASPs renting expensive special purpose applications on
demand.
 Catalogued data and databases – e.g. transparent access to
human genome database.
 Special devices/instruments – e.g., radio telescope – SETI@Home
searching for life in galaxy.
 People/collaborators.
Depending on their availability, capability, cost, and user QoS requirements for
solving large-scale problems/applications.
Thus enabling the creation of “virtual enterprises”
Towards Grid Computing
RN
R3
R2
R4
R1
Unification of geographically distributed resources
Plethora of Challenges
Security
Computational Economy
Uniform Access
Resource Discovery
System Management
Data locality
Resource Allocation
& Scheduling
Application Construction
Network Management
Technology Needs: Present, Future
 Distributed Supercomputing:
– Computational science.
 High-Capacity/Throughput Computing:
– Large scale simulation/chip design & parameter studies.
 Content Sharing (free or paid):
– Sharing digital contents among peers.
 Remote software access/renting services:
– Application service provides (ASPs) & Web services.
 Data-intensive computing:
– Drug Design, Particle Physics, Stock Prediction...
 On-demand, realtime computing:
– Medical instrumentation & Mission Critical.

Collaborative Computing:
– Collaborative design, Data exploration, education.
 Service Oriented Computing (SOC):
– Towards economic-based Utility Computing: New paradigm, new applications,
new industries, and new business.
Datanomic Computing
System behavior driven by characteristics of the data
• System automatically optimizes itself to complement ever
changing data requirements
– Allocate resources according to increase in demand of the data
– Transform data formats to support different applications
• Seamless data access from anywhere at anytime
– Location and context aware access to data
– Consistent view of each user’s data
• Data access independent of platforms, operating systems,
and data formats
• Potential platform for cyberinfrastructure
– High performance computing, large data stores, better
bandwidth for communication
Objectives
• Develop self optimizing global infrastructure
• Exploit active objects and intelligent storage
devices
– Objects can be uniquely identified
– Objects can automatically migrate, replicate or
transcode to satisfy varied user demands
• Store, search, and manage large amount of data
efficiently
• Adaptive performance
– Objects dynamically adapt to the level of available service
– Means to handle intermittent connectivity
Objectives (Cont.)
• Support data intensive applications for
– Data mining
– Multimedia (MPEG-21)
• Ensure E2E security, strong authentication,
anonymity, confidentiality over the IP
networks
Datanomic Computing: Hybrid Architecture
Regional Manager
Regional Manager
Laptop
Laptop
IP Network
Within a Region
IP Network
Within a Region
Desktop
App Server
Desktop
App Server
IP Network
Regional Manager
Regional Manager
Laptop
Laptop
IP Network
Within a Region
App Server
IP Network
Within a Region
Desktop
App Server
Desktop
Hybrid Architecture: P2P Interaction
• Network is partitioned into an arbitrary
number of variable-sized regions
• Region-to-region interaction: P2P
– Distributed Indexing with Hashing Architecture
(DIHA) for inter-region communication
– Focus on enterprise environment => assume
greater level trust between regions and greater
homogeneity in regional interaction
Hybrid Architecture: Regional Organization
Regional Manager
Laptop
IP Network
Desktop
App Server
Intelligent OSD
Hybrid Architecture: Regional Organization
• Partition of regions: based on physical or
logical affinity
• Single regional manager
• clients
• Intelligent object-based storage devices
Hybrid Architecture: Regional component (1)
• Regional Manager
–
–
–
–
–
Object metadata management
Security related issues within/outside region
Naming service
Object replication, migration and consistency
Clients and OSD devices management
(including mobile clients and devices)
Hybrid Architecture: Regional component (2)
• Client
– End users or applications that access objects
within a region
– Client has a home region that stores important
client information. The home region is allowed
to move
– Client can move freely among region
Hybrid Architecture: Regional component (3)
• Intelligent Object-based Storage Devices
– OSD decides if a specific client is allowed to
perform some operations
– Perform data-directed operations specified by
the object itself
Hybrid Architecture: Scenario within a region
Regional Manager
1
2
5
3
Laptop
4
IP Network
Desktop
App Server
Intelligent OSD
Hybrid Architecture: Scenario inter-region
Regional Manager
4
Regional Manager
Lookup(object ID/name)
Laptop
2
3
5
Desktop
1
App Server
9
IP Network
Laptop
6
IP Network
Desktop
App Server
8 IP Network
7
Regional Manager
General Picture
IP-N/W
Regional Manager
IP-N/W
App Server
OSD
OSD
IP Network
Regional Manager
Regional Manager
IP-N/W
App Server
IP-N/W
App Server
OSD
OSD
Regional Manager
General Picture
IP-N/W
Regional Manager
IP-N/W
App Server
OSD
OSD
IP Network
Regional Manager
Regional Manager
IP-N/W
App Server
IP-N/W
App Server
OSD
OSD
Comparison
Oceanstore
StorageTank
Scope
Internet/WAN
Data center
Architecture
Peer to peer
Server/client (Client, metadata server, storage device)
Connection/Commu Varied, wireless, intermittent
nication
Trusty model
untrusted infrastructure(data encrypted)
Normally permanent, high speed
Channel security
insecure
secure
Admin domain
Peer to peer
One
Object capability
Active/archival
File data
Not-mentioned (trusted?)
Data replication and Introspective, bottleneck at parent node
Static, app level
migration
Consistency
Conflict resolution (a range of consistency semantic) strong
Scalability
Performance
Separation of Control and
data
Support of OSD
good
Cluster arena
Very high
Comparison - Similarities
Aspects
Datanomic PAST
Separation of
control and
data
Access level
MDS, OBT N/A
and OBD
Object
(file)
Storage device Intelligent
Transport
TCP,IB,
RDMA
File
Storage from
node
TCP
Comparison- Differences
criteria
Scope
Architecture
Datanomic
Global area
Hybrid, 2
levels
Connection/Com Varied,
munication
wireless,
intermittent
Trusty model
untrusted
Channel security insecure
PAST
Global area
P2P (DHT)
N/A
untrusted
insecure
Comparison – Differences (cont.)
Aspects
Admin domain
Object
capability
Data replication
and migration
Consistency
Scalability
Performance
Datanomic
No
Active with
method
Automatic,
three levels
Strong/weak
unlimited
high
PAST
No
File data
Automatic
N/A
unlimited
high
Comparison – Differences (cont.)
Aspects
Datanomic
Data privacy
Support
and integrity
Load balancing Intelligent
Data striping
Target
Support
Enterprise
PAST
Support
Based on
randomization
Not support
Global,
archival
storage
Comparison
criteria Datanomic
NASD
Scope
Archite
cture
Connect
ion/Co
mmunic
ation
Access
Control
Global
Data Center
area
Hybrid, 2 Client-Server
levels
Varied,
Could be
wireless,
varied
intermitten
t
TBD
Capability
based:
Issued per
Lustre
Restricted, data
center
Client-server
Normally
permanent, high
speed
Comparison
Aspects
Datanomic
Revocation
TBD
NASD
Fast since
capability
issued per
open request
Granularity
Method
Part of object
of access
(specified in
control
term of bytes)
Data
Automatic, Not addressed
replication three levels
and
migration
Lustre
Use existing
schemes
File
Static, app
level