Berkeley NOW

Download Report

Transcript Berkeley NOW

Massive Cluster
Progress on
System
Architecture for
Extreme Devices
Clusters
Gigabit Ethernet
David Culler
http://www.cs.berkeley.edu/~culler
U.C. Berkeley
Endeavour Retreat
1/20/200
Outline
• The Very Large
– Millenium Cluster-of-cluster resources available to you
– System Architecture
• The Middle
– Kiosks, laptops, and PDAs
• The Small
– Embedded Servers
– Low power wireless
– Architecture for Zillions of devices
1/20/2000
Endeavour Sys. Arch
2
Large Resource Deployment
• In place:
– Full 64-PIII Linux Cluster w/ Myrinet & ethernet (Gb rdy)
» demonstrated on Ninja DDS
– several remote 16-PIII clusters
– DLIB 4-P + 1/4 TB cluster
– 1/2 TB storage server
– Gb connection to dept, OC48 to NTON
– Rootstock cluster dissemination facility
– REXEC econ-based remote execution facility
– 100 KVA PDU
• Demonstrated
– intercampus network, routing, config (as per CNS)
– full Gb/s with 2 source, 2 sinks
• Deploying
– 45 x 4 of main cluster (2/4 GB mem, 18 GB disk, Gb ether)
1/20/2000
Endeavour Sys. Arch
3
Vineyard Cluster Architecture
Applications / Services
(ISPACE/Kiosks)
PBS
I/O
MPI
VEXEC
TOOLS
REXEC
- VIA / GM, GbE
- Multicast
Mgmt / Monitoring
• dissemination, resource control, remote
execution, and communication
- NT / Linux (2.2.x)
- Stride Scheduler
Rootstock Distribution
1/20/2000
Endeavour Sys. Arch
4
Dissemination: Rootstock
1. Cluster Stock
- Rootstock build pages
- Full Current Linux
- all fixes and pckgs
- SSL, SSH
3. CS power-on build
Cluster System
Distribution Center
- xfer and localize DT
- add local admin scripts
- node build floppy
K
leased
builds
cluster
stock
- build
- os
- drvrs
- mill SW
- os mods
Cluster
cs
CAN
2. Make the CS “graft”
- specify IP address
- pckg removes
- dchp, dns, nis,...
sanity check and build
- resolv.conf, /etc/hosts, ...
constructs cluster build (lease)
download CS build floppy
IP
network
...
- Cluster Drivers
- Cluster System Layers
- rexec, mpe, pbs
- Optional SW ($)
- Cluster Kernal Mods
1/20/2000
4. Node power-on build
- local stock from CS
5. Cluster Update button (future)
- 2nd dialtone, CF engine, rolling update
Endeavour Sys. Arch
5
REXEC / VEXEC
• Resource Management, Autoconfig,
Mechanism/Policy, Enforcement
Node A
Node B
Node C
Node D
rexecd
rexecd
rexecd
rexecd
Cluster IP Multicast Channel
vexecd
(Policy A)
run indexer on Nodes
AB at 3 credits/min
vexecd
(Policy B)
“Nodes AB”
minimum $
rexec
%rexec –n 2 –r 3 indexer
1/20/2000
Endeavour Sys. Arch
6
“Intelligent” Middle
• Deployed many laptops with 802-11, 3 base
stations, many PDAs with IR
• Solved the PDA to IR-dongle and PDA to Annex
serial port (J. Hill)
• Deployed two kiosks: touch-LCD, IR-ppp
– act as server for managment
• Demonstrated key aspects of the service
infrastructure
– eg: get device applet from service point
• eSticky notes appln
• motivated xcoding-security infrastucture
• => Need to harvest and extend
1/20/2000
Endeavour Sys. Arch
7
Small: Embedded Servers
• Tested commercial products - promising
– axis camera server
– SOHO NAT, DHCP, Firewall server
• plumbing = ethernet
• Identified platform for building embedded
servers
– DIMM PC + Linux + …
– 486 + 16 MB RAM + 16 MB Flash Disk
1/20/2000
Endeavour Sys. Arch
8
Low-power Wireless
• Tested available options
– RF monolithics (used in Smart Rocks)
» “virtual wire” is brain-dead, but good tranceiver
– RadioMetrics (used in ISI RF Tags)
» simple, primitive packet controller, no pwr down
– World Wireless
» nice MAC, but only infrastructure mode
• Selected RF monolithics
– working with BSAC and ISI on building-block
– new packet controller + MAC
1/20/2000
Endeavour Sys. Arch
9
Zillions of Little Devices
• Connected device as client well-established
– distiller in the infrastructure spoonfeeds client
» powerful services in power-limited devices!
– How to get the illusion of continuous connectivity?
• What about sensors-based devices?
– they should behave as servers
» eg: camera server
– How to scale tiny server to need?
– How to get illusion of continuous connectivity?
» use the infrastructure
• First a demonstration: note server in a PDA
1/20/2000
Endeavour Sys. Arch
10
Assumptions
• Computation and storage in the infrastructure is
plentiful
• Wired bandwidth is pervasive and essentially
free
• [ Multicast is widely accessible]
=> every device has a representative proxy in the
infrastructure
1/20/2000
Endeavour Sys. Arch
11
Cautionary enabling thought
app
app
app
app
TCP
IP
• TCP protocol (acks, fsm, etc.) terminates in the
kernel.
• System specific protocol exists between kernel
TCP agent and applications.
– SSI clusters have another layer of network!
• Think of device as an application (not big laptop)
• Where is its kernel agent?
1/20/2000
Endeavour Sys. Arch
12
Device Access Architecture
low power
local device
link
Physical
Device
Scalable, Available
Ninja Base
AP
Dev MC
persistent
named
representative
Clients
Services
AP
AP
• infra proxy provides name, state, queuing, etc.
• extend toward AP as optimization
1/20/2000
Endeavour Sys. Arch
13
Demo Mapping
Laptop w/
Ninja iSpace
serial
PDA
laptop AP
w/ pppd
TCP
Laptop
Browser
persistent
named
representative
AP
AP
BayStacker 802.11
Wireless
1/20/2000
Endeavour Sys. Arch
14
Key piece to build
• Low power controller
with 2 stream devices
Application
Tiny Kernel
Tiny flow drivers
RF tcvr
X
– X = sensor + actuator for
devices
– X = host interface for AP and
Embedded server
host
a
s
s
a
svr
sa
1/20/2000
Endeavour Sys. Arch
15