Data - The Computer Laboratory
Download
Report
Transcript Data - The Computer Laboratory
Clean Slate Ubicomp Device
System Architecture
Jon Crowcroft,
http://www.cl.cam.ac.uk/~jac22
Thank you but you are in
the opposite direction!
I have 100M bytes of
data, who can carry
for me?
I can also carry for
you!
Give it to me, I have
1G bytes phone flash.
Don’t give to me! I
am running out of
storage.
Reach an access
point.
There is one
in my
Search La
pocket…
Bonheme.mp3 for
me
Internet
Finally, it
arrive…
Search La
Bonheme.mp3 for
me
Search La
Bonheme.mp3 for
me
System Architecture
Radical Device Stuff +
Cloud Stuff
Research:
split/migrate/cache/proxy/dynamics
Motivation #1
Mobile users currently have a very bad
experience with networking
Applications do not work without networking
infrastructure such as 802.11 access points or cell
phone data coverage
Local connectivity is plentiful (WiFi, Bluetooth,
etc), but very hard for end users to configure and
use
Example: Train/plane on the way to London
How to send a colleague sitting opposite some
slides to review?
How to get information on restaurants in London?
(Clue: someone else is bound to have it cached on
their device)
Underlying Problem
Applications tied to network details and
operations via use of IP-based socks
interface
Apps survive by using directory services
What interface to use
How to route to destination
When to connect
Address book maps names to email addresses
Google maps search keywords to URLs
DNS maps domain names to IP addresses
Directory services mean infrastructure
Phase transitions and networks
Solid networks: wired, or fixed wireless mesh
Liquid networks: Mobile Ad-Hoc Networking
(MANET)
Short lived end-to-gateway routes
Gaseous networks: Delay Tolerant Networking
(DTN), Pocket Switched Networking (PSN)
Long lived end-to-end routes
No routes at all!
Opportunistic, store and forward networking
One way paths, asymmetry, node mobility carries
data
Haggle targets all three, so must work in most
Haggle Overview
Clean-slate redesign of mobile node
Spans MAC to application layers
(inclusive), but is itself layerless –
uses six “managers”
Key Features:
Store-and-forward architecture with
data persisting inside Haggle not
separate file sys
App-layer protocols (SMTP, HTTP,
etc) moved into Haggle rather than
apps themselves
Forwarding decisions made on “name
graphs” allowing just-in-time binding
Resource management integrated
API is asynchronous and data-centric
Event.register( Event.OnEncounter, fun d:device ->
if d.nID = “B” && distance(self,d) < 3 then
dispatch NodeEncountered(d);
)
Overview of D3N Architecture
Each node is responsible for storing, indexing,
searching, and delivering data
Primitive functions associated with core D3N
calculus syntax are part of the runtime system
Prototype on MS Mobile .Net + Haggle
Runtime
8
Data-Driven Declarative Networking (D3N)
How to program distributed
computation?
Declarative is new idea in
networking
Ex. P2: Building overlay: Network
properties specified declaratively
Use of Functional Programming:
Functions are first class values and
can be both the input and the result
of other functions
Example: Query to Networks
Queries are part of source level syntax
Distributed execution (single node programmer model)
Familiar syntax
D3N:
select name from poll() where institute = “Computer Laboratory”
F#:
poll()
E
|> filter (fun r -> r.institute = “Computer Laboratory”)
C
|> map (fun r -> r.name)
A
Message:
(code, nodeid, TTL, data)
B
D
Trust, the Cloud Society:Cloud Atlas…
Anil Madhavapeddy (Cambridge/Imperial)
and
Daniele Quercia (UCL/MIT)…
Who do you trust?
Your phone
The cloud
Unreachable, goes broke, spy on you
Solution spaces
lost/stolen/broken/hacked
Migrate cloud state on/off fone
Need encapsulation of this (social vm)?
P2P soln too (nearby devices)
Resource (sensor) pooling
Social Sign on Scales/Usability?
Empirical Stuff…
Why measure human mobility?
Mobility increases capacity of dense
mobile network [tse/grossglauser]
Also create dis-connectivities
Human mobility patterns determine
communication opportunities
And discover social groupings - see later
for resource allocation (e.g. spectrum)
Experimental setup
iMotes
ARM processor
Bluetooth radio
64k flash memory
Bluetooth Inquiries
5 seconds every 2 minutes
Log {MAC address, start time, end time}
tuple of each contact
Experimental devices
Contact and Inter-contact time
Inter-contact is important
Affect the feasibility of opportunistic network
Nature of distribution affects choice of forwarding
algorithm
Rarely studied
Infocom 2005 experiment
54 iMotes distributed
Experiment duration: 3 days
41 yielded useful data
11 with battery or packaging problem
2 not returned
[data on crawdad, have several other datasets from
Hong Kong, Barcelona, Cambridge, Lisbon, and of
course other people have done this too:-]
Brief summary of data
41 iMotes
182 external devices
22459 contacts between iMotes
5791 contacts between iMote/external
device
External devices are non-iMote devices in
the environment, e.g. BT mobile phone,
Laptop.
Contacts seen by an iMote
iMoites
External Devices
Analysis of Conference Mobility
Patterns
Contact and Inter-contact
Distribution
Contacts
Inter-contacts
What do we see?
Power law distribution for contact and
Inter-contact time
Both iMotes and external nodes
Does not agree with currently used
mobility model, e.g. random way point
Power law coefficient < 1
K-clique Communities in Cambridge
Dataset
K-clique Communities in Infocom06
Dataset
Paris Groups
Lausanne Group
Barcelona Group
Barcelona Group
Paris Group A
Paris Group B
Lausanne Group
K=4
Backup Architecture
Data Objects (DOs)
DO = set of attributes
= {type, value} pairs
Can link to other DOs
Exposing metadata
facilitates search
To structure data that
should be kept together
To allow apps to
categorise/organise
Apps/Haggle managers
can “claim” DOs to
assert ownership
Message
DO-Type
Data
ContentType
message/rfc822
From
James Scott
To
Richard Gass
Subject
Check this photo out!
Body
[text]
Attachment
DO-Type
Data
ContentType
image/jpeg
Keywords
Sunset, London
Creation
time
05/06/06 2015
GMT
Data
[binary]
DO Filters
Queries on fields of data objects
E.g. “content-type” EQUALS “text/html” AND
“keywords” INCLUDES “news” AND
“timestamp” >= (now() – 1 hour)
DO filters are also a special case of DOs
Haggle itself can match DOFilters to DOs –
apps don’t have to be involved
Can be persistent or be sent remotely…
DO Filter is a powerful mechanism
One-Off
Local
“Desktop”
Search
(find mp3s with
artist “U2”)
Remote “Web” Search
(find “london
restaurants”)
Persistent
Listen
(wants to receive
webpages)
Subscribe
(send all photos
created by user X to
X’s PC)
Layerless Naming
Haggle needs just-in-time binding of user
level names to destinations
Q: when messaging a user, should you send to
their email server or look in the
neighbourhood for their laptop’s MAC
address?
A: Both, even if you already reached one. E.g. you
can send email to a server and later pass them in
the corridor, or you could see their laptop directly,
but they aren’t carrying it today so you’d better
email it too…
Current layered model requires ahead-of-time
resolution by the user themselves in the
Name Graphs comprised of Name
Objects
Name Graph represents full
variety of ways to reach a userlevel name
NO = special class of DO
Used as destinations for data in
transit
Names and links between names
obtained from
Applications
Network interfaces
Neighbours
Data passing through
Directories
DOType
Name
Name
James Scott
DO-Type
Name
Name
00:0E:F6:23:91:3
4
DO-Type
Name
Name
[email protected]
rg
Forwarding Objects
Special class of DO
used for storing
metadata about
forwarding
TTL,expiry, etc
Since full structure
of naming and data is
sent, “intermediate”
nodes are empowered
to:
Use data as they see
fit
FO
DO
DO
NO
DO
DO
NO
NO
NO
Connectivities and Protocols
Connectivities (network interfaces) say which
“neighbours” are available (including
“Internet”)
Protocols use this to determine which NOs
they can deliver to, on a per-FO basis
P2P protocol says it can deliver any FO to
neighbour-derived NOs if corresponding neighbour
is visible
HTTP protocol can deliver FOs which contain a
DOFilter asking for a URL, if “Internet” neighbour
is present
Protocols can also perform tasks directly
POP protocol creates EmailReceiveTask when
Forwarding Algorithms
{Protocol, Name, Neighbour}
FOs
x
xx
x
x
x
x
x
algorithm 1
algorithm 2
x = scalar
“benefit” of
forwarding task
Forwarding algorithms create Forwarding
Tasks to send data to suitable next-hops
Can also create Tasks to perform signalling
Many forwarding algs can run simultaneously
Resource Management –
Tasks and Cost/Benefit
Task( Benefit getBenefit(), Cost getCost() )
Cost = {Energy, Time on network X, Money}
Benefit = {App, User, Forwarding}
Resource manager does cost/benefit
comparison using some utility function
Owners preferences must also be applied
E.g. don’t spend my money on others’ traffic
Implications of using Tasks
Tasks can come from all managers – http fetch, email
receive, neighbour discovery, etc
Tasks are executed at dynamic times/intervals (or
not done at all) based on
Key illustration of “layerless” architecture
Current resource costs
Other tasks
User priorities/preferences
“Too many” tasks is fine, even encouraged – unlike
with IP where network operations are queued
E.g. it is hard for a current web client to express
“predictively download these pages, if energy and bandwidth
are plentiful and free”
Connectivities have
responsibility for
neighbour
discovery
Protocols use
neighbours to
“mark” NOs
“nearby”
Resource
management
controls frequency
of neighbour
discovery
Applications (messaging, web, etc)
Haggle Application Interface
Name
5. Insert new names
Neighbour
Discovery
Protocol
Connectivities (WiFi, BT, GPRS, etc)
Haggle Application Interface
Name
4. Decide
next hop X
1. Insert data
3. Call “send”
API for send split
into three (sets
of) calls
FO can be sent to
many nodes using
many protocols
Asynchronous
Benefits of send
change with time
and context
2. Insert names
Applications (messaging, web, etc)
10. Connect & transmit
Sending
Data
Protocol
Connectivities (WiFi, BT, GPRS, etc)
Receiving
Data
Haggle Application Interface
Name
2. Incoming connection
Incoming data is
still processed using
tasks
Eventually inserted
into Data Manager
Apps “listen” by
registering
interests
(DOFilters)
Applications (messaging, web, etc)
Protocol
Connectivities (WiFi, BT, GPRS, etc)
Aside on security etc
Security was “left out” for version 1 in this 4year EU project, but threats were considered
Data security can reuse existing solutions of
authentication/encryption
Some new threats to privacy
With proviso that it is not possible to rely on a
synchronously available trusted third party
Neighbourhood visibility means trackability
Name graphs could include quite private
information
Incentives to cooperate an issue
Why should I spend any bandwidth/energy on your
stuff?
Implementation Status
Implemented in Java for XP and Linux
Ported to Windows Mobile using C# (Java also
runs)
Code at http://haggle.cvs.sourceforge.net/
Connectivity: WiFi, GPRS, (Bluetooth, SMS)
Protocol: P2P, SMTP, POP, HTTP, (Search)
Data: SQL, In-Memory, (SQLite)
Name, Resource, Forwarding: “Good defaults”
Forwarding Algs: Direct, Epidemic, (…)
Apps: Email, Web (both via proxies)
D N*
Programming Distributed
Computation
in Pocket Switched Networks
Extra slides…
Eiko Yoneki, Ioannis Baltopoulos
and Jon Crowcroft
University
of Cambridge
Computer
* Data
Driven Declarative
Networking
Declarative Networking
Declarative is new idea in
networking
e.g. Search: ‘what to look for’
rather than ‘how to look for’
Abstract complexity in
networking/data processing
P2: Building overlay using
Overlog
Network properties specified
D3N Data-Driven Declarative Networking
How to program distributed
computation?
Use Declarative Networking
Use of Functional Programming
Simple/clean semantics, expressive,
inherent parallelism
Queries/Filer etc. can be
expressed as higher-order
D3N and Functional Programming I
Functions are first-class values
They can be both input and
output of other functions
They can be shared between
different nodes (code mobility)
Not only data but also functions
flow
Language syntax does not have
state
Variables are only ever assigned
D3N and Functional Programming II
Integrated features from query
language
Assurance as in logical
programming
Appropriate level of abstraction
Imperative languages closely
specify the implementation
details (how); declarative
Overview of D3N Architecture
Each node is responsible for storing, indexing,
searching, and delivering data
Primitive functions associated with core D3N
calculus syntax are part of the runtime system
Prototype on MS Mobile .NET
47
D3N Syntax and Semantics I
Very few primitives
Integer, strings, lists, floating
point numbers and other
primitives are recovered
through constructor application
Standard FP features
Declaring and naming
functions through let-bindings
48
D3N Syntax and Semantics II
Advanced features
Concurrency (fork)
Communication (send/receive
primitives)
Query expressions (local and
distributed select)
49
D3N Language (Core Calculus
Syntax)
Event.register( Event.OnEncounter, fun d:device ->
if d.nID = “B” && distance(self,d) < 3 then
dispatch NodeEncountered(d);
)
50
Runtime System
Language relies on a small
runtime system
Operations implemented in the
runtime system written in F#
Each node is responsible on
data:
Storing
Indexing
51
Kernel Event Handler
Kernel maintains
An event queue (queue)
A list of functions for each event
(fenc, fdep)
Kernel processes
It removes an event from the
front of the queue (e)
Pattern matches against the
52
Example: Query to Networks
Queries are part of source level
syntax
Distributed execution (single
name from
poll() where institutemodel)
= “Computer Laboratory”
D N: select
node
programmer
Familiar syntax
poll()
F#:
3
E
|> filter (fun r -> r.institute = “Computer Laboratory”)
|> map (fun r -> r.name)
C
A
Message:
(code, nodeid, TTL, data)
B
D
Event.register( Event.OnEncounter, fun d:device ->
if d.nID = “B” && distance(self,d) < 3 then
dispatch NodeEncountered(d);
)
Example: Vote among Nodes
Voting application: implements
a distributed voting protocol of
choosing location for dinner
Rules
Each node votes once
A single node initiates the
application
Ballots should not be counted
twice
54
Event.register( Event.OnEncounter, fun d:device ->
if d.nID = “B” && distance(self,d) < 3 then
dispatch NodeEncountered(d);
)
Sequential Map function (smap)
Inner working
It sends the code to execute on
the remote node
It blocks waiting for a response
waiting from the node
Continues mapping the function
to the rest of the nodes in a
sequential fashion
An unavailable node blocks the
55
Parallel Map Function (pmap)
Inner working
Similar to the sequential case
The send/receive for each node
happen in a separate thread
An unavailable node does not
A
block the entire computation
pmap
B
C
D
E
F
G
56
Event.register( Event.OnEncounter, fun d:device ->
if d.nID = “B” && distance(self,d) < 3 then
dispatch NodeEncountered(d);
)
Reduce Function
Inner working
The reduce function aggregates
the results from a map
The reduce gets executed on
the initiator node
All results must have been
received before the reduce can
proceed
57
Voting Application Code
58
Event.register( Event.OnEncounter, fun d:device ->
if d.nID = “B” && distance(self,d) < 3 then
dispatch NodeEncountered(d);
)
Cascaded Map Function
Social Graph can be exploited
for map function
Logical topology
extracted
from
D
G
B
social
networks
E
F
A
C
Construct a minimum spanning
(a) Social Graph
tree with node A
D
D
B
B
Use tree as
navigation
of
task
E
E
A
F
C
(b) Nodes for Map at A
(c) Nodes for Map at B
59
Event.register( Event.OnEncounter, fun d:device ->
if d.nID = “B” && distance(self,d) < 3 then
dispatch NodeEncountered(d);
)
Outlook and Future Work
Current reference
implementation:
F# targeting .NET platform
taking advantage of a vast
collection of .NET libraries for
implementing D3N primitives
Future work:
Security issues are currently out
of the scope of this paper.
Prototype Application: Email
Standard email client
Haggle provides localhost
SMTP and POP servers
Emails become FO/DO/NOs
Protocols
Email client
on Haggle
email
SMTP marks email addrs
Internet
“nearby” when Internet is
available
Haggle Data Object
encapsulated in email
P2P marks MAC addrs
“nearby” when visible
POP creates “task” to
P2P
email
receive email when Internet
Haggle
visible (allows dynamic email
DOs
check interval…)
Email client
on Haggle
email
Standard
email
client
Prototype Application: Web
HTTP request becomes DO
filter for URL
Handled via:
Standard web clients used
Haggle acts as web proxy
Search page translated into DO
filter on content not URL
Search protocol indirects via
search engine and gets top N
links
Object request
http
Haggle
Local DOs (including data being
routed for others)
Internal
HTTP protocol to Internet
object
P2P protocol to neighbours
cache
Extension: search
Web client
Internal
object
cache
http
http
Web server
Internet
Haggle
Email Performance
Haggle/infrastructure and w/out-Haggle are comparable
(i.e. proxying etc adds acceptably low overhead)
Haggle/ad hoc wins – not just in elapsed time but also
because email account would not accept >5Mb attachments
100
100Kb
1Mb
5Mb
10
10Mb
1
ag
gl
e/
in
fra
H
ag
gl
e/
ad
ho
c
H
ag
gl
e/
bo
th
Due to neighbour sensing,
email checking, etc
causing 802.11 to switch
between AP and adhoc
mode often
Being addressed with
MultiNet approach
[Chandra,Bahl2]
10b
H
1000
ag
gl
e
Haggle/both is in
between, with more
variability
W
/o
ut
H
End-to-end email time (s)
Future work idea:
Resource-aware media sharing
Devices could do a lot of media sharing proactively
But resource management is crucial
Downloading public content predictively (e.g. webpages)
Backing up/keeping multiple caches of personal content (e.g.
photos, media)
Sharing media with friends/family/public
Contrast laptop in docking station to laptop on plane
Proactive tasks can exhaust battery life easily
Current architecture does not suffice
IP stack is difficult to configure with user priorities (e.g.
when should smart phone use GPRS and when WiFi?)
Current file system makes it hard for apps to fill disk with
“predictive” content
Using Haggle for
resource-aware media sharing
Haggle can store application data predictively
and evict it if “more important” data comes
along
Haggle makes forwarding decisions with
knowledge of resource consumption and
application priorities
Keep your photo albums in empty laptop space
Keep “public” media in TiVo-like fashion
Doesn’t ship holiday snaps over GPRS from
Australia
Haggle can share data with neighbours,
facilitating use of free fast local connectivity
wherever possible
Backup Empirical Stuff…
Implication of Power Law
Coefficient
Large coefficient => Smaller delay
Consider 2-hops relaying [tse/grossglauser] analysis [TechRep]
Denote power law coefficient as a
For a > 2
Any stateless algorithm achieves a finite expected delay.
For a > (m+1)/m and #{nodes} ≥2m :
There exist a forwarding algorithm with m copies and a finite expected
delay.
For a < 1
No stateless algorithm (even flooding) achieve a bounded delay (Orey’s
theorem).
Frequency of sightings and pairwise
contact
What do we see?
Nodes are not equal, some active and
some not
Does not agree with current mobility model,
equally distributed.
iMotes are seen more often than
external address
More iMotes pair contact
Identify Sharing Communities to improve
forwarding algorithm
Influence of time of day
What do we see?
Still a power law distribution for any
three-hour period of the day
Different power law coefficient for
different time
Maybe different forwarding algorithm for
different time of the day
Consequences for mobile
networking
Mobility models needs to be redesigned
Exponential decay of inter contact is wrong
Mechanisms tested with that model need to
be analyzed with new mobility assumptions
Stateless forwarding does not work
Can we benefit from heterogeneity to
forward by communities ?
Should we consider different algorithm for
different time of the day?
Human Hubs: Popularity
Reality
Infocom06
Cambridge
HK
Forwarding
Explicit Social Structure Scheme Design Space
Bubble
Label
Human
Dimension
Structure in
Cohesive Group
Clique
Label
Network Plane
Rank, Degree
Structure in Degree
Social Structure Based Forwarding in
DTNs & MANETs
Majority of DTN routing based
on epidemic, optimized
flooding algorithms:
• Limited number of copies
• Context-based [PROPHET]
• Use of mobility [FERRY]
BUBBLE RAP: Use of social hubs (e.g.
celebrities and postman) as betweenness
centrality and combining community structure
for improved forwarding efficiency.
Social Structure Based
Forwarding:
BUBBLE
RANK: Centrality based
forwarding. Each node has
global and local ranking of
popularity
LABEL: Community based
forwarding
BUBBLE RAP: Combination of
centrality and community
Implementation as a forwarding component of
Haggle (http://www.haggleproject.org)
framework. Haggle provides data centric
opportunistic networking with persistence of
search, delay tolerance and opportunism.
Metrics and Forwarding:
Preferred neighbors can be
•clique based on distributed detection, or explicit
label (e.g. same squad).
•based on centrality in detected graph.
Bubble combines both.
75
Social Structure Based Forwarding in
DTNs & MANETs (Cont)
BUBBLE performance versus flood or wait or pure LABEL: Cost reduction in
BUBBLE is significant.
BUBBLE will be effective for hierarchical community structure. To be
evaluated.
Scaled Haggle experiments (up to 200 devices) in the university campus is
in plan. This will demonstrate BUBBLE-LABEL performance in real world.
Social Based Forwarding (e.g. use of community structure and centrality) shows
significant improvement for forwarding efficiency
76
Use affiliation+hubs to fwd
inter+intra cliques
The End!
Some NOs for you:
[email protected]
http://www.haggleproject.org/
Thanks for listening
Any questions?