Data - The Computer Laboratory

Download Report

Transcript Data - The Computer Laboratory

Clean Slate Ubicomp Device
System Architecture
Jon Crowcroft,
http://www.cl.cam.ac.uk/~jac22
Thank you but you are in
the opposite direction!
I have 100M bytes of
data, who can carry
for me?
I can also carry for
you!
Give it to me, I have
1G bytes phone flash.
Don’t give to me! I
am running out of
storage.
Reach an access
point.
There is one
in my
Search La
pocket…
Bonheme.mp3 for
me
Internet
Finally, it
arrive…
Search La
Bonheme.mp3 for
me
Search La
Bonheme.mp3 for
me
System Architecture
Radical Device Stuff +
Cloud Stuff
Research:
split/migrate/cache/proxy/dynamics
Motivation #1

Mobile users currently have a very bad
experience with networking



Applications do not work without networking
infrastructure such as 802.11 access points or cell
phone data coverage
Local connectivity is plentiful (WiFi, Bluetooth,
etc), but very hard for end users to configure and
use
Example: Train/plane on the way to London


How to send a colleague sitting opposite some
slides to review?
How to get information on restaurants in London?
(Clue: someone else is bound to have it cached on
their device)
Underlying Problem

Applications tied to network details and
operations via use of IP-based socks
interface




Apps survive by using directory services




What interface to use
How to route to destination
When to connect
Address book maps names to email addresses
Google maps search keywords to URLs
DNS maps domain names to IP addresses
Directory services mean infrastructure
Phase transitions and networks

Solid networks: wired, or fixed wireless mesh


Liquid networks: Mobile Ad-Hoc Networking
(MANET)


Short lived end-to-gateway routes
Gaseous networks: Delay Tolerant Networking
(DTN), Pocket Switched Networking (PSN)




Long lived end-to-end routes
No routes at all!
Opportunistic, store and forward networking
One way paths, asymmetry, node mobility carries
data
Haggle targets all three, so must work in most
Haggle Overview


Clean-slate redesign of mobile node
Spans MAC to application layers
(inclusive), but is itself layerless –
uses six “managers”
Key Features:

Store-and-forward architecture with
data persisting inside Haggle not
separate file sys

App-layer protocols (SMTP, HTTP,
etc) moved into Haggle rather than
apps themselves

Forwarding decisions made on “name
graphs” allowing just-in-time binding
 Resource management integrated

API is asynchronous and data-centric
Event.register( Event.OnEncounter, fun d:device ->
if d.nID = “B” && distance(self,d) < 3 then
dispatch NodeEncountered(d);
)
Overview of D3N Architecture
 Each node is responsible for storing, indexing,
searching, and delivering data
 Primitive functions associated with core D3N
calculus syntax are part of the runtime system
 Prototype on MS Mobile .Net + Haggle
Runtime
8
Data-Driven Declarative Networking (D3N)


How to program distributed
computation?
Declarative is new idea in
networking
Ex. P2: Building overlay: Network
properties specified declaratively
 Use of Functional Programming:
Functions are first class values and
can be both the input and the result
of other functions

Example: Query to Networks
 Queries are part of source level syntax
 Distributed execution (single node programmer model)
 Familiar syntax
D3N:
select name from poll() where institute = “Computer Laboratory”
F#:
poll()
E
|> filter (fun r -> r.institute = “Computer Laboratory”)
C
|> map (fun r -> r.name)
A
Message:
(code, nodeid, TTL, data)
B
D
Trust, the Cloud Society:Cloud Atlas…
Anil Madhavapeddy (Cambridge/Imperial)
and
Daniele Quercia (UCL/MIT)…
Who do you trust?

Your phone


The cloud


Unreachable, goes broke, spy on you
Solution spaces





lost/stolen/broken/hacked
Migrate cloud state on/off fone
Need encapsulation of this (social vm)?
P2P soln too (nearby devices)
Resource (sensor) pooling
Social Sign on Scales/Usability?
Empirical Stuff…
Why measure human mobility?




Mobility increases capacity of dense
mobile network [tse/grossglauser]
Also create dis-connectivities
Human mobility patterns determine
communication opportunities
And discover social groupings - see later
for resource allocation (e.g. spectrum)
Experimental setup

iMotes




ARM processor
Bluetooth radio
64k flash memory
Bluetooth Inquiries


5 seconds every 2 minutes
Log {MAC address, start time, end time}
tuple of each contact
Experimental devices
Contact and Inter-contact time

Inter-contact is important



Affect the feasibility of opportunistic network
Nature of distribution affects choice of forwarding
algorithm
Rarely studied
Infocom 2005 experiment






54 iMotes distributed
Experiment duration: 3 days
41 yielded useful data
11 with battery or packaging problem
2 not returned
[data on crawdad, have several other datasets from
Hong Kong, Barcelona, Cambridge, Lisbon, and of
course other people have done this too:-]
Brief summary of data





41 iMotes
182 external devices
22459 contacts between iMotes
5791 contacts between iMote/external
device
External devices are non-iMote devices in
the environment, e.g. BT mobile phone,
Laptop.
Contacts seen by an iMote
iMoites
External Devices
Analysis of Conference Mobility
Patterns
Contact and Inter-contact
Distribution
Contacts
Inter-contacts
What do we see?




Power law distribution for contact and
Inter-contact time
Both iMotes and external nodes
Does not agree with currently used
mobility model, e.g. random way point
Power law coefficient < 1
K-clique Communities in Cambridge
Dataset
K-clique Communities in Infocom06
Dataset
Paris Groups
Lausanne Group
Barcelona Group
Barcelona Group
Paris Group A
Paris Group B
Lausanne Group
K=4
Backup Architecture
Data Objects (DOs)

DO = set of attributes
= {type, value} pairs


Can link to other DOs



Exposing metadata
facilitates search
To structure data that
should be kept together
To allow apps to
categorise/organise
Apps/Haggle managers
can “claim” DOs to
assert ownership
Message
DO-Type
Data
ContentType
message/rfc822
From
James Scott
To
Richard Gass
Subject
Check this photo out!
Body
[text]
Attachment
DO-Type
Data
ContentType
image/jpeg
Keywords
Sunset, London
Creation
time
05/06/06 2015
GMT
Data
[binary]
DO Filters





Queries on fields of data objects
E.g. “content-type” EQUALS “text/html” AND
“keywords” INCLUDES “news” AND
“timestamp” >= (now() – 1 hour)
DO filters are also a special case of DOs
Haggle itself can match DOFilters to DOs –
apps don’t have to be involved
Can be persistent or be sent remotely…
DO Filter is a powerful mechanism
One-Off
Local
“Desktop”
Search
(find mp3s with
artist “U2”)
Remote “Web” Search
(find “london
restaurants”)
Persistent
Listen
(wants to receive
webpages)
Subscribe
(send all photos
created by user X to
X’s PC)
Layerless Naming


Haggle needs just-in-time binding of user
level names to destinations
Q: when messaging a user, should you send to
their email server or look in the
neighbourhood for their laptop’s MAC
address?


A: Both, even if you already reached one. E.g. you
can send email to a server and later pass them in
the corridor, or you could see their laptop directly,
but they aren’t carrying it today so you’d better
email it too…
Current layered model requires ahead-of-time
resolution by the user themselves in the
Name Graphs comprised of Name
Objects




Name Graph represents full
variety of ways to reach a userlevel name
NO = special class of DO
Used as destinations for data in
transit
Names and links between names
obtained from





Applications
Network interfaces
Neighbours
Data passing through
Directories
DOType
Name
Name
James Scott
DO-Type
Name
Name
00:0E:F6:23:91:3
4
DO-Type
Name
Name
[email protected]
rg
Forwarding Objects

Special class of DO
used for storing
metadata about
forwarding


TTL,expiry, etc
Since full structure
of naming and data is
sent, “intermediate”
nodes are empowered
to:

Use data as they see
fit
FO
DO
DO
NO
DO
DO
NO
NO
NO
Connectivities and Protocols


Connectivities (network interfaces) say which
“neighbours” are available (including
“Internet”)
Protocols use this to determine which NOs
they can deliver to, on a per-FO basis



P2P protocol says it can deliver any FO to
neighbour-derived NOs if corresponding neighbour
is visible
HTTP protocol can deliver FOs which contain a
DOFilter asking for a URL, if “Internet” neighbour
is present
Protocols can also perform tasks directly

POP protocol creates EmailReceiveTask when
Forwarding Algorithms
{Protocol, Name, Neighbour}
FOs
x
xx



x
x
x
x
x
algorithm 1
algorithm 2
x = scalar
“benefit” of
forwarding task
Forwarding algorithms create Forwarding
Tasks to send data to suitable next-hops
Can also create Tasks to perform signalling
Many forwarding algs can run simultaneously
Resource Management –
Tasks and Cost/Benefit





Task( Benefit getBenefit(), Cost getCost() )
Cost = {Energy, Time on network X, Money}
Benefit = {App, User, Forwarding}
Resource manager does cost/benefit
comparison using some utility function
Owners preferences must also be applied

E.g. don’t spend my money on others’ traffic
Implications of using Tasks

Tasks can come from all managers – http fetch, email
receive, neighbour discovery, etc


Tasks are executed at dynamic times/intervals (or
not done at all) based on




Key illustration of “layerless” architecture
Current resource costs
Other tasks
User priorities/preferences
“Too many” tasks is fine, even encouraged – unlike
with IP where network operations are queued

E.g. it is hard for a current web client to express
“predictively download these pages, if energy and bandwidth
are plentiful and free”



Connectivities have
responsibility for
neighbour
discovery
Protocols use
neighbours to
“mark” NOs
“nearby”
Resource
management
controls frequency
of neighbour
discovery
Applications (messaging, web, etc)
Haggle Application Interface
Name
5. Insert new names
Neighbour
Discovery
Protocol
Connectivities (WiFi, BT, GPRS, etc)



Haggle Application Interface
Name
4. Decide
next hop X
1. Insert data
3. Call “send”
API for send split
into three (sets
of) calls
FO can be sent to
many nodes using
many protocols
Asynchronous
Benefits of send
change with time
and context
2. Insert names

Applications (messaging, web, etc)
10. Connect & transmit
Sending
Data
Protocol
Connectivities (WiFi, BT, GPRS, etc)
Receiving
Data


Haggle Application Interface
Name
2. Incoming connection

Incoming data is
still processed using
tasks
Eventually inserted
into Data Manager
Apps “listen” by
registering
interests
(DOFilters)
Applications (messaging, web, etc)
Protocol
Connectivities (WiFi, BT, GPRS, etc)
Aside on security etc


Security was “left out” for version 1 in this 4year EU project, but threats were considered
Data security can reuse existing solutions of
authentication/encryption


Some new threats to privacy



With proviso that it is not possible to rely on a
synchronously available trusted third party
Neighbourhood visibility means trackability
Name graphs could include quite private
information
Incentives to cooperate an issue

Why should I spend any bandwidth/energy on your
stuff?
Implementation Status









Implemented in Java for XP and Linux
Ported to Windows Mobile using C# (Java also
runs)
Code at http://haggle.cvs.sourceforge.net/
Connectivity: WiFi, GPRS, (Bluetooth, SMS)
Protocol: P2P, SMTP, POP, HTTP, (Search)
Data: SQL, In-Memory, (SQLite)
Name, Resource, Forwarding: “Good defaults”
Forwarding Algs: Direct, Epidemic, (…)
Apps: Email, Web (both via proxies)
D N*
Programming Distributed
Computation
in Pocket Switched Networks
Extra slides…
Eiko Yoneki, Ioannis Baltopoulos
and Jon Crowcroft
University
of Cambridge
Computer
* Data
Driven Declarative
Networking
Declarative Networking
Declarative is new idea in
networking
 e.g. Search: ‘what to look for’
rather than ‘how to look for’
 Abstract complexity in
networking/data processing
 P2: Building overlay using
Overlog
 Network properties specified

D3N Data-Driven Declarative Networking


How to program distributed
computation?
Use Declarative Networking

Use of Functional Programming


Simple/clean semantics, expressive,
inherent parallelism
Queries/Filer etc. can be
expressed as higher-order
D3N and Functional Programming I

Functions are first-class values
They can be both input and
output of other functions
 They can be shared between
different nodes (code mobility)
 Not only data but also functions
flow


Language syntax does not have
state

Variables are only ever assigned
D3N and Functional Programming II


Integrated features from query
language
 Assurance as in logical
programming
Appropriate level of abstraction
 Imperative languages closely
specify the implementation
details (how); declarative
Overview of D3N Architecture
 Each node is responsible for storing, indexing,
searching, and delivering data
 Primitive functions associated with core D3N
calculus syntax are part of the runtime system
 Prototype on MS Mobile .NET
47
D3N Syntax and Semantics I
 Very few primitives


Integer, strings, lists, floating
point numbers and other
primitives are recovered
through constructor application
Standard FP features

Declaring and naming
functions through let-bindings
48
D3N Syntax and Semantics II

Advanced features
Concurrency (fork)
 Communication (send/receive
primitives)
 Query expressions (local and
distributed select)

49
D3N Language (Core Calculus
Syntax)
Event.register( Event.OnEncounter, fun d:device ->
if d.nID = “B” && distance(self,d) < 3 then
dispatch NodeEncountered(d);
)
50
Runtime System
 Language relies on a small
runtime system


Operations implemented in the
runtime system written in F#
Each node is responsible on
data:
Storing
 Indexing

51
Kernel Event Handler
 Kernel maintains
An event queue (queue)
 A list of functions for each event
(fenc, fdep)


Kernel processes
It removes an event from the
front of the queue (e)
 Pattern matches against the

52
Example: Query to Networks

Queries are part of source level
syntax
Distributed execution (single
name from
poll() where institutemodel)
= “Computer Laboratory”
D N: select
node
programmer
 Familiar syntax
poll()
F#:

3
E
|> filter (fun r -> r.institute = “Computer Laboratory”)
|> map (fun r -> r.name)
C
A
Message:
(code, nodeid, TTL, data)
B
D
Event.register( Event.OnEncounter, fun d:device ->
if d.nID = “B” && distance(self,d) < 3 then
dispatch NodeEncountered(d);
)
Example: Vote among Nodes
 Voting application: implements
a distributed voting protocol of
choosing location for dinner

Rules
Each node votes once
 A single node initiates the
application
 Ballots should not be counted
twice

54
Event.register( Event.OnEncounter, fun d:device ->
if d.nID = “B” && distance(self,d) < 3 then
dispatch NodeEncountered(d);
)
Sequential Map function (smap)
 Inner working
It sends the code to execute on
the remote node
 It blocks waiting for a response
waiting from the node
 Continues mapping the function
to the rest of the nodes in a
sequential fashion
 An unavailable node blocks the

55
Parallel Map Function (pmap)
 Inner working
Similar to the sequential case
 The send/receive for each node
happen in a separate thread
 An unavailable node does not
A
block the entire computation
pmap

B
C
D
E
F
G
56
Event.register( Event.OnEncounter, fun d:device ->
if d.nID = “B” && distance(self,d) < 3 then
dispatch NodeEncountered(d);
)
Reduce Function
 Inner working
The reduce function aggregates
the results from a map
 The reduce gets executed on
the initiator node
 All results must have been
received before the reduce can
proceed

57
Voting Application Code
58
Event.register( Event.OnEncounter, fun d:device ->
if d.nID = “B” && distance(self,d) < 3 then
dispatch NodeEncountered(d);
)
Cascaded Map Function
 Social Graph can be exploited
for map function
Logical topology
extracted
from
D
G
B
social
networks
E
F
A
C
 Construct a minimum spanning
(a) Social Graph
tree with node A
D
D
B
B
 Use tree as
navigation
of
task
E
E
A
F

C
(b) Nodes for Map at A
(c) Nodes for Map at B
59
Event.register( Event.OnEncounter, fun d:device ->
if d.nID = “B” && distance(self,d) < 3 then
dispatch NodeEncountered(d);
)
Outlook and Future Work
 Current reference
implementation:


F# targeting .NET platform
taking advantage of a vast
collection of .NET libraries for
implementing D3N primitives
Future work:

Security issues are currently out
of the scope of this paper.
Prototype Application: Email

Standard email client



Haggle provides localhost
SMTP and POP servers
Emails become FO/DO/NOs
Protocols



Email client
on Haggle
email
SMTP marks email addrs
Internet
“nearby” when Internet is
available
Haggle Data Object
encapsulated in email
P2P marks MAC addrs
“nearby” when visible
POP creates “task” to
P2P
email
receive email when Internet
Haggle
visible (allows dynamic email
DOs
check interval…)
Email client
on Haggle
email
Standard
email
client
Prototype Application: Web

HTTP request becomes DO
filter for URL



Handled via:




Standard web clients used
Haggle acts as web proxy

Search page translated into DO
filter on content not URL
Search protocol indirects via
search engine and gets top N
links
Object request
http
Haggle
Local DOs (including data being
routed for others)
Internal
HTTP protocol to Internet
object
P2P protocol to neighbours
cache
Extension: search

Web client
Internal
object
cache
http
http
Web server
Internet
Haggle
Email Performance
Haggle/infrastructure and w/out-Haggle are comparable
(i.e. proxying etc adds acceptably low overhead)
 Haggle/ad hoc wins – not just in elapsed time but also
because email account would not accept >5Mb attachments
100
100Kb
1Mb
5Mb
10
10Mb
1
ag
gl
e/
in
fra
H
ag
gl
e/
ad
ho
c
H
ag
gl
e/
bo
th

Due to neighbour sensing,
email checking, etc
causing 802.11 to switch
between AP and adhoc
mode often
Being addressed with
MultiNet approach
[Chandra,Bahl2]
10b
H

1000
ag
gl
e
Haggle/both is in
between, with more
variability
W
/o
ut
H

End-to-end email time (s)

Future work idea:
Resource-aware media sharing

Devices could do a lot of media sharing proactively




But resource management is crucial



Downloading public content predictively (e.g. webpages)
Backing up/keeping multiple caches of personal content (e.g.
photos, media)
Sharing media with friends/family/public
Contrast laptop in docking station to laptop on plane
Proactive tasks can exhaust battery life easily
Current architecture does not suffice


IP stack is difficult to configure with user priorities (e.g.
when should smart phone use GPRS and when WiFi?)
Current file system makes it hard for apps to fill disk with
“predictive” content
Using Haggle for
resource-aware media sharing

Haggle can store application data predictively
and evict it if “more important” data comes
along



Haggle makes forwarding decisions with
knowledge of resource consumption and
application priorities


Keep your photo albums in empty laptop space
Keep “public” media in TiVo-like fashion
Doesn’t ship holiday snaps over GPRS from
Australia
Haggle can share data with neighbours,
facilitating use of free fast local connectivity
wherever possible
Backup Empirical Stuff…
Implication of Power Law
Coefficient




Large coefficient => Smaller delay
Consider 2-hops relaying [tse/grossglauser] analysis [TechRep]
Denote power law coefficient as a
For a > 2
Any stateless algorithm achieves a finite expected delay.

For a > (m+1)/m and #{nodes} ≥2m :
There exist a forwarding algorithm with m copies and a finite expected
delay.

For a < 1
No stateless algorithm (even flooding) achieve a bounded delay (Orey’s
theorem).
Frequency of sightings and pairwise
contact
What do we see?

Nodes are not equal, some active and
some not



Does not agree with current mobility model,
equally distributed.
iMotes are seen more often than
external address
More iMotes pair contact

Identify Sharing Communities to improve
forwarding algorithm
Influence of time of day
What do we see?


Still a power law distribution for any
three-hour period of the day
Different power law coefficient for
different time

Maybe different forwarding algorithm for
different time of the day
Consequences for mobile
networking

Mobility models needs to be redesigned



Exponential decay of inter contact is wrong
Mechanisms tested with that model need to
be analyzed with new mobility assumptions
Stateless forwarding does not work


Can we benefit from heterogeneity to
forward by communities ?
Should we consider different algorithm for
different time of the day?
Human Hubs: Popularity
Reality
Infocom06
Cambridge
HK
Forwarding
Explicit Social Structure Scheme Design Space
Bubble
Label
Human
Dimension
Structure in
Cohesive Group
Clique
Label
Network Plane
Rank, Degree
Structure in Degree
Social Structure Based Forwarding in
DTNs & MANETs
Majority of DTN routing based
on epidemic, optimized
flooding algorithms:
• Limited number of copies
• Context-based [PROPHET]
• Use of mobility [FERRY]
BUBBLE RAP: Use of social hubs (e.g.
celebrities and postman) as betweenness
centrality and combining community structure
for improved forwarding efficiency.
Social Structure Based
Forwarding:
BUBBLE
RANK: Centrality based
forwarding. Each node has
global and local ranking of
popularity
LABEL: Community based
forwarding
BUBBLE RAP: Combination of
centrality and community
Implementation as a forwarding component of
Haggle (http://www.haggleproject.org)
framework. Haggle provides data centric
opportunistic networking with persistence of
search, delay tolerance and opportunism.
Metrics and Forwarding:
Preferred neighbors can be
•clique based on distributed detection, or explicit
label (e.g. same squad).
•based on centrality in detected graph.
Bubble combines both.
75
Social Structure Based Forwarding in
DTNs & MANETs (Cont)
BUBBLE performance versus flood or wait or pure LABEL: Cost reduction in
BUBBLE is significant.
BUBBLE will be effective for hierarchical community structure. To be
evaluated.
Scaled Haggle experiments (up to 200 devices) in the university campus is
in plan. This will demonstrate BUBBLE-LABEL performance in real world.
Social Based Forwarding (e.g. use of community structure and centrality) shows
significant improvement for forwarding efficiency
76
Use affiliation+hubs to fwd
inter+intra cliques
The End!

Some NOs for you:




[email protected]
http://www.haggleproject.org/
Thanks for listening 
Any questions?