Three Challenges in Reliable Data Transport over Heterogeneous

Download Report

Transcript Three Challenges in Reliable Data Transport over Heterogeneous

Middleware Design
• Goals
identify the issues for middleware design in
wireless and mobile environments
 An illustrative middleware framework
 Detailed design for an image transcoding proxy
and application session handoff

Middleware Definition
• RFC2768:
Def 1: those services found above the transport
(I.e. over TCP/IP) layer set of services but below
the application environment (i.e., below
application-level APIs)
 Def 2: a reusable, expandable set of services and
functions that are commonly needed by many
applications to function well in a networked
environment.

• Industry usage:

Software gateway (“glue”) between two apps
Issues for Middleware Design
• Legacy systems and protocols
• Diverse networks (wireline, indoor & outdoor
wireless)
• Network dynamics: congestion, link errors,
failures, attacks
• Device and platform heterogeneity
• User mobility
• Thin-client support
• Large number of users and devices
Some middleware design goals
for wireless and mobile devices
• Improve user experience across heterogeneous
devices (e.g. PDAs, laptops, desktops)

e.g. transcoding (adaptive content delivery)
• Provide new services for heterogeneous devices

e.g. application state migration
• Minimal change to the existing infrastructure and
applications: may add/change a few more “boxes”
• Adaptation to network dynamics (induced by
mobility and wireless links)
• Scalable and secure service
• Service availability (in the presence of failures,
attacks and large user population)
Transcoding middleware service
• Client variations along 3 dimensions:

Network variation
• bandwidth, latency and error behavior

Hardware variation
• screen size/resolution, color/grayscale bit depth, memory, CPU

Software variation
• Applications for specific MIME types (PDF, PS, PPT, AVI, etc.)
• Codecs for specific encodings (H263, JPEG, etc.)
• Transcoding goals:

Reduce latency experienced by user
• Reduce image’s color depth, resolution to get smaller file

Provide access to new types of content
• PDF  text, Speech  text
Transcoding design issues
• Design issues for adapting
to variation:



How: Datatype-specific
lossy compression
mechanisms: distillation &
refinement based on
(MIME) type of data
Where: at the content
server or at a proxy
When: static or on-demand
Distillation and refinement
• Main idea: high-level semantic types (MIME types)
dictate datatype-specific operations



Images: can discard color info, high-frequency
components, or pixel resolution
Video: additionally include frame rate reduction
Formatted text: can discard some formatting information
• Datatype-specific distillation: highly lossy,
datatype-specific compression that preserves most
of the semantic content of a data object while
adhering to a particular set of constraints
• Datatype-specific refinement: fetching some part
(possibly all) of a source object at increased quality,
possibly the original representation
Choices to handle client variations
• Server ignores variations:

low-end clients may suffer
• Server use the most basic types & minimal graphics:

high-end client suffers
• Servers provide multiple formats:


used today by major websites (ESPN, Amazon, Yahoo)
need to categorize clients into discrete classes
• Progressive encodings:

typically assume that all parts of the encoded documents
are equally important
• On-demand distillation and refinement:

generate on-the-fly based on client characteristics
An Adaptive-Proxy Based Middleware
Design Framework
• Three-tier model: client – proxy – server
• A programming model for proxy-based
design: TACC
Transformation: distillation, filtering, format
conversion, etc.
 Aggregation: collect and collate data from
various sources
 Caching: both original and transformed content
 Customization: user-customized service (user
profiling, adaptive service to each user’s needs or
device characteristics)

Why do we need a proxy ?
• Advantages for servers:


Servers concentrate on serving high quality content,
rather than having to keep multiple versions
Servers do not pay the costs required to do on-demand
distillation
• Advantages for clients:


Low-end clients can rely on the proxy to optimize content
from servers designed for high-end clients
Client communicates with a single logical entity—proxy,
allowing the client to manage bandwidth at the
application level
• Advantages for both:


Pushing the complexity away from both clients and
servers by relocating it into the network infrastructure
Distillation and refinement can be offered as a valueadded service by a service provider
A Scalable Cluster-based Infrastructure
• Address three issues: incremental scalability,
24X7 availability, and cost effectiveness
• A cluster based architecture for scalable
network services
Exploit the strength of cluster computing
 Cluster-based servers

• BASE data semantics: basically available,
soft state, eventual consistency.
Cluster architecture
Front-end
Load-balancer
Workstation
cluster
Webserver
Why do we need clusters?
• Scalability:


well-suited for networking service workloads that are
highly parallel
Clusters can grow incrementally over time
• High availability:


Natural redundancy due to the independence of the nodes
Hot upgrade: disable a node and upgrade it in place
• Commodity building blocks:

Use low-end, high-volume PCs rather than high-end,
low-volume machines
• Bad thing about clusters:

Administration; component and system replication
(software should decompose into loosely coupled
modules); partial failures; shared state
An example: adaptive transcoding proxy
• Web server  transcoding proxy  web browser
• Proxy architecture:



Content analysis
Adaptive transcoding policies: when and how much to
transcode
Transformation modules: text modification, images
decode & compress
• Key design goal:


Improve latency experienced by user at heterogeneous
devices
fixed quality or fixed delay
Design
• Two scenarios:
Store-and-forward image transcoding
 Streamed image transcoding

• Two main issues:
Whether to transcode
 How much to transcode

How to decide whether to transcode?
Bpc
Bsp
server
proxy
client
• Dp = transcoding delay, S = orig size, Sp = transcoded size
• w/o transcoding:

2*RTTpc + 2*RTTsp + S / min(Bpc, Bsp)
• w/transcoding:

2*RTTpc + 2*RTTsp + Dp + S / Bsp + Sp/Bpc
• If Bpc < Bsp, proxy-based transcoding useful when:
Dp + S/Bsp < (S-Sp)/Bpc
• How to predict transcoding delay?

Details for store-and-forward image
transcoding
• Prediction


Transcoded image’s output size in bytes: high correlation
between output size and the image area (number of
pixels)  linear interpolation
Prediction of transcoding delay: approximated by linear
function of the input image area
• Policies:


Fixed-quality transcoder: if (transcoding = feasible),
transcode according to user’s parameter vector
Fixed-delay transcoder: if(transcoding=feasible), search
space of transcoding parameters to find optimal set that
maximizes quality subject to the given response time,
transcode using the optimal parameters
Transcoding internal stages
• Determine target parameters



In-band or out-of-band data
Use HTTP headers
Use a client profile and/or network conditions
• Download data and characterize it

E.g. get image’s type, resolution, and color depth
• Apply heuristics and policies


How to match data’s characteristics to target parameters?
Multi-dimensional constraint satisfaction
• Execute the transcoding

Typically can use off-the-shelf software
Streamed image transcoding
• Perform transcoding under two stability
conditions:
No buffer overflow
 Output transmission link is not saturated

Another middleware service:
Application session handoff
• We want continuous access to our data across
these machines
• Middleware software will integrate data
across devices

for immediate access to information anytime,
anywhere
• Move applications across multiple computers
More application session handoff
• Applications will have session state
discrete data
 multimedia, streaming data

• Application session handoff: application’s
state will move automatically and seamlessly
across devices
• Data will be transcoded for each device
Broad view of system
Application Server
High Bandwidth Network
Middleware Cluster
Wireless Network
Clients
Application session handoff in action
Legacy Multimedia DBMS
Middleware design issues for ASH
• Client must incorporate application-layer
library code to participate with proxy
• Protocol gateway
client  proxy : custom control protocol +
application-specific protocol
 Proxy  server: HTTP, SMTP, RTP, etc.

•
•
•
•
Service discovery
Data consistency protocols
Scalability across cluster of proxies
PKI-based security
Summary
• Middleware provides improved user
experience or additional functionality
• Middleware runs within limits of existing
legacy system or protocols
• New functionality typically implemented at a
proxy
• Clustering provides scalability for proxy
services
Goals for Middleware Design
• Minimal change to the existing infrastructure and
applications: may add/change a few more “boxes”
• Adaptation to network dynamics (induced by
mobility and wireless links)
• Support for heterogeneous devices (e.g. laptop,
desktop, pocket PC, palm-devices)
• Customized service (e.g., adaptive content delivery)
• Scalable and secure service
• Portability: seamless migration across computing
platforms
• User-friendly design
• Service availability (in the presence of failures,
attacks and large user population)
On-demand dynamic distillation
• Issues to address: client variations along 3
dimensions:



Network variation: bandwidth, latency and error behavior
Hardware variation: screen size and resolution, color or
grayscale bit depth, memory, CPU power
Software variation: application-level data encodings, etc.
• Design principles for adapting to variation:



Datatype-specific lossy compression mechanisms:
distillation and refinement based on semantic type of the
data
On the fly adaptation: compute a desired representation
of a typed object on demand
Complexity away from both clients and servers: done at
an intermediate proxy
Sharing semantics
• Traditional transactional database model: ACID
(atomicity, consistency, isolation, and durability)



strongest semantics at the highest cost and complexity
No guarantee for availability
Suited for e-commerce transaction, billing users,
maintaining user profile info etc.
• Many users/services prefer availability rather than
strong consistency or durability:



Stale data can be temporarily tolerated as long as all
copies of data eventually reach consistency after a short
time
Soft state: can be used to improve performance
Approximate answers are preferred if delivered quickly
compared to exact but slow answer
BASE semantics
• BASE: basically available, soft state,
eventual consistency
Handle partial failures in clusters with less
complexity and cost
 Trading consistency for simplicity
 Trading consistency for availability
 Use of soft state to allow each watcher process to
detect that its peer is alive (rather than mirroring
the peer’s state), be able to restart its peer (rather
than take over its peer’s duties)
