Dryad: Distributed Data-Parallel Programs from Sequential Building
Download
Report
Transcript Dryad: Distributed Data-Parallel Programs from Sequential Building
Dryad: Distributed Data-Parallel
Programs from Sequential
Building Blocks
Michael Isard, Mihai Budiu, Yuan Yu,
Andrew Birrell, Dennis Fetterly
Microsoft Research, Silicon Valley
Eurosys 2007
Overview
Mechanism to express parallel computation
Large scale internet services
Chip multiprocessors
Drawing on previous work
Condor
GPU shader languages
MapReduce
Parallel databases
Main Idea
Represent the computation as a DAG
of communicating sequential
processes
Claims
The Dryad execution engine:
Schedules across resources
Optimizes the level of concurrency in a node
Manages failure resilience
Delivers data where needed
Performance is good and scales
The programming abstraction is at the right
level
API mastery only takes a couple of weeks
Higher level abstractions have been built on
Dryad
Dryad System
Name server enumerates all resources
Including location relative to other
resources
Daemon running on each machine for
vertex dispatch
Communication
Constructing the Job
Use graph operators implemented in
C++ to describe the graph.
Database Query Example
Execution
Job manager not currently fault tolerant
Vertices may be scheduled multiple times
Each execution versioned
Execution record kept- including versions of
incoming vertices
Outputs are uniquely named (versioned)
Final outputs selected if job completes
Non-file communication may cascade failures
Vertices specify hard constraints or
preferences for placement
Scheduling is greedy assuming only one job
Run-time Graph Refinement
Results – I
SQL Query
10 Machines
2 dualcore 2 GHz
8 GB Mem
1 Gb Ethernet
4x400GB disks
Winows Server
2003
Results – II
Map then reduce style
Builds histogram of MSN Search query
frequency
1800 Machines
10.2 TB source data
11072 Vertices
Refinement
Dryad