PowerPoint 프레젠테이션

Download Report

Transcript PowerPoint 프레젠테이션

TreadMarks: Distributed Shared Memory on
Standard Workstations and Operating Systems
Jeongho Kim, Youngseok Son
M.S. Student.
Real-Time Operating Systems Laboratory
Seoul National University
Paper Presentation
2011 Fall Distributed Information Processing
TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems
Contents
I.
II.
III.
IV.
Introduction
Techniques
Implementation: TreadMarks
Evaluation
Paper Presentation
2011 Fall Distributed Information Processing
I. Introduction
Distributed Shared Memory
 The distributed shared memory (DSM) implements the shared
memory model in distributed systems, which have no physical
shared memory
 The shared memory model provides a virtual address space
shared between all processors
 The high cost of communication in distributed systems is that
DSM systems move data between processors
Processor 1
Processor 2
Processor N
Mem 1
Mem 2
Mem N
network
Shared memory
Paper Presentation
2011 Fall Distributed Information Processing
I. Introduction
Problems of Existing DSM
 OS dependency
 Kernel modifications
 Poor performance
① Communication overhead
② False sharing
Paper Presentation
2011 Fall Distributed Information Processing
I. Introduction
Problems of Existing DSM
① Communication overhead
 If the communication occurs whenever the x variable is changed, it
costs a lot of overhead.
Process A
w(x)
w(x)
w(x)
page
page
page
t
t
Process B
 Solution) The communication occurs, when the x variable is changed
finally
Process A
w(x)
w(x)
w(x)
t
page
Process B
t
Paper Presentation
2011 Fall Distributed Information Processing
I. Introduction
Problems of Existing DSM
② False sharing
 A situation in which two or more processes access different variables
within a page.
• If only one process is allowed to write to a page at a time, false sharing
leads to unnecessary communication, called the “ping-pong” effect.
Process
Code using
variable a
Process
Code using
variable b
① Access to write variable a
Page
③ Access to read variable b
④ wait
X
Page
a
b
② Transfer the page
a
b
Paper Presentation
2011 Fall Distributed Information Processing
I. Introduction
Objectives
 User-level implementation
 Good performance
 Reduction of communication overhead
• Lazy release consistency
• Invalidate-based protocol
 Solving false sharing
• multiple writer protocol
Paper Presentation
2011 Fall Distributed Information Processing
TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems
Contents
I.
II.
III.
IV.
Introduction
Techniques
Implementation: TreadMarks
Evaluation
Paper Presentation
2011 Fall Distributed Information Processing
II. Related Works
Related Works
1. Lazy release consistency
2. Multiple-writer protocol
Paper Presentation
2011 Fall Distributed Information Processing
II. Related Works
1. Lazy Release Consistency
 Release Consistency (RC) is a memory consistency model
that permits a node to delay making its changes of shared
data visible to other nodes until certain synchronization
accesses occur
 Shared memory accesses
 Ordinary access
 Synchronization access
• acquire access
• release access
 Example
Process
acq(L1)
w(x)1
w(y)1
rel(L1)
t
L1: lock related to x, y
Paper Presentation
2011 Fall Distributed Information Processing
II. Related Works
1. Lazy Release Consistency
 Two types of RC
① Eager release consistency
② Lazy release consistency
Paper Presentation
2011 Fall Distributed Information Processing
II. Related Works
1. Lazy Release Consistency
① Eager release consistency (ERC) postpones sending the
modifications to the next release.
 Understanding ERC
apply changes
r(x)0 r(x)0
Process A
acq(L1) r(y)1
x=1
y=1
acq(L1) w(x)1 w(y)1
Process B
apply changes
r(z)1
t
z=1
rel(L1)
apply changes
r(z)0
x=1
y=1
r(z)1
t
z=1
acq(L2) w(z)1
r(x)1
t
Process C
apply changes
rel(L2)
• Release operation does not complete (is not performed) until the
acknowledgements from all the processes are received.
• L1 is lock related to x, y and L2 is lock related to z
Paper Presentation
2011 Fall Distributed Information Processing
II. Related Works
1. Lazy Release Consistency
① Eager release consistency (ERC) postpones sending the
modifications to the next release.
 Understanding ERC
apply changes
r(x)0 r(x)0
Process A
acq(L1) r(y)1
x=1
y=1
acq(L1) w(x)1 w(y)1
Process B
apply changes
r(z)1
t
z=1
rel(L1)
apply changes
r(z)0
x=1
y=1
r(z)1
t
z=1
acq(L2) w(z)1
r(x)1
t
Process C
apply changes
rel(L2)
• Release operation does not complete (is not performed) until the
acknowledgements from all the processes are received.
• L1 is lock related to x, y and L2 is lock related to z
Paper Presentation
2011 Fall Distributed Information Processing
II. Related Works
1. Lazy Release Consistency
② Lazy release consistency (LRC) postpones sending of
modifications until a remote processor actually needs them
 Understanding ERC
w(x)1
Process A
Process B
Process C
rel(L1)
r(x)0
r(x)0
r(x)0
acq(L1)
r(x)1
r(x)0
r(x)0
r(x)0
acq(L2)
r(x)0
How did process B know whether
t
variable x has to be updated?
1) Write notice
• It is guaranteed that the acquirer of the2)same
lock sees the modification
Timestamp
that precede the release in program order.
Paper Presentation
2011 Fall Distributed Information Processing
II. Related Works
1. Lazy Release Consistency
② Lazy release consistency (LRC)
1) Write Notice
• The releaser send write notice to all other processes
w(x)1
Process A
rel(L1)
x=1
write notice
Process B
r(x)0
r(x)0
r(x)0
acq(L1)
r(x)1
Paper Presentation
2011 Fall Distributed Information Processing
II. Related Works
1. Lazy Release Consistency
② Lazy release consistency (LRC)
2) Timestamp
• Satisfying the happened-before relationship between all operations is
enough to satisfy LRC
• The ordering is applied to process intervals.
– Interval is a segment of time in the execution of a single process.
– New interval begins each time a process executes a synchronization
operation.
acq(L3)
rel(L1)
Process
1
2
3
Paper Presentation
2011 Fall Distributed Information Processing
II. Related Works
1. Lazy Release Consistency
② Lazy release consistency (LRC)
2) Timestamp
• It is implemented as a vector timestamp
Process 1
(2,3,0)
(1,0,0)
rel(L1)
Process 2
(1,1,0)
(1,2,0)
acq(L1)
Process 2
(1,0,1)
(1,3,0)
rel(L1)
(1,3,2)
Paper Presentation
2011 Fall Distributed Information Processing
II. Related Works
2. Multiple-Writer Protocol
 Multiple-writer protocol is designed to solve false sharing
 It is possible that several processes make modifications to different
variables at the same page
Process1
Variable
Page
Process2
Variable
X
Y
 Two techniques
 Twin
• It is a copied page of original page
• It will be compared with a changed page.
 Diff
• diff is a difference between twin and ‘copyset’
Paper Presentation
2011 Fall Distributed Information Processing
II. Related Works
2. Multiple-Writer Protocol
 Two techniques (cont’d)
 Twin
write P
twin
writable working copy
 Diff
release:

diff
 Note that twinning and diffing not only allows multiple independent
writers but also significantly reduces the amount of data sent
Paper Presentation
2011 Fall Distributed Information Processing
II. Related Works
Summary
 Solving communication overhead
 Lazy release consistency
 Solving false sharing
 Multiple-writer protocol
 Implementation
 Lazy release consistency + Multiple-writer protocol
 Eager release consistency + Multiple-writer protocol
Paper Presentation
2011 Fall Distributed Information Processing
t
II. Related Works
Summary
 Eager release consistency + multiple-writer protocol
twin diff
P1
w(x)
rel(L1)
inv(P)
acq(L2)
P2
P3
inv(P)
w(y)
twin diff
acq(L1) r(x)
r(y)
page P
x
y
rel(L2)
Paper Presentation
2011 Fall Distributed Information Processing
t
II. Related Works
Summary
 Lazy release consistency + multiple-writer protocol
twin diff
P1
w(x)
rel(L1)
inv(P)
acq(L2)
P2
P3
inv(P)
w(y)
twin diff
acq(L1) r(x)
r(y)
page P
x
y
rel(L2)
Paper Presentation
2011 Fall Distributed Information Processing
TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems
Contents
I.
II.
III.
IV.
Introduction
Techniques
Implementation: TreadMarks
Evaluation
Paper Presentation
2011 Fall Distributed Information Processing
t
III. Implementation: TreadMarks
Implementation
1)
2)
3)
4)
5)
Data structures
Interval and Diff creation
Access misses
Garbage collection
Etc
Paper Presentation
2011 Fall Distributed Information Processing
t
III. Implementation: TreadMarks
1) Data Structure (1/2)
 Lazy release consistency
 Write notice
 Timestamp
 Multiple-writer protocol
 Diff
Paper Presentation
2011 Fall Distributed Information Processing
t
III. Implementation: TreadMarks
1) Data Structure (2/2)
1
2
3
Page State
4
...
Page Array
Write Notice
Records
Copyset
Write Notice
Record 1
Write Notice
Record 2
If there is a write notice record in a page,
happen before relationship should be considered
through the vector time stamp
Write Notice
Record 3
Diff value is managed by diff pool
for efficient memory management
Diff Pool
Vector
Timestamp
1
1
2
3
2
4
...
3
4
Process Array
...
Processor Array
Paper Presentation
2011 Fall Distributed Information Processing
t
III. Implementation: TreadMarks
2) Interval and Diff creation
 Interval craetion
 logically
• a new interval begins at each release and acquire
 In practice
• Interval creation can be postponed until we communicate with another
process, avoiding overhead if a lock is reacquired by the same processor
 Diff creation
 With lazy diff creation, these pages remain writable until a diff request
or a write notice arrives for that page
• a subsequent write results in a write notice for the next interval
Paper Presentation
2011 Fall Distributed Information Processing
t
III. Implementation: TreadMarks
3) Access Misses
 When access miss occurs
 without write notice
• Initially setup that processor 0 has the page
 with write notice
1. Get the diffs from the write notice with small timestamp
2. Create an actual diff which is correction of all diff related to the page
3. The twin is discarded and the result is copied to copyset
Paper Presentation
2011 Fall Distributed Information Processing
t
III. Implementation: TreadMarks
4) Etc.
 Lock & barrier
 Statically assigned manager
 Garbage collection
 It reclaim the space used by write notice records, interval records, and
diffs
 It is triggered when the free space drops below a threshold
 Unix Aspects
 TreadMarks relies on Unix standard libraries
• Remote process creation, interprocessor communication, and memory
management.
Paper Presentation
2011 Fall Distributed Information Processing
TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems
Contents
I.
II.
III.
IV.
Introduction
Techniques
Implementation: TreadMarks
Evaluation
Paper Presentation
2011 Fall Distributed Information Processing
lV. Evaluation
Experimental Environment
 Experimental Environment
 8 DECstation-5000/240
 Connected to a 100-Mbps ATM LAN and a 10-Mbps Ethernet
 Applications
 Water – molecular dynamics simulation , 343 molecules for 5 steps
 Jacobi – Successive Over-Relaxation with a grid of 2000 by 1000
elements
 TSP – branch & bound algorithm to solve the traveling salesman
problem for a 19 cities
 Quicksort – sorts an array of 256K integers. Using bubblesort to sort
subarray of less than 1K element
 ILINK – genetic linkage analysis
Paper Presentation
2011 Fall Distributed Information Processing
lV. Evaluation
Results
 Speedup
 Message rate
 messages / sec
Paper Presentation
2011 Fall Distributed Information Processing
lV. Evaluation
Results
 Diff creation rate
 diffs / rate
Paper Presentation
2011 Fall Distributed Information Processing
End of Doc.
Paper Presentation
2011 Fall Distributed Information Processing