PowerPoint 프레젠테이션
Download
Report
Transcript PowerPoint 프레젠테이션
TreadMarks: Distributed Shared Memory on
Standard Workstations and Operating Systems
Jeongho Kim, Youngseok Son
M.S. Student.
Real-Time Operating Systems Laboratory
Seoul National University
Paper Presentation
2011 Fall Distributed Information Processing
TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems
Contents
I.
II.
III.
IV.
Introduction
Techniques
Implementation: TreadMarks
Evaluation
Paper Presentation
2011 Fall Distributed Information Processing
I. Introduction
Distributed Shared Memory
The distributed shared memory (DSM) implements the shared
memory model in distributed systems, which have no physical
shared memory
The shared memory model provides a virtual address space
shared between all processors
The high cost of communication in distributed systems is that
DSM systems move data between processors
Processor 1
Processor 2
Processor N
Mem 1
Mem 2
Mem N
network
Shared memory
Paper Presentation
2011 Fall Distributed Information Processing
I. Introduction
Problems of Existing DSM
OS dependency
Kernel modifications
Poor performance
① Communication overhead
② False sharing
Paper Presentation
2011 Fall Distributed Information Processing
I. Introduction
Problems of Existing DSM
① Communication overhead
If the communication occurs whenever the x variable is changed, it
costs a lot of overhead.
Process A
w(x)
w(x)
w(x)
page
page
page
t
t
Process B
Solution) The communication occurs, when the x variable is changed
finally
Process A
w(x)
w(x)
w(x)
t
page
Process B
t
Paper Presentation
2011 Fall Distributed Information Processing
I. Introduction
Problems of Existing DSM
② False sharing
A situation in which two or more processes access different variables
within a page.
• If only one process is allowed to write to a page at a time, false sharing
leads to unnecessary communication, called the “ping-pong” effect.
Process
Code using
variable a
Process
Code using
variable b
① Access to write variable a
Page
③ Access to read variable b
④ wait
X
Page
a
b
② Transfer the page
a
b
Paper Presentation
2011 Fall Distributed Information Processing
I. Introduction
Objectives
User-level implementation
Good performance
Reduction of communication overhead
• Lazy release consistency
• Invalidate-based protocol
Solving false sharing
• multiple writer protocol
Paper Presentation
2011 Fall Distributed Information Processing
TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems
Contents
I.
II.
III.
IV.
Introduction
Techniques
Implementation: TreadMarks
Evaluation
Paper Presentation
2011 Fall Distributed Information Processing
II. Related Works
Related Works
1. Lazy release consistency
2. Multiple-writer protocol
Paper Presentation
2011 Fall Distributed Information Processing
II. Related Works
1. Lazy Release Consistency
Release Consistency (RC) is a memory consistency model
that permits a node to delay making its changes of shared
data visible to other nodes until certain synchronization
accesses occur
Shared memory accesses
Ordinary access
Synchronization access
• acquire access
• release access
Example
Process
acq(L1)
w(x)1
w(y)1
rel(L1)
t
L1: lock related to x, y
Paper Presentation
2011 Fall Distributed Information Processing
II. Related Works
1. Lazy Release Consistency
Two types of RC
① Eager release consistency
② Lazy release consistency
Paper Presentation
2011 Fall Distributed Information Processing
II. Related Works
1. Lazy Release Consistency
① Eager release consistency (ERC) postpones sending the
modifications to the next release.
Understanding ERC
apply changes
r(x)0 r(x)0
Process A
acq(L1) r(y)1
x=1
y=1
acq(L1) w(x)1 w(y)1
Process B
apply changes
r(z)1
t
z=1
rel(L1)
apply changes
r(z)0
x=1
y=1
r(z)1
t
z=1
acq(L2) w(z)1
r(x)1
t
Process C
apply changes
rel(L2)
• Release operation does not complete (is not performed) until the
acknowledgements from all the processes are received.
• L1 is lock related to x, y and L2 is lock related to z
Paper Presentation
2011 Fall Distributed Information Processing
II. Related Works
1. Lazy Release Consistency
① Eager release consistency (ERC) postpones sending the
modifications to the next release.
Understanding ERC
apply changes
r(x)0 r(x)0
Process A
acq(L1) r(y)1
x=1
y=1
acq(L1) w(x)1 w(y)1
Process B
apply changes
r(z)1
t
z=1
rel(L1)
apply changes
r(z)0
x=1
y=1
r(z)1
t
z=1
acq(L2) w(z)1
r(x)1
t
Process C
apply changes
rel(L2)
• Release operation does not complete (is not performed) until the
acknowledgements from all the processes are received.
• L1 is lock related to x, y and L2 is lock related to z
Paper Presentation
2011 Fall Distributed Information Processing
II. Related Works
1. Lazy Release Consistency
② Lazy release consistency (LRC) postpones sending of
modifications until a remote processor actually needs them
Understanding ERC
w(x)1
Process A
Process B
Process C
rel(L1)
r(x)0
r(x)0
r(x)0
acq(L1)
r(x)1
r(x)0
r(x)0
r(x)0
acq(L2)
r(x)0
How did process B know whether
t
variable x has to be updated?
1) Write notice
• It is guaranteed that the acquirer of the2)same
lock sees the modification
Timestamp
that precede the release in program order.
Paper Presentation
2011 Fall Distributed Information Processing
II. Related Works
1. Lazy Release Consistency
② Lazy release consistency (LRC)
1) Write Notice
• The releaser send write notice to all other processes
w(x)1
Process A
rel(L1)
x=1
write notice
Process B
r(x)0
r(x)0
r(x)0
acq(L1)
r(x)1
Paper Presentation
2011 Fall Distributed Information Processing
II. Related Works
1. Lazy Release Consistency
② Lazy release consistency (LRC)
2) Timestamp
• Satisfying the happened-before relationship between all operations is
enough to satisfy LRC
• The ordering is applied to process intervals.
– Interval is a segment of time in the execution of a single process.
– New interval begins each time a process executes a synchronization
operation.
acq(L3)
rel(L1)
Process
1
2
3
Paper Presentation
2011 Fall Distributed Information Processing
II. Related Works
1. Lazy Release Consistency
② Lazy release consistency (LRC)
2) Timestamp
• It is implemented as a vector timestamp
Process 1
(2,3,0)
(1,0,0)
rel(L1)
Process 2
(1,1,0)
(1,2,0)
acq(L1)
Process 2
(1,0,1)
(1,3,0)
rel(L1)
(1,3,2)
Paper Presentation
2011 Fall Distributed Information Processing
II. Related Works
2. Multiple-Writer Protocol
Multiple-writer protocol is designed to solve false sharing
It is possible that several processes make modifications to different
variables at the same page
Process1
Variable
Page
Process2
Variable
X
Y
Two techniques
Twin
• It is a copied page of original page
• It will be compared with a changed page.
Diff
• diff is a difference between twin and ‘copyset’
Paper Presentation
2011 Fall Distributed Information Processing
II. Related Works
2. Multiple-Writer Protocol
Two techniques (cont’d)
Twin
write P
twin
writable working copy
Diff
release:
diff
Note that twinning and diffing not only allows multiple independent
writers but also significantly reduces the amount of data sent
Paper Presentation
2011 Fall Distributed Information Processing
II. Related Works
Summary
Solving communication overhead
Lazy release consistency
Solving false sharing
Multiple-writer protocol
Implementation
Lazy release consistency + Multiple-writer protocol
Eager release consistency + Multiple-writer protocol
Paper Presentation
2011 Fall Distributed Information Processing
t
II. Related Works
Summary
Eager release consistency + multiple-writer protocol
twin diff
P1
w(x)
rel(L1)
inv(P)
acq(L2)
P2
P3
inv(P)
w(y)
twin diff
acq(L1) r(x)
r(y)
page P
x
y
rel(L2)
Paper Presentation
2011 Fall Distributed Information Processing
t
II. Related Works
Summary
Lazy release consistency + multiple-writer protocol
twin diff
P1
w(x)
rel(L1)
inv(P)
acq(L2)
P2
P3
inv(P)
w(y)
twin diff
acq(L1) r(x)
r(y)
page P
x
y
rel(L2)
Paper Presentation
2011 Fall Distributed Information Processing
TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems
Contents
I.
II.
III.
IV.
Introduction
Techniques
Implementation: TreadMarks
Evaluation
Paper Presentation
2011 Fall Distributed Information Processing
t
III. Implementation: TreadMarks
Implementation
1)
2)
3)
4)
5)
Data structures
Interval and Diff creation
Access misses
Garbage collection
Etc
Paper Presentation
2011 Fall Distributed Information Processing
t
III. Implementation: TreadMarks
1) Data Structure (1/2)
Lazy release consistency
Write notice
Timestamp
Multiple-writer protocol
Diff
Paper Presentation
2011 Fall Distributed Information Processing
t
III. Implementation: TreadMarks
1) Data Structure (2/2)
1
2
3
Page State
4
...
Page Array
Write Notice
Records
Copyset
Write Notice
Record 1
Write Notice
Record 2
If there is a write notice record in a page,
happen before relationship should be considered
through the vector time stamp
Write Notice
Record 3
Diff value is managed by diff pool
for efficient memory management
Diff Pool
Vector
Timestamp
1
1
2
3
2
4
...
3
4
Process Array
...
Processor Array
Paper Presentation
2011 Fall Distributed Information Processing
t
III. Implementation: TreadMarks
2) Interval and Diff creation
Interval craetion
logically
• a new interval begins at each release and acquire
In practice
• Interval creation can be postponed until we communicate with another
process, avoiding overhead if a lock is reacquired by the same processor
Diff creation
With lazy diff creation, these pages remain writable until a diff request
or a write notice arrives for that page
• a subsequent write results in a write notice for the next interval
Paper Presentation
2011 Fall Distributed Information Processing
t
III. Implementation: TreadMarks
3) Access Misses
When access miss occurs
without write notice
• Initially setup that processor 0 has the page
with write notice
1. Get the diffs from the write notice with small timestamp
2. Create an actual diff which is correction of all diff related to the page
3. The twin is discarded and the result is copied to copyset
Paper Presentation
2011 Fall Distributed Information Processing
t
III. Implementation: TreadMarks
4) Etc.
Lock & barrier
Statically assigned manager
Garbage collection
It reclaim the space used by write notice records, interval records, and
diffs
It is triggered when the free space drops below a threshold
Unix Aspects
TreadMarks relies on Unix standard libraries
• Remote process creation, interprocessor communication, and memory
management.
Paper Presentation
2011 Fall Distributed Information Processing
TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems
Contents
I.
II.
III.
IV.
Introduction
Techniques
Implementation: TreadMarks
Evaluation
Paper Presentation
2011 Fall Distributed Information Processing
lV. Evaluation
Experimental Environment
Experimental Environment
8 DECstation-5000/240
Connected to a 100-Mbps ATM LAN and a 10-Mbps Ethernet
Applications
Water – molecular dynamics simulation , 343 molecules for 5 steps
Jacobi – Successive Over-Relaxation with a grid of 2000 by 1000
elements
TSP – branch & bound algorithm to solve the traveling salesman
problem for a 19 cities
Quicksort – sorts an array of 256K integers. Using bubblesort to sort
subarray of less than 1K element
ILINK – genetic linkage analysis
Paper Presentation
2011 Fall Distributed Information Processing
lV. Evaluation
Results
Speedup
Message rate
messages / sec
Paper Presentation
2011 Fall Distributed Information Processing
lV. Evaluation
Results
Diff creation rate
diffs / rate
Paper Presentation
2011 Fall Distributed Information Processing
End of Doc.
Paper Presentation
2011 Fall Distributed Information Processing