NC: A Verifiable, Executable Coordination Paradigm for
Download
Report
Transcript NC: A Verifiable, Executable Coordination Paradigm for
NC: A Verifiable, Executable
Coordination Paradigm for
Real-Time Communication
Sebastian Fischmeister
[email protected]
Department of Computer and Information Science
University of Pennsylvania
1
Real-Time Systems
• Right value at the wrong time can cause an error!
How to guarantee timeliness in the system?
No RT
Computer
simulation
Soft RT
User
interface
Internet
video, audio
Cruise
control
Hard RT
Tele
communication
Flight
control
Electronic
engine
2
Distributed RT Systems
• Multiple nodes together realize one real-time application.
How to bound communication delays in the system?
Shared
medium
Brake pedal
Fuel tank
Tires
Brake
Board
computer
Collisions, retransmissions, delays!
need for a coordination model for controlling access.
3
Time Division Multiple Access
• TDMA is a method for realizing collision-free access to a shared medium.
• Partition access into time slices (=slots) and rounds.
• Assigns slots to nodes for exclusive access using some algorithm.
• Example: Round robin with four nodes
Slot
N1
N2
N3
N4
t
Round
Slot #
1
2
3
4
Sender
N1
N2
N3
N4
Receiver
…
…
…
…
4
Flavors of TDMA
• Great for safety-critical systems, but too inflexible.
statically allocated
One static phase:
dynamic
Only dynamic:
Static + dynamic:
Static, dynamic, free:
static
static
dynamic
dynamic
fre
e
• Still:
– Decisions are made at the beginning of the round/phase.
– Phase order stays the same.
– Typically independent of the software.
5
Goals of Network Code
• Goals:
– Allow close interaction between the application and the medium
access control.
– Allow developers to program their own access control.
– Allow decisions during the communication round.
– Should be apt for safety-critical hard real-time systems.
6
Network Code
[ISCC05]
• Executable programmable coordination model for real-time
communication with a domain specific language.
– Split application into: program logic + communication logic
– Communication logic is programmed separately (=Network Code program)
– An interpreter (=Network Code machine) executes the program at run time
Program logic
NC program
Node1
Node2
Application
set X
Application
get X
Value space
get X
Value space
set X
NCM
NCM
Medium
tx M/ch
7
Research Results/Status
• Theory:
– Underlying concept and idea of Network Code [ISCC05]
– Verification of Network Code programs [RTAS06]
• Given a set of programs, are they safe to run concurrently?
– Metrics framework for measuring programs [EmSoft06]
• Overhead, throughput, avg. cycle length, avg. waiting time
– Composition algorithms for programs [submitted]
• Given several applications, how can they run together?
• Simulation:
– In round decisions increase throughput and performance. [EmSoft06]
– Gains are kept despite overhead introduced during composition. [submitted]
• Implementation:
– Implementation RTLinuxPro[RTAS06,TC07]+Ethernet, PIC18F+CAN , FPGA
8
How does it work?
9
Example: Transmit X in M
•
•
•
Slot length is 10t.
Node 1 transmits variable X every 20t in message M on channel ch.
Node 2 receives this message and stores it Y.
0
Sender
10
20
t
L0: create(M, X)
send(ch, M, _)
wait(20)
goto(L0)
Receiver
La: wait(10)
receive(ch, Y)
wait(10)
goto(La)
10
Complex Example
• Transmit a value with temporal replication tolerating 2 failures
– Sensor values A to E. (2n+1)
– Decide at the end of the round what the sensor value is.
– Take the majority vote (e.g., ‘4,4,3,2,4’ is 4)
Standard:
A
B
C
D
E
t
Vote decisive? Skip ahead!
Improved version:
A
B
C
D
E
t
11
Complex Example Code
A
B
C
D
E
t
Node A:
La: create(MA, A)
send(ch, MA)
wait(20)
receive(ch, B)
wait(10)
receive(ch, C)
if(voteDecisive, La)
wait(10)
receive(ch, D)
if(voteDecisive, La)
wait(10)
receive(ch, E)
goto(La)
12
System Overview
Application
layer
Network code
program
RT application
set/get
Dynamic
TDMA
layer
Queue
Variable space
Soft queue
Hard queue
Message transceiver
Queue
Network code
machine
Value det.
layer
set/get
Coordination
Access control
Communication medium
13
Data Flow (Hard Values)
Application
layer
Network code
program
RT application
set/get
Dynamic
TDMA
layer
Queue
Variable space
Soft queue
Hard queue
Message transceiver
Queue
Network code
machine
Value det.
layer
set/get
Coordination
Access control
Communication medium
14
Data Flow (Soft Values)
Application
layer
Network code
program
RT application
set/get
Dynamic
TDMA
layer
Queue
Variable space
Soft queue
Hard queue
Message transceiver
Queue
Network code
machine
Value det.
layer
set/get
Coordination
Access control
Communication medium
15
Verification & Checking
[RTAS06]
• What if I have a bug in my Network Code program?
Prgm 1:
La:
wait(20)
send(_,_,_)
if(guardX, La)
wait(10)
send(_,_,_)
goto(La)
Prgm 2:
Lb: wait(10)
send(_,_,_)
wait(30)
goto(Lb)
a verification framework with model checking
• Use operational semantics of instructions to model system state.
• Convert program instructions to transformations of the system state.
• Check properties are reachability questions using model checking.
16
Example Properties
• Integrity checks
–
–
–
–
• Behavioral checks
All guards present
Valid jump addresses
Successor address
Consequential code
– Non-empty send
– Local message lifecycle
• Distributed checks
–
–
–
–
Collision-free communication
Schedulability
Non-empty receive
Message reception within timeout
17
Metrics Framework
[EmSoft06]
• What is the overhead? What is the throughput? What is the
cycle?
– Convert the Network Code Program into a timed tree communication
schedule.
– Compute metrics on the tree schedule using DFS & model checking.
• Timed tree communication schedule:
– Nodes encode messages releases and delays
– Transitions encode mutually excl. decisions
(exhaustive, deterministic)
– Leaves cause a reset
(X,3)
!g1
(Y,4)
g1
(Z,2)
18
Example Metrics
A
A
B
C
D
g1
E
g2
B
t
• Longest cycle length = longest branch
• Guard overhead
• End-to-end constraints for messages
g1
C
-
!g1
g2
-
D
!g2
E
• Probabilistic tree schedules:
– Average cycle length = average branch length
– Average waiting times
19
Is it better?
20
Inverted Pendulum
• Pendulum sitting on a cart with one degree of freedom.
• Measure the angle w.r.t. to straight up.
• The cart moves horizontally to balance the pendulum.
• Temporal replication to tolerate two faults:
– 5 measurement units (A to E)
– Report measurements to the controller.
– Controller updates the kart at the round.
21
Simulation Results
Standard schedule:
A
B
C
D
E
t
Network code:
Vote decisive? Skip ahead!
A
B
C
D
E
t
Network Code is better*
22
Simulation Results
Mean time to failure
With an increase in the mean time to failure, standard TDMA stays the same
Network Code utilize this increase in reliability.
23
Implementation Measurements
• Given the shortest program, what is the fastest execution rate?
– Output queue must not overflow.
– Input queue must not under-/overflow.
– Slot structure must not be violated.
• RTLinuxPro + direct link (100 MBit/s)
– 10% latency increase due to computation time
– 2.96MByte/s with a 99% guarantee of a successful transmission
– 2.65MByte/s with 99.999% guarantee
• Interrupts preempt instructions, cause jitter, cause delays.
– Send() instruction: 372ns (39.47%), 733ns (99%), 19090ns (99.999%)
=> FPGA! (10us)
24
Conclusions
• Communication behavior differs among applications;
custom protocols/schedules can:
– Significantly improve the application’s performance
– Incorporate/guarantee new properties at the communication layer
• Network Code allows programming custom TT communication behavior.
– Programs are checkable, guarantee properties in the distributed
system.
– Programs are measurable with tree schedules.
– Can run real-time and non-real-time traffic on the same network.
• Future work: multi-resource tree schedules and resource DSLs,
program generation, development methodology
25
Other/Past Research
• Logical execution model has deficits when considering communication.
– Task chains[DASC04]
– Case study[RTAS05]
• How to deterministically reconfigure RT software?
– Use virtualization and separate behavior & functionality[ECRTS05]
• Are mobile agents systems secure? (year 2000)
– No![MA01]
• How to build efficient proactive location-based systems?
– Decentralized architecture[LNAI]
– Use notification service[PERCOM03]
26
Outlook
27
DSLs & Interfaces
Current Approach
close()
Resources
open()
DSL1
send()
x++
X()
read()
malloc()
Software
Component
DSL2
DSL3
receive()
y+=5%x
write()
spawn()
fun()
Network Code
Medium
• What are good abstractions for resources, what are good DSLs?
• What makes the interface of a software component? Is the DSL the interface?
• How can we generate code for a component?
• How do we provide hardware support for DSLs? (e.g., programmable memory)
28