Transcript Document

Is Java Ready
For
Real-Time Embedded Systems?
Angelo Corsaro
[email protected]
DOC Group
Washington University
St. Louis, USA
David Sharp, Jim Urnes, Jr.
[email protected],
james.m.urnes-jr @boeing.com
Boeing St. Louis USA
Table of Contents
 Real Time Java
 General Performance
 JVM98 SPEC Benchmark
 Filtering
 Hanoi Tower
 Garbage Collection
 RT Performance
 Thread Preemption Behavior
 Priority Inversion Avoidance
 Jitter in Periodic Event Handling
 Timing in Periodic Event Handling
 Concluding Remarks
2
What’s Going On?
Real Time Java
 During Y2K two specification for extending Java
with Real-Time capabilities have appeared,
specifically
 RTSJ from the Real Time Java Expert Group (headed by
Greg Bollella)
 J-Consortium Specification for Real Time Java
 Sun supports the RTSJ, and the market seems to
head in the same direction
 Ajile had already implemented a subset of the spec on HW
already last August
 IBM provided us with the first “implementation” of a
RTJVM
 TimeSys has been selected by Sun to implement
the reference JVM that will implement the RTSJ
specification
3
What is RTJava
Real Time Java
 Real Time Java is an extension of the Java platform
to allow the development of Real Time Application
 The RTJava specification enhance 7 areas
 Thread Scheduling and Dispatching
 Memory Management
 Synchronization and Resource Sharing
 Asynchronous Event Handling
 Asynchronous Transfer of Control
 Asynchronous Thread Termination
 Physical Memory Access
4
General Performance
JVM98 SPEC Benchmark 1/3
 The JVM98 SPEC Benchmark is composed
by the following applications
 _200_check
 _213_javac
 _201_compress
 _222_mpegaudio
 _202_jess
 _227_mtrt
 _209_db
 _228_jack
5
JVM98 SPEC Benchmark
_213_javac
MS-Max
Sun-Max
_228_jack
_222_mpegaudio
Application
General Performance
IBM-Max
_209_db
_201_compress
_202_jess
_227_mtrt
0
20
J9 JVM98 Results
40
Grade
60
JView JVM98 Results
80
100
Javac JVM98 Results
120
6
General Performance
Application Description
 Finite Impulse Response (FIR) Filter with 21 tap
 The filter implement a noise attenuation on audio
samples
 Filters coefficient and sample data are double
Audio Sample
x(k)
Filtered Audio Sample
Filter
y(k)
7
Filter Performance
200
C++
Jview
160
Sun-JDK
IBM-J9
140
120
msec
General Performance
180
100
80
60
40
20
0
30000
60000
Samples
90000
120000
8
General Performance
Application Description
 Java implementation of the Algorithm that solves
the famous Hanoi Tower game
9
Hanoi Tower
Hanoi
16000
Sun-Jit
J9-Jit
J9-AOC
JView
14000
12000
msec
General Performance
18000
10000
8000
6000
4000
2000
0
20
25
Disks
26
10
27
Application Description
General Performance
 Build a full binary tree of height H
 Each node of the binary tree allocates a slot of memory of
B bytes
 Analyze the following case
 Once the tree is build, remove the reference from the
root, and force the garbage collection
 Create and release while building the tree to investigate if
the JVM have some sort of incremental GC
11
Garbage Collection Performance 1/5
General Performance
Garbage Collection Test (J9)
18
Tot Time
Tree Depth 16
Method Call
Imposed GC
14
0
200
400
600
800
1000
1200
1400
1600
1800
2000
msec
14
16
18
Tot Time
78
438
1902
Method Call
62
407
1782
Imposed GC
16
31
125
12
Garbage Collection Performance 2/5
20
18
Tree Depth
16
Total Time
Method Call
Imposed GC
14
0
100
200
300
400
500
600
700
800
900
1000
msec
General Performance
Garbage Collection Test J9-RAC
14
16
18
20
Total Time
16
63
235
906
Method Call
16
63
235
906
Imposed GC
0
0
0
0
13
Garbage Collection Performance 3/5
2000
1800
C++-RAC
1600
C++
1400
J9-RAC
Sun-RAC
1200
msec.
General Performance
C++ vs. Java Memory Alloc/Dealloc
1000
800
600
400
200
0
14
16
18
20
Depth
14
General Performance
Garbage Collection Performance 4/5
160
J9-GC Time
140
Sun-GC Time
Sun-Xincgc GC Time
120
J9-GC-rac
100
msec
Sun-GC-rac
80
60
40
20
0
14
16
Depth
18
15
Garbage Collection Performance 5/5
C++
18000
J9
Sun-JVM
16000
C++RAC
J9-RAC
14000
Sun-RAC
12000
msec
General Performance
20000
10000
8000
6000
4000
2000
0
14
16
Depth
18
20
16
Test Cases
 Real-time determinism test cases
RT Performance
 RealtimeThread preemption handling (PreemptTest)
 Priority inversion avoidance (PriInvertTest)
 Dispatching of AsyncEvents (EventDispatchTest2)
 Jitter in periodic event handling (EventJitterTest)
 Timing of periodic event handling (EventTimingTest)
17
PreemptTest Scenario
RT Performance
 Purpose:
 Measure whether priority preemption occurs correctly for
multiple RealtimeThreads of different priorities
 Method:
 Stagger the start of fixed-duration, processor-holding
RealtimeThreads of increasing or decreasing priority. Using
timestamp logging, see when threads enter and exit in
relation to each other
18
PreemptTest Results
RT Performance
Starting at priority 7, start the treads every 2 seconds in decreasing priority order. Threads try to keep the
processor for 7 seconds.
This works!
Problem!
Enter and leave
in blocks of 3
Result: Thread 1 worked correctly, it kept the processor until it was done. But threads 2, 3, and 4, all started
immediately upon thread 1 finishing. They ran simultaneously and when complete, threads 5, 6, and 7
all started and ran simultaneously. Thread preemption did not work correctly.
19
PreemptTest Analysis
RT Performance
 Problem
 The IBM j9 groups multiple Java thread levels to a single
underlying RTOS thread level
 RTSJ Requires at least 28 unique priority levels
Java Thread Priority
QNX Neutrino Thread Priority
28, 29, 30
25, 26, 27
22, 23, 24
19, 20, 21
16, 17, 18
13, 14, 15
10, 11, 12
7, 8, 9
4, 5, 6
1, 2, 3
15r
14r
13r
12r
11r
10r
9r
8r
7r
6r
 Suggestion
 Allow underlying RTOS threads to be FIFO as well as
Round Robin
20
PriInvertTest Scenario
 Purpose:
RT Performance
 Measure whether priority inversion is properly avoided.
 Method:
 A low priority thread obtains a lock on a synchronized
method, a medium thread preempts the low, and a high
priority thread preempts the medium and attempts to
obtain a lock on the same synch’d method. Log entry and
exit times for the threads and synchronized method.
21
RT Performance
PriInvertTest Results
Result: The low priority thread did NOT get elevated to high priority to finish the shared method. Therefore,
the medium priority thread got to finish before the high priority thread. Priority inversion occurred.
22
PriInvertTest Analysis
 Problem
RT Performance
 Priority Inheritance does not currently work
 RTSJ specifies Priority Inheritance as the default priority
inversion avoidance method for synchronized blocks
23
EventDispatchTest Scenario
RT Performance
 Purpose:
 Measure the execution order for multiple
AsyncEventHandlers of different priority when an
AsyncEvent fires for which they are all registered.
 Method:
 Set up three AsyncEventHandlers of different priority, all
registered to the same AsyncEvent. Issue the event and
have the handlers log timestamps on entry and exit.
24
EventDispatchTest Results
RT Performance
In this tests the Java priorities are such that the resulting QNX priorities are different.
Result: This is correct. The highest priority handler runs first, then next highest, and then lowest.
In this tests the Java priorities are such that the resulting QNX priorites are the same.
Result: This is incorrect. The handlers don’t start in priority order and preempt each other.
25
EventDispatchTest Analysis
RT Performance
 Problem
 The OTI j9 groups multiple Java thread levels to a single
underlying RTOS thread level
 RTSJ requires 28 unique priority levels
Java Thread Priority
QNX Neutrino Thread Priority
28, 29, 30
25, 26, 27
22, 23, 24
19, 20, 21
16, 17, 18
13, 14, 15
10, 11, 12
7, 8, 9
4, 5, 6
1, 2, 3
15r
14r
13r
12r
11r
10r
9r
8r
7r
6r
26
EventJitterTest Scenario
RT Performance
 Purpose:
 Measure the amount of time variation between runs of a
PeriodicTimer driven AsyncEventHandler with various other
activity occurring.
 Method:
 Setup a PeriodicTimer object to fire an AsyncEventHandler
at a fixed rate, while other lower priority, processor-keeping
RealtimeThreads run. Log a timestamp each time the
handler runs. After the run, import the data to Excel for
analysis.
27
EventJitterTest Results
RT Performance
In this 1 second test, our AsyncEventHandler runs at java priority 30. Another RealtimeThread runs at
java priority 6. PeriodicTimer event fires every 50 msecs (20 Hz.).
Result: Quite good - jitter between runs of handler within RTOS timer
resolution.
28
EventJitterTest Results (cont)
RT Performance
In this 1 second test, our AsyncEventHandler runs at java priority 30. 200 other RealtimeThreads runs at
java priority 6. PeriodicTimer event fires every 50 msecs (20 Hz.).
Result: Not bad - some jitter (+/- 1.1 msec) between runs, but lower
priority threads do seem to affect jitter.
29
EventJitterTest Results (cont)
RT Performance
In this 1 second test, our AsyncEventHandler runs at java priority 30. Another RealtimeThread runs at
java priority 10. PeriodicTimer event fires every 50 msecs (20 Hz.).
Result: Bad - the periodic events never get to the handler, even though the handler has
higher priority than the other RealtimeThread.
30
EventJitterTest Analysis
 Apparent AsyncEventHandler mechanization
RT Performance
 At least for PeriodicTimer driven AsyncEvents
PeriodicTimer’s fire()
method called
“Spawner”
thread
QNX pri=8r
Handler
Handler
thread
Handler
Javathread
pri=30
Handler
thread
Java
pri=30
QNX pri=15r
thread
Java
pri=30
QNX
pri=15r
Java
pri=30
QNX pri=15r
QNX pri=15r
handler thread of desired priority
spawned for each firing
Another thread
Java pri = 10
QNX pri = 9r
Problem!
If this thread is using the processor,
the “Spawner” thread will not get
the opportunity to spawn the handler
thread. Priority inversion occurs.
31

Results for AsyncEventHandler with priority 30 (highest)

All data points are 100 second runs at 20 Hz (50 msecs)

3 sets of data analyzed with other RealtimeThreads run at
priority 1, 6, and 9 respectively
Num of Other Threads vs. Error Range (i.e. Jitter)
Error range (secs)
RT Performance
EventJitterTest Analysis (cont)
0.06
0.05
0.04
0.03
0.02
0.01
0
pri=1
pri=6
pri=9
0
50
100
150
200
250
300
350
Number of other threads
Dip at exactly 50 threads repeatable
Other threads are processor-keeping and
run time-sliced at the same priority.
For priority 9, jitter is off
the scale due to contention
with “spawner”. (previous
slide)
32
EventJitterTest Analysis (cont)
Absolute period measurements
Period for Each Sample
50 other theads, pri=6
0.08
0.07
0.06
0.05
0.04
0.03
0.02
Period (secs)
Period (secs)
Period for Each Sample
50 other theads, pri=6
0
500
1000
1500
0.0515
0.051
0.0505
0.05
0.0495
0.049
0.0485
2000
0
500
Samples
1500
Samples
Period for Each Sample
30 other theads, pri=6
Period (secs)
0 or50 other RT threads
results in jitter values
around +/- 1 msec,
probably due to RTOS
timer resolution
1000
0.08
0.06
0.04
0.02
0
500
1000
Samples
1500
2000
For other numbers
of RT threads, some
jitter outside +/- 1
msec occurs.
33
2000
EventJitterTest Analysis (cont)
–Standard deviation and Mean
Number of Other Threads vs. Std. Dev.
Std Dev trend
increases with
number of other RT
threads
std dev
0.002
0.0015
pri=1
pri=6
0.001
0.0005
0
0
50
100
150
200
250
300
350
Number of other threads
Number of Other Threads vs. Mean Period
Degradation above
250 threads,
perhaps hitting
processor
throughput limit
Mean period (secs)
RT Performance
0.0025
0.1
0.09
0.08
0.07
0.06
0.05
0.04
0.03
0.02
0.01
0
pri=1
pri=6
pri=9
0
50
100
150
200
250
Number of other threads
300
350
34
EventTimingTest Scenario
RT Performance
 Purpose:
 Measure the time it takes for an AsyncEvent to get from
firing to its handler. Vary priority and other processor
activity.
 Method:
 Set up an AsyncEvent and associated AsyncEvent handler at
specified priority. Over a number of iterations, log the
timestamp before firing and upon entry into the handler.
Run variable number of other processor-keeping
RealtimeThreads. Import results to Excel for analysis.
35
EventTimingTest Results
Latency for each sample
50 other threads, pri=1
Latency (secs)
RT Performance
High priority (30) AsynchEventHandler with 50 other processor-keeping RT threads of priority 1.
100 samples taken.
Result plotted.
0.00206
0.00204
0.00202
0.002
0.00198
0.00196
0
20
40
60
Sample Number
80
100
36
EventTimingTest Analysis (cont)
High priority (30) AsynchEventHandler with various number of other processor-keeping RT threads.
100 samples taken for each point.
Avg. Latency
(secs)
RT Performance
Num of Other Threads vs. Avg. Event Latency
Handler pri = 30, other pri=1, 6, 12
0.004
pri=1
pri=6
pri=12
0.003
0.002
0.001
0
100 200 300 400 500 600 700 800 900
Number of other threads
About 2 msec average
latency for this condition
below 300 other threads
 Still under investigation
 If handler priority lowered to 15, the average latency improves to
about 0.6 msec
– Does not exhibit the “spawner” thread priority inversion of the
PeriodicTimer driven AsyncEventHandler
37
Summary Of Experimental Results
RT Performance
 Java thread priority preemption must be maintained
for each of the 28 required thread levels.
 AsyncEventHandler threads driven by PeriodicTimers
must not be prevented from running by lower
priority threads.
 Priority inheritance should work by default.
 Programmer needs better control of Java thread to
underlying RTOS mapping.
38
Concluding Remarks
 JIT provide acceptable performance to Java, and
improvement in this area will close the performance gap
with C++ (which is already within a factor 2)
 J9 RT-Java extension are still very minimal, and cannot
be considered yet for industrial strength development
 The J9 RT-Java extensions suffer of some major
problem, mainly
 Priority Mapping
 Priority Inversion Effect induced by the “Spawner”
 This investigation has provided us with Test-Bed that
will make easier to evaluate the “performance” of a RTJVM once mature (complete) implementation will start
to appear
39