Configuring Global Payroll for Optimal Performance

Download Report

Transcript Configuring Global Payroll for Optimal Performance

Configuring
PeopleSoft Global
Payroll for Optimal
Performance
Session 507
[email protected]
www.go-faster.co.uk
[email protected]
www.omniasolutions.com
Who are we?
 David Kurtz
 Gene Pirogovsky
• Independent Consultant
working for UBS
• Independent Consultant
working for UBS
 Swiss GP project
 Swiss GP project
• Systems Performance
Tuning
 Oracle Databases
 Unix
 PeopleSoft
Applications
• Global Payroll
• Interfaces
• Customizations
2
Configuring Global Payroll
 Physical Database
Considerations
• Oracle specific
 GP Changes
• Reducing I/O
• Reduce CPU
consumption of rules
Engine
• CPU overhead
• Data Migration
3
Initial Impressions
 Payroll is calculated by a Cobol program
• GPPDPRUN
• Single non-threaded process
• Four Stages
 Cancel
 Identify
 (Re-)Calculate
 Finalise
4
Two stages with different behaviours
 Identify
 Calculation
• Populating temporary
work tables
• Evaluation of rules
• Opening cursors
• Batch insert of results
into database
 (Cobol only)
• Database Intensive
• Cobol (CPU) Intensive
• ~20 minutes
• ~5000 segments / hour
/ stream (was 1200)
5
What is Streaming?
 Employees are split into groups defined by
ranges of employee ID
 Each group/range can be processed by a
different instance/stream of GPPDPRUN
 The streams can then be run in parallel.
 Vanilla PeopleSoft functionality.
6
Why is Streaming Necessary?
 GPPDPRUN is a standard Cobol program.
 It is a single threaded process
 One Cobol process can only run on one
CPU at any one time
 33000 employees at 5000
employees/hour/stream
• 6.6hrs if run in one stream
• 27.5 hours at 1200/hr
 On a multi-processor server streaming
enables consumption of extra CPU.
7
Calculation of Stream Definitions
 Objective is roughly equal processing time
for all stream
• PS_GP_PYE_PRC_STAT indicates work to be
done by payroll.
• Calculate ranges of roughly equal numbers of
rows for this table
• Script using Oracle’s Analytic functions that
directly populates PS_GP_STRM
 This does NOT lead to equally sized
GP_RSLT* tables.
8
Partition Boundary Creep
 As new employees hired EMPLIDs allocated
into the same stream.
• That stream starts to run longer.
• Effective execution time is maximum execution
time for all streams.
• Need to periodically recalculate stream ranges
• Need to reflect this is physical changes.
9
Database Contention
 Rollback Contention
 Snapshot Too Old
 Insert Contention
 I/O Volume
• Datafile I/O
• Redo/Archive Log Activity
10
Rollback Contention
 Working Storage Tables
• Shared by all streams
• Rows inserted/deleted during run
• Different Streams never create locks that block
each other,
• Do update different rows in same block during
processing
• 1 interested transaction per stream in many
blocks.
• There is a additional rollback overhead of 16
bytes per row if two rows in same block -vdifferent blocks
 updates of ~<100 bytes / row
11
Read Consistency
 Oracle guarantees that data is consistent
throughout life of a query
• If a block has been updated by another
transaction since a long running query started,
it must be possible to reconstruct the state of
that block at the time the query started using
the rollback segment.
• If that information cannot be found in the
rollback segment the long running query fails
with ORA-01555.
12
ORA-01555 Snapshot Too Old
 Rollback segments are not extended for
read consistency.
 Additional rollback overhead can cause
rollback segments to spin and wrap.
 Error message also described a ‘rollback
segments too small.’
 In this case, to simply extend the
segments is the wrong response.
 CPU overhead to navigate rollback
segment header chain
13
Insert Contention
 During the calculation phase results are
written to the result tables.
 A number of stream can simultaneously
insert into the same result tables.
 Increases chance that one block will
contain rows relating to more than one
stream.
 This in turn causes rollback problems
during the cancel in the next calculation.
14
Another cause of ORA-1555
• If not processing calendar for the first time,
previous results cancelled
 Result table are deleted
 Monolithic deletes from each table.
• If Streams start together tend to delete same
table at same time in each stream.
• A long running delete is also a query for the
purposes of read consistency.
 It is necessary to reconstruct a block as at the time
the long running delete started in order to delete a
row from it.
 Reconstruction occurs during ‘consistent read’.
 Deletes by primary key columns, thus Oracle tends to
look each row up row by index. Thus index reads also
‘consistent’.
15
Datafile and Log Physical I/O Activity
 During the identify phase data is shuffled from
table to table
• This generates datafile and redo log I/O
 Rollback activity is also written to disk, undo
information is also written to the redo log.
 All the data placed in the temporary working
tables by a stream is of no use to any other
instance of the calculation process.
 It will be deleted by a future process.
16
High Water Marks
 The working storage tables tend to be used to
drive processing.
 Thus, the SQL tends to use full table scans.
 In Oracle, High Water Mark is the highest block
that has ever contained data.
 Full Scans scan the table up to the high water
mark.
 Temporary tables contain data for ALL streams.
 All streams can have to scan data for all streams.
17
How to avoid inter-stream contention?
 Keep rows from different streams in different
blocks
 Each block should contain rows for one and only
one stream.
 Two Oracle Features
18
What is Partitioning?
 Logically,
• a partitioned table is a still a single table
 Physically,
• each partition is a separate table.
• in a range partitioned table, the partition in
which a row is placed is determined by the
value of one or more columns.
 Local Index
• is partitioned on the same logical basis as the
table.
19
How should Partitioning used in GP?
 Largest Result tables range each
partitioned on EMPLID to match GP
streaming
• 1 stream : 1 partition
• Thus each stream references one partition in each result
table.
• Only 1 interested transaction per block
• Indexes ‘locally’ partitioned
 Partitioning really designed for DSS
systems. Only efficient for large tables.
• GP_RSLT_ACUM, GP_RSLT_ERN_DED,
• GP_RSLT_PIN, GP_RSLT_PI_DATA
• GP_PYE_PRC_STAT, GP_PYE_SEG_STAT
20
Global Temporary Tables
 Definition is permanently defined in database
catalogue.
 Physically created on demand by database in
temporary tablespace for duration of
session/transaction. Then dropped.
 Each session has its own copy of each referenced
GT table.
 Each physical instance of each GT table only
contains data for one stream.
 Working Storage Tables PS_GP_%_WRK converted
to GT tables.
21
Global Temporary Tables
 Advantages
 Disadvantages
• Not recoverable,
therefore no
Redo/Archive Logging
• Does consume
temporary tablespace
but only during payroll
 some undo
information
 improved
performance
 reduce rollback
• No High Water Mark
problems
• No CBO Statistics
• Can hamper debugging
• New in Oracle 8.1, some
bugs.
• Smaller object to scan.
• No permanent
tablespace overhead.
22
How many streams should be run?
 Cobol run on database
server
 Cobol and database on
different servers
• Either Cobol is active or
database is active
• Cobol active for 2/3 of
execution time.
• No more than one
stream per CPU
• Up to 1.5 streams per
CPU on Cobol server
• Perhaps CPUs -1
• Up to 3 streams per
CPU on database server
 be careful not to
starve database of
CPU
 run process
scheduler at lower
OS priority
23
UBS Production Payroll Configuration
 2 nodes
• Database Node
• Application Server/Process Scheduler Node
 20 CPUs each
 30 Streams
• 2/3 of 30 is 20, so all 20 application server node
CPUS active during calculate phase
 ‘nice’ the Cobol processes
• 1/3 of 30 is 10, so 10 of 16 CPUs active
 important to leave some free CPU for database
else spins escalated to sleeps generating latch
contention
24
QA Payroll Configuration
 2 nodes
• Database Node
• Application Server/Process Scheduler Node
 10 CPUs each
 15 Streams
• Full production volume payroll
• < 1 hour
25
Goals
How to create and test efficient rules that work without
adversely effecting performance
How best to identify problems particularly in the area of
system setup/data versus a problem in a rule or underlying
program
How to use GP payroll debugging tools
26
Efficient Rules
Responsible for two thirds of the execution time, and so
could produce the greatest saving, it will also require the
greatest effort.
Detailed functional and technical analysis of the definition of
the payroll rules.
27
Efficient Rules
 The process ideally starts during the design stage when
various implementation schemes are analysed, intermediate
tests are performed and the most efficient scheme is
chosen.
 All aspects of Global Payroll must be considered since
creating rules to simplify calculation can adversely affect
reporting or other online and batch areas and vice versa.
28
Efficient Rules
 Arrays
 Re-calculate?
 Store / Don’t store
 Formulas
 Proration and Count
 Historical rules
 Generation control versus conditional section
29
Efficient Rules
 Keyed by Employee - 1 select, multiple fetches, small result
set to search
 User Defined - 1 select, multiple fetches, all searches in
memory.
 User Defined with the Reload Option - multiple selects,
multiple fetches, small result set to search.
30
Efficient Rules
31
Efficient Rules
32
Efficient Rules
33
Efficient Rules
34
Efficient Rules
35
Efficient Rules
36
Efficient Rules
37
Efficient Rules
PIN_NUM
PIN_NM
1677 CH_CA_AR010
1679 CH_CA_AR020
1269 CH_CA_AR015
1986 CH_FK_AR004
1521 CH_CA_AR030
1408 CH_TX_AR005
1912 CH_CA_AR012
1206 CH_TX_AR003
Before
Count Time Average Count
1569
4379
3138
3138
4610
704
1569
704
2818.56 1.796405
2316.64 0.529034
1663.98 0.530268
1663.98 0.530268
115.52 0.025059
111.43 0.158281
68.27 0.043512
40.5 0.057528
1569
4379
3138
3138
4610
704
1569
704
Total
8798.88
Difference 2 hours 13 minutes 40 seconds running 1569
After
Time
509.08
55.81
40.72
34.3
38.11
22.37
69.76
7.88
Average
0.324461
0.012745
0.012976
0.010931
0.008267
0.031776
0.044461
0.011193
778.03
employees
38
Efficient Rules
After Modification 8%
Before
Modification 92%
39
t126297:
Migration/Customization
 PI v. Array
• PI can be used during identification.
• PI has special considerations during eligibility
checking.
• PI allows easy override of components on
element definition such as Unit, Rate, Percent
or Base.
• The Array cannot handle multiple instances of
earning/deduction.
40
t126297:
Debugging Tools
 Audit Trace
• Trace All
• Trace Errors
• Large number of records, potential rollback
segment size problems
• View on-line
• Query with SQL
41
t126297:
Debugging Tools
42
t126297:
Debugging Tools
43
t126297:
Debugging Tools
select * from sysadm.ps_gp_audit_tbl
where emplid = '884324'
and cal_run_id = 'ErrMigr'
and pin_num = 40811
order by audit_sort_key, audit_seq_num;
select * from sysadm.ps_gp_audit_tbl
where emplid = '884324'
and cal_run_id = 'ErrMigr'
and audit_sort_key = 229
order by audit_seq_num;
44
t126297:
Debugging Tools
select emplid, audit_sort_key as key
,audit_seq_num as seq, pin_chain_rslt_num as rslt_num
,b.pin_nm, a.pin_num
,pin_status_ind as status, c.pin_nm
,a.pin_parent_num as parent, a.fld_fmt as fmt
,calc_rslt_val as num, date_pin_val as dateval
,chr_pin_val as chr, pin_val_num as pin
from ps_gp_audit_tbl a, ps_gp_pin b, ps_gp_pin c
where cal_run_id = 'U_22_CI0101'
and (emplid, audit_sort_key) in
(select emplid, audit_sort_key
from ps_gp_audit_tbl
where cal_run_id = 'U_22_CI0101'
and pin_num =
(select pin_num
from ps_gp_pin
where pin_nm = 'CH_EP_CHK_1002FF'))
and a.pin_num = b.pin_num
and a.pin_parent_num = c.pin_num
order by emplid, audit_sort_key, audit_seq_num
45
t126297:
Debugging Tools
46
Debugging Tools
AUDIT_SORT_KEY
AUDIT_SEQ_NUM
INSTANCE_NUM
SLICE_BGN_DT
SLICE_END_DT
PIN_CHAIN_LVL_NUM
PIN_CHAIN_RSLT_NUM
PIN_NUM
PIN_STATUS_IND
PIN_PARENT_NUM
FLD_FMT
OLD_VALUE_IND
CALC_RSLT_VAL
CALC_ADJ_VAL
CALC_RAW_VAL
DATE_PIN_VAL
CHR_PIN_VAL
PIN_VAL_NUM
DIFF_SECONDS
BAD_TRACE_IND
SUM_INSTANCE_IND
47
Conclusion
 Use of Partitioning and Global Temporary
Tables reduce (almost eliminate) interstream contention.
 This permits use of streaming to utilise all
available CPUs.
 GP will always be a CPU bound process
• Rule Tuning will reduce CPU overhead
• It is an on-going process
48
And there’s more
 This has been a very concentrated session
 Round Table Discussion session 626
• 15.30-16.30
• Discuss some areas in more detail.
 However, we do have time for some
questions now...
49
Configuring
PeopleSoft Global
Payroll for Optimal
Performance
Session 507
[email protected]
www.go-faster.co.uk
[email protected]
www.omniasolutions.com