Progress Database and Hardware Internals

Download Report

Transcript Progress Database and Hardware Internals

Progress Database Setup,
Maintenance and Tuning
Adam Backman
V.P. of Technology
White Star Software
[email protected]
Agenda
Progress architecture
 Hardware configuration discussion
 Database maintenance
 Performance tuning
 Database future direction

Database Internals

Internal Blocks




Data blocks
Index blocks
Other block types
Physical Layout




Data storage areas
Primary recovery area
After image journal
Other storage
Database Blocks
Master Blocks
 Data Block (RM Block)
 Index Block (Ix Block)
 Index Anchor Block
 Free Blocks
 Empty Blocks

Master Block
This stores the “master” information for the
database including:
Area Status (opened, closed, crashed)
 Last opened date & time
 High water mark for the area
 Last backup date & time

This information and more can be retrieved
through the virtual system tables (VSTs)
Data Blocks
These are also known as RM (Record
Manager) blocks
 Can contain information from one or more
tables
 They can be “full” (RM Blocks)
 Partially full (RM Chain blocks)

Record Storage





In most environments, records are mixed from
different tables in the same block
Progress can store from 1 to 256 records per block
per storage area
All areas for a database must have the same block
size (1 – 8kb)
Total records per area is fixed
More records per block equals lower total blocks
Index Blocks
Also known as IX blocks
 Only contain information from one index
 Always considered partially full, blocks will
split to accommodate growth
 Each block contains the address of itself,
the next and previous blocks to support
forward and reverse searches

Index Structure
Balanced B-tree
 Compressed data
 All data access* through index

*Except rowid access
Progress Index Layout
Example
2-31
2-11
Record 2
.
.
.
Record 11
12-21
Record 12
.
.
.
Record 21
22-31
32-65
32-41
Record 22
.
.
.
Record 31
43-54
Record 32
.
.
.
Record 41
66-98
66-85
55-65
Record 43
.
.
.
Record 54
Record 55
.
.
.
Record 65
Record 66
.
.
.
Record 85
86-98
Record 86
.
.
.
Record 98
Free Blocks
Contain address information
 No affiliation with IX or RM until utilized
 Under high water mark of the database

Empty Blocks
White space
 No addresses
 Under total blocks of the database (area)
 Above high water mark

Storage Areas
Data objects
 Control Area
 Schema/Default Area
 Primary recovery area
 Application Data area(s)

Storage Areas – Data Objects
Index object
 Table object
 Schema object
 Sequences

Control Area
Always has a .db extension
 Describes the “physical schema” or layout
of the database
 Lists storage areas and extents associated
with a database
 Also known as the structure file

Default Storage Area
Also known as the “schema” area
 This is always area 6
 All information that is not assigned a
storage area will be stored here
 All information is stored in this area if the
data is converted from version 8 with a
conv89

Application Data Areas
An area can contain 1 or more data objects
 An area can have 1 or more extents
 Tables and indexes can share areas
 These use area numbers 7-1000

Primary Recovery Area
General information
 Format
 Reuse
 Same as bi file(s) in version 8 and earlier
 This is always area number 3

Primary Recovery Area
General Info
This process is automatic and can’t be
turned off
 This file is vital to database integrity both
physical and logical
 This file is generally sequential

Primary Recovery Area
Format
Each block is a cluster
 Each cluster contains information regarding
transactions to allow transaction undo and
redo
 The transaction information is called notes
 Each note contains a transaction id

Primary Recovery Area
Reuse
Clusters fill sequentially
 When the last formatted cluster is reached
there is a reuse decision point
 The “oldest” cluster is examined to
determine if it can be reused
 Then, either the oldest cluster is reused or
another cluster is added, formatted and used

Progress Memory Architecture
The database engine can be serverless
 The database engine can be multi-server
 Progress applications can be host-based
 Progress applications can be 2-tier
client/server
 Progress applications can be n-tier
client/server

Shared Memory
Host-based Configuration
Record locks
(-L)
Index Cursors
(-c)
Buffers (-B)
After Image
Buffers
User
Control
Table
Server
Control
Table
Latch
Control
Table
Other
Stuff
Before Image
Buffers
Hash Table
What are Latches?
Concurrency control mechanism
 Very course in old versions of Progress
 More granular in current versions of
Progress

Shared Memory
Client/Server Configuration
Record locks
(-L)
Listen
Socket
Index Cursors
(-c)
Buffers (-B)
After Image
Buffers
User
Control
Table
Server
Control
Table
Latch
Control
Table
Other
Stuff
Before Image
Buffers
Hash Table
Servers
Shared Memory
Client/Server Configuration
Database
Broker
Memory
AppServer
Listen
Socket
AppServer
Servers
Hardware Configurations
Disk Considerations
 Memory Allocation
 CPU Considerations

Disk Contention
In most environments disks are the largest
area for improvement. All of the data flows
from the disks to the other resources so this
effects both local and networked users
Balancing Disk I/O
Balancing disk I/O is the process of making
sure you are using all of the available disk
resources (filesystems, disks and controllers)
are working equally as hard at load. This is
also called eliminating variance. A well
tuned system will have less than a 15%
variance
What Causes Disk I/O?
Operating system (swapping and paging)
 Progress

 Database
(DB and BI)
 Application (code and temp files)

Other applications
What RAID Really Means
RAID has many levels. I will only cover a
few
 RAID 0: This level is also called striping.
 RAID 1: This is referred to as mirroring.
 RAID 5: Most common RAID level
 RAID 10: This is mirroring and striping.
Also known as RAID 0 + 1
Raid 0: Striping
Disk 1
Disk 2
Disk 3
Disk
Array
Stripe 1
Stripe 2
Stripe 3
Stripe 4 ...
Volume Set
Raid 0: Striping (continued)
Good for read and write I/O performance
 No failover protection
 Lower data reliability (1 fails they all fail)

Raid 1: Mirroring
Disk 1
Disk 2
Primary
Parity 1
Parity 2
Parity
Raid 1: Mirroring (continued)
OK for read and write applications
 Good failover protection
 High data reliability
 Most expensive in terms of hardware

Raid 5: Poor Man’s Mirroring
This is the kiss of death for OLTP
performance
 User information is striped
 Parity information is striped WITH user
information
 OK for 100% read only applications
 Poor performance for writes

Raid 10: Mirroring and Striping
Good for read and write applications
 High level of data reliability though not as
high as RAID 1 due to striping
 Just as expensive as RAID 1

Software Methods for I/O
Distribution
Manual spread of data across non-striped
disks
 Better control as you can see where the I/O
is going
 More attention by system administrator is
needed

Options








Progress multi-volume
Progress storage areas
8K database block size
BI Cluster size
Use page writers
Move the temp-file I/O with -T
Location of application files
Use of program libraries to reduce I/O
Multi-Volume Database
•
•
•
•
Progress-specific way to distribute I/O
Only way to eliminate I/O indirection in a
Progress environment
Only way to pre-allocate database blocks
Every database is multi-volume in Progress
version 9
Storage Areas

Benefits
 Greater
control of location of data
 Minimize downtime for utilities
 Stripe some, leave some on straight disks

Drawbacks
 More
things to break
 More complex to monitor
Storage Areas - Control
A storage area can hold 1 or more data
objects (index, table, schema, …)
 Separate schema from data if possible
 Try to keep the number of areas
manageable, only add more areas for valid
business reasons

Minimize Downtime
Smaller data areas allow utilities, such as off
line index rebuild, to run faster as they have
less blocks to scan
Database Administration Tools
Backup and restore
 After image journaling
 Other Utilities

probkup

Pros
 Progress
aware
 Supports online backup
 easy

Cons
 Slower
than OS methods
 Does not backup more than the database
prorest
Utility to restore a Progress backup
 Can restore to a different structure provided
there are enough storage areas

Syntax:
prorest dbname device_or_filename [-list |
-vp | -vf]
After Imaging

Pros
 Allows
you to recover to present
 Recover from media failure
 Only way to “repair” catastrophic user error

Cons
 Additional
point of failure
 Adds complexity to the system
 Performance impact
How After Imaging Works
FOR EACH CUSTOMER:
UPDATE CUSTOMER.
Before image note written
END.
After image note written
How to Integrate After Imaging
In conjunction with a backup site
 To update a report server
 As a means of backup

AI to Update a Backup Site
Poor man’s replication
 Allows for periodic update of a copy of the
database
 The copy can then be backed up with a
conventional backup mechanism

AI to Update a Report Server
Similar to keeping a backup site
 Requires two copies of the database in
addition to the original (one for update and
a second for reporting)
 The reporting database is a copy of the
backup that is done periodically to keep the
data synchronized

AI as a Means of Backup

Not generally a good idea
 Increased
recovery time
 Reduced reliability
Backup the database each weekend
 Backup the AI file(s) each weeknight

®
Progress Utilities

Index rebuild
 idxbuild
 idxfix
 idxcompact

DB analysis
®
Progress Utilities
(continued)
Truncate BI
 BI Grow
 Table move
 Index move
 Database log truncation

idxbuild
Can only be run on a database that has been
shutdown
 Can be run on 1 or more indexes

Syntax:
proutil <dbname> -C idxbuild [-TB n]
[-TM n] [-T dirname]
idxfix
Verifies index to record linkage
 Verifies index block to index block linkage
 Works online while in multi-user mode

Syntax:
proutil <dbname> -C idxfix
idxcompact



Fast way to compress (reorganize) indexes online
Utility will pass through the index several times
(number of index levels + 1)
Runs online
Syntax:
proutil <dbname> -C idxcompact [ownername.]table-name.index-name [n]
Database Analysis
ixanalys – analysis of indexes
 chanalys – analysis of record chains
 dbanalys – analysis of records and indexes

Syntax:
proutil <dbname> -C XXanalys
Truncate BI
Reduce for size of the BI file
 Change the cluster size of the BI file
 Change the block size of the BI file

Syntax:
proutil <dbname> -C truncate bi [-bi n]
[-biblocksize n] [-G n]
BI Grow
After truncation it is best to pre-grow your
BI file to it’s anticipated size
 Keeps BI sequential (good for performance)
 Database must be shutdown

Syntax:
proutil <dbname> -C bigrow n
Table Move
Allows the movement from one storage area
to another
 Works “online”
 Uses 4-times the amount of BI space as is
taken up by the table
Syntax:
proutil <dbname> -C tablemove [owner
name.]table-name table-area [index-area]
Index Move



Allows movement of indexes from one storage
area to another
Works “online”
Uses a significant amount of BI space
Syntax:
proutil db-name -C indexmove [ownername.]table-name.index-name area-name
Database Log Truncation
Reduces the size of the log file
 Database must be down for it to work

Syntax:
prolog <dbname>
Performance Tuning - Basics
Before Image cluster size
 Database block size
 Tuning APWs
 Memory tips
 Increasing CPU efficecy

Networking Tips

Keep things local
 No
temp files on network drives
 Move the application “close” to the user
Use -cache to speed initial connection
 Use -pls if you are using program libraries
over the network
 Application issues are magnified over a
network (field-lists, no-lock, indexes, …)

Networking Tips (Continued)
-Mm 8192 to increase the tcp packet size
from 1k to 8k
 -Ma Increase the number of servers to
reduce or eliminate server contention

Stripe Some, Leave Others Flat



Tables that are accessed sequentially may
benefit from being isolated to their own table
space
Randomly accessed tables will generally
perform better on striped volumes
Disk systems that have read ahead algorithms
will help sequential access most when placed
on a single disk (or mirror)
8k Block Size
Most systems will benefit from using 8k
block size (NT should use 4k)
 You will retrieve more information per
physical I/O especially on index reads
 I/O is done how the operating likes it to be
done

BI Cluster Size




Somewhere between 1MB and 4MB works for
most people
If you are checkpointing every 2 minutes or more
often during peak periods increase the cluster size
If you a “workgroup” version of Progress leave
your cluster size alone (512kb)
Don’t forget to use bigrow to avoid allocating
clusters one at a time
®
Progress Page
Writers
Every database that does updates should
have a before image writer (BIW)
 Every database that does updates should
have at least 1 asynchronous page writer
(APW)
 Every database that is using after imaging
should have a after image writer (AIW)

Tuning APWs
Start with 1 APW
 Monitor buffers flushed at checkpoint on
the activity screen (option 5) in promon
 If buffers flushed increases during the
“important” hours of the day add 1 APW

Use -T to Level Disk I/O
Local (host based) users and batch jobs should
use the -T parameter to place their
temporary file (.srt, .pge, .lbi, …) I/O on a
drive that is not working as hard as the other
drives on the system
Note: -T should never point to a network drive
Application Files
Keep paths short
 say run <subdir>/program to eliminate
unnecessary searches
 Put programs into libraries (prolib) to
reduce I/O to temp files
 Libraries use a hashed search mechanism
for better performance

Memory Contention
Memory should be used to reduce disk I/O.
Broker (server) side parameters should be
tuned first and then user parameters can be
modified. In a memory lean situation,
memory should be taken away from
individual users before reducing broker
parameters
Memory Hints
Swapping is bad, buy more memory or
reduce parameters to avoid it
 Increase -B in 10% increments until the
point of diminishing returns or swapping,
whichever comes first
 Use V9 private buffers (-Bp) for reporting
 Do not use private buffers (-I) prior to V9

Memory Hints(continued)
Use memory for the users closest to the
customer first (developers increase last)
 Use -Bt for large temp tables
 Set -bibufs between 50 and 120. Look at
the activity screen in promon (BI buffer
waits) to see if additional tuning is
necessary. Start with 50 as this will work
for the vast majority of people

CPU Contention
High CPU activity is not bad in and of itself
but high system CPU activity is bad and
should be corrected
Components of CPU Activity
USER - This is what you paid for
 SYSTEM - This is overhead
 WAIT - This is waste
 IDLE - This is nothing ;-)

CPU Activity Goals
The goal is to have as much USER time as
possible with as little SYSTEM and WAIT
A practical split is USER:
SYSTEM:
WAIT:
IDLE:
70%
20%
0%
10%
Eliminating High SYSTEM
CPU Activity

Always use -spin




Use a setting of 1 for single CPU systems
Use a higher setting for multiple CPU systems
Testing has shown that the optimal setting for spin is somewhere between 2000 and 10000. First
try 2000
-napmax should default to 5000 but in some late 7
and early 8 versions of Progress it is set to 100
which is way too low
Eliminating High WAIT
CPU Activity
WAIT = Waiting on I/O
 If you still have IDLE time it generally is
not a big problem
 Look at paging/swapping first
 Next look at your disk I/O

Progress Database Future
Directions
Increased uptime through online utilities
 Increased speed of utilities to maintain the
database
 Support for clusters to increase reliability
 Open standards support

Replication
New feature in 9.1D of Progress
 Fathom High Availability
 Allows for single or bi-directional
replication
 Target database can be used for update or
reporting

Replication (continued)
Source database has an agent that forwards
changes to the target database(s)
 Only one agent per database
 One or more targets per agent
 Raw record format is used to increase
performance and reduce overhead

Questions