Raw Tables for Improved Load Performance

Download Report

Transcript Raw Tables for Improved Load Performance

Raw Tables
for
Improved Load Performance
Christopher Coleman
Database Analyst
Mediware Information Systems
KCIUG 30 January, 2004
Raw Tables for Improved Load Performance
1
Christopher Coleman
• President Kansas City Informix Users Group
http://www.iiug.org/kciug/
• Database Analyst
Mediware Information Systems, Inc
http://www.mediware.com/
• Technical Support Engineer
Informix Software, Inc.
– 1998 – 2001
• [email protected]
KCIUG 30 January, 2004
Raw Tables for Improved Load Performance
2
Two Types of Permanent Tables
• STANDARD TABLE
– Logging table
• Not logged in nonlogging database
– Default
• RAW TABLE
– Nonlogging table
KCIUG 30 January, 2004
Raw Tables for Improved Load Performance
3
Standard vs. Raw
Characteristic
Permanent
Loadable
Logged
Indexes
Rollback
Enterprise Replication
Recoverable
Restorable
KCIUG 30 January, 2004
Standard RAW
Yes
Yes
Yes
Yes
Yes
No
Yes
No
Yes
No
Yes
No
Yes
Depends
Yes
Depends
Raw Tables for Improved Load Performance
4
Limitations of Raw Tables
• Raw tables do not support indexes,
referential constraints, or rollback.
• A physical backup can be used to
restore a raw table if it has not been
updated since that backup.
• Fast recovery does not roll back
incomplete transactions on raw tables.
• Enterprise Replication is not allowed on
raw tables.
KCIUG 30 January, 2004
Raw Tables for Improved Load Performance
5
Loading with Raw Tables
• Raw tables are designed for improved loads.
– From the IBM Informix Dynamic Server
Administrator’s Guide:
http://publibfi.boulder.ibm.com/epubs/pdf/ct1ucna.pdf
“ RAW
tables are intended for the initial loading and
validation of data. To load RAW tables, you can use
any loading utility, including dbexport or the HighPerformance Loader (HPL) in express mode. If an
error or failure occurs while loading a RAW table, the
resulting data is whatever was on the disk at the time
of the failure.”
• Raw tables use light appends.
– Rows added quickly to the end of each table
fragment.
KCIUG 30 January, 2004
Raw Tables for Improved Load Performance
6
Comparing Load Performance
• Create three tables with identical columns.
– One standard.
– One raw.
– One standard converted to raw before load and
converted back to standard after.
• Load 10,000,000 rows.
– Approximately 250 MB
• Compare times.
KCIUG 30 January, 2004
Raw Tables for Improved Load Performance
7
SQL for Raw Tables
• Creating raw tables
– Syntax like standard with addition of
keyword: RAW
CREATE RAW TABLE raw_table (…);
• Altering Raw Tables to Standard
ALTER TABLE raw_table TYPE (STANDARD);
• Altering Standard Tables to Raw
ALTER TABLE raw_table TYPE (RAW);
• Other SQL same as used for Standard tables
KCIUG 30 January, 2004
Raw Tables for Improved Load Performance
8
Table Definitions
• Standard Table
– last_name_dist
• Raw table
– last_name_dist_raw
• Standard table converted to raw for load and converted
back after load.
– last_name_dist_cnv
• Columns
–
–
–
–
last_name
frequency
cum_freq
rank
KCIUG 30 January, 2004
LVARCHAR
DECIMAL(5,2)
DECIMAL(5,2)
INTEGER
Raw Tables for Improved Load Performance
9
Load Performance Comparison
Table
Total Time in Minutes
Standard
30:36
Raw
19:16
Conversions
29:05
KCIUG 30 January, 2004
Raw Tables for Improved Load Performance
10
Redundant Warning
• From the IBM Informix Dynamic Server
Administrator’s Guide:
It is recommended that you do not use
RAW tables within a transaction. Once you
have loaded the data, use the ALTER
TABLE statement to change the table to
type STANDARD and perform a level-0
backup before you use the table in a
transaction.
• Ignoring this can result in data
inconsistency.
KCIUG 30 January, 2004
Raw Tables for Improved Load Performance
11
IDS Versions
• IDS 7.31
– Documented only in release notes.
• IDS 8
– Fully Documented
• IDS 9.21 and following
– Documented in release notes.
– Fully Documented for 9.4
KCIUG 30 January, 2004
Raw Tables for Improved Load Performance
12
In Closing
• RAW tables are nonlogging tables
intended for rapid loading and validation
of data.
• RAW tables do not support indexes,
referential constraints, or rollbacks.
• Backup and recovery is only possible
with physical backups.
• Fast recovery is only possible if
checkpoint followed load.
KCIUG 30 January, 2004
Raw Tables for Improved Load Performance
13