Getting the Most from Oracle Data Pump
Download
Report
Transcript Getting the Most from Oracle Data Pump
<Insert Picture Here>
Getting the Most from Oracle Data Pump
Roy Swonger
Director, Oracle Database Utilities
January, 2007
Objectives of Talk
• Provide an overview of Oracle Data Pump
• Give a quick-start primer for users of original
Export/Import
• Highlight useful features that differ from those offered
in original Export/Import
Table of Contents
•
•
•
•
•
Overview of Oracle Data Pump
Data Pump Quick Start for Exp/Imp users
New Features of Oracle Data Pump
Advanced Features of Oracle Data Pump
Frequently Asked Questions
Data Pump Overview
•
•
•
•
Background
Usage Scenarios
Performance
Features
Data Pump Overview: Background
• Replacement for old exp and imp utilities
• Faster and more feature-rich than older utilities
• Available starting in Oracle 10g, Data Pump is the new
export/import mechanism
• As of Oracle 11g, original Export is no longer supported
for general use
Data Pump Overview: Usage Scenarios
• Typical uses for Data Pump Export/Import
•
•
•
•
Logical backup of schema/table
Refresh test system from production
Upgrade (either cross-platform, or with storage reorg)
Move data from production to offline usage (e.g. data warehouse,
ad-hoc query)
• Data Pump complements other Oracle features
• RMAN physical backups
• Oracle Warehouse Builder for Extract/Transform/Load operations
Data Pump Overview: Performance
• Typical results for data load/unload
• expdp is ~2x faster than original exp
• impdp is ~15-40x faster than original imp
• Using PARALLEL can further improve performance
• Your mileage may vary!
• Metadata performance essentially unchanged, sometimes
slower
• Data performance vastly improved
• Small amount of fixed overhead will affect performance of
small jobs
• Storage and file system characteristics are a major factor in
performance
More on Data Pump performance later in this talk!
Data Pump Performance
• Test Results by Prof. Carl Dudley, University of Wolverhampton, UK
• Timings taken for sample employee tables containing 1, 0.5m, 1m, 2m,
4m, 8m and 16m rows
•
•
•
•
Original Export
Data Pump Export using direct path and external table
Original Import
Data Pump Export using direct path and external table
• Sizes of dump file sets compared for
• Original Export
• Data Pump Export using direct path and external table
EMPNO
----7369
7499
7521
7566
ENAME
----SMITH
ALLEN
WARD
JONES
JOB
-------CLERK
SALESMAN
SALESMAN
MANAGER
Source: Prof. Carl Dudley, University of Wolverhampton, UK
MGR
---7902
7698
7698
7839
HIREDATE
--------17-DEC-80
20-FEB-81
22-FEB-81
02-APR-81
SAL COMM DEPTNO SEQID
---- ---- ------ ----800
20
1
1600 300
30
2
1250 500
30
3
2975
20
4
Data Pump Export Timings and Sizes
Timings (seconds)
Rows
1
0.5M
1M
2M
4M
8M
Data Pump
Direct path
28
38
58
69
90
114
Data Pump
External Tables
51
70
90
105
120
146
Original Import
2
21
32
52
100
192
Size of Export File (MB)
Rows
1
0.5M
1M
2M
4M
8M
Data Pump
Direct path
0.008
20
41
82
169
331
Data Pump
External Tables
0.008
20
41
82
169
331
Original Import
0.008
23
47
89
190
389
Source: Prof. Carl Dudley, University of Wolverhampton, UK
Response time
Data Pump Export Performance
3 min
Export - conventional path
Data Pump - direct path
Data Pump - external table
2 min
1 min
0.5m 1m
2m
Source: Prof. Carl Dudley, University of Wolverhampton, UK
4m
Rows (millions)
8m
Data Pump Import Timings
Timings (seconds)
Rows
1
0.5M
1M
2M
4M
8M
Data Pump
Direct path
7
12
16
25
46
86
Data Pump
External Tables
25
33
33
44
63
120
Original Import
2
15
33
73
157
306
Source: Prof. Carl Dudley, University of Wolverhampton, UK
Response time
Data Pump Import Performance
5 min
4 min
Import
Data Pump - direct path
Data Pump - external table
3 min
2 min
1 min
0.5m 1m
2m
Source: Prof. Carl Dudley, University of Wolverhampton, UK
4m
Rows (millions)
8m
Data Pump Overview: Features
• Improved Speed
• Direct path load/unload
• Parallel workers
• Flexibility
• INCLUDE or EXCLUDE many more object types
• REMAP schema, tablespace, data file
• Use multiple dumpfiles for parallelism and ease of file management
• Database Feature Support
• Encrypted columns
• Network move over dblinks
• Newer datatypes
Data Pump Quick Start
• New Concepts
• Syntax Changes
Data Pump Quick Start: New Concepts
• Directory Object
• Used for reading and writing dumpfiles
• Allows DBA to control where files will be written on the server
system
• Default object DATA_PUMP_DIR created as of Oracle
Database 10g Release 2
• Interactive Command-line
• Allows the user to monitor and control Data Pump jobs
• Many job parameters can be adjusted “on the fly”
• Tuning Parameters
• Data Pump handles most tuning internally
Data Pump Quick Start: Directory Object
Example of Directory Object Usage:
• Create the directory as a privileged user:
$ sqlplus sys/<pwd> as SYSDBA
SQL> CREATE DIRECTORY scott_dir AS ‘/usr/apps/datafiles’;
SQL> GRANT READ,WRITE ON DIRECTORY scott_dir TO scott;
SQL> exit
• User scott can then export/import using Data Pump:
expdp scott/<pwd> DIRECTORY=scott_dir dumpfile=scott.dmp
Data Pump Quick Start: Syntax Changes
• New command line clients
• expdp/impdp instead of exp/imp
• Parameter changes: a few examples
Data Pump Parameter
Original Exp/Imp Parameter
SCHEMAS
OWNER
REMAP_SCHEMA
TOUSER
CONTENT=METADATA_ONLY
ROWS=N
EXCLUDE=TRIGGER
TRIGGERS=N
Note: A full mapping of parameters from original
Export/Import to Data Pump Export/Import can be found in
the Oracle Database Utilities manual.
New Features of Oracle Data Pump
•
•
•
•
•
Network Mode
Restartability
Parallelization
Include/Exclude
SQLFILE
New Features: Network Mode Export
expdp scott/tiger network_link=db1 tables=emp
dumpfile=scott.dmp directory=mydir
• Produces a local dump file set using the contents of a remote
database
• Only way to export from a write locked database (e.g., a standby
database or a read-only database)
• Requires a local, writeable database to act as an agent
• May be parallelized
• Will generally be significantly slower than exporting to a file on a
local device
New Features: Network Mode Import
impdp system/manager network_link=db1 directory=mydir
• Moves a portion of a database to a new database without creating a
dump file
• Ideal when the footprint of the dump file set needs to be minimized
• May be parallelized
• Primarily a convenience: will generally be slower than exporting to a
file, copying the file over the network, and importing to the target
New Features: Restartability
• Data Pump jobs may be restarted without loss of data and with
only minimal loss of time
• Restarts may follow:
•
•
•
•
•
•
System failure (e.g., loss of power)
Database shutdown
Database failure
User stop of Data Pump job
Internal failure of Data Pump job
Exceeding dumpfile space on export
New Features: Restartability - Export
expdp system/manager attach=myjob
Export> start_job
• Export writes out objects based upon object type
• On restart, any uncompleted object types are removed from the
dump file and the queries to regenerate them are repeated
• For data, incompletely written data segments (i.e., partitions or
unpartitioned tables) are removed and the data segments are
totally rewritten when the job continues
New Features: Restartability – Import (1)
impdp system/manager attach=myjob
Import> start_job
• Restart is based upon the state of the individual objects recorded
in the master table
• If object was completed, it is ignored on restart
• If object was not completed, it is reprocessed on restart
• If object was ‘in progress’ and its creation time is consistent with
the previous run, it is reprocessed, but ‘duplicate object’ errors
are ignored
New Features Restartability – Import (2)
impdp system/manager attach=myjob
Import> start_job=skip_current
• If skip_current is specified, objects that were ‘in progress’ are
marked as having failed at the time of restart
• skip_current is useful for getting beyond corrupted objects in the
dump file that repeatedly cause impdp to fatally exit
New Features: Parallelization
• Multiple threads of execution may be used within a
Data Pump job
• Jobs complete faster, but use more database and
system resources
• Only available with Enterprise Edition
• Speedup will not be realized if there are bottlenecks in
I/O bandwidth, memory, or CPU
• Speedup will not be realized if bulk of job involves
work that is not parallelizable
New Features: Parallel Export
expdp system/manager directory=mydir
dumpfile=a%u.dmp parallel=2
• There should be at least one file available per degree of
parallelism. Wildcarding filenames (%u) is helpful
• All metadata is exported in a single thread of execution
• Export of user data will use up all other threads of execution:
• Typically each partition or unpartitioned table will be processed by a
single worker thread
• In certain cases, a very large partition will be processed across
multiple threads of execution using parallel query
New Features: Parallel Import
impdp system/manager directory=mydir dumpfile=a%u.dmp
parallel=6
• Degree of parallelization in import does not have to match degree of
parallelization used for export
• Processing of user data is split among the workers as is done for
export
• Creation of package bodies is parallelized by splitting the definitions
of packages across multiple parallel workers
• Index building is parallelized by temporarily specifying a degree
clause when an index is created
New Features: Include/Exclude
impdp hr/hr directory=mydir dumpfile=mydump
exclude=index
• Fine grained object selection is allowed for both expdp and impdp
• Objects may be either excluded or included
• List of object types and a short description of them may be found in the
following views:
• DATABASE_EXPORT_OBJECTS
• SCHEMA_EXPORT_OBJECTS
• TABLE_EXPORT_OBJECTS
New Features: Exclude
expdp hr/hr directory=mydir dumpfile=mydump
exclude=index,trigger
• Objects described by the Exclude parameter are omitted from the
job
• Objects that are dependent upon an excluded object are also
excluded. (e.g., grants and statistics upon an index are excluded if
an index is excluded)
• Multiple object types may be excluded in a single job
New Features: Include
impdp hr/hr directory=mydir dumpfile=mydump
include=procedure,function
• Objects described by the Include parameter are the only objects
included in the job
• Objects that are dependent upon an included object are also
included. (e.g., grants upon a function are included if the function is
included)
• Multiple object types may be included in a single job
Note: Include and Exclude parameters may not be mixed on the
same command
New Features: SQLFILE
• Specifies a file into which the DDL that would have
been executed in the import job will be written
• Actual import is not performed; only the DDL file is
created
• Can be combined with EXCLUDE/INCLUDE to tailor
the contents of the SQLFILE
• Example: to get a SQL script that will create just the
tables and indexes contained in a dump file:
$ impdp user/pwd DIRECTORY=DPUMP_DIR1 DUMPFILE=export.dmp
INCLUDE=TABLE,INDEX SQLFILE=create_tables.sql
• Output of SQLFILE is executable, but will not include
passwords
Advanced Features of Oracle Data Pump
• Network Mode Transportable Move
• Exclude/Include
• With Object Paths
• With Expressions
Advanced Features: Network Mode
Transportable Import
impdp system/manager network_link=db1
transport_tablespaces=tbs_1
transport_datafiles=/disk1/tbs1.f directory=mydir
• Tablespaces should be set to read only and datafiles
should be copied/converted to target location before
import is started
• Best way to perform transportable tablespace moves
Advanced Features: Include/Exclude
with Object Paths
• Object types may also be selected by their object paths
which appear in log files:
include=“/TABLE_EXPORT/TABLE/TRIGGER”
• The full list of object paths is available in the
DATAPUMP_PATHS view
• Abbreviated object paths may also be specified:
exclude=“PACKAGE”
will exclude from the job all object types whose paths
match “LIKE ‘%/PACKAGE%”:
all package specifications
all package bodies
all package grants
Advanced Features: Include/Exclude
with Expressions
• Named objects may be included or excluded based
upon the object’s name:
expdp scott/tiger directory=mydir dumpfile=scott.dmp
include=table:\”LIKE \‘E%\’\”
• Most commonly used operators in name expressions
are “=“, “LIKE”, “BETWEEN”, and “IN”
expdp scott/tiger directory=mydir dumpfile=scott.dmp
include=table:\”in \(select table_name from my_exp_tables\)\”
• Allowed on both export and import
• Because of arcane shell quoting rules, it is best to
specify name expressions within parameter files
Frequently Asked Questions (1)
• Can original Export dump files be used with Data
Pump?
• No. The dump file formats for original exp/imp and Data Pump
Export/Import are not compatible.
• Can Data Pump work with 9i databases?
• No. Data Pump works with Oracle Database 10g and later.
• Can Data Pump work over a network link to an earlier
version database?
• No, you cannot use network links to use Data Pump Export
on a database earlier than version 10g
Frequently Asked Questions (2)
• Can I use Enterprise Manager with Data Pump?
• Yes, there is an EM interface for Data Pump
• How do I pipe a Data Pump job through gzip?
• This compression technique cannot be used with Data Pump,
because Data Pump cannot support named pipes
• In Oracle Database 10g Release 2, Data Pump compresses
metadata by default
• Stay tuned for a data compression solution in a future release
• What will happen to original Export/Import?
• Original Export will no longer be supported for general use
after Oracle Database 10g Release 2
• Original Import will be supported indefinitely, to handle
existing legacy Export files
References
• Oracle® Database Utilities 10g Release 2 (10.2)
• Part Number B14215-01
• Data Pump on Oracle Technology Network:
http://www.oracle.com/technology/products/database/utilities/index.html
The preceding is intended to outline our general
product direction. It is intended for information
purposes only, and may not be incorporated into any
contract. It is not a commitment to deliver any
material, code, or functionality, and should not be
relied upon in making purchasing decisions.
The development, release, and timing of any
features or functionality described for Oracle’s
products remains at the sole discretion of Oracle.