Get the Best Out of Oracle Data Pump Functionality

Download Report

Transcript Get the Best Out of Oracle Data Pump Functionality

1
<Insert Picture Here>
Get the Best Out of Oracle Data Pump Functionality
Dean Gagne (Oracle)
Viljo Hakala (Nokia)
Agenda
• Moving large amounts of data with Transportable
Tablespaces
• Filtering metadata using the INCLUDE and EXCLUDE
parameters
• Restarting stopped/failed jobs
• Hear about Nokia Corporation's database
environment
• How Nokia uses the REMAP_DATA parameter to
scramble data
• How Nokia regenerates primary keys without having
to use additional software or scripts
3
What is Oracle Data Pump?
• New feature starting in Oracle Database 10g Release 1
• Enables very fast bulk data and metadata movement between
Oracle databases
• High-speed, parallel Export and Import utilities (expdp and
impdp) as well as a Web-based Oracle Enterprise Manager
interface
• Jobs can be restarted without loss of data, whether or not the
stoppage was voluntary or involuntary
• Jobs support fine-grained object selection. Virtually any type of
object can be included or excluded
• Supports the ability to load one instance directly from another
(network import) and unload a remote instance (network export)
4
Moving Large Amounts of Data
With Transportable Tablespace
5
What is Transportable Tablespace?
• An Oracle Database feature that allows data file
transfer from a database to another via a simple os
copy and a light specific export/import
• Data Pump will move metadata only
• Data moves with data file copy
• Much faster than using direct path or external tables
6
Moving Large Amounts of Data
With Transportable Tablespaces
Restrictions:
• Tablespaces need to be self contained
– All dependent objects must be included in tablespace set
• Tablespaces need to be read only for duration of export and
datafile copying.
• Not restartable
• Must be privileged account
Notes:
• Use RMAN CONVERT to change endianness (if needed)
• Data Pump moves only metadata
• Can use network link
• For a full list of what is exported:
– select unique seq_num, full_path
from datapump_paths
where het_type = 'TRANSPORTABLE_EXPORT‘
order by seq_num;
7
Transportable Tablespaces
Self Containment Check
• Create table part_tab(id number) tablespace a
partition by range (id)
(partition low value less than (100) tablespace b,
partition hi values less than (1000) tablespace c);
• Create index part_ind on part_tab.id tablespace d;
• Transportable tablespace export requires all 4
tablespaces
– (a, b, c, d)
• DBMS_TTS.TRANSPORT_SET_CHECK procedure
to verify
– accepts a comma separated list of tablespace names
– SELECT * FROM transport_set_violations;
8
Transportable Tablespace Steps
Source system tasks:
• Optionally run DBMS_TTS.TRANSPORT_SET_CHECK
• Set tablespaces read only
• Run expdp command with tablespace list
– Data Pump will list required dumpfiles and datafiles in log file
• Copy datafiles and dumpfiles to target system
• Optionally set tablespaces read write
Target system tasks:
• Run impdp command
• Optionally set tablespaces to read write
9
Table Mode with Transportable=always
• Can export/import tables, partitions, and indexes
regardless of tablespace closure
– Can’t have any storage in system, sysaux, temp tablespaces
• Data Pump export will list tablespaces that need to be
read only if not already done
• Data Pump export will list datafiles and dumpfiles that
need to be copied to target
• Data Pump import can use filters – don’t have to
import the complete dumpfile set
• Can import only one partition – Data Pump creates a
non-partitioned table
• Partition_options=departition – creates nonpartitioned tables for every partition
10
Export Table Mode with Transportable=always
Expdp scott/tiger tables=pt1,pt2 transportable=always
• Table filter needs to be complete tables or partitions from
the same table
– ptab1,ptab2, etc or ptab1:lo,ptab1:hi
• ORA-29335: tablespace 'USER1' is not read only
ORA-29335: tablespace 'USER2’ is not read only
• Dump file set for SCOTT.SYS_EXPORT_TABLE_01 is:
/oracle/work/tab_tts.dmp
• Datafiles required for transportable tablespace USER1:
/oracle/dbs/user1.f
Datafiles required for transportable tablespace USER2:
/oracle/dbs/user2.f
11
Import Using Transportable Table Mode Dumpfile
• Impdp scott/tiger tables=ptab2:hi
rename_table=ptab2:hi:my_tab1 transport_datafiles…
– Created table is my_tab1
• Impdp scott/tiger tables=ptab2
partition_options=departition transport_datafiles…
– Created tables are ptab2_lo, ptab2_hi
• Creates all tablespaces even if no objects created
due to import filters
• Cleans up any unused tablespace segments that are
not imported
12
Filtering Metadata
13
Filtering Metadata Overview
• Perform selective export/import job based on object
type
• Exp/imp had limited filtering
– Grants, index, triggers, statistics, constraints
• Data Pump has almost complete filtering capabilities
14
Filtering Metadata Using
INCLUDE and EXCLUDE Parameters
•
•
•
•
If using exclude parameter, everything else is included
If using include parameter, everything else is excluded
Can’t use exclude and include in the same Data Pump job
Specify complete path or partial path. Objects matching the
specified path will be excluded/included.
• Query to find exclude/include object types:
– select unique seq_num, full_path
from sys.datapump_paths
where het_type = 'DATABASE_EXPORT'
order by seq_num;
– Job_type:
FULL
SCHEMA
TABLE
TRANSPORTABLE
het_type:
DATABASE_EXPORT
SCHEMA_EXPORT
TABLE_EXPORT
TRANSPORTABLE_EXPORT
15
Exclude Example:
expdp system/manager schema=hr exclude=statistics …
vs
expdp system/manager schema=hr exclude=
SCHEMA_EXPORT/TABLE/STATISTICS
select unique seq_num, full_path
from sys.datapump_paths
where het_type = 'SCHEMA_EXPORT' AND full_path like '%STATISTICS%‘
order by seq_num;
77
78
221
222
223
225
226
227
SCHEMA_EXPORT/TABLE/INDEX/STATISTICS
SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS
SCHEMA_EXPORT/TABLE/INDEX/STATISTICS
SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/FUNCTIONAL_AND_BITMAP
SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/FUNCTIONAL_AND_BITMAP/INDEX_STATISTICS
SCHEMA_EXPORT/TABLE/STATISTICS
SCHEMA_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
SCHEMA_EXPORT/TABLE/STATISTICS/USER_PREF_STATISTICS
16
Include Examples:
impdp system/manager tables=hr.employees
vs
impdp system/manager schemas=hr
include=table:\"= \'EMPLOYEES\'\"
• Same results
• Includes all objects that have table in the path
vs
impdp system/manager schemas=hr include=table/table:\"=
\'EMPLOYEES\'\“
• Only includes the table
Some of the TABLE object paths
SCHEMA_EXPORT/TABLE/TABLE
SCHEMA_EXPORT/TABLE/TABLE_DATA
SCHEMA_EXPORT/TABLE/GRANT
17
Restarting Stopped/Failed Jobs
18
Restarting Data Pump Jobs
• Restart jobs intentionally or unintentionally stopped
– User stopped
– Dumpfile exhausted
– Resumable wait
• Optionally change value of PARALLEL parameter
• Helps to know the job name
– System generated job names
• SYS_IMPORT_FULL_01
• SYS_EXPORT_TABLE_05
• What jobs are restartable
– Select * from dba_datapump_jobs
– Select * from user_datapump_jobs
19
Restarting Stopped Export Jobs
Initial Export Command:
expdp system/manager job_name=datapump_exp…
• Restart command:
– expdp system/manager attach=datapump_exp
– Export> start_job
• Resets dumpfile pointers to last completed object type
• Restarts the job after the last completed object
20
Restarting Stopped Import Jobs
initial import command:
impdp system/manager job_name=datapump_imp…
• Restart command:
– Impdp system/manager attach=system.datapump_imp
– Import> Start_job[=skip_current]
• Detects the last object created
• If creating an object and the object exists, checks to see if object
was in progress in failed job
• Data that was being loaded would have either been committed or
not. Committed data marked complete, uncommitted data will be
retried.
21
Restarting Stopped Network Import Jobs
Initial Import Command
impdp scott/tiger job_name=net_imp network_link=dbs1…
• Restart command:
– Impdp scott/tiger attach=net_imp
– Import> Start_job[=skip_current]
• Restarts on object type
• If creating object and object exists, checks to see if object was in
progress in failed job
• Data that was being loaded would have either been committed or
not. Committed data marked complete, uncommitted data will be
retried.
22
Hear about Nokia Corporation's Database
Environment and how they use Oracle Data Pump
Viljo Hakala (Nokia)
23
Nokia
• Nokia is a world leader in mobility
• Head office in Finland; R&D, production, sales, marketing activities
around the world
• World’s #1 manufacturer of mobile devices, with estimated 40%
share of global device market in 2009
• Mobile device volumes 468 million units
• Net sales EUR 50.7 billion
• Operating profit EUR 5.0 billion
• 128 445 employees at year end (including Nokia Siemens
Networks)
• Strong R&D presence in 16 countries
• R&D investment EUR 6.0 billion
• Sales in more than 150 countries
24
24
Oracle @ Nokia
• 20 DBAs, 2300 databases
• Grid Control: used for almost 5 years, #1 tool for all DBAs
• Applications:
• OLTP: 50%, Hybrid databases: 40%, DW: 10% + Teradata
• About 1800+ Oracle; also MySQL 200+, MS SQL 300+
• 11g: 4%
• 10g: 80%
•
9i: 15%
•
8i: 1%
• Host platforms:
• Solaris: 35%
• HP-UX: 35%
• Linux: 20%
• Windows 10%
25
25
Oracle @ Nokia
•
•
•
•
•
•
•
•
•
26
26
Production databases: single, HA Oracle, RAC, HA MySQL
5 MB – 7 TB range
Leverage virtually all Oracle database features
DBAs share responsibilities
Many federated databases: Streams, Advanced Replication,
Materialized views, DB links
Common Oracle platform, Common MySQL platform and Common
MS SQL platform
99% of databases backed up with RMAN
8 RMAN catalog databases
10 PB+ backed up / month
What is Data Masking?
27
27
Hints for Effective Data Masking
•
Be sure to mask all data that is not sensitive, but can be used to create
sensitive data.
•
You should not be able to reverse the masking for it to be secure. Never
should you be able to retrieve original values from masked values.
•
The masked or obfuscated data should follow the source data format.
Masking should protect data, but still allow realistic looking data to be
used for testing.
•
Integrity of relations in data should be followed. When masking a primary
key, the foreign keys should also be masked using the same masking
method.
•
You should be able to repeat the masking process for it to be an effective
day to day process.
28
Data Masking with Oracle Data Pump
• One of the best uses of Oracle Data Pump is to load test and
development systems with data from production systems for
realistic testing purposes.
• Due to company rules or legal regulations, sensitive data in the
non-production systems has to be replaced with realistic looking
data to enable effective testing.
• Effective data masking routines should support automation and
Oracle Data Pump provides several ways to automate data imports
and exports with masking.
29
29
Data Masking with Oracle Data Pump
• In addition to Data Masking Pack in Grid Control, Oracle Data
Pump provides a method to mask data: REMAP_DATA parameter
introduced in Oracle Database 11g.
• Oracle Data Pump’s REMAP_DATA feature uses a remapping
function to rewrite data.
• For example, a column with phone numbers could be replaced by a
numbers generated by a REMAP_DATA function.
30
30
Data Masking with Oracle Data Pump
• REMAP_DATA allows transformation of column’s data while exporting
(expdp) or importing (impdp) by using a remapping function in the
database.
• REMAP_DATA with Data Pump is usually faster than a custom
UPDATE statement for masking data.
• To mask multiple columns in the same process and command, the
REMAP_DATA parameter can be used multiple times.
31
31
Data Masking with Oracle Data Pump:
Quick example
• REMAP_DATA=[schema.]tablename.column_name:[schema.]pk
g.function
• impdp dumpfile=data.dmp
REMAP_DATA=scott.orders.customer_name:scott.maskpkg.mas
k
• expdp dumpfile=data.dmp
REMAP_DATA=scott.orders.customer_name:scott.maskpkg.mas
k
32
32
Data Masking with Oracle Data Pump:
Syntax
REMAP_DATA=
• [schema1.]tablename.column_na
me:
• [schema2.]pkg.function
• schema1 -- the schema with the
table to be remapped. By default,
this is the schema of the user
doing the export.
• tablename -- the table which
column will be remapped.
33
33
• column_name -- which is to be
remapped.
• schema2 -- with the PL/SQL
package for remapping function.
As a default, this is the schema of
the user doing the export.
• pkg -- the name of the PL/SQL
package with the remapping
function.
• function -- the name of the remap
function in the PL/SQL package
Data Masking with Oracle Data Pump:
Restrictions
• Data types must be same in the table column, masking function
parameter, and function return value.
• No commits or rollbacks should be done in the masking function.
• No direct path loads can be used in import process with REMAP_DATA.
• Note: Operation of long export/import data pump processes can be
monitored from the v$session_longops view, but the estimated values
do not take into account REMAP_DATA operations.
34
34
Data Masking with Oracle Data Pump:
Example
• Create a table in the CUSTOMERS schema
called phones
35
35
SQL>
CREATE TABLE CUSTOMERS.PHONES
(
MODELNAME VARCHAR(20) NOT NULL,
PHONENUMBER VARCHAR2(50)
);
insert into CUSTOMERS.PHONES
values(’N900’,’+3581234567’);
insert into CUSTOMERS.PHONES
values(’N8’,’+3589817654’);
insert into CUSTOMERS.PHONES
values(’N7’,’+3584834819’);
Data Masking with Oracle Data Pump:
Example
• We then need to create a function
for remapping
create or replace package
customers.maskpkg
as
function masknumber(phonenumber
varchar2) return varchar2;
end;
/
create or replace package body
customers.maskpkg as
function masknumber (phonenumber
varchar2) return varchar2 is
begin return
substr(phonenumber,1,4)||round(dbms_ra
ndom.value (100,999))||
lpad(round(dbms_random.value
(1,9999)),4,'0');
end;
36
end;
36
• The function masknumber will
accept a varchar2 type and
returns a random phone number
in varchar2 type
Data Masking with Oracle Data Pump:
Example
• This example will mask one
column: phonenumber to the
export output file
$ expdp customers/manager \
tables=customers.phones \
dumpfile=phones_masked.dmp \
directory=dumpdir \
remap_data=\
customers.phones.phonenumber:
\
customers.maskpkg.masknumber
37
37
• Export the data from the
customers.phones table with
expdp utility and use the
REMAP_DATA option for
masking the data in the dump file
with the function
customers.maskpkg.masknumber
created earlier
• By default the owner of the remap
function is the user running
expdp/impdp
• Now the dumpfile
phones_masked.dmp can be
used for testing environments
Data Masking with Oracle Data Pump:
Example, before masking
• SQL> select * from customers.phones
MODELNAME PHONENUMBER
---------- -------------------N900
38
38
+3591234567
N8
+3589817654
N7
+3584834819
Data Masking with Oracle Data Pump:
import/replace table
• This example will import the masked dump file
and replace the existing table with masked column data
$ impdp customers/manager \
tables=customers.phones \
dumpfile=phones_masked.dmp \
directory=dumpdir \
table_exists_action=REPLACE
39
39
Data Masking with Oracle Data Pump:
Example, after import
• SQL> select * from customers.phones
MODELNAME PHONENUMBER
---------- -------------------N900
40
40
+35815499474
N8
+3584800578
N7
+3581247839
Data Masking with Oracle Data Pump:
Import
• REMAP_DATA parameter can also be used in the import process
• Use impdp if you have an existing dump file and you want to mask the
data when loading into the database
• $ impdp customers/manager TABLE_EXISTS_ACTION=replace
dumpfile=phones.dmp directory=dumpdir
remap_data=customers.phones.phonenumber:customers.maskpkg.mask
number
41
41
Primary Key Regeneration with
REMAP_DATA
• To change primary keys after development phase of an application, use
REMAP_DATA parameter.
• It is possible to use Data Pump’s REMAP_DATA to change the primary
keys within a database by writing a remapping function which follows the
new primary key format.
• Primary key conflicts can happen when loading data to an existing table in
a database.
• With REMAP_DATA, conflicts can be avoided by changing the column
values during import.
42
42
When not to use Masking
• It can be very difficult and expensive to mask data especially in large
BI/DWH systems due to the amount of data and refining operations.
• It is often easier to use a data model in which sensitive data can be
limited to one location, referenced via surrogate keys, and
kept safe with strict access policies and auditing.
But when giving out data for simplified testing needs, masking is a good tool!
43
Other Methods to Secure Data in Oracle
Database 11g Release 2
• Oracle Virtual Private Database (VPD aka Fine Grained Access Control)
feature allows filtering of data at row-level for runtime SQL statements
according to a defined policy.
• Data Masking Pack for Enterprise Manager. Centralized masking solution
within Grid Control.
• Oracle Advanced Security Option for encrypting data at tablespace or
column level.
• Oracle Label Security for restricting access to data based on policies.
• Oracle Database Vault for separating roles and restricting access to data
per roles.
44
Oracle Data Pump at Nokia: Summary
• Oracle Data Pump is used extensively at Nokia to move data and
metadata between systems.
• REMAP_DATA option is a great way to mask sensitive data for security
purposes.
• REMAP_DATA option can also be used to change primary keys.
• Oracle has many features for securing data at different levels.
• Security for data is not just Obscurity. It is a process and requires time.
45
45
Oracle Data Pump Summary
• Transportable Tablespaces
– Tablespace mode
– Table mode
• Filtering metadata using the INCLUDE and EXCLUDE
parameters
• Restarting stopped/failed jobs
• REMAP_DATA parameter to scramble data
• How to generate primary keys without having to use additional
software or scripts
46
Questions?
• Oracle Data Pump
• Demogrounds Booth – Moscone West W-025
47
48
49