Teradata - dbmanagement.info

Download Report

Transcript Teradata - dbmanagement.info

Comparison between
Oracle and Teradata
Center of Excellence
Data Warehousing
Wipro Technologies
1
Basic Database Architecture
Differences: Oracle
Oracle SMP
DBMS Kernel
Locking
Logging
Multi-Task Control
Global Systems/Storage Area
Global Buffer Pools
I/O
I/O
I/O
I/O
Data
Partition
I/O
I/O
I/O
I/O
Data
Partition
I/O
I/O
Data
Partition
I/O
I/O
I/O
I/O
I/O
Data
Partition
2
Basic Database Architecture
Differences: Teradata
Teradata MPP
SMP Node
PE V-Proc
PE V-Proc
BYNET Connect
B
Y
N
E
T
SMP Node
PE V-Proc
PE V-Proc
BYNET Connect
AMP
AMP
AMP
AMP
AMP
AMP
AMP
AMP
M-T
M-T
M-T
M-T
M-T
M-T
M-T
M-T
Lock
Lock
Lock
Lock
Lock
Lock
Lock
Lock
Log
Log
Log
Log
Log
Log
Log
Log
BPool
BPool
BPool
BPool
BPool
BPool
BPool
BPool
I/O
I/O
I/O
I/O
I/O
I/O
I/O
I/O
Data
Partition
Data
Partition
Data
Partition
Data
Partition
Data
Partition
Data
Partition
Data
Partition
Data
Partition
@
Oracle SMP
3
Oracle Physical Storage


Tablespaces are individual units of storage
Tablespaces have associated data files

Data files can be added to, extended or removed
from a tablespace dynamically

Only partitioned objects can span tablespaces

Logical objects are stored in physical segments.



Segments consist of extents
Object space is allocated one extent at a time.
May cause fragmentation.
4
Disk Space and the Teradata
Database
SMP Node
PE V-Proc
SMP Node
PE V-Proc
BYNET Connect
AM
P
M-T
AM
P
M-T
AM
P
M-T
AM
P
M-T
Loc
k
Log
Loc
k
Log
Loc
k
Log
BPo
ol
I/O
BPo
ol
I/O
BPo
ol
I/O
PE V-Proc
B
Y
N
E
T
PE V-Proc
BYNET Connect
AM
P
M-T
AM
P
M-T
AM
P
M-T
AM
P
M-T
Loc
k
Log
Loc
k
Log
Loc
k
Log
Loc
k
Log
Loc
k
Log
BPo
ol
I/O
BPo
ol
I/O
BPo
ol
I/O
BPo
ol
I/O
BPo
ol
I/O
Data
Data
Data
Data
Partition Partition Partition Partition
Data
Data
Data
Data
Partition Partition Partition Partition
DBC
Stores
TABLE
INDEX
Products
INDEX
TABLE
5
Sample
Large Disk Space Allocation DDL
Oracle9i
drop tablespace ts_o1 including contents;
create tablespace ts_o1
datafile
’/export/home/oracle/oracle817/dbs/datafiles/o_1’ size
8190m reuse
extent management local
autoallocate
;
Repeat this for
every tablespace
.
.
.
Teradata
Execute Once
CREATE USER tpcd3000g AS PERM= 5400E9,
PASSWORD= tpcd3000g;
6
Comparing Database Creation
Database Creation Task
Oracle
Teradata
Script(s) to Create
Catalog/Data Dictionary
Tables and Views
Catalog.sql, Catclust.sql,
Catproc.sql
DIP
Manage “In-Flight
Transactional Data
Rollback/Undo Segments
Transient Journal
Database Logs
Redo Logs
Permanent Journal
Temporary Space/Work
Space
Temp Space
Spool Space
Default Permanent Space
Table Block Size
Default Tablespace
Set at Table Level for Each
but Requires DBA Setup of
Bufferpools for Each Block
Size
Default Database
Set at Table Level for
Each
Teradata Memory
Management for
Different Blocksizes is
Automatic
XCTL, Vconfig.out
Database Startup Information INIT.ORA and Control File
7
Create Table Statements
Oracle
Teradata
CREATE TABLE Table1
CREATE TABLE Table1 ,FALLBACK ,
( Col1 NUMBER,
NO BEFORE JOURNAL,
NO AFTER JOURNAL
(Col1 INTEGER,
Col2 INTEGER,
Col3 INTEGER)
UNIQUE PRIMARY INDEX ( Col1 );
Col2 NUMBER,
Col3 NUMBER )
TABLESPACE Tablespace1
STORAGE (INITIAL 6144 NEXT 6144
MINEXTENTS 1 MAXEXTENTS 5 );
or...
CREATE TABLE
<expl|impl column specification>
AS SELECT <any query>;
8
Creating Tables: Similarities
 All tables have names
 Tables have columns with data types
 You can define:




Constraints
Referential integrity
Global temporary tables
Triggers
9
Creating Tables: Differences

Teradata tables are created in the database
or user
DBC
Product
TABLE
s

Teradata provides several convenient
options:





Fallback
Permanent journaling
Table block size
Primary index Built into the Table
MULTISET
10
Partitioning
 Partitioning tables and indexes allow Oracle and Teradata to
store lots of data
 With Oracle, the process of choosing partitioning methods
and partitioning keys is the balancing of query access path,
performance, and data load requirements
 You specifically manage the partitioning constraints and their
relationship to disk storage
 With Teradata, the hash partitioning algorithm is very good at
evenly distributing (loading) data partitions and is the basis
for high performance data access and ease of user access
 Provide reasonable partitioning columns when defining the
table and Teradata does the rest
Partitioning Columns are Chosen for Even Data Distribution
in Both Teradata and Oracle
11
Partitioning Comparisons - Oracle
Oracle Partitioning Decisions - Decide on Methods as
well as partitioning columns, set up the disk
environment
Hash1
99Q4
California
99Q3
Florida
99Q2
New York
99Q1
Texas
Range Partitioning
List Partitioning
99Q4
99Q3
99Q2
99Q1
Hash2
Hash3
Hash3
Hash2
Hash4
Hash3
Hash2
Hash1
Hash1
Composite Range Partitioning
Hash Partitioning
12
Partitioning Comparisons-Teradata
 Teradata Partitioning is a Fact of the System with
Hash Data Distribution based on Primary Index
(Partitioning) Columns and system managed disk
AMP1
AMP2
AMP3
AMP4
Table 1
Table 2
Table 1
Table 2
Table 1
Table 2
Table 1
Table 2
Table 3
Table 3
Table 3
Table 3
13
Examples of DDL to Create
Partitioned Tables
Oracle9i
create table orders(
o_orderdate date ,
o_orderkey number NOT NULL,
o_custkey number NOT NULL,
o_orderpriority char(15) ,
o_shippriority number ,
o_clerk char(15) ,
o_orderstatus char(1) ,
o_totalprice number ,
o_comment varchar(79)
)
pctfree 1
pctused 99
initrans 10
storage (freelists 99 freelist groups 2)
parallel
nologging
partition by range (o_orderdate)
subpartition by hash(o_custkey)
subpartitions 16
(
partition ord1 values less than (to_date(’1992-0101’,’YYYY-MM-DD’))
tablespace ts_o1
.
.
.
partition ord84 values less than (MAXVALUE)
tablespace ts_o84
)
;
Teradata
CREATE MULTISET TABLE ORDERTBL,
DATABLOCKSIZE= 29.5 KILOBYTES
(
O_ ORDERKEY DECIMAL (15,0) not null
,O_ CUSTKEY INTEGER not null
,O_ ORDERSTATUS CHAR( 1) CASESPECIFIC not null
,O_ TOTALPRICE DECIMAL( 15,2) not null
,O_ ORDERDATE DATE FORMAT ’yyyy- mm- dd’ not null
,O_ ORDERPRIORITY CHAR( 15) CASESPECIFIC not null
,O_ CLERK CHAR( 15) CASESPECIFIC not null
,O_ SHIPPRIORITY INTEGER not null
,O_ COMMENT VARCHAR( 79) CASESPECIFIC not null
)
UNIQUE PRIMARY INDEX( O_ ORDERKEY );
Repeat 82
Times with DBA
Defined Ranges
14
Oracle Parallelism
 Oracle parallelism is not directly related to table partitioning because
of its shared disk architecture
 Dynamically splits data over parallel processing units for
Selects/Inserts means 1 or more parallel processing units per
partition
 Considers partitioning when distributing data to parallel processing
units
 One parallel processing unit applied to each partition for
Updates/Deletes
 Each user/query can get varying amounts of parallelism or run
serially depending upon the resources available at query run time
 May need to manually control parallelism to improve system
throughput and to ensure fair distribution of parallel resources
 Parallel processes may funnel down to serial processing for final
15
sort/merge, aggregate activity
Teradata Parallelism
 Teradata parallelism is directly related to its shared nothing
architecture
 Automatically applied by the database
 Architecture ensures that each major unit of parallelism (the
VAMP) has similar amounts of data and memory
 Pipelining and query step parallelism is performed within the
VAMP
 Utility Parallelism and Query/Data Manipulation Language
parallelism (Select, Insert, Update, Delete) are all the same
 All system parallelism is available to ALL operations
 Teradata parallelism is automatic, pervasive, and database
managed
 All users/queries take advantage of all the system parallel
resources
 You do not manage and control parallelism
16
Data Types
Teradata
Oracle















CHAR
VARCHAR2
NCHAR
NCHAR2
NUMBER
LONG
LONGRAW
RAW
DATE
BLOB
CLOB
NCLOB
BFILE
ROWID
UROWID


















CHAR
VARCHAR
CHAR VARYING
LONG VARCHAR
NUMERIC
DECIMAL
DOUBLE PRECISION
FLOAT
INTEGER
SMALLINT
BYTEINT
BYTE
VARBYTE
GRAPHIC
VARGRAPHIC
LONG VARGRAPHIC
DATE
REAL
17
Data Type Translations
Oracle
Teradata
Char
Char
Varchar2
Varchar
Number(m,n)
DECIMAL(m,n) or NUMERIC(m,n)
Date (includes
time)
Date (time portion)
Date (does not include time)
Date
Timestamp
Time
18
Datatypes



In Oracle the maximum precision "m" for
number is 38. In Teradata it is 19
NUMBER (without precision) has no direct
counterpart in Teradata. Determine
migration by contents
ROWID has no counterpart in Teradata
19
Constraints

Like Teradata’s Constraints 

Defined on columns and tables
Oracle is different from Teradata on the follow:



Default mandatory is NULL
Case sensitivity is fixed at ON
Constraints can be disabled and enabled
20
Oracle Specific Columns and
Table

ROWNUM will number retrieved rows

ROWID indicates physical address of row

SELECT must have FROM list

DUAL is a pseudo-table with 1 row
21
Creating Indexes
 Similarities between Teradata and Oracle:
 Indexes take up space on disk
 Indexes can be unique and non-unique
 Indexes and secondary indexes provide alternate
ways to access data
 Differences:
 Teradata indexes are not in B-tree structure
 Hash Subtables
 Teradata automatically partitions indexes across
the AMPs
 Teradata uses a Primary Index for each table
22
Index Usage Comparisons
 Oracle solutions traditionally have relied heavily on
indexes
 OLTP workloads required fast access paths to few rows
 Decision support solutions continue Oracle’s use of
indexes where tactical queries with OLTP-like response
time requirements are given more emphasis than
throughput performance
 Teradata solutions have traditionally not used lots of
indexes
 Teradata’s efficient parallel architecture emphasizes
throughput performance requirements - a result of it’s
DSS background
23
Index Usage Comparisons Contd.
 Teradata typically uses fewer indexes than the equivalent Oracle
system
 Saves on disk storage
 Reduces table maintenance windows where affected by existing
indexes
 Fewer database objects to manage and monitor
 Most of the indexes found in Oracle may not be used on Teradata
 Indexes can provide clues to ad hoc query support requirements
 No Bit-Map indexes for Teradata
 Add indexes to Teradata only as workloads (or anticipated
workloads) require them
 Monitor their usage or lack of usage
 Statistics are key - you WANT to collect statistics in Teradata!
24
Principles Behind the Use of Indexes
 Maximize the use of Single or Few AMP Operations:









Drive towards using single or few AMP operations for queries
Minimizing the number of AMPs in an operation, all other AMPs are freed
to perform other tasks.
Scalability is increased because the freed AMPs can execute more single
or few AMP operations creating greater throughput by increasing the
number of tactical queries executed in parallel
Create Efficient All AMP Operations:
Reduce the resource consumption on each AMP for all AMP operations.
Since all AMP operations are virtually impossible to remove from a data
warehousing environment, reducing the impact on each AMP is important
Gain scalability for two reasons:
Individual queries execute faster freeing the AMPs to execute other
queries
25
Decrease in resource consumption allows more queries to use the shared
resources, such as spool space, or CPU
Miscellaneous DDL
 Triggers
 Triggers function just the same (pre- and post-, insert /
update / delete) as in Teradata
 Oracle Options not in Teradata
 Database Links
 Reference (use) objects in another instance
 Synonyms
 Named references to objects
 Sequences
 Number generators
E
 Often used to generate surrogate keys
 Hierarchies
K
L
26
ANALYZE/Collect Statistics


ANALYZE/DBMS_STATS package are intended for collecting
database object statistics for Oracle’s Cost Based Optimizer
(CBO)
 Goal is to collect statistics give queries good access paths
 Once Plans are good and stable, stop analyzing tables to
preserve plans
Teradata’s Collect Statistics command collects database object
statistics
 Optimizer reacts to changing demographics (growing tables,
changing column value cardinality, etc.)
 Keeping statistics up to date ensures good plans
 Don’t freeze statistics - old statistics encourage old access
plans that may not be effective as the database changes
27
Data Control Language
 Data Control Language (DCL) is used in
both Oracle and Teradata to assign user
access
 Oracle users may be granted:

Object Privileges


Easily relate to Teradata
System Privileges

Can be simulated within Teradata
28
User Access Rights
 Teradata enables you to assign access
rights:




To individual users
To logical groups of users
On individual objects
On groups of objects
Tables
Views
Macros
29
Object Privileges
Granting Oracle
Privileges
Granting Teradata
Privileges
Grant Alter
Grant Drop
Grant Delete
Grant Delete
Grant Index
Grant Index or
Grant Drop Table
Grant Select
Grant Select
Grant Insert
Grant Insert
Grant Update
Grant Update
Grant Execute
Grant Execute Procedure/
Grant Execute
30
System Privileges
Granting Oracle Privileges
Granting Teradata Privileges
Grant Create Table
Grant Create Table or Grant Table
Grant Alter Any Table
Grant Create Table or Grant Table
Grant Delete On Any Table
Grant Delete on Database
Grant Drop Any Table
Grant Drop Table or Grant Table
Grant Insert Any Table
Grant Insert on Database
Grant Update Any Table
Grant Update on Database/Grant Update
(column list, all but column list)
Grant Select Any Table
Grant Select on Database
Grant Create View
Grant Create View or Grant View on
Database
Grant Create View or Grant View on
Database
Grant Create Any View
Grant Drop Any View
Grant Drop View or Grant View on
Database
31
Questions ?
32