Performance Optimization
Download
Report
Transcript Performance Optimization
Teradata
Performance
Optimization
Last Updated : 27th April 2004
Center of Excellence
Data Warehousing Group
Performance Optimization Tools and
Techniques
Performance Optimization Techniques
–
–
–
–
–
–
–
–
–
Indexes Choices
Macros & Views
Reducing Row Size – Compression
Optimization Through SQL Tuning
Join Indexes
Hash Indexes
Partitioned Primary Index
Priority Scheduler
Expanding your Teradata configuration.
Query Analysis Tools
–
–
–
–
–
–
Explain/Visual Explain Facility
Query Capture Facility
Target Level Emulation
Teradata System Emulation Tool
Teradata Index Wizard
Teradata Statistics Wizard
System Monitoring
– Resource Usage
– Teradata Manager
Indexes
Types of Indexes supported by Teradata
Unique Primary Index
Non-Unique Primary Index
Unique Secondary Index
Non-Unique Secondary Index
Unique/Non-Unique Indexes
A Unique Index identifies one & only one data row
A Non-Unique Index identifies one or many data
rows
Criteria For Index Selection
Primary Index
* The column (or column set) chosen should be the
set selection most frequently used to select rows
from the table and should be unique (UPI) or close
to unique (NUPI).
Cont...
Criteria For Index Selection
Secondary Index
* Secondary indexes always have an associated subtable.
* The column (or column set) chosen should be a
frequently used set selection.
* Large Table/Small table Joins :
Joins that involve three or more small tables and one large table
are called Large Table/Small Table (LT/ST) joins.
The optimizer first joins the small tables together and then joins
that result with the large table (also called as Product/Merge
Join).
To work well, the join fields of the small table must comprise an
index of the large table. The join fields do not have to be indexes
in the small table.
Collect statistics for all tables on their indexes used in a query.
Value Ordered NUSIs
NUSI rows are stores row hash order which is very
efficient for the queries with equality condition on
the SI column(s).
Queries with inequality conditions on the SI
column(s) typically does full table scan of SI
subtable.
Value ordered NUSI rows are sorted by data value.
It is possible to search only a portion of the index
subtable for a given range of key values.
Value-ordered NUSIs are very efficient for range
queries.
Partitioned Primary Index
Performance benefit from data locality is achieved
by creating local partitions of data rows within
each virtual AMP.
Following performance benefits can be achieved
by PPI
Large performance gain for range queries.
Improves load time.
Deleting entire partition is very fast.
PI access are not affected.
Explain Facility
Allows you to experiment with different
approaches to an answer, then select the one that
performs best.
Provides information about the relative time the
query would take to execute.
Identifying Columns to Index
Run EXPLAINs on typical queries with and without
indexes defined on various columns to determine
which performs best.
Run HELP INDEX tablename statements to
produce information helpful for interpreting the
EXPLAIN statements you run.
Cont….
Identifying Columns to Index
Run COLLECT STATISTICS on the tables to be
indexed to provide data for assessing the
cost/benefit balance afforded by indexes.
Macros
Teradata macros are SQL statements that are
stored on the server and executed there.
Macros are particularly useful for improving
performance.
Views
Views provide a means for application
programmers to develop and test SQL statements
that are highly optimized.
These views can then be provided to users who
can use them without worrying about tying up
system resources needlessly.
Well written macros provide the same facility.
Compressing Columns
Most frequent value of a column can be
compressed to reduce the row size.
Tables with large numbers of rows and fields with
limited numbers of unique values are very good
candidates for compression.
CPU cost overhead for compression processing is
minimal.
Optimization Through SQL Tuning
Correlated Sub queries
Correlated sub queries are faster than use of a temporary table
as Correlated sub queries fully integrated with the global join
plan to minimize the cost.
Case Expression
CASE expressions help increase performance because they
return multiple results in a single pass over the data.
SELECT item_number, item_description,
item_price as “Current//Price”
,CASE
WHEN item_season = ‘summer' and item_count < 3
THEN item_price *(1-.50)
WHEN item_season = ‘summer’ and item_count >= 3
THEN item_price *(1-.25)
WHEN item_season = ‘spring’ THEN item_price * (1-.33)
ELSE NULL END AS “Sale//Price”
FROM inventory_table
WHERE item_season in (‘spring’ or ‘summer’);
Cont….
Optimization Through SQL Tuning
Ordered Analytical Functions
Using Ordered Analytical Functions, you can perform data
analysis within the database engine itself taking full advantage
of Teradata parallel architecture.
CALENDAR view for date related operations.
Optimized Empty table INSERT/SELECT.
INSERT INTO Summary_Table
SELECT * FROM Store1;
INSERT INTO Summary_Table
SELECT * FROM Store1
INSERT INTO Summary_Table
SELECT * FROM Store2;
; INSERT INTO Summary_Table
SELECT * FROM Store2
INSERT INTO Summary_Table
SELECT * FROM Store3;
;INSERT INTO Summary_Table
SELECT * FROM Store3;
Join Indexes
A Join Index is a data structure that stores and
maintains join results.
Frequently executed joins can be stored in Join
Indexes to improve performance.
Join Indexes are maintained automatically.
Join Index Stores pre-joins without de-normalizing the database.
Stores summary data without de-normalizing the
database.
Replicates the tables on the same AMP for join.
Sparse Index - Can be used to index a portion of a
table.
Hash Indexes
Hash indexes create a full or partial replication of
a base table with a primary index on a foreign key
column table to facilitate joins of very large tables
by hashing them to the same AMP.
Eliminate data distribution for join processing.
Priority Scheduling
Can be used to control resources allocated to
users.
Administrator can specify performance group
while creating the user.
It manages resource distribution to improve
performance of one application at the expense of
Performance Group
other.
Allocation Group
R
Resource
Partition
}
Weight
H
Weight
M
L
Weight
Weight
40
20
10
5
Additional SMP/MPP Nodes
You can increase the performance of a
Teradata RDBMS by adding SMP nodes to
your system. Performance increases at a
nearly linear rate with the addition of SMP
nodes to the configuration.
Each MPP system is certified to support
as many as 512 nodes. More nodes can be
added on a custom basis to improve
Performance.